WO2020142059A1

WO2020142059A1 - Demapping based on machine learning

Info

Publication number: WO2020142059A1
Application number: PCT/US2018/068123
Authority: WO
Inventors: Ori Shental
Original assignee: Nokia Technologies Oy; Nokia Usa Inc.
Priority date: 2018-12-31
Filing date: 2018-12-31
Publication date: 2020-07-09

Abstract

Systems, apparatuses, and methods are described for demapping based on machine learning. A computing device may receive one or more symbols. The computing device may use one or more neural networks to demap the one or more symbols. The one or more neural networks may comprise various configurations and/or parameters, and may be trained to demap symbols.

Description

DEMAPPING BASED ON MACHINE LEARNING

TECHNICAL FIELD

[01] Various example embodiments relate to methods and systems for demapping based on machine learning.

BACKGROUND

[02] In a receiver, a demapper may process received symbols, and may make decisions indicating likelihood that particular bits indicated by the received symbols correspond to ones or zeros. The demapper may process every symbol that the receiver receives. Inefficient demapping architectures may lead to higher consumption of resources and/or power.

BRIEF SUMMARY

[03] This Brief Summary is provided to introduce a selection of concepts in a simplified form that are further described below in the Detailed Description. This Brief Summary' is not intended to identify key features or essential features of the various embodiments, nor is it intended to be used to limit the scope of the claims.

[04] Systems, apparatuses, and methods are described for demapping based on machine learning. A computing device comprising one or more neural networks may receive a symbol comprising one or more values. The computing device may input, to the one or more neural networks, the one or more values. The computing device may demap, using the one or more neural networks and based on the one or more values, the symbol to generate one or more confidence scores corresponding to one or more bits indicated by the symbol. The computing device may send, to a decoder, the one or more confidence scores.

[05] In some examples, the one or more confidence scores may approximate one or more log-likelihood ratios that the one or more bits correspond to one or more ones or one or more zeros. In some examples, each of the one or more neural networks may correspond to a bit of the one or more bits. In some examples, each of the one or more neural networks may comprise an input layer comprising one or more first nodes corresponding to the one or more values. Each of the one or more neural networks may comprise a hidden layer comprising one or more second nodes. Each of the one or more neural networks may comprise an output layer comprising a third node corresponding to a confidence score of the one or more confidence scores.

[06] In some examples, each of the one or more neural networks may comprise an input layer comprising one or more first nodes corresponding to the one or more values. Each of the one or more neural networks may comprise a plurality of hidden layers comprising a plurality of second nodes. Each of the one or more neural networks may comprise an output layer comprising a third node corresponding to a confidence score of the one or more confidence scores. In some examples, the demapping of the symbol may comprise demapping, using a single neural network, the symbol. The single neural network may comprise one or more output nodes corresponding to the one or more confidence scores. In some examples, the symbol may be associated with a modulation constellation used by the computing device.

[07] In some examples, each of the one or more neural networks may comprise a plurality of parameters. The computing device may train, based on additional symbols and by- adjusting the plurality of parameters, each of the one or more neural networks, such that confidence scores, generated by the one or more neural networks, corresponding to the additional symbols approach log-likelihood ratios corresponding to the additional symbols. In some examples, based on detecting a change in network conditions, the computing device may update the one or more neural networks. In some examples, the computing device may periodically update, based on an updating frequency, the one or more neural networks. In some examples, the one or more values may comprise two values. The one or more bits indicated by the symbol may comprise at least four bits.

[08] In some examples, a computing device may comprise means for receiving a symbol comprising one or more values. The computing device may comprise means for implementing one or more neural networks. The computing device may comprise means for inputting, to the one or more neural networks, the one or more values. The computing device may comprise means for demapping, using the one or more neural networks and based on the one or more values, the symbol to generate one or more confidence scores corresponding to one or more bits indicated by the symbol. The computing device may comprise means for sending, to a decoder, the one or more confidence scores.

[09] Additional examples are discussed below.

BRIEF DESCRIPTION OF THE DRAWINGS

[10] Some example embodiments are illustrated by way of example and not limited in the accompanying figures in which like reference numerals indicate similar elements and in which:

[11] FIG. 1A is a schematic diagram showing an example communication system in which features described herein may be implemented.

[12] FIG. IB shows an example constellation diagram and an example correspondence between bit sequences and symbols.

[13] FIG. 2 is a schematic diagram showing an example neural network with which features described herein may be implemented.

[14] FIG. 3 is a schematic diagram showing an example system for demapping based on machine learning.

[15] FIG. 4 is a schematic diagram showing another example system for demapping based on machine learing.

[16] FIG. 5 is a schematic diagram showing an example process for training a neural network for demapping.

[17] FIGS. 6A-6B are a flowchart showing an example method for demapping based on machine learning.

[18] FIG. 7 shows an example apparatus that may be used to implement one or more aspects described herein.

DETAILED DESCRIPTION

[19] In the following description of various illustrative embodiments, reference is made to the accompanying drawings, which form a part hereof, and in which are shown by way of illustration various embodiments in which the disclosure may be practiced. It is to be understood that other embodiments may be utilized and structural and functional modifications may be made without departing from the scope of the present disclosure.

[20] FIG. 1A is a schematic diagram showing an example communication system in which features described herein may be implemented. The system may comprise one or more transmitters (e.g., transmitter 101), one or more receivers (e.g., receiver 103), and one or more communication media (e.g., communication medium 105). The transmitter 101 may be configured to send information to the receiver 103 via the communication medium 105.

[21] The transmitter 101 may comprise, for example, a wireless transmitter, a wired transmitter, an optical transmitter, a telecommunications transmitter, a Wi-Fi transmitter, a cellular network transmitter, a fifth generation wireless systems (5G) transmitter, a television radio transmitter, a Data Over Cable Service Interface Specification (DOCSIS) transmitter, a digital subscriber line (DSL) transmitter, a G.fast transmitter, or any other type of device configured to send information. The transmitter 101 may be implemented in any type of computing device, such as a smartphone, a cell phone, a mobile communication device, a personal computer, a server, a tablet, a desktop computer, a laptop computer, a gaming device, a virtual reality headset, a base station, a television, the computing device as described in connection with FIG. 7, etc.

[22] The transmitter 101 may implement various modulation methods, such as pulse- amplitude modulation (PAM), quadrature amplitude modulation (QAM), modulation methods based on constellation diagrams, and/or other types of modulation methods. The transmitter 101 may comprise, for example, one or more encoders (e.g., encoder 107) and one or more mappers (e.g., mapper 109). The transmitter 101 may comprise additional and/or alternative components for carrying out functions of the transmitter 101.

[23] The encoder 107 may be configured to receive data (e.g., binary' digits of information) to be sent to the receiver 103, and to convert the data into another code. For example, the encoder 107 may convert received data into an error detection and/or correction code, such as a repetition code, a parity bit code, a checksum code, a Hamming code, a binary Golay code, a low-density parity-check code, etc.

[24] The mapper 109 may be configured to map the output of the encoder 107 to symbols used in modulation (e.g., PAM, QAM, etc.). The symbols may be represented by- points on a constellation diagram of a modulation method. Each symbol may comprise one or more values. For example (e.g., in QAM), a first value of a symbol may correspond to an in-phase component, and a second value of the symbol may correspond to a quadrature component. Such a symbol may be represented by a complex number (e.g., -l+3j).

[25] The transmitter 101 may be configured to convert (e.g., modulate) the symbols generated by the mapper 109 onto carrier signals (e.g., carrier pulses, carrier waves, etc.), and may generate modulated signals carrying the symbols. For example (e.g., in QAM), the transmitter 101 may change the amplitudes of two carrier waves (e.g., of the same frequency and out of phase with each other by 90 degrees) according to two corresponding values of a symbol output by the mapper 109, and may add the two modulated carrier waves together for sending via the communication medium 105.

[26] The communication medium 105 may comprise, for example, wire, cable, fiber, microwave, satellite, radio, infrared, or any other type of communication medium. Signals (e.g., from the transmitter 101) may propagate through the communication medium 105. Noise may be added to the signals. The added noise may comprise, for example, additive white Gaussian noise (AWGN), attenuation, phase-shift, etc. A signal may be transmitted via the communication medium 105 from the transmitter 101 to the receiver 103.

[27] The receiver 103 may comprise, for example, a wireless receiver, a wired receiver, an optical receiver, a telecommunications receiver, a Wi-Fi receiver, a cellular network receiver, a 5G receiver, a television radio receiver, a DOCSIS receiver, a DSL receiver, a G.fast receiver, or any other type of device configured to receive information. The receiver 103 may be implemented in any type of computing device, such as a smartphone, a cell phone, a mobile communication device, a personal computer, a server, a tablet, a desktop computer, a laptop computer, a gaming device, a virtual reality headset, a base station, a television, the computing device as described in connection with FIG. 7, etc.

[28] The receiver 103 may be configured to demodulate signals received from the transmitter 101, and to obtain the data sent by the transmitter 101. The receiver 103 may comprise, for example, one or more demappers (e.g., demapper 115), one or more channel estimators (e.g., channel estimator 117), and one or more decoders (e.g., decoder 119). The receiver 103 may comprise additional and/or alternative components for carrying out functions of the receiver 103. For example, a channel equalizer (not shown) configured to perform channel equalization may be implemented, in the receiver 103, to process received signals.

[29] The receiver 103 may be configured to convert (e.g., demodulate) a signal received from the transmitter 101 via the communication medium 105, and may generate one or more demodulated symbols corresponding the received signal. For example (e.g., in QAM), the receiver 103 may multiply tire received signal separately wife a cosine signal and/or sine signal (e.g., of the same frequency as the received signal) to generate estimates of the values of symbols carried by the received signal.

[30] The channel estimator 117 may be configured to estimate the condition of a communication channel, and may send the estimation to the demapper 115. For example, the channel estimator 117 may monitor the condition of a communication channel between the mapper 109 and the demapper 115. In some examples, the channel between the mapper 109 and the demapper 115 may be represented by r = hs + CN(0, s²), where r may represent the output of the channel (e.g., the symbol received by the demapper 115), s may represent the input of the channel (e.g., the symbol sent by the mapper 109), h may represent the channel gain, and CN(0, s²) may represent the AWGN of the channel with a zero (0) average and a s¹ variance. The channel gain h and the channel AWGN variance s² may, for example, be monitored by the channel estimator 117 using any type of channel estimation method (e.g., using a reference signal and/or a signal commonly known to the transmitter 101 and the receiver 103), and may be sent to the demapper 115 (e.g., for determining log- likelihood ratios). [31] As explained in more detail below, the demapper 115 may comprise neural network(s) used to perform operations that may otherwise be performed using a soft demapping function to generate one or more confidence scores corresponding to one or more bits indicated by a symbol received from the transmitter 101. A confidence score corresponding to a bit may indicate a level of confidence in a determination that the bit corresponds to a one (1) or a zero (0). A soft demapping function for determining a confidence score based on a log-likelihood ratio L(b_i\r) is shown below as equation (1):

(1)

[32] In the equation (1), which assumes that all bits are equiprobable, r may represent a received symbol (e.g., a complex number), b_t may represent the ith bit indicated by the received symbol, L(b_i\r) may represent the log-likelihood ratio of the ith bit indicated by the received symbol, s may represent a symbol in the modulation constellation diagram used, Of may represent a set of constellation symbols the ith bits of which correspond to ones (1), Of may represent a set of constellation symbols the ith bits of which correspond to zeros (0), h may represent the channel gain, and a² may represent the variance of the channel AWGN. For example, a higher log- likelihood ratio corresponding to a particular bit may indicate a higher confidence level that the bit corresponds to a zero (0), and a lower log-likelihood ratio corresponding to the bit may indicate a higher confidence level that the bit corresponds to a one (1).

[33] A log-likelihood ratio may additionally or alternatively be approximated according to the following equation:

(2)

[34] The equation (2) may be an approximation of the exact equation (1) for determining log-likelihood ratios. The approximation equation (2) may avoid tire exponential and/or logarithmic operations associated with the equation (1). For a constellation (e.g., associated with PAM, QAM, etc.) comprising 2^m constellation symbols (e.g., m bits per symbol), the approximation equation (2) may require O(2^m) operations.

[35] The confidence scores as determined by the demapper 115 may be sent to the decoder 119. The decoder 119 may be configured to decode received symbols, and to determine the data sent by the transmitter 101, for example, in connection with the demapper 115 and using any type of soft-decision algorithm, such as maximum a posteriori (MAP) decoding of error correction codes, etc. Based on the confidence scores as determined by the demapper 115, the decoder 119 may decode the error correction code associated with the received symbols, and may determine the data sent by the transmitter 101 (e.g., data prior to being encoded by the encoder 107).

[36] As an example of a process that may be implemented according to features described herein, data (e.g., four bits 0110) may be sent to the encoder 107. The encoder 107 may convert the data (e.g., 0110) into another code. For example, the encoder 107 may use a repetition code to encode the data (e.g., 0110), and may generate 01100110. The output of the encoder 107 (e.g., 01100110) may be sent to the mapper 109. The mapper 109 may map 01100110 to one or more symbols, for example, based on a correspondence between bit sequences and symbols.

[37] FIG. IB shows an example constellation diagram 150 and an example correspondence between bit sequences and symbols (e.g., implemented by tire mapper 109). The constellation diagram 150 may comprise one or more axes (e.g., a horizontal real axis and a vertical imaginary axis). The constellation diagram 150 may be regarded as a complex plane. Each symbol (e.g., a complex number) on the constellation diagram 150 may correspond to a bit sequence. For example, a symbol 151 (e.g., -l+3j) may correspond to a bit sequence 0110. The mapper 109 may, for example, map the output of the encoder 107 (e.g., 01100110) to two symbols -1+3j, -l+3j. The symbols may be modulated (e.g., with an in-phase carrier signal and a quadrature carrier signal), and the modulated signal may be sent to the receiver 103 via the communication medium 105.

[38] The receiver 103 may demodulate the received signal, and may generate two demodulated symbols (e.g., -0.76+1.3j, -l.l+2.15j). The demodulated symbols may be different than the two symbols generated by the mapper 109 (e.g., because of channel noise). The demapper 115 may determine the demodulated symbols, and may determine confidence scores for the demodulated symbols. For example, the confidence scores for the four bits of the first demodulated symbol may be 0.7, -1.3, - 0.1, 1.4, and the confidence scores for the four bits of the second demodulated symbol may be 3.1, -0.7, -1.1, 2.3. A higher confidence score corresponding to a bit may indicate the bit is more likely to be a zero (0). A lower confidence score corresponding to a bit may indicate the bit is more likely to be a one (1). A zero confidence score corresponding to a bit may indicate that the bit is equally likely to be a one (1) or a zero (0). The confidence scores as determined by the demapper 115 may be sent to the decoder 119. The decoder 119 may determine, based on the confidence scores, data (e.g., 0110) corresponding to the data that was input to the encoder 107 (e.g., 0110).

[39] The demapper 115 may perform its functions (e.g., converting received symbols to confidence scores) on every symbol received by the receiver 103. As the amount of information communicated between devices increases, the demapper 115 may contribute to higher consumption of resources (e.g., computing resources, silicon area, etc.) and/or power. Using neural networks to implement the demapper 115 may help alleviate the challenges discussed above. For example, neural networks may be used to reduce complexity of the demapper 115, and to reduce resource and/or power consumption of the demapper 115. In some examples, with a neural network implementation, the demapper 115 may be of a complexity order 0(m). Additionally or alternatively, neural networks may learn directly from the equation (1) for determining log-likelihood ratios, and may produce robust imitated results for sending to a decoder under various degrees of transceiver and/or channel impairments.

[40] FIG. 2 is a schematic diagram showing an example neural network 200 with which features described herein may be implemented. For example, the neural network 200 may be used to implement the demapper 115. The neural network 200 may comprise a multilayer perceptron (MLP). The neural network 200 may include an input layer 201, one or more hidden layers (e.g., hidden layers 203A-203B), and an output layer 205. There may be additional or alternative hidden layers in the neural network 200. Each of the layers may include one or more nodes. The nodes in the input layer 201 may receive data from outside the neural network 200. The nodes in the output layer 205 may output data to outside the neural network 200.

[41] Data (e.g., value(s) of a symbol) received by the nodes in the input layer 201 may flow through the nodes in the hidden layers 203A-203B to the nodes in the output layer 205. Nodes in one layer (e.g., the input layer 201) may associate with nodes in a next layer (e.g., the hidden layer 203A) via one or more connections. Each of the connections may have a weight. The value of one node in the hidden layers 203A- 203B or the output layer 205 may correspond to the result of applying an activation function to a sum of the weighted inputs to the one node (e.g., a sum of the value of each node in a previous layer multiplied by the weight of the connection between the each node and the one node). The activation function may be a linear or non-linear function. For example, the activation function may include a hyperbolic tangent function, a sigmoid function, a rectified linear unit (ReLU), a leaky rectified linear unit (Leaky ReLU), etc. As an example, if x represents the value of a first node in a layer, a₁-a_n represent the values of the nodes, in a preceding layer, connected to the first node, W₁-w_n represent the weights of the connections between the nodes and the first node, f represents an activation function, b represents a bias of the activation function, the following equation may be used for the neural network 200: x = f(W₁a₁ + W₂a₂ + ··· + W_na_n + b) (3)

[42] The neural network 200 may be used, for example, to imitate demapping functions (e.g., the equations (1) and/or (2) for calculating log-likelihood ratios). The neural network 200 may receive a symbol via the nodes in the input layer 201 (e.g., the value of each node in the input layer 201 may correspond to a value of the symbol). The value(s) of the symbol may flow through the neural network 200, and the nodes in the output layer 205 may indicate confidence score(s).

[43] The connection weights and/or other parameters (e.g., the bias of the activation function) of the neural network 200 may initially be configured with random values. Based on the initial connection weights and/or other parameters, the neural network 200 may generate output values different from the values that may be generated by the demapping functions. To optimize its output, the neural network 200 may be trained by adjusting the weights and/or other parameters (e.g., using gradient descent and/or backpropagation). For example, the neural network 200 may process one or more symbols, and may generate one or more corresponding outputs. Loss values may be calculated based on the outputs and what soft demapping functions (e.g., the equations (1) and/or (2)) may generate based on the one or more symbols. The weights and/or other parameters of the neural network 200 may be adjusted starting from the output layer 205 to the input layer 201 to minimize the loss values. In some examples, the neural network 200 may be used to implement the demapper 115. For example, the nodes in the input layer 201 may receive one or more values of a symbol. The nodes in the output layer 205 may generate confidence scores to be used by the decoder 119. In some examples, the weights and/or other parameters of the neural network 200 may be determined as described herein.

[44] FIG. 3 is a schematic diagram showing an example system for demapping based on machine learning. The system may comprise the decoder 119 and one or more neural networks (e.g., neural networks 301A-301D). The system may comprises additional and/or alternative neural networks. The neural networks 301A-301D may be used to implement the demapper 115. A neural network of the neural networks 301A-301D may comprise, for example, a multilayer perception (e.g., the neural network 200), a convolutional neural network, a recurrent neural network, a deep neural network, or any other type of neural network. A neural network of the neural networks 301A- 301D may comprise an input layer, one or more hidden layers (e.g., one hidden layer or a plurality of hidden layers), and an output layer. Each of the one or more hidden layers may comprise one or more nodes.

[45] The input layer of a neural network of the neural networks 301A-301D may comprise one or more nodes. In some examples (e.g., in a PAM system), the input layer may comprise one node. In some examples (e.g., in a QAM system), the input layer may comprise two nodes. In some examples, the input layer may comprise three or more nodes. A symbol (e.g., received by the receiver 103) may comprise one or more values. In some examples (e.g., in a PAM system), the symbol may comprise one value. In some examples (e.g., in a QAM system), the symbol may comprise two values (e.g., values 311, 313). In some examples, the symbol may comprise three or more values. The one or more values of the symbol (e.g., the values 311, 313) may be sent to the one or more nodes in the input layer of each of the one or more neural networks (e.g., the neural networks 301A-301D). The values of the symbol may flow through one or more hidden layers of each of the one or more neural networks to the output layer of each of the one or more neural networks.

[46] The output layer of a neural network of the neural networks 301A-301D may comprise, for example, one node. The neural networks 301A-301D may correspond to the number of bits indicated by a symbol (e.g., received by the receiver 103). For example, if 16-QAM is used, a 16-QAM symbol (e.g., received by the receiver 103) may indicate four bits, and four neural networks 301A-301D may be used. The node in the output layer of a neural network of the neural networks 301A-301D may generate a confidence score corresponding to one particular bit. For example, the neural network 301 A may generate a confidence score 321 (e.g., corresponding to a first bit), the neural network 30 IB may generate a confidence score 323 (e.g., corresponding to a second bit), the neural network 301C may generate a confidence score 325 (e.g., corresponding to a third bit), and the neural network 30 ID may generate a confidence score 327 (e.g., corresponding to a fourth bit). The confidence scores 321, 323, 325, 327 may be sent to the decoder 119 for further processing (e.g., for obtaining the data sent by a transmitter by decoding of error correction codes).

[47] As an example of a 64-QAM demapping system, symbols (e.g., received by the receiver 103) may be sent to six neural networks. Each of the 64-QAM symbols may correspond to six bits. Each of the symbols may comprise two values (e.g., a first value corresponding to an in-phase component, and a second value corresponding to a quadrature component). Each of the six neural networks may correspond to a separate bit of six bits indicated by a 64-QAM symbol, and may comprise an input layer (e.g., comprising two nodes for receiving the two values of each symbol), a hidden layer (e.g., comprising one or mote nodes), and an output layer comprising one node (e.g., corresponding to a confidence score for a bit of the indicated six bits). For example, a first neural network corresponding to a first bit may comprise one hidden layer of two nodes, a second neural network corresponding to a second bit may comprise one hidden layer of five nodes, a third neural network corresponding to a third bit may comprise one hidden layer of five nodes, a fourth neural network corresponding to a fourth bit may comprise one hidden layer of two nodes, a fifth neural network corresponding to a fifth bit may comprise one hidden layer of five nodes, and a sixth neural network corresponding to a sixth bit may comprise one hidden layer of five nodes. A hyperbolic tangent function may, for example, be used as activation fimction(s) for each of the neural networks. For 64-QAM, the six neural networks may be trained to closely approximate demapping functions, such as the equations (1) or (2). Additionally, over various degrees of channel noise, the performance of a demapper comprising the six neural networks (e.g., as reflected by the bit error rate of the output of a decoder connected to the demapper) may be substantially similar to the performance of a demapper using soft demapping functions, such as the equations (1) or (2). A demapper comprising the six neural networks may have a complexity of 0(24), while a demapper using soft demapping functions may have a complexity' of 0(64). Complexity of a demapper may be reduced using neural networks.

[48] Additionally or alternatively, the number of nodes in a hidden layer of a neural network corresponding to a particular bit (e.g., of the six bits indicated by a 64-QAM symbol) may be determined and/or adjusted, for example, based on the degree of non linearity of soft demapping fimction(s) (e.g., the equations (1) or (2)), corresponding to that particular bit, with respect to the in-phase value and/or the quadrature value of the symbol. For example, if a soft demapping function, corresponding to that particular bit, with respect to the in-phase value and/or the quadrature value comprises a monotonic function, the number of nodes in the hidden layer of the neural network may be set to a small number (e.g., two). If a soft demapping function, corresponding to that particular bit, with respect to the in-phase value and/or the quadrature value comprises, for example, a plurality of crests and/or troughs, the number of nodes in the hidden layer of the neural network may be set to a large number (e.g., five). Additionally or alternatively, if the soft demapping function comprises more crests and/or troughs, the number of nodes in the hidden layer of the neural network may be set to be larger.

[49] As an example of a 1024-QAM demapping system, symbols (e.g., received by the receiver 103) may be sent to ten neural networks. Each of the 1024-QAM symbols may correspond to ten bits. Each of the symbols may comprise two values (e.g., a first value corresponding to an in-phase component, and a second value corresponding to a quadrature component). Each of the ten neural networks may correspond to a separate bit of ten bits indicated by a 1024-QAM symbol, and may comprise an input layer (e.g., comprising two nodes for receiving the two values of each symbol), a hidden layer (e.g., comprising one or more nodes), and an output layer comprising one node (e.g., corresponding to a confidence score for a bit of the indicated ten bits). For example, a first neural network corresponding to a first bit may comprise one hidden layer of two nodes, a second neural network corresponding to a second bit may comprise one hidden layer of ten nodes, a third neural network corresponding to a third bit may comprise one hidden layer of ten nodes, a fourth neural network corresponding to a fourth bit may comprise one hidden layer of fifteen nodes, a fifth neural network corresponding to a fifth bit may comprise one hidden layer of twenty nodes, a sixth neural network corresponding to a sixth bit may comprise one hidden layer of two nodes, a seventh neural network corresponding to a seventh bit may comprise one hidden layer of ten nodes, an eighth neural network corresponding to an eighth bit may comprise one hidden layer of ten nodes, a ninth neural network corresponding to a ninth bit may comprise one hidden layer of fifteen nodes, and a tenth neural network corresponding to a tenth bit may comprise one hidden layer of twenty nodes. A hyperbolic tangent function may, for example, be used as activation fimction(s) for each of the neural networks. For 1024-QAM, the ten neural networks may be trained to closely approximate demapping functions, such as the equations (1) or (2). Additionally, over various degrees of channel noise, the performance of a demapper comprising the ten neural networks (e.g., as reflected by the bit error rate of the output of a decoder connected to the demapper) may be substantially similar to the performance of a demapper using soft demapping functions, such as the equations (1) or (2). A demapper comprising the ten neural networks may have a complexity of O(114), while a demapper using soft demapping functions may have a complexity of O(1024). Complexity of a demapper may be reduced using neural networks.

[50] As an example of a 256-QAM demapping system, symbols (e.g., received by the receiver 103) may be sent to eight neural networks. Each of the 256-QAM symbols may correspond to eight bits. Each of the symbols may comprise two values (e.g., a first value corresponding to an in-phase component, and a second value corresponding to a quadrature component). Each of the eight neural networks may correspond to a separate bit of eight bits indicated by a 256-QAM symbol, and may comprise an input layer (e.g., comprising two nodes for receiving the two values of each symbol), a hidden layer (e.g., comprising one or more nodes), and an output layer comprising one node (e.g., corresponding to a confidence score for a bit of the indicated eight bits). For example, a first neural network corresponding to a first bit may comprise one hidden layer of two nodes, a second neural network corresponding to a second bit may comprise one hidden layer of two nodes, a third neural network corresponding to a third bit may comprise one hidden layer of six nodes, a fourth neural network corresponding to a fourth bit may comprise one hidden layer of six nodes, a fifth neural network corresponding to a fifth bit may comprise one hidden layer of six nodes, a sixth neural network corresponding to a sixth bit may comprise one hidden layer of six nodes, a seventh neural network corresponding to a seventh bit may comprise one hidden layer of ten nodes, and an eighth neural network corresponding to an eighth bit may comprise one hidden layer of ten nodes. A hyperbolic tangent function may, for example, be used as activation fimction(s) for each of the neural networks. For 256-QAM, the eight neural networks may be trained to closely approximate demapping functions, such as the equations (1) or (2). Additionally, over various degrees of channel noise, the performance of a demapper comprising the eight neural networks (e.g., as reflected by the bit error rate of the output of a decoder connected to the demapper) may be substantially similar to the performance of a demapper using soft demapping functions, such as the equations (1) or (2). A demapper comprising the eight neural networks may have a complexity of 0(48), while a demapper using soft demapping functions may have a complexity of 0(256). Complexity of a demapper may be reduced using neural networks. In some examples, a demapper comprising neural networks may be implemented in a 5G receiver (e.g., in a physical downlink shared channel (PDSCH) of the 5G receiver), and the PDSCH throughput of a 5G link may be substantially similar to where demapping functions (e.g., the equations (1) or (2)) are used for demapping.

[51] FIG. 4 is a schematic diagram showing another example system for demapping based on machine learning. The system may comprise the decoder 119 and one or more neural networks (e.g., neural network 401). The neural network 401 may be used to implement the demapper 115. In some examples, the demapper 115 may be implemented by a single neural network 401. The neural network 401 may comprise, for example, a multilayer perceptron (e.g., the neural network 200), a convolutional neural network, a recurrent neural network, a deep neural network, or any other type of neural network. The neural networks 401 may comprise an input layer, one or more hidden layers, and an output layer.

[52] The input layer of the neural networks 401 may comprise one or more nodes. In some examples (e.g., in a PAM system), the input layer may comprise one node. In some examples (e.g., in a QAM system), the input layer may comprise two nodes. In some examples, the input layer may comprise three or more nodes. A symbol (e.g., received by the receiver 103) may comprise one or more values. In some examples (e.g., in a PAM system), the symbol may comprise one value. In some examples (e.g., in a QAM system), the symbol may comprise two values (e.g., values 411, 413). In some examples, the symbol may comprise three or more values. The one or more values of the symbol (e.g., the values 411, 413) may be sent to the one or more nodes in the input layer of the neural network 401. The values of the symbol may flow through one or more hidden layers of the neural network 401 to the output layer of the neural network 401.

[53] The output layer of the neural network 401 may comprise one or mote nodes (e.g., corresponding to the number of bits indicated by a symbol received by the receiver 103). For example, if 16-QAM is used, a symbol (e.g., received by the receiver 103) may indicate four bits, and the output layer of the neural network 401 may comprise four nodes. The one or more nodes in the output layer of the neural network 401 may generate one or more corresponding confidence scores (e.g., confidence scores 421, 423, 425, 427). The one or more confidence scores may be sent to the decoder 119 for further processing (e.g., for obtaining the data sent by a transmitter by decoding of error correcting codes). Additionally or alternatively, the demapper 115 may be implemented using a combination of the various types of neural networks as discussed in connection with FIGS. 3-4. For example, for some bits indicated by a symbol, one neural network may be used for each bit, and for other bits indicated by the symbol, one neural network may be used for some or all of the other bits.

[54] FIG. 5 is a schematic diagram showing an example process for training a neural network for demapping. The process may be implemented by one or more computing devices (e.g., the computing device as described in connection with FIG. 7). The process may be distributed across multiple computing devices, or may be performed by a single computing device. The process may use a neural network 501, a demapping function 503, and a comparison function 505. The neural network 501 may comprise any type of neural network, may be trained to perform demapping functions, and may be used to implement a demapper (e.g., the demapper 115).

[55] The demapping function 503 may comprise, for example, the equation (1) for calculating log-likelihood ratios, the approximation equation (2) for calculating log- likelihood ratios, or any other soft demapping function. The demapping function 503 may be used to produce desired results, which the neural network 501 may be trained to imitate (e.g., in a supervised learning setting).

[56] A symbol used for training the neural network 501 may be sent to the neural network 501 and the demapping function 503. The symbol may be determined, for example, from a modulation constellation for which the neural network 501 is trained. For example, the symbol may be a random point in the modulation constellation. The neural network 501 may comprise an output layer comprising one or more nodes. The symbol may be flow through the neural network 501, and the one or more output nodes may produce one or more first confidence scores (e.g., corresponding to one or more of the bits indicated by the symbol). Additionally, the demapping function 503 may process the symbol, and may generate one or more second confidence scores (e.g., corresponding to the same one or more bits indicated by the symbol).

[57] The comparison function 505 may compare the one or more first confidence scores and the one or more second confidence scores. Based on the comparison, one or more corresponding loss values may be generated. The loss value(s) may indicate difference(s) between the first confidence score(s) and the corresponding second confidence score(s). The loss value(s) may be used to adjust tire parameters of the neural network 501 (e.g., using backpropagation). For example, a loss value corresponding to an output node of the neural network 501 may indicate that a second confidence score (as determined by the demapping function 503) corresponding to the output node is higher than a first confidence score (as determined by the neural network 501) corresponding to the output node. The parameters, associated with the output node, of the neural network 501 may be adjusted in such a manner that the output node may produce a higher confidence score (e.g., approaching the first confidence score) based on the input symbol. [58] For example, the nodes, in the layer preceding the output layer, connected to the output node may be determined, and the weights of the connections that positively contributed to the value of the output node may be increased. The value of a node in the preceding layer may be increased (e.g., by adjusting the weights of connections contributing to the node in the preceding layer) if the connection corresponding to the node positively contributed to the value of the output node. The value of a node in the preceding layer may be decreased if the connection corresponding to the node negatively contributed to the value of the output node. The parameters associated with preceding layers of the neural network 501 may in turn be adjusted in a similar manner. Additionally or alternatively, a bias in an activation function of the neural network 501 may be adjusted (e.g., increased or decreased) so that the output node may produce a higher confidence score.

[59] FIGS. 6A-6B are a flowchart showing an example method for demapping based on machine learning. The method may be performed, for example, by one or more of the systems as discussed in connection with FIGS. 1A, 3-4, and/or using one or more of the processes as discussed in connection with FIG. 5. The steps of the method may be described as being performed by particular components and/or computing devices for the sake of simplicity, but the steps may be performed by any component and/or computing device. The steps of the method may be performed by a single computing device or by multiple computing devices. One or more steps of the method may be omitted, added, and/or rearranged as desired by a person of ordinary skill in the art.

[60] In step 601, a computing device (e.g., a computing device implementing the receiver 103 comprising the demapper 115) may determine whether a symbol (e.g., a PAM symbol, a QAM symbol, or any other type of constellation symbol) is received. For example, a transmitter (e.g., the transmitter 101) may send a signal to the computing device, and the computing device may demodulate the signal, and may generate a corresponding symbol. If the computing device receives a symbol (step 601: Y), the method may proceed to step 603.

[61] In step 603, the received symbol may be input to one or more neural networks. The received symbol may correspond to (e.g., indicate) one or more bits. For example, in 16-QAM, a symbol may correspond to four bits. In 1024-QAM, a symbol may correspond to ten bits. The one or more neural networks may be configured to demap the received symbol to confidence scores corresponding to one or more bits indicated by the received symbol. In some examples, the received symbol may comprise one value. In some examples, the received symbol may comprise two values. In some examples, the received symbol may comprise three or more values. The value(s) of the received symbol may be received by node(s) in an input layer of a neural network. In some examples, one single neural network may be used for demapping the received symbol. In some examples, one or more neural networks may be used for demapping the received symbol, where each neural network corresponds to one bit of the one or more bits indicated by the received symbol. In some examples, a plurality of neural networks may be used for demapping the received symbol, where at least one neural network of the plurality of neural networks corresponds to multiple bits indicated by the received symbol, and at least one neural network of the plurality of neural networks corresponds to a single bit indicated by the same received symbol.

[62] In step 605, the computing device may demap, using the neural network(s), the received symbol to generate one or more confidence scores corresponding to one or more bits indicated by the received symbol. For example, the value(s) of the received symbol may be received by the node(s) in an input layer of a neural network of the neural network(s), and may flow through one or more hidden layers of the neural network to an output layer of the neural network. The value(s) of the node(s) in the output layer may correspond to confidence score(s) associated with the received symbol. In some examples, one single neural network may be used for demapping the received symbol, and the nodes in the output layer of the one single neural network may generate the confidence scores corresponding to the bits indicated by the received symbol. In some examples, a plurality of neural networks may be used for demapping the received symbol, and each neural network may be responsible for generating one confidence score corresponding to one particular bit of the bits indicated by the received symbol. In some examples, a plurality of neural networks may be used for demapping the receiving symbol, where at least one neural network of the plurality of neural networks may be responsible for generating one confidence score corresponding to one particular bit of the bits indicated by the received symbol, and at least one neural network of the plurality of neural networks may be responsible for generating multiple confidence scores corresponding to multiple bits of the bits indicated by the same received symbol. [63] In step 607, the computing device may send, to a decoder (e.g., the decoder 119), one or more confidence scores (as determined in step 605) corresponding to one or more bits indicated by the received symbol. The decoder may determine, based on the one or more confidence scores (and/or confidence score(s) associated with symbol(s) received previously or afterward), data that was sent to the computing device (e.g., data prior to being encoded by an encoder, such as the encoder 107).

[64] If the computing device does not receive a symbol (step 601: N), or following step 607, the method may proceed to step 651. In step 651 (FIG. 6B), the computing device may determine whether to update one or more neural networks used to implement functions of a demapper (e.g., the demapper 115). The computing device may make this determination in various manners. For example, the computing device may update the one or more neural networks periodically at an updating frequency (e.g., once every' 5 seconds). The updating frequency may be, for example, specified by an administrator and/or user of the computing device. Additionally or alternatively, the updating frequency may be modified dynamically, for example, based on a type of the computing device and/or based on other factors. For example, the updating frequency for the computing device may be set to be higher if the computing device is a mobile device, and may be set to be lower if the computing device is a stationary- device.

[65] Additionally or alternatively, the computing device may update the one or more neural networks if a degree of change in network conditions related to tire computing device satisfies (e.g., meets or exceeds) a degree threshold (e.g., 5%). For example, a channel estimator (e.g., the channel estimator 117) implemented on the computing device may monitor the network conditions related to the computing device. The channel estimator may monitor, for example, the channel gain h, the variance of the channel AWGN a², and/or other parameters indicating the channel condition. If a degree of the change of one or more of the monitored parameters (e.g., the channel gain h) from the last time the one or more neural networks were updated satisfies a degree threshold (e.g., 5%), the computing device may update the one or more neural networks, so that the accuracy of the one or more neural networks in demapping received symbols may be maintained. [66] If the computing device determines not to update the one or more neural networks used to implement functions of a demapper (step 651 : N), the method may repeat step 601 (FIG. 6A). If the computing device determines to update the one or more neural networks used to implement functions of a demapper (step 651: Y), the method may proceed to step 653. In step 653, the computing device may determine a demapping function, and/or parameters associated with the demapping function, to be used for training the one or more neural networks. The demapping function may comprise, for example, the equation (1) for calculating log-likelihood ratios, the approximation equation (2) for calculating log-likelihood ratios, or any other de mapping function. The demapping function may be used to produce desired results, which the one or mote neural networks may be trained to imitate (e.g., in a supervised learning setting).

[67] The detennination of a demapping function may be made based on, for example, the amount of available resources of the computing device. For example, if the computing device uses a substantial portion of its resources (e.g., computing resources, power, etc.) for processes other than the training of the one or more neural networks, the computing device may select a demapping function with lower complexity and/or resource consumption (e.g., selecting the approximation equation (2) instead of the equation (1)). The parameters associated with the determined demapping function may be determined. For example, the channel gain h and/or the variance of the channel AWGN s² may be determined for using the equation (1) or the approximation equation (2). The parameters may be monitored and/or determined, for example, by a channel estimator (e.g., the channel estimator 117) implemented on the computing device.

[68] In step 655, the computing device may determine a training set for training the one or more neural networks. The training set may comprise, for example, one or more symbols (e.g., -4-4j, -3.9-4j, -3.8-4j, etc.) selected from the constellation diagram of the modulation system used by the computing device (e.g., 16-QAM, 1024-QAM, etc.). For example, the one or more symbols may be evenly selected from the constellation diagram at a certain density (e.g., selecting a training symbol every 0.1 magnitude change along the real axis and/or the imaginary axis). Some training symbols (e.g., -3.9-4j, -4-3.9j, -3.9-3.9j, etc.) may be selected from locations near the intended symbol locations (e.g., -4-4j, etc.) in the constellation. Additionally or alternatively, the one or more symbols for the training may be randomly selected from the constellation diagram. Additionally or alternatively, the computing device may use received symbols (e.g., from a transmitter) for the training (e.g., as the symbols are being received by the computing device).

[69] In step 657, the computing device may use the demapping function (as determined in step 653) to process the training set. For example, for each symbol in the training set, the computing device may use the demapping function to calculate one or more confidence scores (e.g., log-likelihood ratios) corresponding to one or more bits indicated by the symbol. In step 659, the computing device may use the one or more neural networks to process the training set. For example, for each symbol in the training set, the computing device may use the one or more neural networks to calculate one or more confidence scores corresponding to one or more bits indicated by the symbol.

[70] In step 661, the computing device may adjust the parameters of the one or more neural networks. For example, the computing device may determine one or more loss values corresponding to a particular bit of the one or more bits of a symbol. The loss values may indicate, for example, the differences between confidence scores, as determined by the demapping function, corresponding to that bit and confidence scores, as determined by a neural network of the one or more neural networks, corresponding to that bit. The computing device may adjust, based on the loss values, the parameters of the neural network configured to generate confidence scores for that bit. For example, the loss values may indicate the output node corresponding to that bit may be increased or decreased. The weights and/or other parameters of the neural network may accordingly be adjusted from the output layer to the preceding layers (e.g., using backpropagafion).

[71] In step 663, the computing device may determine whether to perform additional training. For example, the computing device may set an amount of time to be used for training the one or more neural networks, and if the time has expired, the computing device may determine not to perform additional training. Additionally or alternatively, the computing device may determine to perform additional training for a neural network if the loss value(s) associated with the neural network satisfies (e.g., meets or exceeds) a loss value threshold. For example, the computing device may determine not to perform additional training for a neural network if the loss value(s) associated with the neural network fell below the loss value threshold. Additionally or alternatively, the computing device may adjust the number of nodes in the hidden layer(s) of the one or more neural networks. For example, if the loss value(s) associated with a neural network exceed a loss value threshold after a predetermined number of training sessions of the neural network and/or after a predetermined amount of time has been used for training the neural network, the computing device may, for example, increase the number of nodes in the hidden layer(s) of the neural network, and may perform additional training on the neural network comprising the adjusted number of nodes in its hidden layer(s).

[72] If the computing device determines to perform additional training (step 663: Y), the method may repeat step 655. In step 655, the computing device may determine a training set to be used for the additional training. The computing device may use the same training set used in the previous training, or may use a training set that is different from the training set used in the previous training. If the computing device determines not to perform additional training (step 663: N), the method may proceed to step 665. In step 665, the computing device may configure the one or more trained neural networks to process symbols received by the computing device. For example, the one or more trained neural networks may be configured to implement the demapper 1 15. Additionally or alternatively, during the training of the one or more neural networks, the computing device may use the one or more neural networks and/or demapping functions (e.g., the equations (1) or (2)) for demapping received symbols. Following step 665, step 601 may be repeated.

[73] In some examples, the one or more neural networks may be trained as symbols are received by the computing device (e.g., from the transmitter 101). The computing device may determine whether to train the one or more neural networks (e.g., in a similar manner as in step 663). For example, the computing device may determine whether differences between confidence scores, as determined by the one or more neural networks, corresponding to a received symbol, and confidence scores, as determined by demapping functions (such as the equations (1) or (2)), corresponding to the received symbol satisfy (e.g., meet or exceed) a difference threshold. If the differences satisfy the difference threshold, the computing device may use demapping fimctions (such as the equations (1) or (2)) for processing symbols as they are being received, and may use the results produced by the processing to train the one or more neural networks. If training of the one or more neural networks is completed (e.g., if the differences fall below the difference threshold), tire computing device may configure the one or more neural networks to process future symbols, and/or may stop using the demapping functions for processing the symbols. If the computing device detects the one or more neural networks are to be updated (e.g., in step 651), the computing device may switch back to using the demapping functions for processing symbols (e.g., until the updating of the one or more neural networks is completed). In this manner, resources used for the training may also be used for the actual demapping of received symbols by the computing device.

[74] In some examples, a second computing device may train the one or more neural networks, and may send, to the computing device, the trained one or more neural networks. For example, if the computing device determines that the one or more neural networks are to be updated (e.g., in step 651), the computing device may send, to the second computing device, instructions to update the one or more neural networks. The instructions may indicate the demapping functions and/or associated parameters to be used for the training (e.g., the demapping functions and/or associated parameters may be determined in step 653). Based on receiving the instructions, the second computing device may perform the training of the one or more neural networks. If the training of the one or more neural networks is completed, the second computing device may send, to the computing device, the trained one or more neural networks. The computing device may configure the received one or more neural networks to process symbols received by the computing device. The second computing device may comprise, for example, a server, a datacenter, etc., and may be used to perform the training so that the computing device may avoid consuming resources for the training.

[75] FIG. 7 illustrates an example apparatus, in particular a computing device 712, that may be used in a communication system such as the one shown in FIG. 1A, to implement any or all of the transmitter 101, the receiver 103, the encoder 107, the mapper 109, the demapper 115, the channel estimator 117, the decoder 119, any or all of the example processes in FIG. 5, and/or other computing devices to perform the steps described above and in FIGS. 6A-6B. Computing device 712 may include a controller 725. The controller 725 may be connected to a user interface control 730, display 736 and/or other elements as shown. Controller 725 may include circuitry, such as for example one or more processors 728 and one or more memory 734 storing software 740 (e.g., computer executable instructions). The software 740 may comprise, for example, one or more of the following software options: user interface software, server software, etc., including the encoder 107, the mapper 109, the channel estimator 117, the demapper 115, the decoder 119, the constellation diagram 150, the neural networks 200, 301A-301D, 401, 501, the demapping function 503, the comparison function 505, etc.

[76] Device 712 may also include a battery 750 or other power supply device, speaker 753, and one or more antennae 754. Device 712 may include user interface circuitry, such as user interface control 730. User interface control 730 may include controllers or adapters, and other circuitry, configured to receive input from or provide output to a keypad, touch screen, voice interface - for example via microphone 756, function keys, joystick, data glove, mouse and the like. The user interface circuitry and user interface software may be configured to facilitate user control of at least some functions of device 712 though use of a display 736. Display 736 may be configured to display at least a portion of a user interface of device 712. Additionally, the display may be configured to facilitate user control of at least some functions of the device (for example, display 736 could be a touch screen).

[77] Software 740 may be stored within memory 734 to provide instructions to processor 728 such that when the instructions are executed, processor 728, device 712 and/or other components of device 712 are caused to perform various functions or methods such as those described herein (for example, as depicted in FIGS. 3-5, 6A-6B). The software may comprise machine executable instructions and data used by processor 728 and other components of computing device 712 and may be stored in a storage facility such as memory 734 and/or in hardware logic in an integrated circuit, ASIC, etc. Software may include both applications and operating system software, and may include code segments, instructions, applets, pre-compiled code, compiled code, computer programs, program modules, engines, program logic, and combinations thereof. [78] Memory 734 may include any of various types of tangible machine-readable storage medium, including one or more of the following types of storage devices: read only memory (ROM) modules, random access memory' (RAM) modules, magnetic tape, magnetic discs (for example, a fixed hard disk drive or a removable floppy disk), optical disk (for example, a CD-ROM disc, a CD-RW disc, a DVD disc), flash memory, and EEPROM memory. As used herein (including the claims), a tangible or non-transitory machine-readable storage medium is a physical structure that may be touched by a human. A signal would not by itself constitute a tangible or non- transitory machine-readable storage medium, although other embodiments may include signals or ephemeral versions of instructions executable by one or more processors to carry' out one or more of the operations described herein.

[79] As used herein, processor 728 (and any other processor or computer described herein) may include any of various types of processors whether used alone or in combination with executable instructions stored in a memory or other computer-readable storage medium. Processors should be understood to encompass any of various types of computing structures including, but not limited to, one or more microprocessors, special-purpose computer chips, field-programmable gate arrays (FPGAs), controllers, application-specific integrated circuits (ASICs), hardware accelerators, artificial intelligence (AI) accelerators, digital signal processors, software defined radio components, combinations of hardware/firmware/software, or other special or general-purpose processing circuitry.

[80] As used in this application, the term“circuitry” may refer to any of the following: (a) hardware-only circuit implementations (such as implementations in only analog and/or digital circuitry) and (b) combinations of circuits and software (and/or firmware), such as (as applicable): (i) a combination of processors) or (ii) portions of processor s)/software (including digital signal processors)), software, and memory(ies) that work together to cause an apparatus, such as a mobile phone, server, or other computing device, to perform various functions) and (c) circuits, such as a microprocessors) or a portion of a microprocessors), that requite software or firmware for operation, even if the software or firmware is not physically present.

[81] These examples of “circuitry” apply to all uses of this term in this application, including in any claims. As an example, as used in this application, the term “circuitry” would also cover an implementation of merely a processor (or multiple processors) or portion of a processor and its (or their) accompanying software and/or firmware. The term“circuitry” would also cover, for example, a radio frequency circuit, a baseband integrated circuit or applications processor integrated circuit for a mobile phone or a similar integrated circuit in a server, a cellular network device, or other network device.

[82] Device 712 or its various components may be mobile and be configured to receive, decode and process various types of transmissions including transmissions in Wi-Fi networks according to a wireless local area network standard (e.g., the IEEE 802.11 WLAN standards 802.11h, 802.1 lac, etc.), short range wireless communication networks (e.g., near-field communication (NFC)), and/or wireless metro area network (WMAN) standards (e.g., 802.16), through a specific one or more WLAN transceivers 743, one or mote WMAN transceivers 741. Additionally or alternatively, device 712 may be configured to receive, decode and process transmissions through various other transceivers, such as FM/AM Radio transceiver 742, and telecommunications transceiver 744 (e.g., cellular network receiver such as CDMA, GSM, 4G LTE, 5G, etc.). A wired interface 745 (e.g., an Ethernet interface, a DOCSIS interface) may be configured to provide communication via a wired communication medium (e.g., fiber, cable, twisted pair or other conductors), etc.).

[83] Although the above description of FIG. 7 generally relates to a mobile device, other devices or systems may include the same or similar components and perform the same or similar functions and methods. For example, a computer communicating over a wired network connection may include the components or a subset of the components described above, and may be configured to perform the same or similar functions as device 712 and its components. Further computing devices as described herein may include the components, a subset of the components, or a multiple of the components (e.g., integrated in one or more servers) configured to perform the steps described herein.

[84] Although specific examples of carrying out the disclosure have been described, those skilled in the art will appreciate that there are numerous variations and permutations of the above-described systems and methods that are contained within the spirit and scope of the disclosure. Any and all permutations, combinations, and sub- combinations of features described herein, including but not limited to features specifically recited in the claims, are within the scope of the disclosure.

Claims

What is claimed is:

1. A method comprising:

receiving, by a computing device comprising one or more neural networks, a symbol comprising one or more values;

inputting, to the one or more neural networks, the one or more values;

demapping, using the one or more neural networks and based on the one or more values, the symbol to generate one or more confidence scores corresponding to one or more bits indicated by the symbol; and

sending, to a decoder, the one or more confidence scores.

2. The method of claim 1, wherein the one or more confidence scores approximate one or more log-likelihood ratios that the one or more bits correspond to one or more ones or one or more zeros.

3. The method of any of claims 1-2, wherein each of the one or more neural networks corresponds to a bit of the one or more bits.

4. The method of any of claims 1-3, wherein each of the one or more neural networks comprises:

an input layer comprising one or more first nodes corresponding to tire one or more values;

a hidden layer comprising one or more second nodes; and

an output layer comprising a third node corresponding to a confidence score of the one or more confidence scores.

5. The method of any of claims 1-3, wherein each of the one or more neural networks comprises:

an input layer comprising one or more first nodes corresponding to the one or more values;

a plurality of hidden layers comprising a plurality of second nodes; and

6. The method of any of claims 1-2, wherein the demapping the symbol comprises demapping, using a single neural network, the symbol, and wherein the single neural network comprises one or more output nodes corresponding to the one or more confidence scores.

7. The method of any of claims 1-6, wherein the symbol is associated with a modulation constellation used by the computing device.

8. The method of any of claims 1-7, wherein each of the one or more neural networks comprises a plurality of parameters, the method further comprising:

training, by the computing device, based on additional symbols, and by adjusting the plurality of parameters, each of the one or more neural networks, such that confidence scores, generated by the one or more neural networks, corresponding to the additional symbols approach log-likelihood ratios corresponding to the additional symbols.

9. The method of any of claims 1-8, further comprising:

based on detecting a change in network conditions, updating the one or more neural networks.

10. The method of any of claims 1-8, further comprising:

periodically updating, based on an updating frequency, the one or more neural networks.

11. The method of any of claims 1-10, wherein the one or more values comprise two values, and wherein the one or more bits indicated by the symbol comprise at least four bits.

12. An apparatus comprising:

one or more processors; and

memory storing instructions that, when executed by the one or more processors, cause the apparatus to:

receive a symbol comprising one or more values;

input, to one or more neural networks, the one or more values; demap, using the one or more neural networks and based on the one or more values, the symbol to generate one or more confidence scores corresponding to one or more bits indicated by the symbol; and send, to a decoder, the one or more confidence scores.

13. The apparatus of claim 12, wherein the one or more confidence scores approximate one or more log-likelihood ratios that the one or more bits correspond to one or more ones or one or more zeros.

14. The apparatus of any of claims 12-13, wherein each of the one or more neural networks comprises:

a hidden layer comprising one or more second nodes; and

15. The apparatus of any of claims 12-13, wherein each of the one or more neural networks comprises:

a plurality of hidden layers comprising a plurality of second nodes; and

16. The apparatus of any of claims 12-15, wherein each of the one or more neural networks comprises a plurality of parameters, and wherein the instructions, when executed by the one or more processors, further cause the apparatus to:

train, based on additional symbols and by adjusting the plurality- of parameters, each of the one or more neural networks, such that confidence scores, generated by the one or more neural networks, corresponding to the additional symbols approach log-likelihood ratios corresponding to the additional symbols.

17. The apparatus of any of claims 12-16, wherein the instructions, when executed by the one or more processors, further cause the apparatus to:

based on detecting a change in network conditions, update the one or more neural networks.

18. A computer-readable medium storing instructions that, when executed by a computing device, cause the computing device to:

receive a symbol comprising one or more values;

input, to one or more neural networks, the one or more values;

demap, using the one or more neural networks and based on the one or more values, the symbol to generate one or more confidence scores corresponding to one or more bits indicated by the symbol; and

send, to a decoder, the one or more confidence scores.

19. The computer-readable medium of claim 18, wherein the one or more confidence scores approximate one or more log-likelihood ratios that the one or more bits correspond to one or more ones or one or more zeros.

20. The computer-readable medium of any of claims 18-19, wherein each of the one or more neural networks comprises:

a hidden layer comprising one or more second nodes; and