WO2019240638A1

WO2019240638A1 - Machine learning prediction of decoder performance

Info

Publication number: WO2019240638A1
Application number: PCT/SE2018/050625
Authority: WO
Inventors: Hugo Tullberg; Johan OTTERSTEN
Original assignee: Telefonaktiebolaget Lm Ericsson (Publ)
Priority date: 2018-06-14
Filing date: 2018-06-14
Publication date: 2019-12-19

Abstract

Systems and methods for performing a Hybrid Automatic Repeat Request (HARQ) related task based on a Machine-Learning (ML) based prediction of a performance of a Forward Error Correction (FEC) decoder of a radio node are disclosed. In some embodiments, a method in a radio node in a cellular communications system comprises obtaining a plurality of soft metrics for at least a portion of a codeword, determining one or more statistics based on the plurality of soft metrics, and making a prediction as to whether the codeword will be successfully decoded by a FEC decoder of the radio node using a ML model. The one or more statistics are provided as inputs to the ML model in order to make the prediction. The method further comprises performing a HARQ related task based on the prediction.

Description

MACHINE LEARNING PREDICTION OF DECODER PERFORMANCE

Technical Field

[0001] The present disclosure relates to predicting a performance of a Forward Error Correction (FEC) decoder of a radio node and utilizing the prediction to perform one or more Hybrid Automatic Repeat Request (HARQ) related tasks.

Background

[0002] When transmitting a message over a wireless communication channel, errors are introduced into the message due to the unreliability or noisiness of the wireless communication channel. Forward Error Correction (FEC) codes are used to enable the receiver to detect and correct such errors via the use of respective FEC decoders. FEC introduces redundancy in a controlled manner, where increasing the redundancy increases the

performance of FEC. However, as the amount of redundancy increases, code rate and thus channel utilization efficiency decreases.

[0003] Hybrid Automatic Repeat Request (HARQ) is used to strike a balance between error-correcting performance and channel utilization. HARQ is a combination of high-rate FEC coding and Automatic Repeat Request (ARQ). When using HARQ, a transmitter first transmits a comparably high-rate codeword. If decoding of the codeword at the receiver is not successful, then the transmitter transmits a retransmission of the codeword, where the retransmission provides additional redundancy. At the receiver, the

retransmission is combined with the initial transmission and decoding is restarted. A Cyclic Redundancy Check (CRC) code is used to determine whether the decoding succeeded or not.

[0004] In Third Generation Partnership Project (3GPP) Long Term Evolution (LTE), turbo codes are used for data channel FEC. In 3GPP New Radio (NR), Low Density Parity Check (LDPC) codes will be used for data channel FEC. [0005] HARQ processing including FEC places a limit on the latencies that can be achieved, e.g., in 3GPP LTE and NR. In particular, for FEC using Turbo codes and LDPC codes, the FEC decoder uses an iterative decoding process and must complete its decoding iterations before the CRC check can be performed. If the CRC check fails, then a HARQ retransmission is requested. Thus, when using HARQ processing, the time needed to complete FEC decoding places a limit on the latencies that can be achieved.

Summary

[0006] Systems and methods for performing a Hybrid Automatic Repeat Request (HARQ) related task based on a Machine-Learning (ML) based prediction of a performance of a Forward Error Correction (FEC) decoder of a radio node are disclosed. In some embodiments, a method in a radio node in a cellular communications system comprises obtaining a plurality of soft metrics for at least a portion of a codeword, determining one or more statistics based on the plurality of soft metrics, and making a prediction as to whether the codeword will be successfully decoded by a FEC decoder of the radio node using a ML model. The one or more statistics are provided as inputs to the ML model in order to make the prediction. The method further comprises performing a HARQ related task based on the prediction.

[0007] In some embodiments, the one or more statistics and one or more additional parameters are provided as inputs to the ML model in order to make the prediction, the one or more additional parameters comprising one or more modulation and coding parameters and/or one or more channel parameters. Further, in some embodiments, the one or more additional parameters comprise the one or more modulation and coding parameters, and the one or more modulation and coding parameters comprise a code rate used for the codeword and/or a modulation index of a modulation and coding scheme used for the codeword. In some embodiments, the one or more additional parameters comprise the one or more channel parameters, and the one or more channel parameters comprise a Signal to Noise Ratio (SNR) of a wireless channel on which the codeword is received, carrier frequency of the wireless channel on which the codeword is received, fading characteristics of the wireless channel on which the codeword is received, and/or a speed of the radio node and/or a speed of a transmitter from which the codeword is received.

[0008] In some embodiments, the one or more statistics comprise a mean of the plurality of soft metrics, a variance of the plurality of soft metrics, a skewness of the plurality of soft metrics, a kurtosis of the plurality of soft metrics, and/or one or more central moments of the plurality of soft metrics.

[0009] In some embodiments, the plurality of soft metrics is a plurality of Log Likelihood Ratio (LLR) values.

[0010] In some embodiments, the prediction is that the codeword will not be successfully decoded by the FEC decoder of the radio node, and performing the HARQ related task comprises sending, before decoding of the codeword by the FEC decoder is complete, a HARQ retransmission request to a transmit node that transmitted the codeword.

[0011] In some embodiments, the prediction is that the codeword will not be successfully decoded by the FEC decoder of the radio node, and performing the HARQ related task comprises sending, before decoding of the codeword by the FEC decoder is complete, a Negative Acknowledgement (NACK) to a transmit node that transmitted the codeword.

[0012] In some embodiments, the prediction is that the codeword will be successfully decoded by the FEC decoder of the radio node, and performing the HARQ related task comprises waiting until decoding of the codeword by the FEC decoder is complete before sending an Acknowledgement (ACK) or NACK to a transmit node that transmitted the codeword.

[0013] In some embodiments, the prediction is that the codeword will be successfully decoded by the FEC decoder of the radio node, and performing the HARQ related task comprises sending, before decoding of the codeword by the FEC decoder is complete, an ACK to a transmit node that transmitted the codeword. [0014] In some embodiments, the FEC decoder is a Turbo decoder, a Low Density Parity Check (LDPC) decoder, or a Polar decoder.

[0015] In some embodiments, the method further comprises training the ML model based on a plurality of prior codewords that were received and decoded by the radio node prior to receiving the codeword. In some embodiments, training the ML model comprises, for each prior codeword of the plurality of prior codewords, obtaining a plurality of soft metrics for at least a portion of the prior codeword, determining one or more statistics based on the plurality of soft metrics obtained for the at least a portion of the prior codeword, storing the one or more statistics for the prior codeword, decoding the prior codeword to obtain a decoding result, and storing the decoding result for the prior codeword. Training the ML model further comprises training the ML model based on the stored statistics and the stored decoding results for the plurality of prior codewords.

[0016] In some embodiments, training the ML model based on the stored statistics and the stored decoding results for the plurality of prior codewords comprises training the ML model based on the stored statistics and the stored decoding results for the plurality of prior codewords and one or more modulation and coding parameters and/or one or more channel parameters for the plurality of prior codewords, respectively.

[0017] In some embodiments, the plurality of prior codewords comprise two or more sets of prior codewords, and training the ML model comprises: (a) for each prior codeword in a first set of prior codewords, resetting an accumulated gradient to zero, computing one or more statistics based on a plurality of soft metrics obtained for at least a portion of the prior codeword, decoding the prior codeword to obtain a decoding result, computing a prediction of a plurality of parameters that define the ML model given the one or more statistics, computing an error gradient using the prediction of the plurality of parameters of the ML model and the decoding result, and updating the accumulated gradient based on the computed error gradient; (b) repeating the steps of resetting the accumulated gradient, computing the one or more statistics, decoding the prior codeword, computing the prediction, computing the error gradient, and updating the accumulated gradient for each other prior codeword in the first set; (c) updating the plurality of parameters that define the ML model based on the accumulated gradient using a backward pass; and (d) repeating steps (a) through (c) for each other set of prior codewords in the two or more sets of prior codewords.

[0018] In some embodiments, the radio node is a base station. In some other embodiments, the radio node is a wireless device.

[0019] Embodiments of a radio node are also disclosed. In some

embodiments, a radio node for a cellular communications system is disclosed, the radio node adapted to obtain a plurality of soft metrics for at least a portion of a codeword, determine one or more statistics based on the plurality of soft metrics, and make a prediction as to whether the codeword will be successfully decoded by a FEC decoder of the radio node using a ML model. The one or more statistics are provided as inputs to the ML model in order to make the prediction. The radio node is further adapted to perform a HARQ related task based on the prediction. In some embodiments, the radio node is a base station. In some other embodiments, the radio node is a wireless device.

[0020] In some embodiments, a radio node for a cellular communications system comprises circuitry operable to obtain a plurality of soft metrics for at least a portion of a codeword, determine one or more statistics based on the plurality of soft metrics, and make a prediction as to whether the codeword will be successfully decoded by a FEC decoder of the radio node using a ML model.

The one or more statistics are provided as inputs to the ML model in order to make the prediction. The circuitry is further operable to perform a HARQ related task based on the prediction.

Brief Description of the Drawings

[0021] The accompanying drawing figures incorporated in and forming a part of this specification illustrate several aspects of the disclosure, and together with the description serve to explain the principles of the disclosure.

[0022] Figure 1 illustrates one example of a cellular communications network according to some embodiments of the present disclosure; [0023] Figure 2 illustrates an example of a demodulation and decoding system of a receiver of a radio node such as, for example, a wireless device or a base station in the cellular communications network of Figure 1 , that predicts Forward Error Correction (FEC) decoder performance and utilizes the prediction to perform one or more Hybrid Automatic Repeat Request (HARQ) related tasks in accordance with embodiments of the present disclosure;

[0024] Figure 3 is a flow chart that illustrates a process for predicting FEC decoder performance and utilizing the prediction to perform one or more HARQ related tasks in accordance with embodiments of the present disclosure;

[0025] Figure 4 shows a histogram of Log Likelihood Ratio (LLR) values for 16 symbol constellation Quadrature Amplitude Modulation (16-QAM);

[0026] Figure 5 shows a histogram of absolute value LLR values for 16-QAM;

[0027] Figure 6 illustrates a procedure for training a Machine-Learning (ML) model in accordance with some embodiments of the present disclosure;

[0028] Figure 7 illustrates storage for a fixed batch method;

[0029] Figure 8 illustrates storage and pointer operations for a circular buffer implementation of a sliding window batch processing method;

[0030] Figure 9 illustrates computation and storage requirements for an accumulated gradient method;

[0031] Figure 10 illustrates a process for training the ML model using the accumulated gradient scheme in accordance with some embodiments of the present disclosure;

[0032] Figure 1 1 is a flow chart that illustrates the use of the trained ML model to predict the performance of the FEC decoder during operation in accordance with some embodiments of the present disclosure;

[0033] Figure 12 is a schematic block diagram of a radio access node according to some embodiments of the present disclosure;

[0034] Figure 13 is a schematic block diagram that illustrates a virtualized embodiment of the radio access node of Figure 12 according to some

embodiments of the present disclosure; [0035] Figure 14 is a schematic block diagram of the radio access node of Figure 12 according to some other embodiments of the present disclosure;

[0036] Figure 15 is a schematic block diagram of a User Equipment device (UE) according to some embodiments of the present disclosure;

[0037] Figure 16 is a schematic block diagram of the UE of Figure 15 according to some other embodiments of the present disclosure;

[0038] Figure 17 illustrates a telecommunication network connected via an intermediate network to a host computer in accordance with some embodiments of the present disclosure;

[0039] Figure 18 is a generalized block diagram of a host computer communicating via a base station with a UE over a partially wireless connection in accordance with some embodiments of the present disclosure;

[0040] Figure 19 is a flowchart illustrating a method implemented in a communication system in accordance with one embodiment of the present disclosure;

[0041] Figure 20 is a flowchart illustrating a method implemented in a communication system in accordance with one embodiment of the present disclosure;

[0042] Figure 21 is a flowchart illustrating a method implemented in a communication system in accordance with one embodiment of the present disclosure; and

[0043] Figure 22 is a flowchart illustrating a method implemented in a communication system in accordance with one embodiment of the present disclosure.

Detailed Description

[0044] The embodiments set forth below represent information to enable those skilled in the art to practice the embodiments and illustrate the best mode of practicing the embodiments. Upon reading the following description in light of the accompanying drawing figures, those skilled in the art will understand the concepts of the disclosure and will recognize applications of these concepts not particularly addressed herein. It should be understood that these concepts and applications fall within the scope of the disclosure.

[0045] Radio Node: As used herein, a“radio node” is either a radio access node or a wireless device.

[0046] Radio Access Node: As used herein, a“radio access node” or“radio network node” is any node in a radio access network of a cellular

communications network that operates to wirelessly transmit and/or receive signals. Some examples of a radio access node include, but are not limited to, a base station (e.g., a New Radio (NR) base station (gNB) in a Third Generation Partnership Project (3GPP) Fifth Generation (5G) NR network or an enhanced or evolved Node B (eNB) in a 3GPP Long Term Evolution (LTE) network), a high- power or macro base station, a low-power base station (e.g., a micro base station, a pico base station, a home eNB, or the like), and a relay node.

[0047] Core Network Node: As used herein, a“core network node” is any type of node in a core network. Some examples of a core network node include, e.g., a Mobility Management Entity (MME), a Packet Data Network Gateway (P- GW), a Service Capability Exposure Function (SCEF), or the like.

[0048] Wireless Device: As used herein, a“wireless device” is any type of device that has access to (i.e., is served by) a cellular communications network by wirelessly transmitting and/or receiving signals to a radio access node(s). Some examples of a wireless device include, but are not limited to, a User Equipment device (UE) in a 3GPP network and a Machine Type Communication (MTC) device.

[0049] Network Node: As used herein, a“network node” is any node that is either part of the radio access network or the core network of a cellular communications network/system.

[0050] Note that the description given herein focuses on a 3GPP cellular communications system and, as such, 3GPP terminology or terminology similar to 3GPP terminology is oftentimes used. However, the concepts disclosed herein are not limited to a 3GPP system. [0051] Note that, in the description herein, reference may be made to the term “cell;” however, particularly with respect to 5G NR concepts, beams may be used instead of cells and, as such, it is important to note that the concepts described herein are equally applicable to both cells and beams.

[0052] Systems and methods are disclosed herein for predicting Forward Error Correction (FEC) decoder performance using a Machine-Learning (ML) model and utilizing the predicted FEC decoder performance for one or more Hybrid Automatic Repeat Request (HARQ) related tasks. In some embodiments, if FEC decoding is predicted to not succeed, HARQ retransmission may be preemptively requested or a HARQ Negative Acknowledgement (NACK) may be sent before FEC decoding is complete. If FEC decoding is predicted to succeed, the receiver may wait for FEC decoding to complete before sending a HARQ Acknowledgement (ACK) or NACK as appropriate or may preemptively send an HARQ ACK before FEC decoding is complete. In this manner, latency can be improved.

[0053] Statistics of soft metrics are used as inputs to the ML model. Further, in some embodiments, additional parameters may be used as inputs to the ML model. These additional parameters may include, for example, one or more modulation and coding parameters (e.g., Modulation and Coding Scheme (MCS) index, or any other modulation and coding parameter(s) that affect the shape of the soft metric distributions) and/or one or more channel parameters (e.g., Signal to Noise Ratio (SNR), carrier frequency, fading characteristics, speed of the terminal, etc.). Embodiments for training the ML model are also disclosed.

[0054] As discussed below, the soft metric statistics are computed

successively as the soft metrics are output by the demodulator of the receiver. Hence, it is not necessary to wait for the whole codeword to be received before a prediction can be made and the appropriate HARQ task(s) is performed.

[0055] While embodiments of the present disclosure provide numerous advantages, one example advantage is that embodiments of the present disclosure reduce the amount of time between reception of a codeword and HARQ ACK/NACK transmission, which in turn reduces latency on the link level of the cellular communications system. This reduction in latency would especially benefit Ultra Reliable Low Latency Communication (URLLC). Further, 3GPP NR (i.e. , 5G) Low Density Parity Check (LDPC) codes are, at least currently, not as optimized as the current Turbo codes used in 3GPP LTE. Thus, the benefit of implementing embodiments of the present disclosure in a 3GPP NR may be substantial.

[0056] In this regard, Figure 1 illustrates one example of a cellular

communications network 100 according to some embodiments of the present disclosure. In the embodiments described herein, the cellular communications network 100 is a 5G NR network. In this example, the cellular communications network 100 includes base stations 102-1 and 102-2, which in LTE are referred to as eNBs and in 5G NR are referred to as gNBs, controlling corresponding macro cells 104-1 and 104-2. The base stations 102-1 and 102-2 are generally referred to herein collectively as base stations 102 and individually as base station 102. Likewise, the macro cells 104-1 and 104-2 are generally referred to herein collectively as macro cells 104 and individually as macro cell 104. The cellular communications network 100 may also include a number of low power nodes 106-1 through 106-4 controlling corresponding small cells 108-1 through 108-4. The low power nodes 106-1 through 106-4 can be small base stations (such as pico or femto base stations) or Remote Radio Heads (RRHs), or the like. Notably, while not illustrated, one or more of the small cells 108-1 through 108-4 may alternatively be provided by the base stations 102. The low power nodes 106-1 through 106-4 are generally referred to herein collectively as low power nodes 106 and individually as low power node 106. Likewise, the small cells 108-1 through 108-4 are generally referred to herein collectively as small cells 108 and individually as small cell 108. The base stations 102 (and optionally the low power nodes 106) are connected to a core network 1 10.

[0057] The base stations 102 and the low power nodes 106 provide service to wireless devices 112-1 through 1 12-5 in the corresponding cells 104 and 108. The wireless devices 1 12-1 through 1 12-5 are generally referred to herein collectively as wireless devices 1 12 and individually as wireless device 1 12. The wireless devices 112 are also sometimes referred to herein as UEs.

[0058] Figure 2 illustrates one example of a demodulation and decoding system 200 that predicts FEC decoder performance using a ML model and utilizes the predicted FEC decoder performance to perform one or more HARQ related tasks in accordance with some embodiments of the present disclosure. The demodulation and decoding system 200 is implemented in a receiver chain of a radio node such as, e.g., a base station 102 or a wireless device 1 12. Note that, optional blocks are indicated with dashed lines.

[0059] As illustrated, the demodulation and decoding system 200 includes a demodulator 202, a soft metric computation function 204, a FEC decoder 206, a Cyclic Redundancy Check (CRC) check function 207, a statistics calculator 208, a ML predictor 210, and optionally a ML model training function 212. The various components of the demodulation and decoding system 200 (i.e. , the demodulator 202, the soft metric computation function 204, the FEC decoder 206, the statistics calculator 208, the ML predictor 210, and optionally the ML model training function 212) are implemented in hardware (e.g., an Application Specific Integrated Circuit(s) (ASIC(s))) or a combination of hardware and software.

[0060] In operation, the demodulator 202 demodulates a received signal and outputs a complex-valued symbol, which is an estimate of the transmitted constellation symbol. For each received symbol, the soft metric computation function 204 computes a soft metric value (e.g., a Log Likelihood Ratio (LLR) value) for each bit of the received symbol. In other words, for each received symbol, the soft metric computation function 204 computes a real-valued estimate of the LLRs of the bits in the label of the symbol. For example, in case of 16 symbol constellation Quadrature Amplitude Modulation (16-QAM), each of the 2⁴=16 symbol constellation points has a 4-bit label, and thus the soft metric computation function 204 outputs four LLR values per received symbol. The soft metrics are then fed into the FEC decoder 206. The FEC decoder 206 can be any type of FEC decoder. Some example FEC decoder types are Turbo decoder, a LDPC decoder, and a Polar decoder. However, other classical types of FEC and corresponding decoders may be used. Some examples of classical types of FEC decoders include, but are not limited to, convolutional codes, a range of block codes, e.g., Reed-Solomon, Bose-Chaudhuri-Hocquenghem (BCH), and Hamming. After the FEC decoding, the bits of the received symbol are fed to the CRC check function 207 and then onwards in the receiver chain of the radio node, as will be appreciated by one of skill in the art. Modern FECs use iterative decoders which run for a number of iterations, the more iterations the longer the delay.

[0061] In accordance with embodiments of the present disclosure, the statistics calculator 208 computes one or more statistics of the soft metrics (e.g., the LLR values) output by the soft metric computation function 204. The statistics include, e.g., mean value, variance, skewness, and/or kurtosis. The skewness is a measure of the asymmetry of the probability distribution of a real- valued random variable about its mean. The skewness value can be positive or negative, or undefined. Kurtosis is a measure of the "tailedness" of the probability distribution of a real-valued random variable. The kurtosis of any univariate normal distribution is 3. Excess kurtosis is the kurtosis minus 3, i.e. , deviation from a normal distribution.

[0062] The computed statistics of the soft metrics are used by the ML predictor 210 as inputs to a ML model to predict the performance of the FEC decoder 206. In particular, the computed statistics of the soft metrics are computed and stored while receiving each of a number of codewords. The stored statistics are used by the ML model training function 212 to train a ML model for predicting the performance of the FEC decoder 206. Subsequently, when receiving a new codeword, the soft metrics computed by the soft metric computation function 204 are used by the ML predictor 210 as inputs to the trained ML model to predict whether the FEC decoder 206 will successfully decode the codeword. The ML predictor 210 then performs or initiates performance of one or more HARQ related tasks based on the prediction.

[0063] Figure 3 is a flow chart that illustrates a process (e.g., performed by the demodulation and decoding system 200 of the receive chain of a radio node) to predict the FEC decoder performance and utilize the prediction to perform one or more HARQ-related tasks in accordance with some embodiments of the present disclosure. As illustrated, the radio node (e.g., via the soft metric computation function 204) obtains soft metrics for at least a portion (i.e. , a received portion) of a codeword (step 300). The codeword is a codeword that is in the process of being received by the radio node. The soft metrics are obtained as the codeword is being received. The radio node (e.g., the statistics calculator 208) determines one or more statistics for the obtained soft metrics (step 302). The radio node (e.g., the ML predictor 210) makes a prediction as to whether the codeword will be successfully decoded by the FEC decoder 206 using a ML model (step 304). The one or more statistics computed for the obtained soft metrics are used as inputs to the ML model. Optionally, one or more additional parameters may be used as additional inputs to the ML model. These one or more additional parameters may include, for example, one or more modulation and coding parameters (e.g., MCS index, or any other modulation and coding parameter(s) that affect the shape of the soft metric distributions) and/or one or more channel parameters (e.g., SNR, carrier frequency, fading characteristics, speed of the terminal, etc.).

[0064] The radio node performs one or more HARQ related tasks based on the prediction (step 306). These HARQ related task(s) may be initiated by the ML predictor 210 or performed by some other component of the radio node based on the output of the ML predictor 210, which may be a value that indicates the prediction (i.e., a value that indicates successful decoding or a value that indicates unsuccessful decoding). For example, if the prediction is that the FEC decoder 206 will not successfully decode the codeword, the radio node may preemptively transmit a HARQ NACK (i.e., transmit a HARQ NACK before FEC decoding of the codeword is complete) or preemptively request HARQ

retransmission (i.e., request HARQ retransmission before FEC decoding of the codeword is complete). As another example, if the prediction is that the FEC decoder 206 will successfully decode the codeword, the radio node may wait until FEC decoding of the codeword is complete to transmit a HARQ ACK or NACK depending on whether decoding is actually successful or not or preemptively transmit a HARQ ACK (i.e. , transmit a HARQ ACK before FEC decoding of the codeword is complete).

[0065] Now some particular embodiments of the present disclosure will be discussed. In these embodiments, the prediction is based on a ML model and the soft metrics are LLR values. Regarding the computed statistics for the soft metrics, Figure 4 shows a histogram of LLR values for 16-QAM. The histogram is an approximation of the LLR distribution. The histogram is made up of a number of super-positioned Gaussian-like distributions whose width depends on the channel SNR. When all constellation symbols are used equiprobably, the bits in the labels will be zero or one with equal probability. The mean and skewness values of the overall LLR distribution will thus be zero or close to zero regardless of the channel SNR.

[0066] To remedy this, the absolute value of LLR values is also considered, as shown in the histogram of Figure 5. Here, no values will be below zero and thus the mean and skewness values will be greater than zero. The variance of the abs(LLR) distribution will be lower compared to the LLR distribution.

[0067] The mean, variance, skewness, and kurtosis of the LLR values for a codeword can be computed successively, and thus it is not necessary to wait for the entire codeword to be received before a prediction of the FEC decoder performance can be made. The computation of the statistics (mean, variance, skewness, and kurtosis) can be computed through the central moments as shown below. Note that, as will be understood by one of skill in the art of probability theory and statistics, a central moment is a moment of a probability distribution of a random variable about the random variable's mean. In other words, a central moment is the expected value of a specified integer power of the deviation of the random variable from the mean.

1. Initiate a LLR sample counter n and the four moments m, M₂, M₃, and M₄ to zero. Initiate LLR sample counter n to zero.

2. Obtain LLR sample x 3. Update the LLR sample counter n as:

n = n + 1

4. Let d be the difference between the current LLR sample x and the current accumulated mean

d = x— m

5. Update the mean value

5

m' = m +

n

6. Update the second moment, the third moment, and the fourth moment a. Update the second moment

n— 1

M_z' = M₂ + d²

n

M₂ = M₂'

b. Update the third moment

(n - 1 )(n - 2) 3 dM₂

M,' = M, + d ³

h^D n

M ¹, 3 =— M,'

c. Update the fourth moment

M_L = M'

Note that, here, the values M₂ and M₃ are from the previous iteration.

7. The mean, variance, skewness, and kurtosis can be computed from the central moments as follows:

mean = m

8. Repeat steps 2-7 for a desired number of LLR samples [0068] Since the mean, variance, skewness, and kurtosis of the LLR values are computed successively, early values may differ substantially from the final values computed over the whole codeword. Thus, making predictions too early may lead to erroneous predictions. Thus, the number of LLR values processed before making the prediction may vary depending on the desired accuracy of the prediction. As an example, a prediction can be made after 64 LLR samples. As another example, a prediction can be made after 256 LLR samples.

[0069] Note that the statistics described above are only an example.

Additional or alternative statistics of the LLR values for at least a portion of the codeword may be used. The statistics used may vary depending on the particular ML model used. Further, while the mean, variance, skewness, and kurtosis of the LLR values are computed in the example above and used as inputs for the ML model, the present disclosure is not limited thereto. Any one or any combination of two or more of the mean, variance, skewness, and kurtosis of the LLR values may be computed and used for the ML model, depending on the particular implementation.

[0070] The statistical values computed for the LLR values and, optionally, additional parameters such as modulation and coding parameters and/or channel parameters are then fed into the ML model. During training of the ML model, the output of the FEC decoder 206 is used as the training target. This training can either be done before deployment or during normal operations.

[0071] Once the ML model is trained, the output of the ML predictor 210 can be used to perform one or more HARQ related tasks such as, e.g., initiate a HARQ retransmission earlier when it is predicted that the FEC decoder will fail.

[0072] Figure 6 illustrates a procedure for training the ML model in

accordance with some embodiments of the present disclosure. For illustration purposes, this process is described as being performed by the demodulation and decoding system 200 of Figure 2; however, the present disclosure is not limited thereto. As illustrated, a codeword is received by the radio node and processed by the demodulator 202 and the soft metric computation function 204 (step 600). As the codeword is being received, soft metrics, which are LLR values in this example, are output by the soft metric computation function 204. Initially, the statistics are reset or initialized to some initial value (step 602). As the codeword is received, the statistics calculator 208 collects, or obtains, a LLR value and updates the statistics based on the LLR value, as described above (steps 604 and 606). For example, using the process for updating moments m, M₂, M₃, and M₄ described above, the statistics calculator 208 uses the LLR value (referred to the LLR value n in this context) to update the moments m, M₂, M₃, and M₄ and updates the mean, variance, skewness, and kurtosis values based on the updated moments m, M₂, M₃, and M₄. The statistics calculator 208 determines whether the number of LLR values collected for the codeword has reached a predefined or preconfigured threshold number of samples (step 608). Preferably, the threshold number of LLR samples is less than a total number of LLR values to be output by the soft metric computation function 204 for the codeword. For example, the threshold number of samples may be 64 or 256. More generally, the threshold number of samples can be any number up to the maximum length of the codeword. Since the statistics are computed sequentially, the statistics become more accurate as the number of samples collected before the prediction increases. However, making the prediction more quickly increases the amount of latency reduction that can be achieved. Hence there is a trade-off between low latency and better statistics. In LTE, the codewords can range from 40 to 6144 bits, so in some cases there are only 40 LLR values. In the case of 40 LLR values, in some embodiments, all 40 LLR values may be collected before making the prediction, and hence the latency reduction would“only” be in the time saved by not starting the FEC decoder. For very long codewords, it will take some time to receive all LLR values, and if we make the prediction before all LLR values have been received, then the latency reduction would be larger, since waiting for the entire codeword to be received is avoided and starting of the FEC decoder is avoided. The examples of 64 and 256 are example threshold values that are powers of 2 and are substantially shorter than the maximum codeword length. Again, looking at step 608, if the desired number of LLR samples has not yet been reached, the process returns to step 604 and steps 604 and 606 are repeated for the next LLR value. Once the desired number of LLR samples have been processed, the statistics are stored (step 610).

[0073] In addition, the complete set of LLR values for the codeword are run through the FEC decoder 206 (step 612). In other words, the FEC decoder 206 attempts to decode the codeword using the LLR values output by the

demodulator 202 and soft metric computation function 204 for the codeword.

The CRC is checked to determine whether decoding was successful or not (step 613). The result of the decoding of the codeword (i.e., a success or failure) is stored (step 614). In this example, the result of the decoding of the received is stored by the ML model training function 212. In addition, in some embodiments, one or more modulation and coding parameters (e.g., code rate, MCS index, or any other modulation and coding parameter(s) that affect the shape of the soft metric distributions) and/or one or more channel parameters (e.g., SNR, carrier frequency, fading characteristics, speed of the terminal, etc.) are stored in association with the decoding result and the LLR statistics.

[0074] The ML model training function 212 determines whether a desired number of exemplars for training the ML model have been obtained (step 616). More specifically, in some embodiments, an exemplar is the decoding result, the LLR statistics, and optionally one or more modulating and coding parameters and/or one or more channel parameters for a codeword. In some other embodiments, exemplars are stored only for those codewords for which FEC decoding was successful. The number of exemplars to collect depends on the ML model and training method used. The desired number of exemplars may, e.g., be predefined or preconfigured. If the desired number of exemplars have not been obtained, the process returns to step 600 and is received for a next codeword. Once the desired number of exemplars have been obtained, the ML model training function 612 trains the ML model using the obtained exemplars (step 618). The FEC decoding result is the target for the training. Any suitable ML training technique may be used. [0075] Optionally, in some embodiments, the ML model training function 212 determines whether additional training is needed (step 620). If not, the process is done. However, if additional training is needed, the process returns to step 600 and is repeated. This process continues until a stopping criterion is met (e.g., error gradient within a certain limit, weight updates sufficiently small, maximum number reached). Note that the ML model training may be performed partially or completely online (i.e., during the reception of actual data). The ML model is now ready for use.

[0076] Figure 6 is one example process for training the ML model. Now, the discussion turns to another example process for training the ML model. As discussed above, the example training process described below reduces the amount of data that must be stored. However, before describing this training process, some background information regarding training of a ML model is beneficial.

[0077] The goal of training in ML is to minimize some objective/cost function of the form

where training estimates parameters w that minimize the cost function Q(w).

The term Qi(w) is typically associated with the i-th training example in the data set. Depending on the ML model used, the parameters w can be a vector (e.g., linear regressor), one or more matrices (e.g., neural networks), or set splitting thresholds (e.g., tree-based classifiers). In supervised learning, Qi(w) is computed as a difference between the prediction produced by the ML model (e.g., neural network, tree, linear/logistic regressor) based on the present input features and the known desired output for each training example. The objective function varies depending on the problem. Common objective functions include mean square error and absolute error typically used in regression problems, and variations of cross-entropy for classification problems. [0078] The training of the ML model is done by updating the parameters w with respect to the objective function Q(w). A common method is gradient descent

where a is the learning rate (step size) and VQi(w) is the gradient of the i-th training example with respect to the parameters w.

[0079] The training iterates over a forward pass (also referred to as“forward propagation” or“forward propagation pass” and a backward pass (also referred to as“backward propagation” or“backward propagation pass”). During the forward pass, a prediction is generated for the current parameter values w, and the cost function is computed. During the backward pass, the error gradient is propagated backwards, and the parameters w are updated. The algorithm is thus commonly referred to as“back-propagation” or“back-prop.” For example, when using a neural network, during the forward pass or forward propagation, data at the inputs of the neural network is propagated forward - for each layer the inputs from the previous layer are multiplied by weights, summed and passed through a non-linear function, commonly a sigmoid function, hyperbolic tangent, or rectifier linear unit. At the output layer, the neural network produces an output, an opinion of what the target value should be given the current inputs. During training, the difference/error is used to update the weights of the neural network. Each weight is updated depending on how much it contributed to the error. To do this, the error gradient/derivative is calculated with respect to the weight under consideration. These gradients can be efficiently calculated from the output backwards towards the input, hence the name backward propagation, or back- prop for short.

[0080] The available set of exemplars are usually divided into a training set (70% - 85% of the exemplars) used for training (i.e., updating the parameters w), a validation set (-15% if used) used to choose between ML model meta parameters and/or stopping of the training, and a test set (15% - 30%) used to assess the performance of the trained ML model. [0081] Training continues until some training goal has been achieved, e.g. small enough updates of the parameter w, small enough gradient (the same as the first to a scaling factor), a maximum number of training iterations has been performed, or the performance on the validation set starts to decrease (indicating risk of over-training). Over-training occurs when the model parameters w fits the training example“too well” at the cost of bad generalization, i.e. , at the cost of poor ability to predict the correct output for hitherto unseen inputs.

[0082] Training methods can be categorized depending on how the training data, the set of training examples, are used to compute the gradient.

[0083] In full batch gradient descent (or batch gradient descent), the gradient is computed over the entire dataset of N samples, averaging over potentially a vast amount of information. This requires a large amount of memory to store the entire data set.

[0084] In pure Stochastic Gradient Descent (SGD), the gradient is computed on a single instance of the dataset, and the parameters are updated based on this. That is, instead of using the average VQ(w) Q(w) in the equation above, we update on each VQi(w) . Since no averaging takes place, the SGD does not converge to a single point but jitters around a mean. If the learning rate is sufficiently low, or adaptively reduced, this jitter does not impact prediction performance. The jitter can be beneficial to escape from local minima or saddle points. However, pure SGD is also inefficient and may require many training examples (e.g., by looping over the training set multiple times) to find a good solution.

[0085] Mini-batch gradient descent is a compromise where a random subset of the training set is used to compute the gradient and update the parameters w. The size of the mini-set may be on the order of a few hundred to a few thousand. The mini-batch method injects enough noise to each gradient update to avoid local minima or saddle points, while achieving a relative speedy convergence. Often the term SGD is used to mean mini-batch gradient descent.

[0086] In the context of the present disclosure, the entire data set would be all codewords ever transmitted. This is clearly not practically realizable since it would require infinite storage and incur infinite delays. In this context, full batch means a set so large that adding further training exemplars only changes the average gradient in a negligible way.

[0087] Here, mini-batch corresponds to a smaller set, a few hundred to a few thousand training exemplars. The size is such that, by adding more exemplars to the mini-batch, the gradient may change in a non-negligible way.

[0088] Two example ways of performing mini-batch processing (or similar processing) are as follows:

1. Fixed batch. Store N training examples in a matrix. Each training

example consists of input features: LLR statistics, code and modulation parameters, channel parameters, and the target value (i.e., the FEC decoding result). After collecting N samples, compute the gradient and update the parameters w. Since all the exemplars are stored, a new prediction and error gradient can be computed for each exemplar. Thus, the data can be utilized for multiple updates of the parameters w. When the desired number of parameter updates has been done, the matrix is emptied, and N new samples are collected. This is repeated until some training goal has been achieved. Note that this is somewhat different than a pure mini-patch method since the standard mini-batch method picks a random mini-batch from a large fixed training set, whereas this always gets new samples. Figure 7 illustrates storage for the fixed batch method. Figure 7 illustrates an example of a matrix storing N training exemplars. Each training exemplar consists of n input features and m output targets. For this example, m=1 since we only use CRC success; however, m can, in general, be any number. The total storage is N x (n + m). This is the storage required for“fixed batch” and“sliding window batch.”

2. Sliding window batch. Again, collect N training examples in a matrix, compute the gradient, and update the parameters. However, instead of discarding the entire collection of N examples, only the oldest example is discarded and replaced by one new training example. Since all the input features and target values have been stored, the gradient can be recomputed for the updated parameters w after each replacement. Thus, in steady state, every training exemplar is used N times. In the method above, the number of times an exemplar is used is a parameter for tuning. During the initial fill of the empty matrix, processing can wait N codewords until the matrix is filled, or processing can compute the gradient and update the parameters w during initial fill up. In the first case, the first exemplar will only be used once, which is not an issue since in an operating system there will be ample amount of training data. In the second case, the first gradients will be computed on very few exemplars and the gradient noisy. As the matrix fills up, the gradients will quickly become less noisy. The sliding window can be implemented as a circular buffer where the new exemplar overwrites the oldest to avoid unnecessary data read/write. Figure 8 illustrates storage and pointer operations for the circular buffer implementation of the sliding window batch processing.

The new exemplar is written in the matrix at the address indicated by the write pointer. When the data has been written, the pointer is incremented one step. When the pointer reaches the end of the matrix, it is reset to 1.

If we use 0, ..., N-1 numbering, this becomes a modulo-N increment p:=p+1 mod N.

[0089] In both fixed batch and sliding window batch processing described above, N training examples must be stored in a matrix. If N and/or the number of features is large, the matrix requires substantial memory. A new ML model training method is described below that avoids storing the training examples.

This training method is referred to herein as an“accumulated gradient” ML model training method, which uses an accumulated gradient approach to compute mini- batch gradients.

[0090] Figure 9 illustrates computation and storage requirements for the accumulated gradient method described below with respect to Figure 10. Here, the storage requirement is 1xn, which is substantially less than either the fixed batch or sliding window batch processes described above. Again, n is the number of input features where the input features are denoted as“i1” through“in” in Figure 9, and m is the number of output targets where the output targets are denoted as“o1” through“om” in Figure 9. The computed gradients are denoted as“g1” through“gm”. Here, the storage requirement is 1xm, since the output data does not need to be stored. The n input features are passed through the ML model (e.g., neural network) and a prediction is generated (NN forward pass). The NN output is compared to the target value(s) o1 -om, and the error gradient is computed. The gradient for the single training example is then added to the accumulated gradient mean. The reason that the topmost arrow is bidirectional is that the running, or accumulated, mean is used in the computation when a new training sample is added.

[0091] Figure 10 illustrates a process for training the ML model using the accumulated gradient scheme in accordance with some embodiments of the present disclosure. This process is described with respect to the demodulation and decoding system 200 of Figure 1 ; however, the present disclosure is not limited thereto. As illustrated, the ML model training function 212 resets, or initializes, the accumulated gradient to zero (step 1000). A codeword is received by the radio node and processed by the demodulator 202 and the soft metric computation function 204 (step 1002). As the codeword is being received, soft metrics, which are LLR values, are output by the soft metric computation function 204. Using the LLR values, the statistics calculator 208 computes statistics for the LLR values of the codeword, as described above (step 1004). More specifically, as the codeword is received, the statistics calculator 208 collects, or obtains, the LLR values and updates the statistics based on the LLR values, as described above. For example, using the process for updating moments m, M₂, M₃, and M₄ described above, the statistics calculator 208 uses the LLR value (referred to the LLR value n in this context) to update the moments m, M₂, M₃, and M₄ and updates the mean, variance, skewness, and kurtosis values based on the updated moments m, M₂, M₃, and M₄. Once the desired number of LLR values have been received and used to update the statistics, the statistics for the LLR values of the codeword are stored.

[0092] In addition, the FEC decoder 206 attempts to decode the codeword using the complete set of LLR values for the codeword (step 1006). A CRC is performed to determine whether decoding was successful or not (i.e. , to determine the result of the FEC decoding). The result of the decoding of the codeword (i.e., a success or failure) is stored. In this example, the result of the decoding of the received is stored by the ML model training function 212. In addition, in some embodiments, one or more modulation and coding parameters (e.g., code rate, MCS index, or any other modulation and coding parameter(s) that affect the shape of the soft metric distributions) and/or one or more channel parameters (e.g., SNR, carrier frequency, fading characteristics, speed of the terminal, etc.) are stored in association with the decoding result and the LLR statistics.

[0093] The ML model training function 212 computes a prediction (i.e., a forward pass for the training) for the parameters w of the ML model given the features (i.e., the inputs of the ML model, e.g., the LLR statistics, modulation and coding parameters, and channel parameters) corresponding to the codeword, as described above (step 1008). The ML model training function 212 then computes an error gradient using the prediction of the parameters w of the ML model, the target data (i.e., the result of the decoding of the codeword), and the formula (i.e. Equation (1 )) given above for the successive computation of the mean (step 1010). In Equation (1 ), it is shown how to compute the average gradient given a number of samples. Here, the formula in Equation (1 ) can also be used for computing running means m’ such that the training exemplars do not need to be stored. Regarding the computation of the error gradient, again, the cost function (a.k.a. objective function, error function) computes the difference between the prediction made by the ML model and the true target value. The prediction is a function of the current parameters of the ML model. The error gradient is computed by taking the derivative of the cost function with respect to the parameters, which are often referred to as“weights.” Then, if we update the weights in the negative direction of this gradient, the error will diminish. The error gradient is computed for each training example. Looking at the equations above, the error gradient is VQ_j(w) where VQ_j(w) =

where Q_t is the cost function for the i-th training example given the current input parameters and a is output of the ML model for the current input. The ML model training function 212 then updates the accumulated gradient using the computed error gradient (step 1012). For example, the running mean can be computed.

[0094] The ML model training function 212 determines whether a desired number of codewords have been received and processed for training the ML model (step 1014). The desired number of codewords is a number of codeword that corresponds to the size of a mini-batch. If not, the process returns to step 1002 and is repeated for the next codeword. Once the desired number of codewords have been received and processed for training the ML model, the ML model training function 212 updates the parameters w of the ML model with a backward pass (step 1016). The details of the backward pass (also referred to as“back prop” or“backward propagation” are well known to those of ordinary skill in the art and, as such, are not repeated herein. In general, the backward pass updates the weight of each layer in the ML model based on the error gradient for that specific layer, where the gradient is computed as the derivative of the cost function with respect to the weights for the layer under consideration. This makes use of the chain-rule for derivatives. The process then returns to step 1000 and is repeated, e.g., until some stopping criterion is met. One advantage of the accumulated gradient method is that the radio node only has to store the accumulated gradient.

[0095] Note that any suitable stopping criterion may be used for ML training. As one example, the stopping criterion is a maximum number of training epochs or when the training error falls below some threshold. As another example, validation may be used to determine a prediction error rate, where training stops based on the prediction error rate. As one specific example, after a given number of training epochs, the ML model is used to make predictions for a number of codewords, and the predictions are compared to the decoded codewords to determine a prediction error rate. The last one (or more) prediction error rates are stored. When the prediction error rate starts to increase, training is stopped, since an increase in the prediction error rate indicates that over- training of the ML model is starting. In some embodiments, the ML model weights from the last prediction error rate computation are restored when the prediction error rate starts to increase.

[0096] Note that since the error gradient depends on the training data and the parameters w, the accumulated gradient cannot be used once the parameters w have been updated. Thus, the accumulated gradient must be reset after each update of the parameters w. Thus, each training example is only used once and reuse, as in the mini-batch cases above, is not possible. However, in the context of a cellular communications system, training data is abundant, and this should not be an issue.

[0097] The discussion of Figures 6 through 10 relates to the training of the ML model. Figure 1 1 is a flow chart that illustrates the use of the trained ML model to predict the performance of the FEC decoder 206 during operation in

accordance with some embodiments of the present disclosure. The process of Figure 1 1 can be viewed as one example of an implementation of the process of Figure 3. In general, during reception of a codeword, when a sufficient number of LLR values have been collected, the LLR statistics along with, in this example, the modulation and coding parameter(s) and the channel parameter(s) are fed into the ML model to make a prediction of the result of FEC decoding of the codeword. This prediction can be done even before the full codeword has been received (i.e. , before all LLR values for the codeword are computed). Then, based on the prediction, one or more HARQ related tasks can be performed even before the FEC decoding begins.

[0098] In the process of Figure 1 1 , a codeword is received by the radio node and processed by the demodulator 202 and the soft metric computation function 204. As the codeword is being received, soft metrics, which are LLR values, are output by the soft metric computation function 204. Initially, the statistics are reset or initialized to some initial value (step 1100). As the codeword is received, the statistics calculator 208 collects, or obtains, a LLR value and updates the statistics based on the LLR value, as described above (steps 1102 and 1 104).

For example, using the process for updating moments m, M₂, M₃, and M₄ described above, the statistics calculator 208 uses the LLR value (referred to the LLR value n in this context) to update the moments m, M₂, M₃, and M₄ and updates the mean, variance, skewness, and kurtosis values based on the updated moments m, M₂, M₃, and M₄. The statistics calculator 208 determines whether the number of LLR values collected for the codeword have reached a predefined or preconfigured threshold number of samples (step 1 106).

Preferably, the threshold number of LLR samples is less than a total number of LLR values to be output by the soft metric computation function 204 for the codeword. If the desired number of LLR samples has not yet been reached, the process returns to step 1 102 and steps 1 102 and 1 104 are repeated for the next LLR value. Once the desired number of LLR samples have been processed, the statistics are stored.

[0099] The ML predictor 210 inputs the computed LLR statistics for the codeword together with the modulation and coding scheme parameter(s) and the channel parameter(s) into the trained ML model to make a prediction as to whether the FEC decoder 206 will successfully decode the codeword (step 1 108). Importantly, this prediction is made before FEC decoding is complete and may be made even before all LLR values for the codeword are computed and thus before FEC decoding begins.

[0100] If the prediction is that FEC decoding will be successful (step 1 1 10, YES), the radio node either sends a preemptive HARQ ACK or does nothing (i.e. , waits until FEC decoding is complete to send either a HARQ ACK or HARQ NACK depending on the actual result of FEC decoding) (step 1 1 12). If a HARQ ACK is preemptively sent, this HARQ ACK is sent before FEC decoding completes and may even be sent before FEC decoding begins. If the prediction is that FEC decoding will not be successful (step 1 1 10, NO), the radio node either sends a preemptive NACK or preemptively requests retransmission (step 1 1 14). This HARQ NACK or retransmission request is preemptively sent before FEC decoding completes and may even be sent before FEC decoding begins. After steps 1 1 12 and 1 1 14, the process returns to step 1 100 and is repeated for the next codeword.

[0101] Note that, in some embodiments, at least some aspects of the present disclosure may be implemented in the cloud. For example, there are example cloud implementations of a Turbo decoder. As such, in some embodiments, the ML predictor 210 and, optionally, the ML model training function 212 are implemented in the cloud.

[0102] In a scenario where ML training is performed while the system is in operation, it could be useful to implement the training (i.e., the ML model training function 212) in the cloud, e.g., to reduce the computational load on the unit containing the receiver. This requires an interface between the receiver and the ML trainer, where training data is fed from the receiver to the cloud-based trainer, and new prediction parameters are fed back from the ML trainer to the receiver. Such interface/protocol needs to strike a balance between flexibility and efficiency. It must be flexible enough to allow description of various kinds of training data, either single examples or batches of training examples, and various kinds of ML models and corresponding parameters. However, data transferred over the training protocol competes for resources with the communication fronthaul/backhaul, and the amount must be reasonable. In this scenario, the ML predictor 210 is implemented in the receiver.

[0103] Figure 12 is a schematic block diagram of a radio access node 1200 according to some embodiments of the present disclosure. The radio access node 1200 may be, for example, a base station 102 or 106. As illustrated, the radio access node 1200 includes a control system 1202 that includes one or more processors 1204 (e.g., Central Processing Units (CPUs), ASICs, Field Programmable Gate Arrays (FPGAs), and/or the like), memory 1206, and a network interface 1208. The one or more processors 1204 are also referred to herein as processing circuitry. In addition, the radio access node 1200 includes one or more radio units 1210 that each includes one or more transmitters 1212 and one or more receivers 1214 coupled to one or more antennas 1216. The radio units 1210 may be referred to or be part of radio interface circuitry. In some embodiments, the radio unit(s) 1210 is external to the control system 1202 and connected to the control system 1202 via, e.g., a wired connection (e.g., an optical cable). However, in some other embodiments, the radio unit(s) 1210 and potentially the antenna(s) 1216 are integrated together with the control system 1202. The one or more processors 1204 operate to provide one or more functions of a radio access node 1200 as described herein. In some

embodiments, the statistics calculator 208, ML predictor 210 and, in some embodiments, the ML model training function 212 are implemented in hardware or a combination of hardware in the control system 1202 or distributed across the control system and the radio unit(s) 1210. As an example, at least some aspects of the statistics calculator 208, ML predictor 210 and, in some embodiments, the ML model training function 212 are implemented in software that is stored, e.g., in the memory 1206 and executed by the one or more processors 1204.

[0104] Figure 13 is a schematic block diagram that illustrates a virtualized embodiment of the radio access node 1200 according to some embodiments of the present disclosure. This discussion is equally applicable to other types of network nodes. Further, other types of network nodes may have similar virtualized architectures.

[0105] As used herein, a“virtualized” radio access node is an implementation of the radio access node 1200 in which at least a portion of the functionality of the radio access node 1200 is implemented as a virtual component(s) (e.g., via a virtual machine(s) executing on a physical processing node(s) in a network(s)).

As illustrated, in this example, the radio access node 1200 includes the control system 1202 that includes the one or more processors 1204 (e.g., CPUs, ASICs, FPGAs, and/or the like), the memory 1206, and the network interface 1208 and the one or more radio units 1210 that each includes the one or more transmitters 1212 and the one or more receivers 1214 coupled to the one or more antennas 1216, as described above. The control system 1202 is connected to the radio unit(s) 1210 via, for example, an optical cable or the like. The control system 1202 is connected to one or more processing nodes 1300 coupled to or included as part of a network(s) 1302 via the network interface 1208. Each processing node 1300 includes one or more processors 1304 (e.g., CPUs, ASICs, FPGAs, and/or the like), memory 1306, and a network interface 1308.

[0106] In this example, functions 1310 of the radio access node 1200 described herein are implemented at the one or more processing nodes 1300 or distributed across the control system 1202 and the one or more processing nodes 1300 in any desired manner. In some particular embodiments, at least some aspects of the statistics calculator 208, ML predictor 210 and, in some embodiments, the ML model training function 212 described herein are

implemented as virtual components executed by one or more virtual machines implemented in a virtual environment(s) hosted by the processing node(s) 1300. As will be appreciated by one of ordinary skill in the art, additional signaling or communication between the processing node(s) 1300 and the control system 1202 is used in order to carry out at least some of the desired functions 1310. Notably, in some embodiments, the control system 1202 may not be included, in which case the radio unit(s) 1210 communicate directly with the processing node(s) 1300 via an appropriate network interface(s).

[0107] In some embodiments, a computer program including instructions which, when executed by at least one processor, causes the at least one processor to carry out the functionality of radio access node 1200 or a node (e.g., a processing node 1300) implementing one or more of the functions 1310 of the radio access node 1200 in a virtual environment according to any of the embodiments described herein is provided. In some embodiments, a carrier comprising the aforementioned computer program product is provided. The carrier is one of an electronic signal, an optical signal, a radio signal, or a computer readable storage medium (e.g., a non-transitory computer readable medium such as memory).

[0108] Figure 14 is a schematic block diagram of the radio access node 1200 according to some other embodiments of the present disclosure. The radio access node 1200 includes one or more modules 1400, each of which is implemented in software. The module(s) 1400 provide the functionality of the radio access node 1200 described herein. This discussion is equally applicable to the processing node 1300 of Figure 13 where the modules 1400 may be implemented at one of the processing nodes 1300 or distributed across multiple processing nodes 1300 and/or distributed across the processing node(s) 1300 and the control system 1202. In some embodiments, the modules 1400 include a obtaining module operable to obtain soft metrics for at least a portion of a codeword, a determining module operable to determine one or more statistics based on the obtained soft metrics, a prediction module operable to make a prediction using a ML model as to whether the codeword will be successfully decoded by the FEC decoder, and a performing module operable to perform (e.g., initiate) one or more HARQ related tasks based on the prediction, as described above. In addition, the modules 1400 may include a ML model training module operable to train the ML model, as described above.

[0109] Figure 15 is a schematic block diagram of a UE 1500 according to some embodiments of the present disclosure. As illustrated, the UE 1500 includes one or more processors 1502 (e.g., CPUs, ASICs, FPGAs, and/or the like), memory 1504, and one or more transceivers 1506 each including one or more transmitters 1508 and one or more receivers 1510 coupled to one or more antennas 1512. The processors 1502 are also referred to herein as processing circuitry. The transceivers 1506 are also referred to herein as radio circuitry. In some embodiments, the statistics calculator 208, ML predictor 210 and, in some embodiments, the ML model training function 212 are implemented in hardware or a combination of hardware in the processor(s) 1502 or distributed across the processor(s) 1502 and the receiver(s) 1510. As an example, at least some aspects of the statistics calculator 208, ML predictor 210 and, in some

embodiments, the ML model training function 212 are implemented in software that is, e.g., stored in the memory 1504 and executed by the processor(s) 1502. Note that the UE 1500 may include additional components not illustrated in Figure 15 such as, e.g., one or more user interface components (e.g., a display, buttons, a touch screen, a microphone, a speaker(s), and/or the like), a power supply (e.g., a battery and associated power circuitry), etc. [0110] In some embodiments, a computer program including instructions which, when executed by at least one processor, causes the at least one processor to carry out the functionality of the UE 1500 according to any of the embodiments described herein is provided. In some embodiments, a carrier comprising the aforementioned computer program product is provided. The carrier is one of an electronic signal, an optical signal, a radio signal, or a computer readable storage medium (e.g., a non-transitory computer readable medium such as memory).

[0111] Figure 16 is a schematic block diagram of the UE 1500 according to some other embodiments of the present disclosure. The UE 1500 includes one or more modules 1600, each of which is implemented in software. The module(s) 1600 provide the functionality of the UE 1500 described herein. In some embodiments, the modules 1500 include a obtaining module operable to obtain soft metrics for at least a portion of a codeword, a determining module operable to determine one or more statistics based on the obtained soft metrics, a prediction module operable to make a prediction using a ML model as to whether the codeword will be successfully decoded by the FEC decoder, and a performing module operable to perform (e.g., initiate) one or more HARQ related tasks based on the prediction, as described above. In addition, the modules 1500 may include a ML model training module operable to train the ML model, as described above.

[0112] With reference to Figure 17, in accordance with an embodiment, a communication system includes a telecommunication network 1700, such as a 3GPP-type cellular network, which comprises an access network 1702, such as a RAN, and a core network 1704. The access network 1702 comprises a plurality of base stations 1706A, 1706B, 1706C, such as NBs, eNBs, gNBs, or other types of wireless Access Points (APs), each defining a corresponding coverage area 1708A, 1708B, 1708C. Each base station 1706A, 1706B, 1706C is connectable to the core network 1704 over a wired or wireless connection 1710. A first UE 1712 located in coverage area 1708C is configured to wirelessly connect to, or be paged by, the corresponding base station 1706C. A second UE 1714 in coverage area 1708A is wirelessly connectable to the corresponding base station 1706A. While a plurality of UEs 1712, 1714 are illustrated in this example, the disclosed embodiments are equally applicable to a situation where a sole UE is in the coverage area or where a sole UE is connecting to the corresponding base station 1706.

[0113] The telecommunication network 1700 is itself connected to a host computer 1716, which may be embodied in the hardware and/or software of a standalone server, a cloud-implemented server, a distributed server, or as processing resources in a server farm. The host computer 1716 may be under the ownership or control of a service provider, or may be operated by the service provider or on behalf of the service provider. Connections 1718 and 1720 between the telecommunication network 1700 and the host computer 1716 may extend directly from the core network 1704 to the host computer 1716 or may go via an optional intermediate network 1722. The intermediate network 1722 may be one of, or a combination of more than one of, a public, private, or hosted network; the intermediate network 1722, if any, may be a backbone network or the Internet; in particular, the intermediate network 1722 may comprise two or more sub-networks (not shown).

[0114] The communication system of Figure 17 as a whole enables connectivity between the connected UEs 1712, 1714 and the host computer 1716. The connectivity may be described as an Over-the-Top (OTT) connection 1724. The host computer 1716 and the connected UEs 1712, 1714 are configured to communicate data and/or signaling via the OTT connection 1724, using the access network 1702, the core network 1704, any intermediate network 1722, and possible further infrastructure (not shown) as intermediaries. The OTT connection 1724 may be transparent in the sense that the participating communication devices through which the OTT connection 1724 passes are unaware of routing of uplink and downlink communications. For example, the base station 1706 may not or need not be informed about the past routing of an incoming downlink communication with data originating from the host computer 1716 to be forwarded (e.g., handed over) to a connected UE 1712. Similarly, the base station 1706 need not be aware of the future routing of an outgoing uplink communication originating from the UE 1712 towards the host computer 1716.

[0115] Example implementations, in accordance with an embodiment, of the UE, base station, and host computer discussed in the preceding paragraphs will now be described with reference to Figure 18. In a communication system 1800, a host computer 1802 comprises hardware 1804 including a communication interface 1806 configured to set up and maintain a wired or wireless connection with an interface of a different communication device of the communication system 1800. The host computer 1802 further comprises processing circuitry 1808, which may have storage and/or processing capabilities. In particular, the processing circuitry 1808 may comprise one or more programmable processors, ASICs, FPGAs, or combinations of these (not shown) adapted to execute instructions. The host computer 1802 further comprises software 1810, which is stored in or accessible by the host computer 1802 and executable by the processing circuitry 1808. The software 1810 includes a host application 1812. The host application 1812 may be operable to provide a service to a remote user, such as a UE 1814 connecting via an OTT connection 1816 terminating at the UE 1814 and the host computer 1802. In providing the service to the remote user, the host application 1812 may provide user data which is transmitted using the OTT connection 1816.

[0116] The communication system 1800 further includes a base station 1818provided in a telecommunication system and comprising hardware 1820 enabling it to communicate with the host computer 1802 and with the UE 1814. The hardware 1820 may include a communication interface 1822 for setting up and maintaining a wired or wireless connection with an interface of a different communication device of the communication system 1800, as well as a radio interface 1824 for setting up and maintaining at least a wireless connection 1826 with the UE 1814 located in a coverage area (not shown in Figure 18) served by the base station 1818. The communication interface 1822 may be configured to facilitate a connection 1828 to the host computer 1802. The connection 1828 may be direct or it may pass through a core network (not shown in Figure 18) of the telecommunication system and/or through one or more intermediate networks outside the telecommunication system. In the embodiment shown, the hardware 1820 of the base station 1818 further includes processing circuitry 1830, which may comprise one or more programmable processors, ASICs, FPGAs, or combinations of these (not shown) adapted to execute instructions. The base station 1818 further has software 1832 stored internally or accessible via an external connection.

[0117] The communication system 1800 further includes the UE 1814 already referred to. The UE’s 1814 hardware 1834 may include a radio interface 1836 configured to set up and maintain a wireless connection 1826 with a base station serving a coverage area in which the UE 1814 is currently located. The hardware 1834 of the UE 1814 further includes processing circuitry 1838, which may comprise one or more programmable processors, ASICs, FPGAs, or combinations of these (not shown) adapted to execute instructions. The UE 1814 further comprises software 1840, which is stored in or accessible by the UE 1814 and executable by the processing circuitry 1838. The software 1840 includes a client application 1842. The client application 1842 may be operable to provide a service to a human or non-human user via the UE 1814, with the support of the host computer 1802. In the host computer 1802, the executing host application 1812 may communicate with the executing client application 1842 via the OTT connection 1816 terminating at the UE 1814 and the host computer 1802. In providing the service to the user, the client application 1842 may receive request data from the host application 1812 and provide user data in response to the request data. The OTT connection 1816 may transfer both the request data and the user data. The client application 1842 may interact with the user to generate the user data that it provides.

[0118] It is noted that the host computer 1802, the base station 1818, and the UE 1814 illustrated in Figure 18 may be similar or identical to the host computer 1716, one of the base stations 1706A, 1706B, 1706C, and one of the UEs 1712, 1714 of Figure 17, respectively. This is to say, the inner workings of these entities may be as shown in Figure 18 and independently, the surrounding network topology may be that of Figure 17.

[0119] In Figure 18, the OTT connection 1816 has been drawn abstractly to illustrate the communication between the host computer 1802 and the UE 1814 via the base station 1818 without explicit reference to any intermediary devices and the precise routing of messages via these devices. The network

infrastructure may determine the routing, which may be configured to hide from the UE 1814 or from the service provider operating the host computer 1802, or both. While the OTT connection 1816 is active, the network infrastructure may further take decisions by which it dynamically changes the routing (e.g., on the basis of load balancing consideration or reconfiguration of the network).

[0120] The wireless connection 1826 between the UE 1814 and the base station 1818 is in accordance with the teachings of the embodiments described throughout this disclosure. One or more of the various embodiments improve the performance of OTT services provided to the UE 1814 using the OTT connection 1816, in which the wireless connection 1826 forms the last segment. More precisely, the teachings of these embodiments may improve latency and thereby provide benefits such as, e.g., reduced user waiting time and better

responsiveness.

[0121] A measurement procedure may be provided for the purpose of monitoring data rate, latency, and other factors on which the one or more embodiments improve. There may further be an optional network functionality for reconfiguring the OTT connection 1816 between the host computer 1802 and the UE 1814, in response to variations in the measurement results. The

measurement procedure and/or the network functionality for reconfiguring the OTT connection 1816 may be implemented in the software 1810 and the hardware 1804 of the host computer 1802 or in the software 1840 and the hardware 1834 of the UE 1814, or both. In some embodiments, sensors (not shown) may be deployed in or in association with communication devices through which the OTT connection 1816 passes; the sensors may participate in the measurement procedure by supplying values of the monitored quantities exemplified above, or supplying values of other physical quantities from which the software 1810, 1840 may compute or estimate the monitored quantities. The reconfiguring of the OTT connection 1816 may include message format, retransmission settings, preferred routing, etc.; the reconfiguring need not affect the base station 1814, and it may be unknown or imperceptible to the base station 1814. Such procedures and functionalities may be known and practiced in the art. In certain embodiments, measurements may involve proprietary UE signaling facilitating the host computer 1802’s measurements of throughput, propagation times, latency, and the like. The measurements may be

implemented in that the software 1810 and 1840 causes messages to be transmitted, in particular empty or‘dummy’ messages, using the OTT connection 1816 while it monitors propagation times, errors, etc.

[0122] Figure 19 is a flowchart illustrating a method implemented in a communication system, in accordance with one embodiment. The

communication system includes a host computer, a base station, and a UE which may be those described with reference to Figures 17 and 18. For simplicity of the present disclosure, only drawing references to Figure 19 will be included in this section. In step 1900, the host computer provides user data. In sub-step 1902 (which may be optional) of step 1900, the host computer provides the user data by executing a host application. In step 1904, the host computer initiates a transmission carrying the user data to the UE. In step 1906 (which may be optional), the base station transmits to the UE the user data which was carried in the transmission that the host computer initiated, in accordance with the teachings of the embodiments described throughout this disclosure. In step 1908 (which may also be optional), the UE executes a client application associated with the host application executed by the host computer.

[0123] Figure 20 is a flowchart illustrating a method implemented in a communication system, in accordance with one embodiment. The

communication system includes a host computer, a base station, and a UE which may be those described with reference to Figures 17 and 18. For simplicity of the present disclosure, only drawing references to Figure 20 will be included in this section. In step 2000 of the method, the host computer provides user data.

In an optional sub-step (not shown) the host computer provides the user data by executing a host application. In step 2002, the host computer initiates a transmission carrying the user data to the UE. The transmission may pass via the base station, in accordance with the teachings of the embodiments described throughout this disclosure. In step 2004 (which may be optional), the UE receives the user data carried in the transmission.

[0124] Figure 21 is a flowchart illustrating a method implemented in a communication system, in accordance with one embodiment. The

communication system includes a host computer, a base station, and a UE which may be those described with reference to Figures 17 and 18. For simplicity of the present disclosure, only drawing references to Figure 21 will be included in this section. In step 2100 (which may be optional), the UE receives input data provided by the host computer. Additionally or alternatively, in step 2102, the UE provides user data. In sub-step 2104 (which may be optional) of step 2100, the UE provides the user data by executing a client application. In sub-step 2106 (which may be optional) of step 2102, the UE executes a client application which provides the user data in reaction to the received input data provided by the host computer. In providing the user data, the executed client application may further consider user input received from the user. Regardless of the specific manner in which the user data was provided, the UE initiates, in sub-step 2108 (which may be optional), transmission of the user data to the host computer. In step 21 10 of the method, the host computer receives the user data transmitted from the UE, in accordance with the teachings of the embodiments described throughout this disclosure.

[0125] Figure 22 is a flowchart illustrating a method implemented in a communication system, in accordance with one embodiment. The

communication system includes a host computer, a base station, and a UE which may be those described with reference to Figures 17 and 18. For simplicity of the present disclosure, only drawing references to Figure 22 will be included in this section. In step 2200 (which may be optional), in accordance with the teachings of the embodiments described throughout this disclosure, the base station receives user data from the UE. In step 2202 (which may be optional), the base station initiates transmission of the received user data to the host computer. In step 2204 (which may be optional), the host computer receives the user data carried in the transmission initiated by the base station.

[0126] While processes in the figures may show a particular order of operations performed by certain embodiments of the present disclosure, it should be understood that such order is exemplary (e.g., alternative embodiments may perform the operations in a different order, combine certain operations, overlap certain operations, etc.).

[0127] At least some of the following abbreviations may be used in this disclosure. If there is an inconsistency between abbreviations, preference should be given to how it is used above. If listed multiple times below, the first listing should be preferred over any subsequent listing(s).

• 16-QAM 16 symbol constellation Quadrature Amplitude

Modulation

3GPP Third Generation Partnership Project

5G Fifth Generation

ACK Acknowledgement

AP Access Point

ARQ Automatic Repeat Request

ASIC Application Specific Integrated Circuit

BCH Bose-Chaudhuri-Hocquenghem

CPU Central Processing Unit

CRC Cyclic Redundancy Check

eNB Enhanced or Evolved Node B

FEC Forward Error Correction

FPGA Field Programmable Gate Array

gNB New Radio Base Station

HARQ Hybrid Automatic Repeat Request

LDPC Low Density Parity Check LLR Log Likelihood Ratio

LTE Long Term Evolution

MCS Modulation and Coding Scheme

ML Machine-Learning

MME Mobility Management Entity

MTC Machine Type Communication

NACK Negative Acknowledgement

NR New Radio

OTT Over-the-Top

· P-GW Packet Data Network Gateway

RRH Remote Radio Head

SCEF Service Capability Exposure Function

SGD Stochastic Gradient Descent

SNR Signal to Noise Ratio

· UE User Equipment

URLLC Ultra Reliable Low Latency Communication

[0128] Those skilled in the art will recognize improvements and modifications to the embodiments of the present disclosure. All such improvements and modifications are considered within the scope of the concepts disclosed herein.

Claims

Claims What is claimed is:

1. A method in a radio node in a cellular communications system,

comprising:

obtaining (300; 1 102 and 1 106) a plurality of soft metrics for at least a portion of a codeword;

determining (302; 1 104) one or more statistics based on the plurality of soft metrics;

making (304; 1 108) a prediction as to whether the codeword will be successfully decoded by a Forward Error Correction, FEC, decoder (206) of the radio node using a Machine-Learning, ML, model, wherein the one or more statistics are provided as inputs to the ML model in order to make the prediction; and

performing (306; 1 1 10-1 1 14) a Hybrid Automatic Repeat Request, HARQ, related task based on the prediction.

2. The method of claim 1 wherein the one or more statistics and one or more additional parameters are provided as inputs to the ML model in order to make the prediction, the one or more additional parameters comprising one or more modulation and coding parameters and/or one or more channel parameters.

3. The method of claim 2 wherein the one or more additional parameters comprise the one or more modulation and coding parameters, and the one or more modulation and coding parameters comprise a code rate used for the codeword and/or a modulation index of a modulation and coding scheme used for the codeword.

4. The method of claim 2 or 3 wherein the one or more additional parameters comprise the one or more channel parameters, and the one or more channel parameters comprise a Signal to Noise Ratio, SNR, of a wireless channel on which the codeword is received, carrier frequency of the wireless channel on which the codeword is received, fading characteristics of the wireless channel on which the codeword is received, and/or a speed of the radio node and/or a speed of a transmitter from which the codeword is received.

5. The method of any one of claims 1 to 4 wherein the one or more statistics comprise a mean of the plurality of soft metrics, a variance of the plurality of soft metrics, a skewness of the plurality of soft metrics, a kurtosis of the plurality of soft metrics, and/or one or more central moments of the plurality of soft metrics.

6. The method of any one of claims 1 to 5 wherein the plurality of soft metrics is a plurality of Log Likelihood Ratio, LLR, values.

7. The method of any one of claims 1 to 6 wherein the prediction is that the codeword will not be successfully decoded by the FEC decoder (206) of the radio node, and performing the HARQ related task comprises sending, before decoding of the codeword by the FEC decoder (206) is complete, a HARQ retransmission request to a transmit node that transmitted the codeword.

8. The method of any one of claims 1 to 6 wherein the prediction is that the codeword will not be successfully decoded by the FEC decoder (206) of the radio node, and performing the HARQ related task comprises sending, before decoding of the codeword by the FEC decoder (206) is complete, a Negative Acknowledgement, NACK, to a transmit node that transmitted the codeword.

9. The method of any one of claims 1 to 6 wherein the prediction is that the codeword will be successfully decoded by the FEC decoder (206) of the radio node, and performing the HARQ related task comprises waiting until decoding of the codeword by the FEC decoder (206) is complete before sending an

Acknowledgement, ACK, or Negative Acknowledgement, NACK, to a transmit node that transmitted the codeword.

10. The method of any one of claims 1 to 6 wherein the prediction is that the codeword will be successfully decoded by the FEC decoder (206) of the radio node, and performing the HARQ related task comprises sending, before decoding of the codeword by the FEC decoder (206) is complete, an

Acknowledgement, ACK, to a transmit node that transmitted the codeword.

1 1 . The method of any one of claims 1 to 10 wherein the FEC decoder (206) is a Turbo decoder, a Low Density Parity Check, LDPC, decoder, or a Polar decoder.

12. The method of any one of claims 1 to 1 1 further comprising training the ML model based on a plurality of prior codewords that were received and decoded by the radio node prior to receiving the codeword.

13. The method of claim 12 wherein training the ML model comprises:

for each prior codeword of the plurality of prior codewords:

obtaining a plurality of soft metrics for at least a portion of the prior codeword;

determining one or more statistics based on the plurality of soft metrics obtained for the at least a portion of the prior codeword;

storing the one or more statistics for the prior codeword;

decoding the prior codeword to obtain a decoding result;

storing the decoding result for the prior codeword; and training the ML model based on the stored statistics and the stored decoding results for the plurality of prior codewords.

14. The method of claim 13 wherein training the ML model based on the stored statistics and the stored decoding results for the plurality of prior codewords comprises training the ML model based on: the stored statistics and the stored decoding results for the plurality of prior codewords; and

one or more modulation and coding parameters and/or one or more channel parameters for the plurality of prior codewords, respectively.

15. The method of claim 12 wherein the plurality of prior codewords comprise two or more sets of prior codewords, and training the ML model comprises: a) for each prior codeword in a first set of prior codewords:

resetting (1000) an accumulated gradient to zero;

computing (1004) one or more statistics based on a plurality of soft metrics obtained for at least a portion of the prior codeword;

decoding (1006) the prior codeword to obtain a decoding result; computing (1008) a prediction of a plurality of parameters that define the ML model given the one or more statistics;

computing (1010) an error gradient using the prediction of the plurality of parameters of the ML model and the decoding result; and

updating (1012) the accumulated gradient based on the computed error gradient;

b) repeating the steps of resetting the accumulated gradient, computing the one or more statistics, decoding the prior codeword, computing the prediction, computing the error gradient, and updating the accumulated gradient for each other prior codeword in the first set;

c) updating (1016) the plurality of parameters that define the ML model based on the accumulated gradient using a backward pass; and d) repeating steps (a) through (c) for each other set of prior codewords in the two or more sets of prior codewords.

16. The method of any one of claims 1 to 15 wherein the radio node is a base station.

17. The method of any one of claims 1 to 15 wherein the radio node is a wireless device.

18. A radio node for a cellular communications system, the radio node adapted to:

obtain a plurality of soft metrics for at least a portion of a codeword;

determine one or more statistics based on the plurality of soft metrics; make a prediction as to whether the codeword will be successfully decoded by a Forward Error Correction, FEC, decoder (206) of the radio node using a Machine Learning, ML, model, wherein the one or more statistics are provided as inputs to the ML model in order to make the prediction; and

perform a Hybrid Automatic Repeat Request, HARQ, related task based on the prediction.

19. The radio node of claim 18 wherein the radio node is further adapted to perform the method of any one of claims 2 to 15.

20. The radio node of claim 18 or 19 wherein the radio node is a base station.

21 . The radio node of claim 18 or 19 wherein the radio node is a wireless device.

22. A radio node for a cellular communications system, the radio node comprising:

circuitry (1214, 1204, 1510, 1502) operable to:

obtain a plurality of soft metrics for at least a portion of a codeword; determine one or more statistics based on the plurality of soft metrics;

make a prediction as to whether the codeword will be successfully decoded by a Forward Error Correction, FEC, decoder (206) of the radio node using a Machine Learning, ML, model, wherein the one or more statistics are provided as inputs to the ML model in order to make the prediction; and

23. A base station configured to communicate with a User Equipment, UE, the base station comprising a radio interface and processing circuitry configured to perform the method of any one of claims 1 to 15.

24. A User Equipment, UE, configured to communicate with a base station, the UE comprising a radio interface and processing circuitry configured to perform the method of any one of claims 1 to 15.

25. A communication system including a host computer comprising:

processing circuitry configured to provide user data; and

a communication interface configured to forward user data to a cellular network for transmission to a User Equipment, UE;

wherein the UE comprises a radio interface and processing circuitry, the UE’s processing circuitry configured to perform the method of any one of claims 1 to 15.

26. The communication system of claim 25, further including the UE.

27. The communication system of claim 26, wherein the cellular network further includes a base station configured to communicate with the UE.

28. The communication system of claim 26 or 27, wherein:

the processing circuitry of the host computer is configured to execute a host application, thereby providing the user data; and

the UE’s processing circuitry is configured to execute a client application associated with the host application.

29. A method implemented in a communication system including a host computer, a base station and a User Equipment, UE, the method comprising: at the host computer, providing user data; and

at the host computer, initiating a transmission carrying the user data to the UE via a cellular network comprising the base station, wherein the UE performs the method of any one of claims 1 to 15.

30. The method of claim 29, further comprising:

at the UE, receiving the user data from the base station.

31 . A communication system including a host computer comprising:

a communication interface configured to receive user data originating from a transmission from a User Equipment, UE, to a base station;

wherein the UE comprises a radio interface and processing circuitry, the UE’s processing circuitry configured to performs the method of any one of claims 1 to 15.

32. The communication system of claim 31 , further including the UE.

33. The communication system of claim 32, further including the base station, wherein the base station comprises a radio interface configured to communicate with the UE and a communication interface configured to forward to the host computer the user data carried by a transmission from the UE to the base station.

34. The communication system of claim 32 or 33, wherein:

the processing circuitry of the host computer is configured to execute a host application; and

the UE’s processing circuitry is configured to execute a client application associated with the host application, thereby providing the user data.

35. The communication system of claim 32 or 33, wherein:

the processing circuitry of the host computer is configured to execute a host application, thereby providing request data; and

the UE’s processing circuitry is configured to execute a client application associated with the host application, thereby providing the user data in response to the request data.

36. A communication system including a host computer comprising a communication interface configured to receive user data originating from a transmission from a User Equipment, UE, to a base station, wherein the base station comprises a radio interface and processing circuitry, the base station’s processing circuitry configured to perform the method of any one of claims 1 to 15.

37. The communication system of claim 36, further including the base station.

38. The communication system of claim 37, further including the UE, wherein the UE is configured to communicate with the base station.

39. The communication system of claim 38, wherein:

the UE is configured to execute a client application associated with the host application, thereby providing the user data to be received by the host computer.