WO2021048594A1

WO2021048594A1 - Methods for block error rate target selection for a communication session and related apparatus

Info

Publication number: WO2021048594A1
Application number: PCT/IB2019/057618
Authority: WO
Inventors: Euhanna GHADIMI; Pablo SOLDATI
Original assignee: Telefonaktiebolaget Lm Ericsson (Publ)
Priority date: 2019-09-10
Filing date: 2019-09-10
Publication date: 2021-03-18
Also published as: EP4029171A1

Abstract

A method of operating a network node in a communication network may be provided. The network node may determine a probability density function of a set of block error rate targets for a communication session with a user device based on user device context information, network context information, and/or an exploration strategy. The network node may select a block error rate target from the set of block error rate targets based on the probability density function. The network node may configure radio transmission parameters for the communication session with the user device based on the selected block error rate target.

Description

METHODS FOR BLOCK ERROR RATE TARGET SELECTION FOR A COMMUNICATION SESSION AND RELATED APPARATUS

TECHNICAL FIELD

[0001 ] The present disclosure relates generally to configuring a block error rate (BLER) target for a communication session between a network node and a user device in a radio communication network.

BACKGROUND

[0002] Link adaptation, or adaptive modulation and coding (AMC), is a technique used in wireless communications systems, such as 3GPP High Speed Downlink Packet Access (HSDPA), Long Term Evolution (LTE) or New Radio (NR). Link adaption, or AMC, may be used to dynamically adapt a transmission rate of a communication link to time- and frequency- varying channel conditions. Modulation and Coding Schemes (MCS) may effectively adapt the transmission rate by matching the modulation and coding parameters used for communication to the conditions of the radio link, such as propagation loss, channel strength, interference from other signals concurrently transmitted in the same radio resources, etc. Link adaptation is a dynamic process that may potentially act as frequently as each transmission time interval (e.g., on a millisecond time-scale in a 3GPP LTE system), where the communication link between a radio network node (also referred to herein as a network node) and a user device is scheduled for transmission.

[0003] Thus, link adaptation algorithms may require some form of channel state information at the transmitter to improve rate of transmission, and/or reduce bit error rates. Channel state information can be acquired in the form of channel quality reports from a user device, such as the channel quality indicator (CQI) of a 3GPP LTE and/or NR system. In such a case, a receiver of the transmission may perform measurements of the channel state and feedback a measurement report to the transmitter. In a time division duplex (TDD) system, it may often be reasonable to assume channel reciprocity (that the downlink channel from the transmitter (the network node) to the receiver (the user device) is approximately the same as the uplink channel from the receiver to the transmitter). Therefore, the network node can use estimates of the uplink channel state derived from uplink sounding reference signals as a measure of the downlink channel state to perform a link adaptation process for the downlink communication.

[0004] In either case, the estimate of the channel state available at the transmitter may not be very accurate or may degrade over time. For example, channel state information derived from sounding reference signals may be used by the transmitter until a new sounding reference signal is received. This may imply that the latest channel state estimate becomes less and less reliable over time, which may be referred to as channel aging. In a 3GPP LTE system, for example, a user device can be configured to transmit sounding reference signals as frequently as every 2 ms (e.g., every other radio subframe) or as infrequently as 160ms (e.g., every 16 radio frames). When the user device estimates the channel state information for the network node, the channel state report may be received within a certain delay and the channel state measurements may have been prefiltered by the user device either over time, or over frequency or over spatial domain (e.g., such as over different transmission beams in a multi-antenna system).

[0005] A mismatch between the channel state estimate available at the transmitter and the effective instantaneous channel state between the transmitter and the receiver may introduce uncertainty in the selection of the transmission rate which may result in suboptimal performance. Such a mismatch may become severe in some scenarios. For example, a mismatch may become severe in scenarios with rapidly varying channel conditions due to certain radio environment conditions, such as fast-moving user devices, sudden changes in traffic in neighboring cells, rapidly varying inter-cell interference, etc. Thus, link adaptation algorithms may need to account for inaccurate channel state information to achieve high spectral efficiency in the data transmission.

SUMMARY

[0006] According to some embodiments of inventive concepts, a method performed by a network node may be provided. The network node may determine a probability density function of a set of block error rate targets for a communication session with a user device based on at least one or more of user device context information, network context information, and an exploration strategy. The network node may further select a block error rate target from the set of block error rate targets based on the probability density function. The network node may further configure radio transmission parameters for the communication session with the user device based on the selected block error rate target.

[0007] According to some other embodiments of inventive concepts, a network node may be provided. The network node may include at least one processor, and at least one memory connected to the at least one processor to perform operations. The operations may include determining a probability density function of a set of block error rate targets for a communication session with a user device based on at least one or more of user device context information, network context information, and an exploration strategy. The operations may further include selecting a block error rate target from the set of block error rate targets based on the probability density function. The operations may further include configuring radio transmission parameters for the communication session with the user device based on the selected block error rate target.

[0008] According to some embodiments, a computer program may be provided that includes instructions which, when executed on at least one processor, cause the at least one processor to carry out methods performed by the network node.

[0009] According to some embodiments, a computer program product may be provided that includes a non-transitory computer readable medium storing instructions that, when executed on at least one processor, cause the at least one processor to carry out methods performed by the network node.

[0010] Other systems, computer program products, and methods according to embodiments will be or become apparent to one with skill in the art upon review of the following drawings and detailed description. It is intended that all such additional systems, computer program products, and methods be included within this description and protected by the accompanying claims.

[0011 ] Operational advantages that may be provided by one or more embodiments may include enabling an improved or optimized BLER target selection that may achieve higher efficiency in communication compared to some approaches having a fixed BLER target. A further advantage may provide tailored BLER target selection for each individual communication session to match channel conditions of the communication session and to meet certain performance requirements of the communication session, such as spectral efficiency, user rate (throughput), latency etc. A further advantage may provide that by tailoring the BLER target selection to specific needs and radio conditions of an individual communication session, the number of failed transmissions per communication session may be reduced. Reducing the number of failed transmissions per communication session may lead to improved system performance and decreased latency.

BRIEF DESCRIPTION OF THE DRAWINGS

[0012] The accompanying drawings, which are included to provide a further understanding of the disclosure and are incorporated in and constitute a part of this application, illustrate certain non-limiting embodiments of inventive concepts. In the drawings:

[0013] Figure 1 illustrates a block diagram of operations that may be performed by a network node for determining and selecting a BLER target for a communication session between a network node and a user device in accordance with some embodiments of the present disclosure;

[0014] Figure 2 is a block diagram illustrating a user device UD (also referred to as a wireless device) according to some embodiments of inventive concepts;

[0015] Figure 3 is a block diagram illustrating a network node according to some embodiments of inventive concepts;

[0016] Figure 4 illustrates a block diagram of operations that may be performed by a network node for determining and selecting a BEER target for a communication session between a network node and a user device in accordance with some embodiments of the present disclosure;

[0017] Figure 5 illustrates a probability distribution function for a set of BEER targets available to a network node in accordance with some embodiments of the present disclosure;

[0018] Figure 6 illustrates a block diagram of operations that may be performed by a network node for determining and selecting a BLER target for a communication session between a network node and a user device in accordance with some embodiments of the present disclosure;

[0019] Figure 7 illustrates an exploitative-explorative probability density function for a set of BLER targets available at a network node in accordance with some embodiments of the present disclosure; [0020] Figure 8 illustrates an exploitative-explorative probability density function for a set of BLER targets available at a network node in accordance with some embodiments of the present disclosure;

[0021 ] Figure 9 illustrates BLER target selection for a communication session with a user device based on a probability distribution function associated to a set of available BLER targets in accordance with some embodiments of the present disclosure; and [0022] Figures 10-13 are flowcharts illustrating operations that may be performed by a network node in accordance with some embodiments of the present disclosure.

DETAILED DESCRIPTION

[0023] Various embodiments will be described more fully hereinafter with reference to the accompanying drawings. Other embodiments may take many different forms and should not be construed as limited to the embodiments set forth herein; rather, these embodiments are provided by way of example to convey the scope of the subject matter to those skilled in the art. Like numbers refer to like elements throughout the detailed description.

[0024] Generally, all terms used herein are to be interpreted according to their ordinary meaning in the relevant technical field, unless a different meaning is clearly given and/or is implied from the context in which it is used. All references to a/an/the element, apparatus, component, means, step, etc. are to be interpreted openly as referring to at least one instance of the element, apparatus, component, means, step, etc., unless explicitly stated otherwise. The steps of any methods disclosed herein do not have to be performed in the exact order disclosed, unless a step is explicitly described as following or preceding another step and/or where it is implicit that a step must follow or precede another step. Any feature of any of the embodiments disclosed herein may be applied to any other embodiment, wherever appropriate. Likewise, any advantage of any of the embodiments may apply to any other embodiments, and vice versa.

Other objectives, features and advantages of the enclosed embodiments will be apparent from the following description.

[0025] Radio communication systems may rely on a discrete number of transmission rates, which may be mapped to different combinations of modulation and coding values, also referred to as modulation and coding scheme (MCS) values. Link adaptation algorithms may attempt to optimally adapt the transmission data rate chosen for a link to the current channel and interference conditions of the link by selecting the proper MCS value.

[0026] Some radio communication systems, such as 3GPP LTE and NR systems, may rely on link adaptation strategies that aim at controlling an error decoding rate for each communication session over a radio link, also referred to as a Block Error Rate (BLER) target. Some approaches may use a strategy to adapt an MCS selection, hence the data transmission rate, to maintain an average BLER for a communication session link below or equal to a certain value (referred to herein as a “BLER target” of a “block error rate target”).

[0027] A BLER target may provide a proxy parameter for controlling an average quality of a communication session. Depending on a channel state of the communication link, however, an incorrect setting of the BLER target may result in an excessive usage of radio resources.

For example, if a too low BLER target is required from a communication link with poor channel quality, or in poor performance if a too high BLER target is configured for a link with very good channel performance.

[0028] In some approaches for 3GPP LTE and 5G NR systems, a fixed BLER target may be configured for all user devices with the same type of traffic. In some approaches, the same BLER target may be configured for all user devices in a coverage area of a radio network node regardless of their traffic type. Further, in some approaches, the BLER target may not be adapted over time but may be kept fixed for all links.

[0029] While such approaches may simplify some implementation aspects of a radio communication system, a potential disadvantage of approaches discussed above may be that such approaches may lead to suboptimal system performance due to non-stationary and rapidly varying channel conditions, both over time and frequency domain. Such a potential disadvantage may exist despite that a BLER target may be intended to provide a control parameter to adjust the configuration setting of a communication link to its channel state to adapt to deep channel fade or interference.

[0030] On one hand, configuring the same BLER target for all users within a coverage area of a radio network node (or potentially worse, within larger parts of the network) may be problematic as different users typically experience different channel states and interference. For example, it may be desirable to configure high BLER targets to try to make the communication link robust from rapidly varying interference, at the expense of higher usage of communication resources. However, user devices closer to the transmitter may be less affected by rapidly varying interference than user devices located further away from the transmitter. Therefore, by setting a high BLER target for all users may potentially result in a system performance degradation as fewer radio resources may be made available for user devices with good channel conditions. A similar potential advantage may occur if the system configures a too low BLER target for all users. In such a case, the system may optimistically assume robust communication links for all users, which may be harmful for users affected by strong interference.

[0031 ] On the other hand, keeping a BLER target fixed within a communication session may only allow tracking of average channel behavior. Therefore, a fixed BLER target may not sufficiently exploit the potential of a communication link. Being able to adapt the BLER target within a communication session may be desirable to set configuration parameters for the communication link more opportunistically to increase overall spectral efficiency and quality of service. For example, setting a higher BLER target when the communication link suffers from higher interference or bad channel quality, or setting a lower BLER target when channel conditions are more favorable.

[0032] Adapting a BLER target of a communication link dynamically during a communication session, however, is not trivial as it may add an additional control loop within the link adaptation. On one hand, the BLER target may be used as input to an outer loop link adaptation (OLLA) algorithm, which may require hundreds of milliseconds to control the actual BLER toward a target BLER. Since the convergence time may not be constant, changing the BLER target should be done carefully to avoid instability issues in OLLA. On the other hand, channel quality reports from a user device may be the result of filtered measurements which hide the channel variations that the BLER adaptation algorithm should track and compensate for. [0033] Certain aspects of the present disclosure and their embodiments may provide solutions to these and/or other challenges.

[0034] In various embodiments, a method is provided for improving or optimizing a target BLER for a communication session between a network node and a user device, either for downlink or for uplink communication. The method is performed by a network node for selecting the BLER target for a communication session. Operations performed by the network node include determining a probability density function (PDF) for a set of (two or more) potential BLER targets for the communication session. The operations further include choosing a BLER target for the communication session with the user device based on the PDF. The operations further include configuring radio transmission parameters for communicating with the user device based on the selected BLER target.

[0035] The target BLER for a communication link may be selected based on the user (or link) context information. User context information may include, for example, current channel state of the communication link and/or network context information (e.g., interference measured from a neighboring cell) and features characterizing the network deployment surrounding the cell wherein the user device is located.

[0036] There are, proposed herein, various embodiments which may address one or more of the issues described herein. Certain embodiments may provide one or more of the following technical advantages. Some embodiments may provide an improved or optimized BLER target selection mechanism for each individual communication session (e.g., for each communication link, type of traffic or user device) to match the channel conditions of the communication session. Some embodiments may also provide improved or optimized BLER target selection to meet certain performance requirements of the communication session, such as spectral efficiency, user rate (throughput), latency etc. Thereby the methods of various embodiments may provide higher efficiency in communication compared to some approaches having a fixed BLER target, such as 10% BLER target in the LTE/LTE-A system, is applied to all communication sessions.

[0037] Some embodiments may also provide for, in a wireless communication system, better usage of the radio resources may be granted for individual communication sessions; and may thereby increase overall spectral efficiency of the system. In addition, some embodiments may provide, by tailoring the BLER target selection to specific needs and radio conditions of an individual communication session, a reduction of the number of failed transmissions per communication session. Reducing the number of failed transmissions per communication session may lead to improved system performance (e.g., as more radio resources are made available for other communication sessions or user devices in the system) and decreased latency (e.g., as fewer retransmissions are needed to successfully communicate over a radio link).

[0038] According to various embodiments, a communication system with at least a radio network node and at least one user device that communicate over a radio link may be provided. The radio network node may configure at least one communication session with a user device to maintain a certain BLER target. However, unlike the 3 GPP LTE/LTE-A and 5G NR systems where the radio network node uses the same BLER target for all communication sessions within a radio cell, various embodiments herein include a system wherein a radio network node may improve or optimize and configure each communication session with a dedicated BLER target. The BLER target may be chosen among a set of N > 1 BLER targets labelled by integers n = ... , N . More generally, in various embodiments, the BLER target associated with a user device, or a communication link, or a type of traffic used during a communication session (such as a radio bearer) in the system may be improved or optimized. As used herein, the terms communication session, communication link, and radio bearer are used interchangeably.

[0039] In various embodiments, the BLER target for a communication session may be statically optimized at the beginning of the communication session or dynamically adapted during the communication session. The PDF may be a function describing the probabilities p_n of the set of N BLER targets that may be used in a communication session between the network node and a user device.

[0040] Exemplary embodiments related to computation of a PDF for a set of BLER targets will now be described.

[0041 ] In some embodiments, the probability p_n for each N BLER targets, hence the PDF for the set of BLER target, is determined based on one or more of:

User device context information

Network context information

Exploration strategy information

[0042] Thus, the BLER target for a communication session may be optimized at the beginning of the communication session or dynamically adapted during the communication session based on user device specific context information and/or network specific context information.

[0043] Error! Reference source not found, illustrates a block diagram of operations that is performed by a network node for determining and selecting a BLER target for a communication session between a network node and a user device (e.g., user device (UD) 200 of Figure 2) in accordance with some embodiments of the present disclosure. Network node 115 may be implemented using structure of network node 300 from Figure 3 (as described further below). Referring to Figure 1, network node 115 determines 107 a PDF 109 for a set of BLER targets (e.g. the probabilities p_n for the set of N BLER targets available at the network node 115) based on all three types of information (user device context information 101, network context information 103, and exploration strategy information 105). One of ordinary skill in the art will understand that alternative implementations of such an embodiment may use a subset of the available information, such as only user device context information 101 and exploration strategy information 105. In another exemplary embodiment, network node 115 uses network context information 103 and exploration strategy information 105.

[0044] Figure 2 is a block diagram illustrating a user device UD (also referred to as a wireless device) according to some embodiments of inventive concepts. User device (UD) may be implemented using structure of UD 200 from Figure 2 with instructions stored in device readable medium (also referred to as memory) 205 of UD 200 so that when instructions of memory 205 of UD 200 are executed by at least one processor (also referred to as processing circuitry) 203 of UD 200, at least one processor 203 of UD 200 performs respective operations discussed below. Processing circuitry 203 of UD 200 may thus transmit and/or receive communications to/from one or more other UDs and/or network nodes/entities/servers of a radio communication network through antenna 207 of UD 200. In addition, processing circuitry 203 of UD 200 may transmit and/or receive communications to/from one or more UDs and/or network nodes/entities/servers of a radio communication network through transceiver 201 of UD 200.

[0045] As used herein, UD refers to a device capable, configured, arranged and/or operable to communicate wirelessly with network nodes and/or other wireless devices. Unless otherwise noted, the term UD may be used interchangeably herein with user equipment (UE). Communicating wirelessly may involve transmitting and/or receiving wireless signals using electromagnetic waves, radio waves, infrared waves, and/or other types of signals suitable for conveying information through air. In some embodiments, a UD may be configured to transmit and/or receive information without direct human interaction. For instance, a UD may be designed to transmit information to a network on a predetermined schedule, when triggered by an internal or external event, or in response to requests from the radio communication network. Examples of a UD include, but are not limited to, a smart phone, a mobile phone, a cell phone, a voice over IP (VoIP) phone, a wireless local loop phone, a desktop computer, a personal digital assistant (PDA), a wireless camera, a gaming console or device, a music storage device, a playback appliance, a wearable terminal device, a wireless endpoint, a mobile station, a tablet, a laptop, a laptop-embedded equipment (LEE), a laptop-mounted equipment (LME), a smart device, a wireless customer-premise equipment (CPE), a vehicle-mounted wireless terminal device, etc. A UD may support device-to-device (D2D) communication, for example by implementing a 3GPP standard for sidelink communication, and may in this case be referred to as a D2D communication device. As yet another specific example, in an Internet of Things (IoT) scenario, a UD may represent a machine or other device that performs monitoring and/or measurements, and transmits the results of such monitoring and/or measurements to another UD and/or a network node. The UD may in this case be a machine-to-machine (M2M) device, which may in a 3GPP context be referred to as a machine-type communication (MTC) device. As one particular example, the UD may be a UE implementing the 3GPP narrow band internet of things (NB-IoT) standard. Particular examples of such machines or devices are sensors, metering devices such as power meters, industrial machinery, or home or personal appliances (e.g. refrigerators, televisions, etc.) personal wearables (e.g., watches, fitness trackers, etc.). In other scenarios, a UD may represent a vehicle or other equipment that is capable of monitoring and/or reporting on its operational status or other functions associated with its operation. A UD as described above may represent the endpoint of a wireless connection, in which case the device may be referred to as a wireless terminal. Furthermore, a UD as described above may be mobile, in which case it may also be referred to as a mobile device or a mobile terminal.

[0046] As used herein, network node refers to equipment capable, configured, arranged and/or operable to communicate directly or indirectly with a user device and/or with other network nodes or equipment in the radio communication network to enable and/or provide wireless access to the user device and/or to perform other functions (e.g., administration) in the radio communication network. Examples of network nodes include, but are not limited to, access points (APs) (e.g., radio access points), base stations (BSs) (e.g., radio base stations, Node Bs, evolved Node Bs (eNBs), gNode Bs, etc.). Base stations may be categorized based on the amount of coverage they provide (or, stated differently, their transmit power level) and may then also be referred to as femto base stations, pico base stations, micro base stations, or macro base stations. A base station may be a relay node or a relay donor node controlling a relay. A network node may also include one or more (or all) parts of a distributed radio base station such as centralized digital units and/or remote radio units (RRUs), sometimes referred to as Remote Radio Heads (RRHs). Such remote radio units may or may not be integrated with an antenna as an antenna integrated radio. Parts of a distributed radio base station may also be referred to as nodes in a distributed antenna system (DAS). Yet further examples of network nodes include multi-standard radio (MSR) equipment such as MSR BSs, network controllers such as radio network controllers (RNCs) or base station controllers (BSCs), base transceiver stations (BTSs), transmission points, transmission nodes, multi-cell/multicast coordination entities (MCEs), core network nodes (e.g., MSCs, MMEs), O&M nodes, OSS nodes, SON nodes, positioning nodes (e.g., E-SMLCs), and/or MDTs. As another example, a network node may be a virtual network node. More generally, however, network nodes may represent any suitable device (or group of devices) capable, configured, arranged, and/or operable to enable and/or provide a user device with access to the radio communication network or to provide some service to a user device that has accessed the radio communication network.

[0047] Figure 3 is a block diagram illustrating a network node according to some embodiments of inventive concepts. Network node 115 may be implemented using structure of network node 300 from Figure 3 with instructions stored in device readable medium (also referred to as memory) 305 of network node 300 so that when instructions of memory 305 of network node 300 are executed by at least one processor (also referred to as processing circuitry) 303 of network node 300, at least one processor 303 of network node 303 performs respective operations discussed below with respect to Figures 1 and 4-13. Processing circuitry 303 of network node 300 may thus transmit and/or receive communications to/from one or more other network nodes/entities/servers of a radio communication network through network interface 307 of network node 300. In addition, processing circuitry 303 of network node 300 may transmit and/or receive communications to/from one or more wireless devices (e.g., user device (UD) 200) through interface 301 of network node 300 (e.g., using transceiver 301).

[0048] Referring to Figure 1, network node 115 determines 107 a PDF 109 and select 111 a BEER target 113 in a set of BEER targets available at network node 115. Network node 115 uses user context information 101, network context information 103, and/or exploration strategy information 105 to determine 107 a probability p_n for each N BLER targets available at network node 115 (PDF 109). Network node 115 selects 111 a BLER target, BLER_k 113, from the set N of BLER targets based on the PDF 109.

[0049] In various embodiments, user context information 101 includes, but is not limited to, one or more of:

One or more network key performance indicator (KPI) associated to a communication session between the user device and a network node. Relevant KPI include throughput, spectral efficiency, latency, packet loss rate, call drop rate, etc. The KPI may be measured or estimated by one or more network nodes, in association to one or more radio cells. Each KPI may be represented by a single value, such as an instantaneous measurement, an average over a time window, a maximum or minimum value achieved over a time window, etc. or in statistical terms, for instance using first and second statistical moments, or a probability distribution function.

Channel quality indicator (CQI) measurements of the communication link between the user device and the network node, such as wideband CQI or narrow band CQI measurements.

Reference Signal Received Power (RSRP) measurements for downlink or uplink reference signals, such as channel state information reference signals (CSI-RS), channel sounding reference signals (SRS), cell-specific reference signals (CRS), synchronization reference signals, such as primary and secondary synchronization reference signals (PSS, SSS, respectively) or the Synchronization Signals and PBCH Blocks (SSB or SS/PBCH block) defined by the 3GPP NR system.

Quality of service (QoS) requirement for the communication session between the user device and the network node, such as minimum or maximum rate requirements, minimum or maximum tolerable latency, minimum or maximum number of communication resources. Signal attenuation measurements between the user device and one or more network nodes. This includes measurements of pathloss, fading, shadowing over one or multiple communication frequencies that can be used by the user device and the network node. Such measurements can be either wideband, i.e., one measurement for entire bandwidth of interest in a communication frequency, or narrow-band, e.g., multiple measurements are made in different parts of the bandwidth of interest in a communication frequency. Timing advance measurement associated to the user device. In LTE and 5G systems this can be derived by the network node based on uplink measurements of random access preamble signals during the random access procedure.

Measurements of time the signal takes to reach the network node from a user device, or to reach from the network node to user device and from user device to network node. For example, timing advance measurements or round-trip time measurements can be used for such purpose.

Interference measurement, either in uplink or in downlink, for the communication link between the user device and the network node, such as wideband or narrow-band interference measurements.

Type of user device, such as model, vendor, type of receiver, type of transmitter, etc. [0050] User context information 101 may be either deduced or determined directly by network node 115 from uplink signals, such as a reference signal received from the user device, or interfering signals received from other user devices or other network nodes. User context information 101 may also be measured by the user device and transmitted from the user device to network node 115. Thus, in some embodiments, network node 115 further receives one or more messages including user context information 101 from the user device. [0051] In various embodiments, network context information 103 includes, but is not limited to one or more of:

One or more network key performance indicator (KPI) associated to one or more cells of the radio network. Relevant KPI include throughput, spectral efficiency, latency, packet loss rate, call drop rate, etc. The KPI are measured or estimated by one or more network nodes, in association to one or more radio cells. Each KPI includes a single value, such as an instantaneous measurement, an average over a time window, a maximum or minimum value achieved over a time window, etc. or in statistical terms, for instance using first and second statistical moments, or a probability distribution function.

Traffic load and/or radio resource utilization in one or more radio cell of the network node serving the user device.

Traffic load and/or radio resource utilization in one or more radio cell of one or more neighboring node (e.g., interfering network node or radio cell).

Inter-site distance (ISD) information between serving network node and neighboring network nodes. This includes, for example: o An estimate of the effective ISD with neighboring network nodes; o Statistical information of ISD with neighboring network nodes (e.g., first and second order statistical momentum such as average ISD and standard deviation) o Statistical information of ISD between network nodes in the region of the network wherein the network node is deployed (e.g., first and second order statistical momentum such as average ISD and standard deviation) o Topological information of the network deployment, such as coordinates of the location of network nodes and orientation of the respective radio cells Propagation loss measurements of radio signals from/to the serving network node to/from the user device.

Propagation loss measurements of radio signals from/to one or more interfering network node to/from the user device.

Interference measurements of radio signals from/to one or more interfering network node to/from the user device.

Interference measurements of radio signals from/to the serving network node and one or more interfering user devices. Number of neighboring cells or radio network nodes that can interfere with the user device. In one example, a cell or radio network node is considered to be interfering with the user device if the received strength (power) of reference signals transmitted by such cell or radio network node exceed a certain threshold.

Type of neighboring cells or radio network nodes. For instance, distinguishing between different generation of broadband communication systems (2g, 3G, 4G, 5G, etc.) such as UMTS, HSPA, LTE, LTE-A, 5G-NR, etc. and/ different releases of communication systems.

Power setting parameters, such as available power budget for downlink communication, average power transmitted per physical resource block.

Type of traffic and /or distribution of traffic in neighboring cells or radio network nodes.

Mobility settings parameters, such as mobility offset setting for user device handover. [0052] In various embodiments, a PDF 109 for a set of potential BLER targets for a communication session is computed based on an exploitative model (e.g., exploitative model 107a described below with reference to Figure 4) available at network node 115. In some embodiments, the exploitative model includes one or more of: a feedforward neural network; a recurrent neural network; a convolutional neural network; an ensemble of neural networks, such as feedforward neural networks, recurrent neural networks, convolutional neural networks or a combination thereof; a decision tree; a random decision forest; a linear regression model; and a nonlinear regression model.

[0053] As used herein, the term “exploitative” refers to the BLER target selection based solely on knowledge of the exploitative model 107a and the context information (101 and/or 103) for the user device and/or the network. Output of the exploitative model 107a may then be combined with parameters of exploration strategy 105 to determine the PDF 109 for the set of potential BLER targets for a communication session (as described further below with reference to Figure 4).

[0054] Examples of output of an exploitative model 107a will now be described.

[0055] In some embodiments, an exploitative model 107a takes the contextual information

(user context information 101 and/or network context information 103) as input and returns a BLER target chosen within the set of available BLER targets for the communication session. As used herein, such a BLER target is referred to as an exploitative BLER target (e.g., because it is selected based solely on knowledge of the exploitative model and the context information). Use of this output to generate a PDF of the available BLER targets is described further below with reference to Figure 4.

[0056] In some embodiments, an exploitative model 107a takes the contextual information (user context information 101 and/or network context information 103) as input and returns a set of score values with each score value v_n associated to one of the available BLER targets

BLER_n. In some embodiments, the score values v_n are positive quantities that sum up to 1, thereby representing a probability distribution function of the available BLER targets. As used herein, such a PDF may be referred to as an exploitative PDF. Use of this output to generate a PDF of the available BLER targets is described further below with reference to Figure 4.

[0057] Exploration strategies 105 will now be described.

[0058] In various embodiments, network node 115 uses an exploration strategy 105 to explore, based on a statistical behavior, different BLER targets from what it could have determined based on a sole exploitative model 107a. The statistical behavior is determined by the specific method used to explore and the associated parameters.

[0059] In some embodiments, network node 115 uses an epsilon-greedy (also referred to herein as Î-greedy) exploration strategy 105. For an epsilon-greedy exploration strategy 105, network node 115 explores with probability Î and exploit with probability 1 — Î, where e is a parameter ranging from zero to one associated to this type of exploration strategy 105. In one embodiment, network node 115 chooses a BLER target at random with probability Î or chooses a BLER target according to the output of the exploitative model with probability 1 — Î. Other embodiments Error! Reference source not found.for network node 115 to determine a PDF 109 of the available BLER target based an Î-greedy exploration strategy 105 and the outcome of the exploitative model are described further below. [0060] In some embodiments, network node 115 uses a t-first exploration strategy 105 characterized by a parameter t taking integer values greater than or equal to one. With a

exploration strategy 105, network node 115 explores different BLER targets uniformly at random for a fixed number

of times and selects a BLER target by exploiting the model afterwards.

[0061 ] In some embodiments, network node 115 uses an ensemble strategy 105 characterized by a parameter K taking an integer value greater than or equal to one. With an ensemble strategy 105, network node 115 uses an ensemble of (e.g., a number of) K exploitative models each creating a potentially different exploitative BLER target, PDFs, or BLER scores. In one embodiment, the exploration strategy 105 selects an exploitative model from the ensemble uniformly at random and selects BLER targets accordingly. For instance, if the ensemble of exploitative models is such that each model selects a BLER target based on given contexts (see e.g., Figure 4 described further below) then the exploration strategy 105 first picks an exploitative model uniformly at random and then the selected exploitative model 107a selects the ultimate BLER target.

[0062] In another embodiment, other exploration strategies 105 (e.g., epsilon-greedy) are combined with the ensemble of exploitative models. Such exploration method 105 may be referred to as an epsilon ensemble in which the probability of choosing each BLER target is given as

where K is the number of exploitative models 107a in the ensemble and ƒ_k(n), denotes the output of the k-th exploitative model in the ensemble. The PDF

of the available BLER targets in this embodiment is computed based on an epsilon greedy strategy with probability e and an average of exploitative actions (denoted by ƒ_k(n)) produced by different models in the ensemble with probability 1 — Î. For example, if the ensemble of exploitative models is such that each model selects a BLER target based on given contexts (see e.g., Figure 4), then

[0063] PDF 109 generation from an exploitative BLER target combined with an exploration strategy 105 will now be described.

[0064] Figure 4 illustrates a block diagram of operations that are performed by a network node for determining and selecting a BLER target for a communication session between a network node and a user device in accordance with some embodiments of the present disclosure. Figure 4 illustrates some embodiments where the PDF 109 of the available BLER targets is computed 107c by combining a BLER target selected based on context information (user context information 101 and/or network context information 103) (indicated as BLERi 107b) with parameters of exploration strategy 105.

[0065] Still referring to Figure 4, in some embodiments, network node 115 selects an exploitative BLER target BLER_i within the set of available BLER targets based on

the exploitative model 107a and context information (e.g., user device context information 101 and/or network context information 105). Still referring to Figure 4, index i takes a value in the range { ... , N} and univocally identifies one of the available BLER targets. Subsequently, network node 115 computes the PDF

109 for set of potential BLER targets by combining or computing 107c the exploitative BLER target BLER_i 107b with parameters of the exploration strategy 105. Examples of exploration strategies 105 and associated parameters are described above.

[0066] In another embodiment, network node 115 uses an Î-greedy exploration strategy 105 with associated parameter Î taking value in the range [0, 1] Given the index i of the BLER target chosen according to the exploitative model 107a and the parameter Î of the exploration strategy 105, network node 115 determines the probability density function associated to the available BLER targets by computing a probability p_n for each of the available

BLER targets as

[0067] In other words, the probability of selecting each of N BLER targets available at network node 115 is: for the BLER target BLER_i 107b chosen based on the context information (101

and/or 103) only; for all the rest BLER targets, i.e., for all indices n in the range {1, ... , N} different from i.

Therefore, the probability distribution function 109 is slightly skewed toward the BLER target BLER_i 107b temporarily chosen based only on the context information. Finally, network node 115 determines a BLER target to be used for the communication session based on the PDF 109 defined by the individual BLER target probabilities (as described in more detail below).

[0068] Figure 5 illustrates a probability distribution function for a set of BLER targets available to a network node in accordance with some embodiments of the present disclosure. Figure 5 shows an example of a probability distribution function 109 computed based on Equation ( 1 ) above in accordance with some embodiments of the present disclosure. In this example, the number of BLER targets available to network node 115 is assumed to be 5, e.g.

N = 5 BLER targets e.g. {BLER₁ = 10%, BLER₂ = 25 %, BLER₃ = 50%, BLER₄ = 75%, BLER₅ = 90%}. In the example of Figure 5, network node 115 selected BLER₂ = 25% based on exploitative model 107a (where the index associated to the exploitative BLER target is i = 2) and uses an Î-greedy exploration strategy with parameter Î = 0.6.

[0069] PDF generation from exploitative score values combined with an exploration strategy 105 will now be described.

[0070] Figure 6 illustrates a block diagram of operations that are performed by a network node for determining and selecting a BLER target for a communication session between a network node and a user device in accordance with some embodiments of the present disclosure. Figure 6 illustrates embodiments where the PDF 109 of the available BLER targets is computed 107c by combining the probability of each BLER target 107c computed based on context information (101 and/or 103) with parameters of an exploration strategy 105.

[0071 ] Still referring to Figure 6, in some embodiments, the PDF

109 for a set of available BLER targets for a communication session are computed based on An exploitative model 107a available at network node 115 that takes as input user specific and/or network specific context information (101 and/or 103) and returns to a set of exploitative score values with each score value v_n associated to one of the

available BLER targets BLER_n ; and Parameters of the exploration strategy 105.

[0072] In other words, exploitative model 107a takes user context information 101 and/or network context information 103 and returns a set of scores .The score values v

_n represent the confidence that the exploitative model 107a assigns to each of the available BLER targets that can be configured for a communication session with a user device. The PDF

109 of the target BLER values are computed by combining the exploitative score values

with parameters of the exploration strategy 105.

[0073] In one embodiment, if network node 115 uses an Î-greedy exploration strategy with associated parameter e taking a value in the range [0, 1], the PDF 109 associated to the set of available BLER targets

can be computed by determining, for each BLER target BLER_n a probability function p_n defined as

where is an indicator function that takes value 1 if the exploitative model

107a assigns a highest score to BLER target «(compared to the rest of the BLER targets) and takes value 0 otherwise.

[0074] In another embodiment, network node 115 uses a softMax exploration strategy 105 with associated parameter t (as described above), the PDF 109 associated to the set of available BLER targets can be computed 107c by determining, for each BLER target BLER_n a probability function p_n 109 defined as

[0075] The set of probabilities represents a PDF 109 of the available BLER targets

as each p_n takes values in the continuous interval [0, 1] (i.e., in mathematical notation p_n Î [0, 1]) and the sum of all p_n values is one (i.e., in mathematical notation: The

effect of equation (2) is to obtain a PDF similar to the PDF obtained through equation (1). A main difference in this embodiment is that the exploitative model 107a returns a set of score values, while in other embodiments described herein the exploitative model 107a directly returns a BEER target that is configured for the communication session with the user device if no exploration strategy 105 is used.

[0076] In one embodiment, the exploitative model 107a available at network node 115 takes as input user specific and/or network specific context information (101 and/or 103) and returns as an output a set of exploitative scores such that each v_n takes values in the

continuous interval [0, 1] (i.e., in mathematical notation v_n 6 [0, 1]) and the sum of all v_n is one (i.e., in mathematical notation: ). In other words, the exploitative model 107a takes

user context information 101 and/or network context information 103 as input and returns a probability distribution function (or just probabilities) 109 associated to the set of available BLER targets. Methods of such an embodiment may be referred to herein as exploitative PDF since it only depends on the exploitative model 107a (i.e., it is independent on the parameters of the exploration strategy 105).

[0077] Still referring to Figure 6, network node 115 computes a PDF 109 for the

set of available BLER targets by combining or computing 107c the exploitative PDF 107a (e.g., represented by the score values with parameters of the exploration strategy 105. In some

embodiments, if network node 115 uses an Î-greedy exploration strategy with associated parameter Î taking a value in the range [0, 1], the PDF 109 associated to the set of available BLER targets can be computed 107c by determining, for each BLER target BLER_n a probability function p_n defined as

[0078] Figures 7 and 8 illustrate example embodiments of a probability distribution function 109 computed based on Equation (4). [0079] Figure 7 illustrates an exploitative-explorative probability density function for a set of BLER targets available at a network node in accordance with some embodiments of the present disclosure . Figure 7 provides a table illustrating exploitative-explorative probability density function 109 for the set of N BLER targets available at network node 115 based on Equation (4) and a given explorative PDF. In this example embodiment, it is assumed that there are N = 5 BLER targets that network node 115 can configure, namely { BLER₁ = 10%, BLER₂ =

25%, BLER₃ = 50%, BLER₄ = 75%, BLER₅ = 90%}; that network node 115 has computed an exploitative PDF 107b defined by the set of score values

in the first row of the table of Figure 7 ; and that network node 115 uses an e -greedy exploration strategy with parameter Î = 0.6.

[0080] Figure 8 illustrates an exploitative-explorative probability density function for a set of BLER targets available at a network node in accordance with some embodiments of the present disclosure. Figure 8 illustrates an exploitative-explorative probability density function 109 for the set of N BLER targets available at network node 115 based on Equation (4) and a given explorative PDF according to Figure 7. In this example embodiment, it is assumed that there are N = 5 BLER targets that network node 115 can configure, namely {BLER₁ = 10%, BLER₂ = 25%, BLER₃ = 50%, BLER₄ = 75%, BLER₅ = 90%}; that network node 115 has computed an exploitative PDF 107a defined by the set of score values in the first row of the table of

Figure 7; and that network node 115 uses Î-greedy exploration strategy with parameter Î = 0.6. [0081 ] In this example embodiment, the number of BLER targets available to network node 115 is assumed to be N = 5 BLER targets, namely {BLER₁ = 10%, BLER₂ = 25%, BLER₃ = 50%, BLER₄ = 75%, BLER₅ = 90%}. It is further assumed that network node 115 computed an exploitative PDF based on the exploitative model 107a characterized by the score values defined in Figure 7 (first row of table), and that network node 115 uses an Î-greedy

exploration strategy 105 with parameter Î = 0.6. The second row of the table of Figure 7 shows an example of a probability distribution function 109 for N = 5 BLER targets available to network node 115 and Î = 0.6.

[0082] In the exemplary embodiment of Figures 7 and 8, an exploitative PDF 107a as illustrated in Figure 5 was assumed.

[0083] Exemplary embodiments related to selecting a BLER target based on a PDF will now be described. [0084] In some embodiments, network node 115 selects the BLER target to be configured for a communication session with a user device by drawing a BLER target at random from the set of available BLER targets according to the probability distribution function 109 generated

for the set of available BLER targets.

[0085] In one exemplary embodiment, given a PDF 109 for a set of available BLER

targets, network node 115 chooses the BLER target by

Computing a cumulative distribution function (CDF) associated to the PDF

109. For example, the values q_n are determined as the cumulative sum of the

probabilities p_i for indexes Thus, each q_n takes a value

between 0 and 1, with q_n £ q_n+1 for all indices n.

Choosing a random number x between 0 and 1 using a uniform probability distribution.

Finding the smallest value of the index n such that the relation q_n-₁ £ x £ q_n holds, where the symbol £ indicates “less than or equal to”.

Choosing the BLER target to be configured for the communication session as the BLER target corresponding to the value of index n found in the previous step.

[0086] Figure 9 illustrates BLER target selection for a communication session with a user device based on a probability distribution function associated to a set of available BLER targets in accordance with some embodiments of the present disclosure. Figure 9 illustrates an example of BLER target selection for a communication session with a user device based on a probability distribution function associated to the set of available BLER targets.

[0087] Referring to Figure 9, a BLER target is selected to be configured for a communication session with a user device based on a probability distribution function 109 associated to the set of available BLER targets. The example of Figure 9 uses the PDF 109 illustrated in 5, wherein the set of available BLER targets is { BLER₁ = 10%, BLER₂ =

25%, BLER₃ = 50%, BLER₄ = 75%, BLER₅ = 90%}, with BLER targets indexed by n =

1, 2, ..., 5, and the corresponding PDF 109 is {p₄ = 0.12, p₂ = 0.52, p₃ = 0.12, p₄ = 0.12, p₅ = 0.12 }. The associated CDF is {q₄ = 0.12, q₂ = 0.64, q₃ = 0.76, q₄ = 0.88, q₅ = 1.0}. [0088] The size of the vertical beams (that is the vertical lines in Figure 9) associated to each BLER target determine the likelihood to select each BLER target by choosing a random number with uniform distribution. Therefore, in this example, if a random number x=0.74 is choose from a uniform distribution, the BLER₃ = 50% would be selected for the communication session with the user device.

[0089] It is noted that that this is different from choosing the BLER target with the highest probability, i.e. BLER 25% in this example. The method of this exemplary embodiment thus allows network node 115 to choose a BLER target that has a lower probability, according to the distribution given by the PDF 109.

[0090] Exemplary embodiments related to data collection and training will now be described. [0091 ] Referring first to data collection, in some embodiments, network node 115 further

Configures the user device to measure one or more key performance indicator(s) associated to the communication session configured based on the selected BLER target; and/or Receives a measurement report message from the user device comprising one or more measurements of key performance indicator associated to the communication session that has been configured based on the selected BLER target.

[0092] Network node 115 further determines a data tuple associated to the communication session with the user device including, for example:

A representation of a state x associated to the task of selecting the BLER target for a user device;

The BLER target selected for a communication session with the user device, e.g., BLER_i:

One or more KPI r associated to the communication session configured with the selected BLER target; and

One or more parameters characterizing the exploration strategy 105, as described above with reference to the discussion of exploration strategies 105 for selecting a BLER target for a communication session with the user device.

[0093] The representation of the state x associated to the task of selecting the BLER target for a user device may include information associated to one or more cell controlled by network node 115, but it may also include information associated to the state of neighboring network nodes in the network. The state representation may further include information associated to the state prior to selecting a BLER target to be configured for a communication session with a user device. In other words, the state representation may be the information used by the network node to determine the BLER target to be configured for the communication session with the user device. As such, the state representation may comprise a set of features x including network context information 103 and/or user specific context information 101 as described above.

[0094] In some embodiments, information characterizing the exploration strategy 105 for BLER target selection of a session is the probability p_n of which the BLER target n was configured by network node 115 for a communication session with a user device.

[0095] Data tuples associated to the communication session with the user devices can be stored by network node 115 and used to train and improve the exploitative model 107a.

[0096] An advantage of such an embodiment may be that empirical data can be collected directly from network measurements in relation to a BLER target configured for the communication session with a user device. Unlike supervised learning techniques, which would require knowing which BLER target was optimal for a given state representation, various embodiments described herein do not require prior knowledge of the optimal BLER target associated to a state representation. It may be noted that in some systems, it may not be possible to determine a priori the optimal BLER target associated to be configured for a communication session as the BLER target is computed prior to starting a communication session from which KPI measurements can be determined.

[0097] A further advantage of this embodiment may be that it can solve the issue of training and improving the exploitative model 107a based on data samples correctly representing the environment wherein the exploitative model 107a is deployed. As such, the exploitative model may be able to adapt to real-time changes of the network reflected in user context information 101 and/or network context information 103. A severe performance loss can otherwise occur when the exploitative model 107a is trained based on a data set that does not correctly represent the environment where exploitative model 107a is used to select BLER targets for communication sessions. This issue could occur, for instance, if the exploitative model 107a is trained based on synthetic data generated in a simulation environment, or when the exploitative model 107a is trained based on empirical data collected in a limited part of the system (e.g., one or more radio cells) which is not representative of the deployment environment (other radio cells of the system) where the exploitative model 107a is used. In other words, any difference between training and deployment environments such as network configurations (e.g., inter site distance and number of adjacent cells), traffic situations (e.g., mobile broad-band, IoT, video or voice, etc.), environmental differences that might affect radio signal propagation, among other things may have significant impact to the accuracy of the supervised model to the point that it may result in performance degradation.

[0098] Various embodiments of the present disclosure may resolve the above -referenced shortcomings allowing training of a model based on samples of experienced derived from interactive network and user measurements.

[0099] In one embodiment, network node 115 further transmits data tuples associated to the communication session with the user device(s) to a second network node (e.g., 300). The second network node includes a storage unit (e.g., memory 305) and a control unit (e.g., at least one processor 303). The storage unit is used to store data tuples associated to the communication session with different user devices. The control unit retrieves historical data from the storage unit for training/updating the exploitative model 107a.

[00100] Training will now be described.

[00101 ] In one embodiment, network node 115 further trains/updates the exploitative model 107a for selecting the BLER target for communication sessions with user device(s) based on historical data. Each data sample of the historical data is associated to a BLER target selected for a communication session with a user device.

[00102] Historical data samples collected upon selection of BLER targets for the communication session with different user devices are, therefore, be used to train the exploitative model 107a. The data samples include data samples collected from one or more radio cells controlled by network node 115.

[00103] In some embodiments, network node 115 further

Transmits a request for historical data to the second network node (e.g., network node 300) associated to the task of BLER target selection;

Receives from the second network node (e.g., network node 300) a set of historical data associated to the task of BLER target selection; and/or Trains/updates the explorative model 107a based on the set of historical data and/or data stored at network node 115

[00104] In such embodiments, network node 115 requests data samples from a second network node (e.g., network node 300). The second network node includes a storage unit (e.g., memory 305) for data collected by network node 115. In addition, the second network node further stores data samples collected by other network nodes (e.g., network nodes 300). In this case, network node 115 therefore trains the exploitative model 107a with data collected by other network nodes, e.g., in other radio cells not controlled by network node 115. This may allow increasing the diversity of the data samples and therefore may improve the generalization capacity of the exploitative model 107a.

[00105] In one embodiment, network node 115 further receives from the second network node (e.g., network node 300) one or more updated exploitative model 107a for the task of selecting BLER targets for communication session with user devices.

[00106] In such an embodiment, the second network node trains/determines/updates the exploitative model 107a using historical data associated to the task of BLER target selection. The discussion of exploitative models described above provides examples of suitable models that are be trained for selecting a BLER target. In the event that network node 115 uses an ensemble of more than one model (see e.g., the discussion of exploration strategies 105 described above), the second network node transmits to one more exploitative models 107a to network node 115.

[00107] In another embodiment, network node 115 further requests from the second network node a more updated exploitative model 107a for the task of selecting BLER targets for communication session with user devices.

[00108] Exemplary embodiments for training will now be described.

[00109] In some embodiments, parameters of an exploitative model 107a (e.g., weights of an artificial neural network, support vector machine, linear or non-linear regression model) are calculated or updated using suitable optimization techniques. For example, progressively updating model parameters is performed via training. While data collection described above describes training data collected from history of a user device BLER selection session, the training described above describes network node(s) where training takes place.

[00110] Two example embodiments in which exploitative models 107a are utilized in accordance with inventive concepts are described below. [00111] In one embodiment, network node 115 determines an exploitative BLER target based on an exploitative model 107a, as shown for example in Figure 4. As such, a computation for the BLER target at exploitative model 107a parameterized is of the following form

in which a prediction function and its associated parameters q are calculated/updated in the

training process. Input to the exploitative model 107a is a set represented by x which contains network and user context information (101 and/or 103) associated with a communication session. Output of the exploitative model 107a is BLER_t, the BLER target value corresponding to the maximum predicted value for the given input x.

[00112] In another embodiment, network node 115 determines an exploitative PDF of the available BLER targets (see e.g., Figure 6Error! Reference source

not found.)· As such, computation for the BLER target at exploitative model 107a parameterized can be of the following form

in which the prediction function and its associated parameters q is calculated/updated in the

training process. Input to the exploitative model 107a is a set represented by x which contains network and user context information (101 and/or 103) associated with a communication session.

The values corresponds to the predicted KPI (reward) values for different BLER targets

potentially achievable for a communication session characterized by user 101 and network context information 103, x. Output of the exploitative model 107a is a SoftMax operator

with a suitably selected scalar t. It may be noted that SoftMax operator takes the functional values as input and returns a probability simplex; e.g.,

[00113] Algorithmic details of exemplary embodiments that are used for training the exploitative model will now be described. The following description is in the context of the exploitative model exemplary embodiments described above. Training of an exploitative model 107a is formulated as a mathematical optimization of the following form

[00114] where t = 1, ... , T denotes the number of training samples (the number of BLER selection communication sessions that network node 115 has at its training data); r_t is the measured KPI (or a function of it) collected after setting a BLER target for the communication session t. The term ) represents the loss function in the optimization and takes various forms. For example, a squared loss and/or a hinge loss in support vector machines

etc.

[00115] The term f ( x_t , BLER_t, q_k) is a prediction function in which x_t denotes contextual information (101 and/or 103) associated with session t; BLER_t is the BLER target selected by network node 115 at session t; and q represents the exploitative model 107a parameters. The parameter w_t is a positive scalar value which represents a weight on individual samples (training information corresponding to communication session t). In one embodiment, w_t is proportional to the inverse of the probability p_n in which BLER target BLER_n was selected by network node 115 at session t, that is The function / (x, BLER_n, q) estimates the

network KPI or reward in machine learning terminology associated to any given context x and BLER target BLER_n .The parameters q is trained to fit a best to the sum of KPI (rewards) over all BLER selection session available at training data.

[00116] The regularization term may sometimes be added to the optimization to introduce certain properties to the optimization or to the structure of model parameters. For example, -norm regularization term parametrized by a scalar l > 0

introduces smoothness properties that may lead to improved convergence of numerical algorithms that solve optimization for training. In another embodiment,

norm regularization which may favors sparse solutions of model parameters q and thereby may reduce the risk of overfitting.

[00117] The optimization for training, e.g., minimizing loss function with respect to exploitative model parameters q, is solved using suitable numerical optimization algorithms including variants of gradient descent, BFGS, or higher order methods such as Newton [00118] These and other related operations will now be described in the context of the operational flowcharts of Figures 10-13 of operations that are performed by a network node (e.g., network node 300) according to some embodiments of inventive concepts.

[00119] Referring initially to Figure 10, operations can be performed by a network node (e.g., 300 in Fig. 3) in a radio communication network. Network node 300 determines (107, 1001) a probability density function 109 of a set of block error rate targets for a communication session with a user device 200 based on at least one or more of user device context information 101, network context information 103, and an exploration strategy 105. Network node 300 further performs operations for selecting (111, 1003) a block error rate target from the set of block error rate targets based on the probability density function 109. Network node 300 further performs operations for configuring 1003 radio transmission parameters for the communication session with the user device 200 based on the selected block error rate target 113.

[00120] In various embodiments, the user context information 101 includes one or more of: at least one key performance indicator associated with the communication session between the user device and the network node; at least one channel quality indicator measurement of the communication session between the user device and the network node; at least one reference signal received power measurement for a downlink or an uplink reference signal; a quality of service requirement for the communication session between the user device and the network node; at least one signal attenuation measurement between the user device and the network node; at least one timing advance measurement associated to the user device; at least one measurement of time a signal takes to reach the network node from the user device, or to reach from the network node to the user device and from the user device to the network node; at least one interference measurement for the communication session between the user device and the network node; and a type of the user device. [00121 ] In various embodiments, the network context information 103 includes one or more of: at least one key performance indicator associated with one or more cells of the radio communication network; a traffic load and/or radio resource utilization value associated to at least one of a serving cell, an interfering cell, and a network node; at least one inter-site distance information between a serving network node and neighboring network nodes for the user device; at least one propagation loss measurement of radio signals from/to the serving network node to/from the user device; at least one propagation of radio signals from/to an interfering network node from/to the user device; at least one interference measurement of radio signals from/to an interfering network node to/from the user device; at least one interference measurement of radio signals from/to the serving network node to/from other interfering user devices; a number of neighboring cells or radio network nodes that can interfere with the user device; a type of neighboring cells or radio network nodes; at least one power setting parameter for the serving network node or at least an interfering network node; a type of traffic and/or traffic distribution in the serving radio cell or serving network node; a type of traffic and/or traffic distribution in neighboring radio cells or network nodes; and at least one mobility settings parameter in the serving radio cell or neighboring radio cells.

[00122] In various embodiments, the exploration strategy 105 includes one or more of an epsilon-greedy exploration strategy; a t-first exploration strategy; a SoftMax exploration strategy; and an ensemble exploration strategy comprising a parameter K having an integer value greater or equal to 1. [00123] In various embodiments, the operations performed by network node 300 for configuring 1005 radio transmission parameters for the communication session with the user device 200 based on the selected block error rate target includes adapting a modulation and coding scheme, for the communication session with the user device 200, based on the selected block error rate target.

[00124] In various embodiments, the probability density function 109 includes a probability for each block error rate target in the set of block error rate targets available at the network node 300.

[00125] In some embodiments, the operations performed by network node 300 for selecting 1003 a block error rate target from the set of block error rate targets based on the probability density function 109 includes obtaining a block error rate target at random from the set of block error rate targets according to the probability density function 109.

[00126] In some embodiments, the operations performed by network node 300 for determining 1001 a probability density function 109 includes inputting the user device context information 101 and/or the network context information 103 into an exploitative model 107 a available at the network node 300 to generate at least one block error rate target parameter 107b. The operations further include computing 107c the probability density function 109 of the set of block error rate targets using the at least one block error rate target parameter and the exploration strategy 105.

[00127] In some embodiments, the operations performed by network node 300 for selecting 1003 the block error rate target from the set of block error rate targets based on the probability density function 109 include selecting the block error rate target at random from the set of BLER targets, wherein each block error rate target has a probability of being selected that is based on its respective probability in the probability density function 109.

[00128] In some embodiments, the operations performed by network node 300 for selecting 1003 the block error rate target from the set of block error rate targets based on the probability density function 109 include computing a cumulative distribution function associated with the probability density function for indexes i = 1 to N, wherein the number of indexes N is equal to the number of block error rate targets in the set of block error rate targets and the index i identifies one of the block error rate targets in the set of block error rate targets. The operations further include determining a random number x from a range between 0 and 1 using a uniform probability distribution. The operations further include finding a smallest value of the index n such that the relation of the cumulative distribution function is satisfied for n- 1 £ x £ the cumulative distribution function for n; and selecting the block error rate target corresponding to the smallest value of the index n.

[00129] In some embodiments, the at least one block error rate parameter 107b includes an exploitative block error rate target from the set of block error rate targets. The operations performed by network node 300 for the computing 107c the probability density function 109 for the set of block error rate targets using the at least one block error rate target parameter and the exploration strategy 105 includes combining the exploitative block error rate target with at least one parameter of the exploration strategy 105.

[00130] In some embodiments, the exploration strategy 105 is an epsilon-greedy strategy.

The at least one parameter of the exploration strategy 105 is epsilon having a value in the range [0, 1], and computing the probability density function 109 by combining the exploitative block error rate target 107b with the at least one parameter of the exploration strategy includes computing 107c a probability p_n for each block error rate target in the set of block error rate targets according to

wherein is used for the exploitative block error rate target chosen based on the

user device context information 101 and/or the network context information 105, wherein is used for the remainder of the block error rate targets in the set of block error

rate targets, wherein n is the number of block error rate targets in the set of block error rate targets defined as n = 1, ...N, and wherein i is the exploitative block error rate target.

[00131 ] In some embodiments, the exploitative model includes one or more of: a feedforward neural network; a recurrent neural network; a convolutional neural network; an ensemble of neural networks; a decision tree; a random decision forest; a linear regression or classification model; and a non-linear regression or classification model.

[00132] Referring to Figure 11, further operations that can be performed by network node 300 in a radio communication network include configuring 1101 the user device 200 to measure one or more key performance indicators associated with the communication session configured based on the selected block error rate target. The operations further include receiving 1103 a measurement report message from the user device 200 including one or more measurements of the one or more key performance indicators.

[00133] Referring to Figure 12, further operations that can be performed by network node 300 in a radio communication network includes determining 1201 a data tuple associated with the communication session with the user device 200. The data tuple includes one or more of: a representation of a state associated with the communication session with the user device 200 prior to selecting a block error rate target from the set of block error rate targets based on the probability density function 109; a representation of a state associated with the communication session with the user device 200 after selecting and configuring a block error rate target from the set of block error rate targets based on the probability density function 109; the selected block error rate target; at least one or more key performance indicators associated with the communication session configured upon selecting the selected block error rate target; and at least one or more parameters of the exploration strategy 105.

[00134] Referring to Figure 13, further operations that can be performed by network node 300 in a radio communication network include receiving 1301, from a second network node 300, historical data associated with selecting a block error rate target for other user devices from the set of block error rate targets based on the probability density function 109. The operations further include performing 1303 training of the exploitative model 107a for selecting a block error rate target from the set of block error rate targets based on the probability density function 109 using the historical data. Each data sample of the historical data is associated with a block error rate target selected for the communication session between the network node 300 and the user device 200. At least one parameter of the exploitative model 107a is determined by progressively updating the at least one parameter during the training based on the historical data. For example, the at least one parameter of the exploitative model 107a (e.g., a feedforward neural network) is user context information 101 and/or the network context information 105.

[00135] Referring to Figures 10-13, in various embodiments, the at least one block error rate parameter 107b) includes a set of exploitative score values

indexed by an integer i = 1 to n, where each exploitative score value in the set of exploitative score values is associated with a block error rate target in the set of block error rate targets. The computing 107c the probability density function 109 for the set of block error rate targets using the at least one block error rate target parameter and the exploration strategy 105 includes combining the set of exploitative score values with at least one parameter of an exploration strategy 105.

[00136] Referring to Figures 10-13, in various embodiments, the exploration strategy 105 is an epsilon-greedy strategy. The at least one parameter of the exploration strategy is epsilon having a value in the range [0, 1]. Computing the probability density function 109 by combining 107b the set of exploitative score values with the at least one parameter of the exploration strategy includes computing 107c a probability p_n for each block error rate target in the set of block error rate targets according to

wherein is an indicator function having a value of 1 when the

exploitative model 107a assigns a highest score to a block error rate target n compared to the remainder of the block error rate targets in the set of block error rate targets and has a value of 0 otherwise, and wherein n is the index of block error rate targets in the set of block error rate targets defined as n = 1, ...N.

[00137] Referring to Figures 10-13, in various embodiments, the exploration strategy 105 is a SoftMax exploration strategy. The at least one parameter of the exploration strategy 105 is t having a value of greater than or equal to 1. Computing the probability density function 109 by combining the exploitative block error rate target 107b with the at least one parameter of the exploration strategy 105 includes computing a probability p_n for each block error rate target in the set of block error rate targets according to

wherein n is the number of block error rate targets in the set of block error rate targets defined as n = 1, ...N.

[00138] Aspects of the present disclosure have been described herein with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the disclosure. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable instruction execution apparatus, create a mechanism for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks.

[00139] These computer program instructions may also be stored in a computer readable medium that when executed can direct a computer, other programmable data processing apparatus, or other devices to function in a particular manner, such that the instructions when stored in the computer readable medium produce an article of manufacture including instructions which when executed, cause a computer to implement the function/act specified in the flowchart and/or block diagram block or blocks. The computer program instructions may also be loaded onto a computer, other programmable instruction execution apparatus, or other devices to cause a series of operational steps to be performed on the computer, other programmable apparatuses or other devices to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide processes for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks.

[00140] It is to be understood that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention.

Unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this disclosure belongs. It will be further understood that terms, such as those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of this specification and the relevant art and will not be interpreted in an idealized or overly formal sense expressly so defined herein.

[00141 ] The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods, and computer program products according to various aspects of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware -based systems that perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

[00142] The terminology used herein is for the purpose of describing particular aspects only and is not intended to be limiting of the disclosure. As used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms "comprises" and/or "comprising," when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof. As used herein, the term "and/or" includes any and all combinations of one or more of the associated listed items. Like reference numbers signify like elements throughout the description of the figures.

[00143] The corresponding structures, materials, acts, and equivalents of any means or step plus function elements in the claims below are intended to include any disclosed structure, material, or act for performing the function in combination with other claimed elements as specifically claimed. The description of the present disclosure has been presented for purposes of illustration and description, but is not intended to be exhaustive or limited to the disclosure in the form disclosed. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the disclosure. The aspects of the disclosure herein were chosen and described in order to best explain the principles of the disclosure and the practical application, and to enable others of ordinary skill in the art to understand the disclosure with various modifications as are suited to the particular use contemplated.

Claims

CLAIMS:

1. A method performed by a network node (300) in a radio communication network, the method comprising: determining (107, 1001) a probability density function (109) of a set of block error rate targets for a communication session with a user device (400) based on at least one or more of user device context information (101), network context information (103), and an exploration strategy (105); selecting (111, 1003) a block error rate target from the set of block error rate targets based on the probability density function; and configuring (1003) radio transmission parameters for the communication session with the user device based on the selected block error rate target (113).

2. The method of Claim 1 , wherein the user context information comprises one or more of: at least one key performance indicator associated with the communication session between the user device and the network node; at least one channel quality indicator measurement of the communication session between the user device and the network node; at least one reference signal received power measurement for a downlink or an uplink reference signal; a quality of service requirement for the communication session between the user device and the network node; at least one signal attenuation measurement between the user device and the network node; at least one timing advance measurement associated to the user device; at least one measurement of time a signal takes to reach the network node from the user device, or to reach from the network node to the user device and from the user device to the network node; at least one interference measurement for the communication session between the user device and the network node; and a type of the user device.

3. The method of any of Claims 1 to 2, wherein the network context information comprises one or more of: at least one key performance indicator associated with one or more cells of the radio communication network; a traffic load and/or radio resource utilization value associated to at least one of a serving cell, an interfering cell, and a network node; at least one inter-site distance information between a serving network node and neighboring network nodes for the user device; at least one propagation loss measurement of radio signals from/to the serving network node to/from the user device; at least one propagation of radio signals from/to an interfering network node from/to the user device; at least one interference measurement of radio signals from/to an interfering network node to/from the user device; at least one interference measurement of radio signals from/to the serving network node to/from other interfering user devices; a number of neighboring cells or radio network nodes that can interfere with the user device; a type of neighboring cells or radio network nodes; at least one power setting parameter for the serving network node or at least an interfering network node; a type of traffic and/or traffic distribution in the serving radio cell or serving network node; a type of traffic and/or traffic distribution in neighboring radio cells or network nodes; and at least one mobility settings parameter in the serving radio cell or neighboring radio cells.

4. The method of any of Claims 1 to 3, wherein the exploration strategy comprises one or more of: an epsilon-greedy exploration strategy; a t- first exploration strategy; a SoftMax exploration strategy; and an ensemble exploration strategy comprising a parameter K having an integer value greater or equal to 1.

5. The method of any of Claims 1 to 4, wherein the configuring radio transmission parameters for the communication session with the user device based on the selected block error rate target comprises adapting a modulation and coding scheme, for the communication session with the user device, based on the selected block error rate target.

6. The method of any of Claims 1 to 5, wherein the probability density function comprises a probability for each block error rate target in the set of block error rate targets available at the network node.

7. The method of any of the Claims 1 to 6, wherein the selecting a block error rate target from the set of block error rate targets based on the probability density function comprises: obtaining a block error rate target at random from the set of block error rate targets according to the probability density function.

8. The method of any of Claims 1 to 7, wherein the determining a probability density function comprises: inputting the user device context information (101) and/or the network context information (103) into an exploitative model (107a) available at the network node to generate at least one block error rate target parameter (107b); and computing (107c) the probability density function of the set of block error rate targets using the at least one block error rate target parameter and the exploration strategy.

9. The method of any of Claims 1 to 8, wherein the selecting the block error rate target from the set of block error rate targets based on the probability density function comprises: selecting the block error rate target at random from the set of BLER targets, wherein each block error rate target has a probability of being selected that is based on its respective probability in the probability density function.

10. The method of Claim 9, wherein the selecting a block error rate target from the set of block error rate targets based on the probability density function comprises: computing a cumulative distribution function associated with the probability density function for indexes i = 1 to N, wherein the number of indexes N is equal to the number of block error rate targets in the set of block error rate targets and the index i identifies one of the block error rate targets in the set of block error rate targets; determining a random number x from a range between 0 and 1 using a uniform probability distribution; finding a smallest value of the index n such that the relation of the cumulative distribution function is satisfied for n-1 < x < the cumulative distribution function for n; and selecting the block error rate target corresponding to the smallest value of the index n.

11. The method of any of Claims 8 to 10, wherein the at least one block error rate target parameter (107b) comprises an exploitative block error rate target from the set of block error rate targets, and wherein the computing (107c) the probability density function for the set of block error rate targets using the at least one block error rate target parameter and the exploration strategy comprises combining the exploitative block error rate target with at least one parameter of the exploration strategy.

12. The method of Claim 11, wherein the exploration strategy (105) is an epsilon- greedy strategy, wherein the at least one parameter of the exploration strategy is epsilon having a value in the range [0, 1], and wherein computing the probability density function by combining the exploitative block error rate target (107b) with the at least one parameter of the exploration strategy comprises computing (107c) a probability p_n for each block error rate target in the set of block error rate targets according to

user device context information and/or the network context information, wherein is used for the remainder of the block error rate targets in the set of block error

13. The method of Claim 8, wherein the exploitative model comprises one or more of: a feedforward neural network; a recurrent neural network; a convolutional neural network; an ensemble of neural networks; a decision tree; a random decision forest; a linear regression or classification model; and a non-linear regression or classification model.

14. The method of any of Claims 1 to 13, further comprising: configuring (1101) the user device to measure one or more key performance indicators associated with the communication session configured based on the selected block error rate target; and receiving (1103) a measurement report message from the user device comprising one or more measurements of the one or more key performance indicators.

15. The method of any of Claims 1 to 14, further comprising: determining (1201) a data tuple associated with the communication session with the user device, wherein the data tuple comprises one or more of: a representation of a state associated with the communication session with the user device prior to selecting a block error rate target from the set of block error rate targets based on the probability density function; a representation of a state associated with the communication session with the user device after selecting and configuring a block error rate target from the set of block error rate targets based on the probability density function; the selected block error rate target; at least one or more key performance indicators associated with the communication session configured upon selecting the selected block error rate target; and at least one or more parameters of the exploration strategy.

16. The method of any of Claims 1 to 15, further comprising: receiving (1301), from a second network node, historical data associated with selecting a block error rate target for other user devices from the set of block error rate targets based on the probability density function; performing training of the exploitative model (1303) for selecting a block error rate target from the set of block error rate targets based on the probability density function using the historical data, wherein each data sample of the historical data is associated with a block error rate target selected for the communication session between the network node and the user device, and wherein at least one parameter of the exploitative model is determined by progressively updating the at least one parameter during the training based on the historical data.

17. The method of any of Claims 1 to 16, wherein the at least one block error rate parameter (107b) comprises a set of exploitative score values indexed by an integer i = 1

to n, where each exploitative score value in the set of exploitative score values is associated with a block error rate target in the set of block error rate targets, and wherein the computing (107c) the probability density function for the set of block error rate targets using the at least one block error rate target parameter and the exploration strategy comprises combining the set of exploitative score values with at least one parameter of an exploration strategy.

18. The method of Claim 17, wherein the exploration strategy (105) is an epsilon- greedy strategy, wherein the at least one parameter of the exploration strategy is epsilon having a value in the range [0, 1], and wherein computing the probability density function by combining the set of exploitative score values (107b) with the at least one parameter of the exploration strategy comprises computing (107c) a probability p_n for each block error rate target in the set of block error rate targets according to

wherein is an indicator function having a value of 1 when the

exploitative model assigns a highest score to a block error rate target n compared to the remainder of the block error rate targets in the set of block error rate targets and has a value of 0 otherwise, and wherein n is the index of block error rate targets in the set of block error rate targets defined as n = 1, ...N.

19. The method of Claim 11, wherein the exploration strategy is a SoftMax exploration strategy, wherein the at least one parameter of the exploration strategy is t having a value of greater than or equal to 1 , and wherein computing the probability density function by combining the exploitative block error rate target (107b) with the at least one parameter of the exploration strategy comprises computing a probability p_n for each block error rate target in the set of block error rate targets according to

20. A computer program comprising instructions which, when executed on at least one processor (303), cause the at least one processor to carry out a method according to any one of Claims 1 to 19.

21. A computer program product comprising: a non-transitory computer readable medium storing instructions, when executed on at least one processor (303) causes the at least one processor to carry out a method according to any one of Claims 1 to 19.

22. A network node (300) configured to operate in a radio communication network, the network node comprising: at least one processor (303); and a memory (305) coupled with the at least one processor, wherein the memory includes instructions that when executed by the at least one processor causes the network node to perform the operations according to any of Claims 1 to 19.