WO2023202768A1 - Procédés, appareil et supports lisibles par machine se rapportant à un apprentissage automatique dans un réseau de communication - Google Patents

Procédés, appareil et supports lisibles par machine se rapportant à un apprentissage automatique dans un réseau de communication Download PDF

Info

Publication number
WO2023202768A1
WO2023202768A1 PCT/EP2022/060458 EP2022060458W WO2023202768A1 WO 2023202768 A1 WO2023202768 A1 WO 2023202768A1 EP 2022060458 W EP2022060458 W EP 2022060458W WO 2023202768 A1 WO2023202768 A1 WO 2023202768A1
Authority
WO
WIPO (PCT)
Prior art keywords
network
machine
entity
entities
optimization
Prior art date
Application number
PCT/EP2022/060458
Other languages
English (en)
Inventor
Martin Isaksson
Rickard CÖSTER
Original Assignee
Telefonaktiebolaget Lm Ericsson (Publ)
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Telefonaktiebolaget Lm Ericsson (Publ) filed Critical Telefonaktiebolaget Lm Ericsson (Publ)
Priority to PCT/EP2022/060458 priority Critical patent/WO2023202768A1/fr
Publication of WO2023202768A1 publication Critical patent/WO2023202768A1/fr

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/14Network analysis or design
    • H04L41/145Network analysis or design involving simulating, designing, planning or modelling of a network
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/08Configuration management of networks or network elements
    • H04L41/0803Configuration setting
    • H04L41/0813Configuration setting characterised by the conditions triggering a change of settings
    • H04L41/082Configuration setting characterised by the conditions triggering a change of settings the condition being updates or upgrades of network functionality
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/08Configuration management of networks or network elements
    • H04L41/0803Configuration setting
    • H04L41/0823Configuration setting characterised by the purposes of a change of settings, e.g. optimising configuration for enhancing reliability
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/08Configuration management of networks or network elements
    • H04L41/085Retrieval of network configuration; Tracking network configuration history
    • H04L41/0853Retrieval of network configuration; Tracking network configuration history by actively collecting configuration information or by backing up configuration information
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/12Discovery or management of network topologies
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/12Discovery or management of network topologies
    • H04L41/122Discovery or management of network topologies of virtualised topologies, e.g. software-defined networks [SDN] or network function virtualisation [NFV]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/14Network analysis or design
    • H04L41/147Network analysis or design for predicting network behaviour
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/08Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters
    • H04L43/0805Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters by checking availability
    • H04L43/0811Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters by checking availability by checking connectivity
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/08Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters
    • H04L43/0852Delays
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/08Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters
    • H04L43/0876Network utilisation, e.g. volume of load or congestion level
    • H04L43/0888Throughput
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/10Active monitoring, e.g. heartbeat, ping or trace-route

Definitions

  • Embodiments of the disclosure relate to machine-learning, and particularly to methods, apparatus and machine-readable media relating to machine-learning in a communication network.
  • wireless devices are connected to a core network via a radio access network.
  • the core network operates according to a Service Based Architecture (SBA), in which services are provided by network functions via defined application interfaces (APIs).
  • SBA Service Based Architecture
  • Network functions in the core network use a common protocol framework based on Hypertext Transfer Protocol 2 (HTTP/2).
  • HTTP/2 Hypertext Transfer Protocol 2
  • a network function can also invoke services in other network functions through these APIs.
  • Examples of core network functions in the 5G architecture include the Access and mobility Management Function (AMF), Authentication Server Function (AUSF), Session Management Function (SMF), Policy Charging Function (PCF), Unified Data Management (UDM) and Operations, Administration and Management (OAM).
  • AMF Access and mobility Management Function
  • AUSF Authentication Server Function
  • SMF Session Management Function
  • PCF Policy Charging Function
  • UDM Unified Data Management
  • OAM Operations, Administration and Management
  • an AMF may request subscriber authentication data from an AUSF by calling a
  • NWDA Network Data Analytics
  • NWDAF Network Data Analytics Function
  • FIG. 1 shows an NWDAF 102 connected to a network function (NF) 104.
  • the network function 104 may be any suitable network function (e.g. an AMF, an AUSF or any other network function).
  • the NWDAF 102 connects to an Event Exposure Function at the network function over an Nnf reference point (as detailed in the 3GPP documents TS 23.502 v 16.0.2 and TS 23.288 v 16.0.0).
  • the NWDAF 102 can then receive data from the network function over the Nnf reference point by subscribing to reports from the network function or by requesting data from the network function.
  • the timing of any reports may be determined by timeouts (e.g. expiry of a timer) or may be triggered by events (e.g. receipt of a request).
  • the types of data that can be requested by the NWDAF 102 from the network function may be standardised.
  • the network function 104 For the network function 104 to be discoverable by the NWDAF 102 (or any other service consumer such as, for example, another network function), the network function 104 registers with a Network function Repository Function (NRF).
  • NWDAF 102 or any other service consumer such as, for example, another network function
  • NEF Network function Repository Function
  • FIG. 2 shows an illustration of an NRF 208 connected to three network functions, NF A 202, NF B 204 and NF C 206 that are registered at the NRF 208.
  • the NRF 208 may be preconfigured with information about the network functions 202-206, or each of the network functions 202-206 may have performed a network registration procedure with the NRF 208 to register at the NRF 208.
  • Once a network function is registered at the NRF 208, another entity in the network may discover the network function by calling a discovery function at the NRF 208.
  • NF B 204 may discover NF A 202 and NF C 206 by calling a discovery function at the NRF 208.
  • Machine-learning in the context of 5G networks is typically large-scale and may be executed in a cloud (virtualised) environment where performance and security are prioritised.
  • this means that the data available for training models using machine-learning may be distributed across many entities in the network, and that data should ideally be collated at one network entity to be used for developing models using machine-learning. Collating these datasets at a single network entity can be slow and resource intensive, which is problematic for time-critical applications.
  • some applications require the use of data sets comprising sensitive or private data, and collating these data at a single network entity may have security implications.
  • One such collaborative machine-learning technique is federated machine-learning.
  • a plurality of training agents each train respective local copies of a machine-learning model using respective local training data sets.
  • Local copies of the machine-learning model are transmitted to an aggregation entity, where they are combined to generate an updated global copy of the machine-learning model.
  • the updated global copy of the machine-learning model is then made available to the training agents (for further training), or to other entities for implementation.
  • the process may be repeated multiple times, with each iteration termed a “training round” herein.
  • the machine-learning model may be trained based on a large, diverse and scalable data set, without transmitting that data over a network.
  • a paper by Isaksson and Norrman discloses one approach to federated learning in 5G mobile networks.
  • One problem associated with federated learning and other collaborative machinelearning techniques is the selection of clients or agents to perform training within each training round.
  • a communication system may comprise many hundreds or thousands of network entities that are capable of training a local copy of a machine-learning model for later aggregation as part of a collaborative machine-learning process.
  • federated learning overcomes the security problems of transmitting potentially sensitive data over the network, significant communication costs are nonetheless associated with transmitting local updates to the machine-learning model over the communication system. It is likely to be prohibitively expensive in terms of system resources to use all available network entities in any given training round. In practice, selection of a subset of the available network entities is required to limit the amount of data being transmitted over the system.
  • each network entity will be associated with a different set of metrics, in respect of its local copy of the machine-learning model (which may perform well or poorly), the training data set which is available to it (which may be diverse from or similar to the training data sets of other entities), and its geographical location in the communication system. If entities are selected poorly, the promised benefits of collaborative machine-learning — reduced communication cost, diverse training data, improved performance — will not be realized.
  • Embodiments of the disclosure address these and other problems.
  • the disclosure provides a method performed by a first network entity in a communications network.
  • the method comprises: transmitting one or more query messages.
  • the one or more query messages comprises an indication of one or more attributes to be fulfilled by second network entities for participation in a collaborative machine-learning process to update a global copy of a machine-learning model.
  • the one or more attributes comprise an attribute relating to underfitting or overfitting of respective local copies of the machine-learning model maintained by the second network entities.
  • the method further comprises: receiving information identifying a first plurality of second network entities fulfilling the one or more attributes; and selecting, from the first plurality of second network entities, one or more second network entities for participation in the collaborative machine-learning process.
  • the disclosure provides a first network entity to perform the method recited above.
  • a further aspect provides a computer program for performing the method recited above.
  • a computer program product, comprising the computer program, is also provided.
  • the disclosure provides a method performed by a second network entity in a communications network.
  • the second network entity maintains a local copy of a machine-learning model.
  • the method comprises: registering, with a network function repository entity, a profile of the second network entity.
  • the profile comprises an indication of an ability of the second network entity to participate in a collaborative machine-learning process to train the machine-learning model.
  • the method further comprises: training the local copy of the machine-learning model using training data accessible to the second network entity; and, responsive to a determination that the local copy of the machine-learning model is underfitted or has become overfitted to training data accessible to the second network entity: transmitting a message comprising an indication that the local copy of the machine-learning model is underfitted or has become overfitted.
  • the disclosure provides a second network entity to perform the method recited above.
  • a further aspect provides a computer program for performing the method recited above.
  • a computer program product, comprising the computer program, is also provided.
  • the first network entity comprises processing circuitry and a non-transitory machine-readable medium storing instructions which, when executed by the processing circuitry, cause the first network entity to: transmit one or more query messages, the one or more query messages comprising an indication of one or more attributes to be fulfilled by second network entities for participation in a collaborative machine-learning process to update a global copy of a machine-learning model, wherein the one or more attributes comprises an attribute relating to underfitting or overfitting of respective local copies of the machinelearning model maintained by the second network entities; receive information identifying a first plurality of second network entities fulfilling the one or more attributes; and select, from the first plurality of second network entities, one or more second network entities for participation in the collaborative machine-learning process.
  • the second network entity maintains a local copy of a machine-learning model.
  • the second network entity comprises processing circuitry and a non-transitory machine-readable medium storing instructions which, when executed by the processing circuitry, cause the second network entity to: register, with a network function repository entity, a profile of the second network entity, the profile comprising an indication of an ability of the second network entity to participate in a collaborative machine-learning process to train the machine-learning model; train the local copy of the machine-learning model using training data accessible to the second network entity; and responsive to a determination that the local copy of the machine-learning model is underfitted or has become overfitted to training data accessible to the second network entity: transmit a message comprising an indication that the local copy of the machine-learning model is underfitted or has become overfitted.
  • Figure 1 shows a network data analytics function connected to a network function
  • Figure 2 shows a network function repository function connected to three network functions
  • Figure 3 shows a system according to embodiments of the disclosure
  • Figure 4 is a schematic illustration of a method according to embodiments of the disclosure.
  • Figure 5 shows a network connectivity diagram according to embodiments of the disclosure
  • Figure 6 shows a network connectivity diagram according to further embodiments of the disclosure.
  • Figures 7 and 8 are schematic illustrations of underfitting or overfitting of a machinelearning model during training
  • Figure 9 is a signalling diagram of a method of client selection according to embodiments of the disclosure.
  • Figure 10 is a flowchart of a method according to embodiments of the disclosure.
  • Figure 11 is a flowchart of a method according to further embodiments of the disclosure.
  • Figures 12 and 13 are schematic diagrams of an apparatus according to embodiments of the disclosure.
  • FIG. 3 shows a system 300 in a communication network according to embodiments of the present disclosure.
  • One or more entities of the system may, for example, form part of a core network in the communication network.
  • the core network may be a Fifth Generation (5G) Core Network (5GCN).
  • the communication network may implement any suitable communications protocol or technology, such as Global System for Mobile communication (GSM), Wideband Code-Division Multiple Access (WCDMA), Long Term Evolution (LTE), New Radio (NR), WiFi, WiMAX, or Bluetooth wireless technologies.
  • GSM Global System for Mobile communication
  • WCDMA Wideband Code-Division Multiple Access
  • LTE Long Term Evolution
  • NR New Radio
  • WiFi WiMAX
  • Bluetooth wireless technologies such as Bluetooth wireless technology.
  • the network forms part of a cellular telecommunications network, such as the type developed by the 3 rd Generation Partnership Project (3GPP).
  • 3GPP 3 rd Generation Partnership Project
  • the system 300 comprises at least two network entities or network functions (NFs).
  • NFs network entities
  • three network entities, NF A 302, NF B 304 and NF C 306, are shown, although the skilled person will appreciate that the system 300 may comprise many more network entities than shown.
  • the network entities 302-306 are configured to provide one or more services.
  • the network entities may be any type or combination of types of network entities or network functions.
  • one or more of the network entities 302-306 may comprise core network entities or functions such as an access and mobility management function (AMF), an authentication server function (AUSF), a session management function (SMF), a policy control function (PCF), and/or a unified data management (UDM) function.
  • AMF access and mobility management function
  • AUSF authentication server function
  • SMF session management function
  • PCF policy control function
  • UDM unified data management
  • one or more of the network entities 302-306 may be implemented within entities outside the core network, such as radio access network nodes (e.g., base stations such as gNBs, eNBs etc or parts thereof, such as central units or distributed units).
  • the network entities 302- 306 may be implemented in hardware, software, or a combination of hardware and software.
  • Each of the network entities 302-306 is registered at a network registration entity 310 that also forms part of the system 300.
  • the network registration entity is a Network function Repository Function (NRF) 310.
  • NRF Network function Repository Function
  • the NRF 310 may thus store information for each of the network entities 302-306 registered there.
  • the stored information may include one or more of: a type of each of the network entities 302-306; a network address (e.g., IP address) of the network entities; services provided by the network entities; and capabilities of the network entities.
  • the network entities 302-306 are discoverable by other entities in the network.
  • the system 300 further comprises a Network Data and Analytics Function (NWDAF) 308.
  • NWDAF 308 is configured to collect network data from one or more network entities, and to provide network data analytics information to network entities which request or subscribe to receive it.
  • NWDAF may provide information relating to network traffic or usage (e.g. predicted load information or statistics relating to historical load information).
  • the network data analytics information provided by the NWDAF may, for example, be specific to the whole network, or to part of the network such as a network entity or a network slice.
  • a network slice may be a logical partition (e.g. a virtual network) in the communications network.
  • the network slice may be dedicated to a particular use-case or end-user.
  • the network slice may comprise one or more network functions for a particular use-case or end-user.
  • a network slice may be defined in hardware.
  • the network slice may comprise a set of servers dedicated for a particular use-case or enduser.
  • the network slice may be isolated from or independent of other parts of the communication network. For example, the transfer of information between a network slice and other parts of the communication network (e.g. other network slices) may be prohibited or restricted.
  • the network data analytics information provided by the NWDAF 308 may comprise forecasting data (e.g. an indication of a predicted load for a network function) and/or historical data (e.g. an average number of wireless devices in a cell in the communication network).
  • the network data analytics information provided by the NWDAF may include, for example, performance information (e.g. a ratio of successful handovers to failed handovers, ratio of successful setups of Protocol Data Unit (PDU) Sessions to failed setups, a number of wireless devices in an area, an indication of resource usage etc.).
  • performance information e.g. a ratio of successful handovers to failed handovers, ratio of successful setups of Protocol Data Unit (PDU) Sessions to failed setups, a number of wireless devices in an area, an indication of resource usage etc.
  • communication networks are becoming increasingly automated, with network designers seeking to minimise the level of human intervention required during operation.
  • One way of achieving this is to use the data collected in communication networks to train models using machine-learning, and to use those models in the control of the communication network.
  • the models can be updated and adapted to suit the needs of the network.
  • conventional methods for implementing machinelearning in communication networks require collating data for training models at one network entity. Collating these data at a single network entity, such as the NWDAF 308, can be slow and resource intensive and may be problematic if the data is sensitive in nature.
  • a collaborative (e.g. federated) learning process is used to train a model using machine-learning.
  • instances of the model are trained locally at multiple network functions to obtain local updates to parameters of the model at each network entity.
  • the local model updates are collated at an aggregator network entity (such as the NWDAF) and combined to obtain a combined model update.
  • NWDAF aggregator network entity
  • the NWDAF 308 initiates training of a model using machine-learning at each of the network functions, NF A 302, NF B 304 and NF C 306.
  • the NWDAF 308 may transmit a message to each of the network functions 302-306 instructing the network function to train a model using machine-learning.
  • the message may comprise a copy of the model (e.g. a global copy that is common to each of the network functions 302-306), or each of the network functions 302-306 may be preconfigured with a copy of the model. In the latter case, the message may comprise an indicator of which model is to be trained.
  • the message may specify a type of machine-learning algorithm to be used by the network entities.
  • the network entities 302-306 may be preconfigured with the type of machine-learning algorithm to be used for a model.
  • each network entity 302-306 trains the model by inputting training data into the machine-learning algorithm to obtain a local model update to values of one or more parameters of the model.
  • the training data may be data that is unique to the network entity.
  • the training data may comprise data obtained from measurements performed by the network function and/or data collected by the network function from other network entities (e.g. data obtained from measurements performed by one or more other network entities).
  • the local model update may comprise updated values of the parameters of the model or the local model update may comprise an indication of a change in the values of the parameters of the model, e.g., differences between previous values for the parameters and updated values for the parameters.
  • Transmissions between the network entities 302-306 and the NWDAF 308 may be direct (e.g. the NWDAF 308 transmits directly to a network entity) or the transmissions may be via an intermediate network entity.
  • the transmission between the network functions 302-306 and the NWDAF 308 may be via an Operation, Administration and Management function (OAM) 312.
  • OAM Operation, Administration and Management function
  • the NWDAF 308 thus receives the local model updates from each of the network entities 302-306.
  • the NWDAF 308 combines the model updates received from the network entities 302-306 to obtain a combined model update.
  • the NWDAF 308 may use any suitable operation for combining the model updates.
  • the NWDAF 308 may average the received local model updates to obtain an average model update.
  • the average may be a weighted average, with updates from different network entities being assigned different weights.
  • the NWDAF 308 transmits the combined model update to one or more network entities in the network.
  • the NWDAF 308 may send the combined model update to each of the network entities 302-306.
  • the combined model update may be transmitted to one or more further network entities in addition to the network entities 302-306 used to train the model.
  • This process may be repeated one or more times. For example, the process may be repeated until the local model updates received from each of the network entities 302- 306 are consistent with each other to within a predetermined degree of tolerance. In another example, the process may be repeated until the combined model updates converge, i.e. a combined model update is consistent with a previous combined model update to within a predetermined degree of tolerance.
  • Collaborative (e.g. federated) learning may thus be applied to communication networks (and in particular, to a core network in a communication network) to reduce latency, minimise resource overhead and reduce the risk of security problems.
  • a network entity initiates training of a machine-learning model at a plurality of other network entities in the communication network.
  • communication networks often comprise large numbers of network entities and only a fraction of these network entities may be configured to support collaborative learning.
  • participation in training of a particular model with a particular machine-learning algorithm may be associated with specific hardware or software requirements that only a subset of network entities satisfy.
  • Embodiments of the disclosure provide methods, apparatus and machine-readable media for selecting network entities for performing collaborative learning.
  • network entities which are capable of involvement in a collaborative learning process, and thus maintain and train local copies of a machine-learning model, register a profile with a network function repository entity (e.g., an NRF).
  • the profile comprises an indication of an ability of the network entity to participate in a collaborative learning process, such as federated learning, to train the machine-learning model.
  • the network entity trains the local copy of the model, and determines whether the local copy is underfitted or has become overfitted to training data which is accessible to the network entity.
  • the network entity transmits a message comprising an indication that the local copy of the machine-learning mode is underfitted or has become overfitted.
  • the message may be transmitted to the network function repository entity, and used to update the profile for the network entity; or the message may be transmitted to another network entity which is selecting entities for involvement in a collaborative learning process.
  • the indication that the local copy of the model is overfitted or underfitted may be useful to network entities involved in collaborative learning processes such as federated learning.
  • a network entity which is selecting clients for involvement in a training round may select clients based, at least in part, on the indication as to whether local copies of the model maintained by those clients are overfitted or underfitted.
  • clients whose local copies are underfitted or overfitted may not be selected for involvement in a training round of a collaborative learning process.
  • the client selection may further be based on the indication of the time.
  • clients whose local copies were determined to be underfitted or overfitted more than a threshold period of time before a current time may nonetheless be selected for involvement in a training round of a collaborative learning process.
  • the client selection process for a training round of a collaborative learning process may additionally or alternatively be based on network topology information.
  • the network topology information may comprise performance metrics for neighbour relations between network entities in the communications network (e.g., relating to handover between network entities).
  • the network topology information may comprise the network connectivity of network entities in the communications network.
  • clients or network entities may be selected so as to limit a communication cost of acquiring data (e.g., model updates) from the selected entities.
  • the network topology information may comprise information identifying a similarity of training data accessible to the network entities fortraining local copies of the machine-learning model.
  • network entities may be selected based on the similarity of training data so as to diversify training data on which the collaborative machine-learning process is based (e.g., a first network entity having access to a first training dataset may not be selected for a particular training round if a second network entity is already selected having access to a second training data which is substantially similar to the first training dataset).
  • the embodiments described herein thus provide an efficient method for selecting network entities to perform a collaborative learning process.
  • the embodiments described herein provide an efficient and reliable method for identifying candidates that are capable of performing a collaborative learning process without using overfitted or underfitted models.
  • Embodiments of the disclosure thus provide a more reliable method of selecting network entities for the performance of collaborative learning in a communications network.
  • FIG 4 is a diagram illustrating an overall method according to embodiments of the disclosure.
  • the signalling involves a co-ordination network entity 400, a network registration entity 404 and a network entity or function 402.
  • the co-ordination network entity 400 may comprise an NWDAF, such as the NWDAF 308 described above with respect to Figure 3.
  • the network registration entity 404 may comprise an NRF, such as the NRF 310 described above with respect to Figure 3.
  • the NF 402 may comprise any network function or entity which is capable of performing machine-learning.
  • the NF 402 may be a dedicated machine-learning entity (e.g., a server or other computing device) or an entity of the communications network which is capable of performing other functions and tasks in addition to machine-learning.
  • the NF 402 may utilize or implement the machine-learning model generated by the collaborative learning process to perform one or more tasks as will be described in greater detail herein.
  • the signalling also involves an OAM 406, such as the OAM 406 described above with respect to Figure 3.
  • OAM organic-based multi-emetic emetic based on the OAM 406 described above with respect to Figure 3.
  • other embodiments may not involve the OAM.
  • the signalling shown in Figure 4 permits the co-ordination network entity to select one or more network entities to participate in a collaborative learning process such as federated learning.
  • one or more network entities register with the network registration entity 404.
  • a network entity In registering with the network registration entity 404, a network entity provides information relating to the services provided by the network entity (such as the type of network entity or the function performed thereby), and/or the capability of the network entity. Such information may be stored by the network registration entity 404, and associated with addressing information enabling the network entity to be identified and addressed.
  • the NF 402 registers with the network registration entity 404, and particularly provides the network registration entity 400 with a profile comprising information relating to the services provided by the network entity (such as the type of network entity or the function performed thereby), and/or the capability of the network entity.
  • a profile comprising information relating to the services provided by the network entity (such as the type of network entity or the function performed thereby), and/or the capability of the network entity.
  • Such information may be stored by the network registration entity 404, and associated with an identifier (e.g., a unique number within the network) or some other means allowing the network entity to be identified and addressed.
  • the NF 402 is capable of participation in a collaborative machine-learning process to train local copies of a machine-learning model.
  • the co-ordination network entity 400 registers with the network registration entity 404, and particularly provides the network registration entity 400 with a profile comprising information relating to the services provided by the co-ordination network entity 400 (such as the type of network entity or the function performed thereby), and/or the capability of the network entity. Again, this information may be stored by the network registration entity 404, and associated with an identifier (e.g., a unique number within the network) or some other means allowing the co-ordination network entity 400 to be identified and addressed.
  • an identifier e.g., a unique number within the network
  • the NF 402 subscribes to a model update service with the coordination network entity 400. That is, the NF 402 trains (and potentially implements) a machine-learning model. In order to receive updates to that model, and benefit from training performed by other network entities in the communications network, the NF 402 subscribes to receive those updates from the co-ordination network entity 400.
  • the NF 402 transmits a subscription request message to the co-ordination network entity 400.
  • the subscription request message may comprise an indication of the machine-learning model to which the subscription relates (in case the co-ordination network entity maintains multiple different models).
  • the NF 402 receives a set of model weights for the machine-learning model from the co-ordination network entity 400.
  • the model weights may be nominal values if the machine-learning model is yet to be trained; alternatively, where the model has already undergone one or more rounds of training (e.g., using other network entities in the communications network), the model weights may reflect the results of those rounds of training.
  • the NF 402 is provided with initial weights for a local copy of the machine-leaning model.
  • the NF 402 may thereafter begin to train that local copy of the machine-learning model based on training data which is local to the NF 402 and/or accessible to the NF 402.
  • the training data may comprise data relating to services provided by the NF 402, or by other network entities to which the NF 402 is connected.
  • the NF 402 maintains a local copy of the machine-learning model.
  • the local copy is subject to repeated and/or continuous training and thus may comprise weights which differ from the initial weights provided to the NF in step 416.
  • network topology information is collated in step 418. This phase may involve communication between the co-ordination network entity 400 and the NF 402.
  • network topology information is obtained by executing one or more traceroute or other route-finding procedures between the co-ordination network entity 400 and each of the NFs 402 that subscribed to the model update service in the second phase of the method described above.
  • Such procedures operate by transmitting one or more packets (i.e., IP packets) from a sending entity (e.g., the co-ordination network entity 400) and a receiving entity (e.g., the NF 402), and recording the route that each packet follows.
  • the output of a route-finding procedure is thus one or more connections between a sending entity and a receiving entity, and a record of the different network nodes (e.g., routers) along those connections.
  • Figure 5 shows network connectivity information according to this aspect: a series of connections from the co-ordination network entity 400 to a plurality of NFs 402, as well as the route that each connection takes. Note that a connection between the coordination network entity 400 and a first NF 402 may be at least partially co-routed with a connection between the co-ordination network entity 400 and a second NF 402, i.e., the connections follow the same path for at least some of their routes.
  • the communication cost of each connection may be calculated using any suitable metric.
  • One simple approach is to utilize the number of hops as the communication cost (e.g., where each hop costs ‘1 ’). More advanced approaches may take into account metrics such as: the available bandwidth of each hop (i.e., the communication cost of a hop with relatively little available bandwidth may be higher than the communication cost of a hop with relatively large amounts of available bandwidth); the number of connections using the same link between network nodes (where a higher cost is associated with a link used by a higher number of connections, and vice versa).
  • the network connectivity information may be obtained by determining one or more metrics for the connections between a sending entity (e.g., the co-ordination network entity 400) and a receiving entity (e.g., the NF 402).
  • metrics may include one or more of: the delay or latency of the connection between the sending and receiving entities; and the throughput of the connection between the sending and receiving entities.
  • Figure 6 shows network connectivity information according to this aspect: a series of connections from the co-ordination network entity 400 to a plurality of NFs 402.
  • the connections in this aspect are modelled directly between the co-ordination network entity 400 and the respective NFs 402; no route is recorded.
  • the communication cost of each connection may correspond to the metric for the connection as determined above, e.g., the latency or throughput.
  • the communication cost may be derived from one or more of these metrics; for example, the communication cost may correspond to a sum of the latency and throughput, or some other function of the latency and throughput.
  • One output from step 418 is thus a map of respective connections (or ‘edges’) between the co-ordination network entity 400 and each of the subscribed NFs 402 or a subset of those subscribed NFs 402.
  • the weight attached to each edge may correspond to the communication cost of selecting that NF for involvement in a training round of a collaborative machine-learning process.
  • the network topology information may comprise multiple maps, with the weights for each map corresponding to a different metric (e.g., hop count, latency, throughput, etc).
  • the network topology information may comprise an indication of the similarity between training datasets available to the NFs 402.
  • ’’similarity may be calculated in several different ways:
  • proxy data for each training dataset, such as one or more of: metadata, configuration parameters, geographical information, etc.
  • proxy data may be collected by the co-ordination network entity 400 or the OAM 406, for example. In the latter case, the co-ordination network entity 400 may obtain the proxy data from the OAM 406.
  • This information may be shared with the co-ordination network entity 400 directly by the NFs 402, or via an intermediary node such as the OAM 406.
  • measurements taken from the local copies of the machine-learning model themselves For example, measurements of the weights of the local copies of the machine-learning model such as the gradients specified by each local copy, the smoothness of the weights, etc may be compared between local copies as a measure of the similarity between the training datasets used to train those local copies of the model.
  • the model weights for the local copies may be shared with the co-ordination network entity 400 directly by the NFs 402, or via an intermediary node such as the network registration entity 404.
  • NFs which are geographically close to each other may have similar training datasets. This assumption will not always hold, but if there is a large number of NFs to select from, for involvement in collaborative machine-learning process, then discarding NFs having similar latencies and/or geographic locations may improve diversity without significantly affecting the availability of NFs for involvement in the collaborative machine-learning process.
  • Values for the similarity may include one or more of: Earth Movers Distance (EMD), Kullback-Leibler divergence (KL), Jensen-Shannon divergence (JS), cosine similarity, Euclidian norm, etc.
  • EMD Earth Movers Distance
  • KL Kullback-Leibler divergence
  • JS Jensen-Shannon divergence
  • cosine similarity Euclidian norm, etc.
  • the network topology information may comprise neighbour relations between NFs 402 and/or metrics associated with those neighbour relations.
  • ANR automatic neighbour relations
  • a NF being in this context a radio access network entity
  • a NF instructs its served UEs to report the identities of neighbouring NFs detected, for example, through system information broadcast.
  • the NFs may then report these neighbour relations to the co-ordination network entity 400.
  • Metrics may then be associated with each neighbouring NF, such as one or more of: a number of attempted handovers between NFs (e.g., in a given time window); a number of successful handovers between NFs (e.g., in a given time window); and a handover success rate between NFs.
  • the NFs may then report these metrics to the co-ordination network entity 400.
  • the co-ordination network entity 400 may obtain the information from the OAM 406.
  • Network topology information may thus be found for each of the NFs 402 that subscribed to the model update service in the second phase of the method described above.
  • a subset of the subscribed NFs may be chosen from which to obtain network topology information.
  • the subset of NFs may be chosen randomly from the full set of subscribed NFs.
  • the fourth phase represents one training loop of a collaborative machine-learning process (e.g., federated learning).
  • a collaborative machine-learning process e.g., federated learning
  • step 420 the co-ordination network entity 400 transmits or broadcasts updated model parameters for the machine-learning model to the NF 402 and all NFs that have subscribed to the model update service (e.g., in the second phase described above).
  • Step 420 may correspond substantially to step 416, described above, but where an update to the machine-learning model has been generated.
  • step 422 the co-ordination network entity 400, in communication with the NF 402, the registration network entity 404 and potentially the OAM 406, selects clients for participation in the training round. Further detail regarding this process is set out below with respect to Figure 9.
  • step 424 the co-ordination network entity 400 subscribes to receive model updates from those clients that were selected in the preceding step.
  • this includes NF 402, and thus step 424 comprises the co-ordination network entity 400 transmitting a subscription request message to the NF 402.
  • the NF 402 transmits model updates to the coordination network entity in step 426.
  • the model updates may comprise new values or weights for the machine-learning model, or changes (gradients) to the values or weights for the machine-learning model.
  • the co-ordination network entity 400 combines or aggregates the model updates received from the selected clients (including NF 402) to generate an update to a global copy of the machine-learning model, and outputs the global copy of the machinelearning model to the network functions that are registered with the model update service.
  • the co-ordination network entity 400 may use any suitable operation for combining the model updates.
  • the co-ordination network entity 400 may average the received local model updates to obtain an average model update.
  • the average may be a weighted average, with updates from different network entities being assigned different weights.
  • the weights assigned to each update may be calculated as a function of performance metrics associated with neighbours of those network entities.
  • the fourth phase may be repeated for subsequent training rounds to further train and develop the global copy of the machine-learning model.
  • the third phase may also be repeated, such that the network topology information is updated for the client selection process in each training round.
  • Embodiments of the disclosure relate to the concept of underfitting or overfitting a machine-learning model to a training dataset.
  • Training of a machine-learning model is based on a training dataset.
  • the machine-learning model is applied to the training dataset, and the weights or parameters of the machine-learning model are updated based on an evaluation of that performance.
  • the training dataset comprises a set of inputs and corresponding target outputs (e.g., labels) that the machine-learning model should learn to output based on the inputs.
  • Unsupervised learning processes such as reinforcement learning, use rewards to provide positive feedback in the case of desired performance.
  • a reinforcement machine-learning model is required to select from a plurality of possible actions, acting on a system (which is defined in the training dataset)
  • different rewards may be associated with each action based on the outcome of performing that action. Actions leading to a more desirable state for the system may be associated with higher rewards.
  • the performance of the machine-learning model with respect to the training dataset is used to update the machine-learning model itself.
  • training datasets are finite in size and it cannot be guaranteed that a machine-learning model will perform adequately when applied to other sets of data than the training dataset, e.g., when implemented.
  • the machine-learning model may not generalize well to other data than the training dataset.
  • a portion of the data available to a training network entity is typically reserved as a test dataset.
  • the performance of the machine-learning model more generally is evaluated by applying the machine-learning model to the test dataset and measuring its accuracy (or other performance parameters).
  • Figure 7 is a schematic graph showing underfitting and overfitting of a machine-learning model undergoing training.
  • time (t) increases, the machine-learning model undergoes training and its performance with respect to a training set (solid line) and a test set (dashed line) changes.
  • the performance of the machine-learning model improves with respect to both the training set and the test set.
  • ‘Loss’ a conceptual term covering loss of information or other inaccuracies as a result of the machine-learning model, reduces as the performance of the machine-learning model improves.
  • the machine-learning model is underfitted; performance of the machine-learning model with respect to both the training dataset and the test dataset can be improved by further training.
  • the performance of the machine-learning model with respect to the test dataset reaches a turning point; further training of the machinelearning model with respect to the training dataset improves performance of the model with respect to the training dataset, but worsens the performance of the model with respect to the test dataset.
  • the model is less able to generalize, and performs less well on datasets different from the training dataset.
  • the performance of the machine-learning model is good, and may in some senses be considered optimized.
  • FIG 8. A further conceptual illustration of this phenomenon is shown in Figure 8.
  • the input data training dataset
  • the function of the machine-learning model in this instance is to categorize or classify the input data into two classifications.
  • the machine-learning model is underfitted; the model does not adequately distinguish between either classification in the input data set and performs poorly by incorrectly classifying several inputs.
  • the machine-learning model is overfitted; the model performs exceptionally well with respect to the input training data and correctly classifies all input data; however, test data (not illustrated) which is similar to the input data but not exactly the same is likely to be correctly classified.
  • the machine-learning model classifies the input training data well (albeit not perfectly), but may also perform well with respect to test data which is similar to the training dataset.
  • a machine-learning model may be underfitted or overfitted.
  • a model which is underfitted at a first time t1 may perform well or even become overfitted at a later time t2, after further training.
  • a model which is overfitted at a first time t1 may become underfitted or perform well at a later time t2, when additional data is added to the training dataset.
  • Figure 9 is a schematic signalling diagram showing a client selection process according to embodiments of the disclosure. The process may be utilized in step 422 of the method described above with respect to Figure 4, and may involve communication between the co-ordination network entity 400, the NF 402, the network registration entity 404, and the OAM 406.
  • the signalling begins with a discovery process, in which the co-ordination network entity 400 discovers network entities which meet certain criteria, such as having the capability of participating in a collaborative machine-learning process.
  • the criteria may relate to “static” information, i.e., information which does not change substantively over time.
  • the co-ordination network entity 400 transmits a discovery request message to the network registration entity 404.
  • the discovery request message comprises a filter defining the criteria according to which the network registration entity is to identify network entities.
  • the network registration entity 404 stores profiles of network entities in the communications network, indicating their functions, capabilities and addressing information.
  • the discovery request message may comprise one or more of: a target network function type; a requesting network function type; a service name (e.g., identifying the purpose of the machine-learning model); an indication of hardware the network entities are to possess (e.g., a GPU).
  • the network registration entity responds with a list of network entities that fulfil the criteria in the discovery request message and, optionally, the addressing information to reach those network entities.
  • the discovery process enables the co-ordination network entity to discover network entities that are capable of participating in a collaborative machine-learning process, and optionally which meet certain static criteria around network function type and hardware.
  • the co-ordination network entity searches, within those discovered network entities, based on dynamic criteria to find those network entities having suitable metrics that will improve the overall performance of the machine-learning model.
  • the signalling may follow one or both of two options.
  • the co-ordination network entity 400 obtains information relating to the dynamic performance of the network entities via the OAM 406.
  • the co-ordination network entity 400 transmits a data collection query to the OAM 406 for each of the network entities identified by the network registration entity in step 902.
  • the OAM 406 stores the required information to answer the data collection query for each of the network entities (e.g. in a cache). For example, the OAM 406 may store network traffic load information for each of the network entities. Thus, if the at least one query specifies that the network entities must have a network traffic load that is less than 50%, then the OAM 406 may determine which network entities satisfy this requirement and send an indication of which network entities satisfy this requirement to the co-ordination network entity 400.
  • the OAM 406 may transmit data collection queries to each of the network entities in step 906 (including the NF 402), and receive data from those network entities (or a subset of those entities) in step 908. This data is then forwarded to the co-ordination network entity 400 in step 910.
  • the co-ordination network entity 400 obtains information relating to the dynamic performance of the network entities through direct communication with those entities.
  • the co-ordination network entity transmits query messages directly to the network entities identified by the network registration entity 404 in step 902, and receives information back from the network entities (or a subset of those entities) in step 914.
  • the signalling may request data from the identified network entities, or define performance attributes that the network entities are to meet.
  • the co-ordination network entity 400 may receive data from all entities, and then make a selection of clients for involvement in a training round based on that data.
  • the co-ordination network entity 400 may receive positive indications from only the subset of network entities that meet the specified performance attributes. Network entities that do not meet the specified performance attributes may respond negatively or not at all.
  • Both first and second options may be used, for example, where certain data is obtainable only via one of the options: direct signalling, or signalling via the OAM 406.
  • the performance attributes may relate to key performance indicators (KPIs) and machine-learning model performance metrics.
  • KPIs key performance indicators
  • one of the attributes relates to underfitting or overfitting of the machinelearning models maintained by the network entities.
  • the attribute may define that the machine-learning model shall not be underfitted or shall not be overfitted.
  • the attribute may define that the underfitting or overfitting of the machine-learning model is not particularly recent, such that performance of the machine-learning model may have had sufficient to improve.
  • the attribute may define that the machinelearning model shall not be underfitted or shall not be overfitted within a time window preceding transmission of the query messages in steps 904, 906 or 912, for example.
  • the network entities may evaluate their local copies of the machine-learning model to determine whether they are overfitted or underfitted to their respective training datasets. For example, network entities may evaluate performance of the machinelearning model after successive iterations of its training. Where the performance from one iteration of the model to the next improves with respect to the training dataset, but worsens with respect to the test dataset, the network entity may determine that the machine-learning model is overfitted. Where the performance from one iteration of the model to the next improves with respect to the training and test datasets, the network entity may determine that the machine-learning model is underfitted.
  • the difference in performance from one iteration to the next may be subject to a threshold, such that overfitting is determined only where the performance with respect to the test dataset worsens by at least a threshold amount, for example; or that underfitting is determined only where the performance with respect to the training and test datasets improves by at least a threshold amount.
  • Additional performance attributes may relate to one or more of: a size of training dataset available to the network entity; an accuracy of the local copy of the machine-learning model; a current load of the network entity (e.g., such that network entities overloaded with other tasks may not be selected for participation in the collaborative machine learning).
  • a network entity may send a profile update message to the network registration entity comprising an indication that the machine-learning model is underfitted or overfitted (as appropriate) and, optionally, an indication of the time when the model was determined to be underfitted or overfitted.
  • the indication of the time may comprise an indication of a version or iteration of the model that was determined to be overfitted or underfitted.
  • the network registration entity 404 can then update the profile for that network entity to indicate that the machine-learning model has been underfitted or overfitted and the time at which that was determined.
  • the network entity may unsubscribe from the model update service upon determining that its model is underfitted or overfitted. In this way, the network entity may not be selected in future for participation in the collaborative machine-learning process, at least until its local copy of the machinelearning model is no longer underfitted or overfitted, and unnecessary downlink data transfer is avoided.
  • the network entity may resubscribe to the model update service at a later time, e.g., to probe for updates to the global copy of the machine-learning model.
  • the co-ordination network entity selects clients from those that possessed the attributes identified in steps 904 to 914. In some embodiments, all of the clients identified in steps 904 to 914 may be selected for participation in the collaborative machine-learning process.
  • the co-ordination network entity may be configured with a maximum number of network entities that can be involved in each training round. If the number of network entities identified in steps 904 to 914 falls below that maximum number, then all of the entities may be selected. Otherwise, the coordination network entity may select a subset of the entities for involvement in the collaborative machine-learning process based on the network topology information obtained in step 418.
  • the co-ordination network entity may utilize one or more (or all) of the following when selecting a subset of entities for involvement in the collaborative machine-learning process: network connectivity information between the co-ordination network entity and the entities (e.g., as shown in Figures 5 and/or 6); performance metrics between neighbouring entities (e.g., relating to handover); and data similarity.
  • the entities may be ranked based on some or all of these different network topology information parameters. For example, the entities may be ranked according to the communication cost of selecting the entities for involvement in the training round of the collaborative machine-learning process (with lower cost ranked higher). A number of entities having the lowest communication cost may then be selected, and the similarity of their training datasets compared as described above. If the value of the similarity does not meet the threshold (i.e., the training datasets are not sufficiently different), then one or more of the entities may be replaced with other entities and the similarity re-calculated. For example, where two or more entities have very similar training datasets, one or more of those entities may be replaced. This process may be repeated until the threshold is satisfied.
  • the co-ordination network entity may then consider the values of performance metrics for neighbour relations of the NFs.
  • the consideration of such performance metrics may depend on the functionality (e.g., purpose) of the machinelearning model. For example, where the functionality of the machine-learning model (i.e., the output or purpose of the machine-learning model) relates to handover, it may be beneficial to select NFs associated with particularly high numbers of successful handovers.
  • NFs associated with particularly high numbers of handover attempts but relatively low handover success ratio e.g., the accuracy of the machine-learning model may be improved most by sampling at an increased level from such NFs.
  • the NFs may first be ranked according to their neighbour performance metrics, before communication cost and/or data similarity are taken into account.
  • Figure 10 is a flowchart of a method performed by a first network entity in a communications network according to embodiments of the disclosure.
  • the network entity may be, for example, a co-ordination network entity.
  • the network entity may be, for example, a NWDAF, such as the NWDAF 308 or the co-ordination network entity 400 described above.
  • the first network entity identifies a plurality of second network entities that are capable of participation in a collaborative machine-learning process.
  • the first network entity may undertake a discovery process in communication with a network registration entity as described above with respect to steps 900 and 902.
  • the first network entity transmits one or more query messages towards the plurality of second network entities.
  • the one or more query messages may be transmitted directly to the second network entities (as in step 912) or indirectly towards the second network entities via the OAM (as in step 904).
  • the one or more query messages comprise an indication of one or more attributes to be fulfilled by second network entities for participation in a collaborative machine-learning process to update a global copy of a machine-learning model. At least one of the one or more attributes relates to underfitting or overfitting of respective local copies of the machine-learning model maintained by the second network entities.
  • the attribute relating to underfitting or overfitting may define that the local copy of the machine-learning model shall not be underfitted or shall not be overfitted, or that the local copy of the machinelearning model shall not be underfitted or overfitted within a time window preceding transmission of the one or more query messages.
  • the first network entity receives information identifying a subset of the plurality of second network entities that fulfil the one or more attributes defined in the query message(s) transmitted in step 1002.
  • the information may be received directly from the second network entities in response to the query messages transmitted in step 1004.
  • the information may be received from an OAM entity in communication with the plurality of second network entities.
  • the information may be received via communication with the network registration entity (e.g., the NRF).
  • step 1006 the first network entity obtains network topology information. Note that this step in particular may be performed in a different order to that which is illustrated in Figure 10. For example, the network topology information may be obtained prior to any of steps 1000, 1002 or 1004 without departing from the scope of the invention as defined in the claims appended hereto.
  • the first network entity selects second network entities from those identified in step 1004 for participation in a collaborative machine-learning process. Where network topology information has been obtained in step 1006, the second network entities may be selected based on the network topology information.
  • the network topology information may comprise performance metrics for neighbour relations between network entities in the communications network, such as one or more of: a number of attempted handovers between network entities; a number of successful handovers between network entities; and a handover success rate between network entities.
  • the one or more second network entities associated with highest values of the performance metrics for neighbour relations.
  • the network topology information may comprise network connectivity of the network entities in the communications network.
  • one or more second network entities may be selected for participation in the collaborative machine-learning process based on the network connectivity so as to limit a communication cost of acquiring data from the one or more second network entities.
  • the network topology information may comprise information identifying a similarity of training data accessible to the network entities for training local copies of the machine-learning model.
  • one or more second network entities may be selected for participation in the collaborative machine-learning process based on the similarity of training data so as to diversify training data on which the collaborative machine-learning process is based.
  • Figure 11 is a flowchart of a method performed by a second network entity in a communications network according to embodiments of the disclosure.
  • the second network entity belongs to a plurality of second network entities capable of participation in a collaborative machine-learning process to train a machine-learning model.
  • the second network entity maintains a local copy of the machine-learning model.
  • the plurality of network entities may comprise any combination of suitable network entities, including for example, an Access and mobility Management Function (AMF), Authentication Server Function (AUSF), Session Management Function (SMF), Policy Charging Function (PCF), Unified Data Management (UDM) and Operations, Administration and Management (OAM), an evolved Node B (eNB) and a next generation NodeB (gNB).
  • AMF Access and mobility Management Function
  • AUSF Authentication Server Function
  • SMF Session Management Function
  • PCF Policy Charging Function
  • UDM Unified Data Management
  • OAM Operations, Administration and Management
  • eNB evolved Node B
  • gNB next generation NodeB
  • the second network entity registers a profile with a network function repository entity.
  • the profile comprises an indication of an ability of the second network entity to participate in a collaborative machine-learning process to train the machinelearning model. See step 410 described above for further detail regarding this step.
  • the second network entity trains a local copy of the machine-learning model using training data that is accessible (e.g., local) to the second network entity.
  • the second network entity determines whether the local copy of the machine-learning model is underfitted or has become overfitted to the training dataset.
  • the local copy of the machine-learning model may be determined to be overfitted responsive to a determination that performance of a first iteration of the local copy with respect to training data has improved relative to a second, previous iteration of the local copy, whereas performance of the first iteration of the local copy with respect to test data has worsened relative to the second iteration of the local copy.
  • the local copy of the machine-learning model may be determined to be underfitted responsive to a determination that performance of the first iteration of the local copy with respect to both training data and test data has improved relative to the second, previous iteration of the local copy.
  • the method flows back to step 1104 and the second network entity continues to train the local copy of the machinelearning model. Alternatively, the method may end. If the local copy of the model is determined to be overfitted or underfitted, the method proceeds to step 1106 and the second network entity transmits a message comprising an indication that the local copy of the machine-learning model is overfitted or underfitted (as appropriate).
  • the message may be transmitted to the network function repository entity, enabling the network function repository entity to update the profile of the second network entity with the indication that the local copy of the machine-learning model is underfitted or has been overfitted.
  • the message may be transmitted in response to a query message received from a first network entity (e.g., the co-ordination network entity 400 described above).
  • the query message may comprise an indication of one or more attributes to be fulfilled by the second network entity for participation in a collaborative machine-learning process to update a global copy of the machine-learning model, with at least one of the one or more attributes relating to underfitting or overfitting of the local copy of the machine-learning model maintained by the second network entity.
  • Figure 12 is a schematic diagram of an apparatus 1200 for a communication network (for example, the system 300 shown in Figure 3) according to embodiments of the disclosure.
  • the apparatus 1200 may be implemented in a first network function or entity (such as, for example, the NWDAF 308 described above in respect of Figure 3).
  • Apparatus 1200 is operable to carry out the example method described with reference to Figure 10 and possibly any other processes or methods disclosed herein, such as the signalling and operations of the co-ordination network entities in Figures 4, 5 and 9. It is also to be understood that the method of Figure 10 may not necessarily be carried out solely by apparatus 1200. At least some operations of the method can be performed by one or more other entities.
  • the apparatus 1200 comprises processing circuitry 1202 (such as one or more processors, digital signal processors, general purpose processing units, etc), a machine- readable medium 1204 (e.g., memory such as read-only memory (ROM), randomaccess memory, cache memory, flash memory devices, optical storage devices, etc) and one or more interfaces 1206.
  • processing circuitry 1202 such as one or more processors, digital signal processors, general purpose processing units, etc
  • machine- readable medium 1204 e.g., memory such as read-only memory (ROM), randomaccess memory, cache memory, flash memory devices, optical storage devices, etc
  • interfaces 1206 e.g., Ethernet interfaces, etc.
  • the machine-readable medium 1204 stores instructions which, when executed by the processing circuitry 1202, cause the apparatus 1200 to: transmit one or more query messages, the one or more query messages comprising an indication of one or more attributes to be fulfilled by second network entities for participation in a collaborative machine-learning process to update a global copy of a machine-learning model.
  • the one or more attributes comprise an attribute relating to underfitting or overfitting of respective local copies of the machine-learning model maintained by the second network entities.
  • the apparatus 1200 is further caused to: receive information identifying a first plurality of second network entities fulfilling the one or more attributes; and select, from the first plurality of second network entities, one or more second network entities for participation in the collaborative machine-learning process.
  • the processing circuitry 1202 may be configured to directly perform the method, or to cause the apparatus 1200 to perform the method, without executing instructions stored in the non-transitory machine-readable medium 1204, e.g., through suitably configured dedicated circuitry.
  • the one or more interfaces 1206 may comprise hardware and/or software suitable for communicating with other nodes of the wireless communication network using any suitable communication medium.
  • the interfaces 1206 may comprise one or more wired interfaces, using optical or electrical transmission media. Such interfaces may therefore utilize optical or electrical transmitters and receivers, as well as the necessary software to encode and decode signals transmitted via the interface.
  • the interfaces 1206 may comprise one or more wireless interfaces. Such interfaces may therefore utilize one or more antennas, baseband circuitry, etc.
  • the components are illustrated coupled together in series; however, those skilled in the art will appreciate that the components may be coupled together in any suitable manner (e.g., via a system bus or suchlike).
  • the apparatus 1200 may comprise power circuitry (not illustrated).
  • the power circuitry may comprise, or be coupled to, power management circuitry and is configured to supply the components of apparatus 1200 with power for performing the functionality described herein.
  • Power circuitry may receive power from a power source.
  • the power source and/or power circuitry may be configured to provide power to the various components of apparatus 1200 in a form suitable for the respective components (e.g., at a voltage and current level needed for each respective component).
  • the power source may either be included in, or external to, the power circuitry and/or the apparatus 1200.
  • the apparatus 1200 may be connectable to an external power source (e.g., an electricity outlet) via an input circuitry or interface such as an electrical cable, whereby the external power source supplies power to the power circuitry.
  • an external power source e.g., an electricity outlet
  • the power source may comprise a source of power in the form of a battery or battery pack which is connected to, or integrated in, the power circuitry.
  • the battery may provide backup power should the external power source fail.
  • Other types of power sources such as photovoltaic devices, may also be used.
  • Figure 13 is a schematic diagram of an apparatus 1300 for a communication network (for example, the system 300 shown in Figure 3) according to embodiments of the disclosure.
  • the apparatus 1300 may be implemented in a network function or entity (such as one of the network functions, 302 304, 306 described above with respect to Figure 3).
  • Apparatus 1300 is operable to carry out the example method described with reference to Figure 11 and possibly any other processes or methods disclosed herein, such as the signalling and operations of the network functions in Figures 4, 5 and 9. It is also to be understood that the method of Figure 11 may not necessarily be carried out solely by apparatus 1300. At least some operations of the method can be performed by one or more other entities.
  • the apparatus 1300 may belong to a plurality of entities configured to participate in collaborative learning.
  • the apparatus 1300 may belong to a plurality of entities configured to perform collaborative learning to develop a model.
  • Each entity of the plurality of entities may be configured to maintain a local copy of the model.
  • the apparatus 1300 comprises processing circuitry 1302 (such as one or more processors, digital signal processors, general purpose processing units, etc), a machine- readable medium 1304 (e.g., memory such as read-only memory (ROM), randomaccess memory, cache memory, flash memory devices, optical storage devices, etc) and one or more interfaces 1306.
  • processing circuitry 1302 such as one or more processors, digital signal processors, general purpose processing units, etc
  • machine- readable medium 1304 e.g., memory such as read-only memory (ROM), randomaccess memory, cache memory, flash memory devices, optical storage devices, etc
  • interfaces 1306 e.g., Ethernet interfaces, etc.
  • the machine-readable medium 1304 stores instructions which, when executed by the processing circuitry 1302, cause the apparatus 1300 to: register, with a network function repository entity, a profile of the second network entity.
  • the profile comprises an indication of an ability of the second network entity to participate in a collaborative machine-learning process to train the machine-learning model.
  • the apparatus 1300 is further caused to: train the local copy of the machine-learning model using training data accessible to the second network entity; and, responsive to a determination that the local copy of the machine-learning model is underfitted or has become overfitted to training data accessible to the second network entity: transmit a message comprising an indication that the local copy of the machine-learning model is underfitted or has become overfitted.
  • the processing circuitry 1302 may be configured to directly perform the method, or to cause the apparatus 1300 to perform the method, without executing instructions stored in the non-transitory machine-readable medium 1304, e.g., through suitably configured dedicated circuitry.
  • the one or more interfaces 1306 may comprise hardware and/or software suitable for communicating with other nodes of the wireless communication network using any suitable communication medium.
  • the interfaces 1306 may comprise one or more wired interfaces, using optical or electrical transmission media. Such interfaces may therefore utilize optical or electrical transmitters and receivers, as well as the necessary software to encode and decode signals transmitted via the interface.
  • the interfaces 1306 may comprise one or more wireless interfaces. Such interfaces may therefore utilize one or more antennas, baseband circuitry, etc.
  • the components are illustrated coupled together in series; however, those skilled in the art will appreciate that the components may be coupled together in any suitable manner (e.g., via a system bus or suchlike).
  • the apparatus 1300 may comprise power circuitry (not illustrated).
  • the power circuitry may comprise, or be coupled to, power management circuitry and is configured to supply the components of apparatus 1300 with power for performing the functionality described herein.
  • Power circuitry may receive power from a power source.
  • the power source and/or power circuitry may be configured to provide power to the various components of apparatus 1300 in a form suitable for the respective components (e.g., at a voltage and current level needed for each respective component).
  • the power source may either be included in, or external to, the power circuitry and/or the apparatus 1300.
  • the apparatus 1300 may be connectable to an external power source (e.g., an electricity outlet) via an input circuitry or interface such as an electrical cable, whereby the external power source supplies power to the power circuitry.
  • an external power source e.g., an electricity outlet
  • the power source may comprise a source of power in the form of a battery or battery pack which is connected to, or integrated in, the power circuitry.
  • the battery may provide backup power should the external power source fail.
  • Other types of power sources such as photovoltaic devices, may also be used.
  • embodiments of the disclosure are applicable to all methods of collaborative machine-learning, regardless of the function of the machine-learning model that is being trained. Rather, embodiments of the disclosure provide methods, apparatus and machine-readable mediums for implementation in a communications network that enable collaborative machine-learning to happen more efficiently and more effectively through new client selection strategies. Nonetheless, the machine-learning models discussed herein may be employed for a variety of different functions, as set out in more detail below.
  • the machine-learning model generates an output used to control an operation of the communications network or one or more network entities belonging to or connected to the communication network.
  • controlling an operation of the communication network or one or more network entities of the communication network may comprise one or more of: controlling inter-frequency handover of a mobile device between cells; optimization of a value of a network congestion parameter; optimization of a value of a network quality parameter; optimization of a current network resource allocation; optimization of a network resource configuration; optimization of a network usage parameter; optimization of a network parameter of a neighbour communication network cell; optimization of a value of a network signal interference parameter; optimization of a value of a Reference Signal Received Power (RSRP) parameter; optimization of a value of a Reference Signal Received Quality (RSRQ) parameter; optimization of a value of a network signal to interference plus noise ratio, SINR, parameter; optimization of a value of a network power parameter; optimization of a network frequency band; optimization of a current network antenna down-tilt angle; optimization of a network antenna vertical beamwidth; optimization of a network antenna horizontal beamwidth; optimization of a network antenna height; optimization of a
  • the one or more network entities belonging to the communication network may comprise one or more of: one or more wireless devices; one or more core network functions; one or more computing devices comprising a wireless device; one or more control devices within a smart factory; and one or more control devices in an autonomous or semi- autonomous vehicle.
  • the machine-learning model may be used to predict whether a wireless device (i.e., a UE) has coverage on a secondary carrier frequency without measuring on the secondary frequency. In other words, whether the wireless device has an opportunity to perform inter-frequency handover to a less-loaded cell on another carrier. Using secondary carrier prediction, the wireless device may not need to perform inter-frequency measurements, leading to energy savings at the wireless device. However, frequent signaling of source carrier information that enables prediction of the secondary frequency can lead to an additional overhead and should thus be minimized.
  • a machine-learning model that is configured to predict whether a wireless device has coverage on a secondary carrier therefore leads to significantly more efficient use of network resources, by enabling the wireless device to handover when it is beneficial to do so and reducing the signaling of source carrier information.
  • the machine-learning model may utilize reference signal received power (RSRP) values from neighbouring cells on the source frequency, as well as cell-specific features, as its input features.
  • RSRP reference signal received power
  • the output of machine-learning model may include cells on the target frequency.
  • a subset of that plurality of network entities may be further selected based on the performance metrics of neighbour relations.
  • one goal of the machine-learning model is to reduce or minimize unsuccessful handovers (on the basis that a ‘correct’ secondary carrier prediction will result in a successful handover to that secondary carrier). Therefore, the number of attempted handovers and the number of successful handovers are important metrics to consider.
  • Cell performance metrics and key performance metrics may also be considered, e.g., so as to reduce the drop rate and increase cell throughput.
  • the subset of network entities may be selected for involvement in the collaborative machine-learning process based primarily on the performance metrics of their neighbour relations.
  • the network entities may be ranked according to one or more of those performance metrics (or a combination of those metrics), and a certain number of the highest ranked network entities selected.
  • Communication cost and/or data similarity may be taken into consideration subsequently, to ensure that the communication cost of using the selected subset of network entities for collaborative machine-learning is not too high, and/or that the training datasets available to the selected subset of network entities are sufficiently diverse.
  • the performance metrics may subsequently be used to determine weights to be applied to the model updates (gradients) provided by those network entities when generating an update to the global copy of the machine-learning model.
  • one or more performance metrics e.g., handover success ratio
  • the weights may be determined by combining multiple performance metrics in some function. In either case, the weights should be normalized amongst those network entities.
  • aspects of the present disclosure therefore allow for to reducing latency, minimising resource overhead and reducing the risk of security problems by implementing machinelearning in communication networks.
  • the embodiments described herein provide an efficient method for selecting network entities to perform a collaborative learning process, thereby providing a more reliable method for performing collaborative learning in a communications network.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Mobile Radio Communication Systems (AREA)

Abstract

La présente divulgation concerne un procédé exécuté par une première entité de réseau dans un réseau de communication. Le procédé consiste à : transmettre un ou plusieurs messages d'interrogation, le ou les messages d'interrogation comprenant une indication d'un ou de plusieurs attributs à remplir par des secondes entités de réseau pour participer à un processus collaboratif d'apprentissage automatique afin de mettre à jour une copie globale d'un modèle d'apprentissage automatique, l'attribut ou les attributs comprenant un attribut relatif à un sous-ajustement ou à un surajustement des copies locales respectives du modèle d'apprentissage automatique maintenues par les secondes entités du réseau ; recevoir des informations identifiant une première pluralité de secondes entités de réseau répondant à l'attribut ou aux attributs ; et sélectionner, parmi la première pluralité de secondes entités de réseau, une ou plusieurs secondes entités de réseau pour participer au processus collaboratif d'apprentissage automatique.
PCT/EP2022/060458 2022-04-20 2022-04-20 Procédés, appareil et supports lisibles par machine se rapportant à un apprentissage automatique dans un réseau de communication WO2023202768A1 (fr)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/EP2022/060458 WO2023202768A1 (fr) 2022-04-20 2022-04-20 Procédés, appareil et supports lisibles par machine se rapportant à un apprentissage automatique dans un réseau de communication

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/EP2022/060458 WO2023202768A1 (fr) 2022-04-20 2022-04-20 Procédés, appareil et supports lisibles par machine se rapportant à un apprentissage automatique dans un réseau de communication

Publications (1)

Publication Number Publication Date
WO2023202768A1 true WO2023202768A1 (fr) 2023-10-26

Family

ID=81748413

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/EP2022/060458 WO2023202768A1 (fr) 2022-04-20 2022-04-20 Procédés, appareil et supports lisibles par machine se rapportant à un apprentissage automatique dans un réseau de communication

Country Status (1)

Country Link
WO (1) WO2023202768A1 (fr)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190138934A1 (en) * 2018-09-07 2019-05-09 Saurav Prakash Technologies for distributing gradient descent computation in a heterogeneous multi-access edge computing (mec) networks
WO2021032495A1 (fr) * 2019-08-16 2021-02-25 Telefonaktiebolaget Lm Ericsson (Publ) Procédés, appareils et supports lisibles par machine se rapportant à l'apprentissage machine dans un réseau de communication
WO2021032498A1 (fr) * 2019-08-16 2021-02-25 Telefonaktiebolaget Lm Ericsson (Publ) Procédés, appareil et supports lisibles par machine se rapportant à un apprentissage automatique dans un réseau de communication

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190138934A1 (en) * 2018-09-07 2019-05-09 Saurav Prakash Technologies for distributing gradient descent computation in a heterogeneous multi-access edge computing (mec) networks
WO2021032495A1 (fr) * 2019-08-16 2021-02-25 Telefonaktiebolaget Lm Ericsson (Publ) Procédés, appareils et supports lisibles par machine se rapportant à l'apprentissage machine dans un réseau de communication
WO2021032498A1 (fr) * 2019-08-16 2021-02-25 Telefonaktiebolaget Lm Ericsson (Publ) Procédés, appareil et supports lisibles par machine se rapportant à un apprentissage automatique dans un réseau de communication

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
3GPP DOCUMENT TS 23.288
3GPP DOCUMENTS TS 23.502
ANONYMOUS 3GPP: "3rd Generation Partnership Project; Technical Specification Group Services and System Aspects; Study on enablers for network automation for the 5G System (5GS); Phase 2 (Release 17)", 3GPP STANDARD; TECHNICAL REPORT; 3GPP TR 23.700-91, 17 December 2020 (2020-12-17), XP055897177, Retrieved from the Internet <URL:https://www.3gpp.org/ftp/Specs/archive/23_series/23.700-91/> [retrieved on 20220303] *
TS 23.288

Similar Documents

Publication Publication Date Title
US20220321423A1 (en) Methods, apparatus and machine-readable media relating to machine-learning in a communication network
EP4099635A1 (fr) Procédé et dispositif de sélection de service dans un système de communication sans fil
US20220294706A1 (en) Methods, apparatus and machine-readable media relating to machine-learning in a communication network
CN112512058A (zh) 网络优化方法、服务器、客户端设备、网络设备和介质
US20170019495A1 (en) Distribution of popular content between user nodes of a social network community via direct proximity-based communication
US11290915B2 (en) Systems and methods for granular beamforming across multiple portions of a radio access network based on user equipment information
WO2022069054A1 (fr) Gestion adaptative de faisceaux dans un réseau de télécommunications
US20220408293A1 (en) Method and device for providing network analysis information for rfsp index selection in mobile communication network
US11843516B2 (en) Federated learning in telecom communication system
US20210377788A1 (en) Configuring telecommunications network using data obtained from user equipment
US11330494B2 (en) Methods, apparatuses, computer programs and computer program products for load balancing
US11929938B2 (en) Evaluating overall network resource congestion before scaling a network slice
US20230196111A1 (en) Dynamic Labeling For Machine Learning Models for Use in Dynamic Radio Environments of a Communications Network
US11805022B2 (en) Method and device for providing network analytics information in wireless communication network
WO2023202768A1 (fr) Procédés, appareil et supports lisibles par machine se rapportant à un apprentissage automatique dans un réseau de communication
AU2022208075A1 (en) Management and control method for data analysis apparatus, and communication apparatus
US20220377652A1 (en) Resource availability check
US20230164629A1 (en) Managing a node in a communication network
CN106688269B (zh) 用于确定无线设备是否是由于负载平衡原因而切换到目标小区的合适候选者的无线电网络节点和方法
US20230262498A1 (en) Network data analytics function accuracy enhancement
US20230318794A1 (en) Optimizing physical cell id assignment in a wireless communication network
WO2024067248A1 (fr) Procédé et appareil d&#39;acquisition d&#39;ensemble de données d&#39;entraînement
US20220232368A1 (en) Clustering of user entities in a cellular network
WO2023274526A1 (fr) Appareil, procédé et programme informatique
WO2023213405A1 (fr) Gestion de sommeil de cellule de réseau d&#39;accès radio (ran)

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22724042

Country of ref document: EP

Kind code of ref document: A1