WO2022153324A1 - First node, second node, third node, communications system, and methods performed thereby to facilitate prediction of an event - Google Patents

First node, second node, third node, communications system, and methods performed thereby to facilitate prediction of an event Download PDF

Info

Publication number
WO2022153324A1
WO2022153324A1 PCT/IN2021/050037 IN2021050037W WO2022153324A1 WO 2022153324 A1 WO2022153324 A1 WO 2022153324A1 IN 2021050037 W IN2021050037 W IN 2021050037W WO 2022153324 A1 WO2022153324 A1 WO 2022153324A1
Authority
WO
WIPO (PCT)
Prior art keywords
node
local
communications system
global
context information
Prior art date
Application number
PCT/IN2021/050037
Other languages
French (fr)
Inventor
Bisht ASHUTOSH
Satheesh Kumar PEREPU
Original Assignee
Telefonaktiebolaget Lm Ericsson (Publ)
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Telefonaktiebolaget Lm Ericsson (Publ) filed Critical Telefonaktiebolaget Lm Ericsson (Publ)
Priority to PCT/IN2021/050037 priority Critical patent/WO2022153324A1/en
Publication of WO2022153324A1 publication Critical patent/WO2022153324A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/08Configuration management of networks or network elements
    • H04L41/0895Configuration of virtualised networks or elements, e.g. virtualised network function or OpenFlow elements
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/14Network analysis or design
    • H04L41/142Network analysis or design using statistical or mathematical methods
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/14Network analysis or design
    • H04L41/147Network analysis or design for predicting network behaviour
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/16Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks using machine learning or artificial intelligence
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/40Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks using virtualisation of network functions or resources, e.g. SDN or NFV entities
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/06Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
    • G06N3/063Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/12Protocols specially adapted for proprietary or special-purpose networking environments, e.g. medical networks, sensor networks, networks in vehicles or remote metering networks

Abstract

A computer-implemented method performed by a first node (111). The first node (111) operates in a communications system (10). The method is to facilitate prediction of an event in the communications system (10). The first node (111) determines (302) whether or not an accuracy of a local update to a global machine-learning model to predict an event in the communications system (10) exceeds a threshold. The first node (111) also sends (304) the local update to another node (114) operating in the communications system (10) based on a result of the determination. With the proviso that the accuracy exceeds the threshold, the first node (111) proceeds with the sending of the local update. With the proviso that the accuracy does not exceed the threshold, the first node (111) refrains from sending the local update.

Description

FIRST NODE, SECOND NODE, THIRD NODE, COMMUNICATIONS SYSTEM, AND METHODS PERFORMED THEREBY TO FACILITATE PREDICTION OF AN EVENT
TECHNICAL FIELD
The present disclosure relates generally to a first node and methods performed thereby, to facilitate prediction of an event. The present disclosure also relates generally to a second node, and methods performed thereby, to facilitate prediction of an event. The present disclosure further relates generally to a third node, and methods performed thereby, to facilitate prediction of an event. The present disclosure also relates generally to a communications system, and methods performed thereby, to facilitate prediction of an event.
BACKGROUND
Computer systems in a communications network may comprise one or more nodes, which may also be referred to simply as nodes. A node may comprise one or more processors which, together with computer program code may perform different functions and actions, a memory, a receiving port and a sending port. A node may be, for example, a server. Nodes may perform their functions entirely on the cloud.
NR
The standardization organization 3rd Generation Partnership Project (3GPP) is currently in the process of specifying a New Radio Interface called New Radio (NR) or 5G-Universal Terrestrial Radio Access (UTRA), as well as a Fifth Generation (5G) Packet Core Network, which may be referred to as Next Generation (NG) Core Network, abbreviated as NG-CN, NGC or 5G CN.
In the current concept, gNB denotes an NR BS, where one NR BS may correspond to one or more transmission and/or reception points.
One of the main goals of NR is to provide more capacity for operators to serve ever increasing traffic demands and variety of applications. Because of this, NR will be able to operate on high frequencies, such as frequencies over 6 GHz, until 60 or even 100 GHz.
Operation in higher frequencies makes it possible to use smaller antenna elements, which enables antenna arrays with many antenna elements. Such antenna arrays facilitate beamforming, where multiple antenna elements may be used to form narrow beams and thereby compensate for the challenging propagation properties.
Internet of Things (loT)
The Internet of Things (loT) may be understood as an internetworking of devices, e.g., physical devices, vehicles, which may also referred to as "connected devices" and "smart devices", buildings and other items — embedded with electronics, software, sensors, actuators, and network connectivity that may enable these objects to collect and exchange data. The loT may allow objects to be sensed and/or controlled remotely across an existing network infrastructure.
"Things," in the loT sense, may refer to a wide variety of devices such as heart monitoring implants, biochip transponders on farm animals, electric clams in coastal waters, automobiles with built-in sensors, DNA analysis devices for environmental/food/pathogen monitoring, or field operation devices that may assist firefighters in search and rescue operations, home automation devices such as for the control and automation of lighting via e.g., cameras, light monitors, heating, e.g. a “smart” thermostat, ventilation, air conditioning, and appliances such as washers, dryers, ovens, refrigerators or freezers that may use telecommunications for remote monitoring. These devices may collect data with the help of various existing technologies and then autonomously flow the data between other devices. Machine Type Communication (MTC)
Machine Type Communication (MTC) has in recent years, especially in the context of the Internet of Things (loT), shown to be a growing segment for cellular technologies. An MTC device may be a communication device, typically a wireless communication device or simply user equipment, that is a self and/or automatically controlled unattended machine and that is typically not associated with an active human user in order to generate data traffic. An MTC device may be typically simpler, and typically associated with a more specific application or purpose, than, and in contrast to, a conventional mobile phone or smart phone. MTC involves communication in a wireless communication network to and/or from MTC devices, which communication typically may be of quite different nature and with other requirements than communication associated with e.g. conventional mobile phones and smart phones. In the context of and growth of the loT, it is evident that MTC traffic will be increasing and thus needs to be increasingly supported in wireless communication systems.
Network Function Virtualization
Network Functions Virtualization (NFV) may be understood as a network architecture originally conceived by service providers who wanted to leverage the Information Technology (IT) virtualization technology to classes of network nodes in order to make them faster and more agile. The European Telecommunications Standards Institute (ETS I) was the first major organization to release a Network Functions Virtualization standard in October 2013 and has subsequently released multiple new specifications on Network Functions Virtualization and its components.
Both the ETSI and the Linux Foundation are actively developing and nurturing the reference architecture and standards for the NFV framework. Virtual Network Functions
Virtualized Network Functions (VNF) may be understood to commonly refer to the software form of network appliances such as a router, firewall, load balancer, etc. VNFs may be understood to be mostly deployed as virtual machines (VMs) on Linux Kernel Virtual Machine (KVM) or VMware vSphere hypervisors on Commercial Off-The-Shelf hardware (COTS). In contrast to Virtualized Network Functions, a Physical Network Function (PNF) may be understood to refer to the legacy network appliances on proprietary hardware.
NFV architecture
Figure 1 is a schematic diagram representing an ETSI NFV reference architecture providing an illustration of the relationship between Virtualized Network Functions and Network Functions Virtualization.
Within this ETSI NFV Framework, an NFV Orchestrator 11 , one or more VNF Manager(s) 12 and a Virtual Infrastructure Manager (VIM) 13 may deliver the primary NFV Management and Orchestration (MANO) functionality. NFV MANO 14 may be understood to be responsible for interacting with Operations and Business Support Systems (OSS/BSS) 15 to deliver business benefits to service providers, such as rapid service innovation, flexible network function deployment, improved resource usage, and reduced Capex and Opex costs. The OSS/BSS 15 may be understood to also be responsible for orchestrating VNFs 16 into network services (NS), deploying and operating the VNF and NS instances on the virtualized resources, and managing the lifecycle of VNF and NS instances to fulfill the business benefits for service providers. In Figure 1 , and for illustration purposes only, there is are three VNFs 17: VNF1 , VNF2 and VNF3. The OSS/BSS 15 may be further understood to be responsible for interacting with Elements Management (EM) 17 to manage the logical function and assure service levels of the VNFs spanning across the management of VNF Fault, Configuration, Accounting, Performance and Security (FCAPS). In Figure 1 , and for illustration purposes only, there is are three EM 17: EM1 , EM2 and EM3. Lastly, the OSS/BSS 15 may be understood to also be responsible for interacting with a Network Function Virtualization Infrastructure (NFVI) 18 to allocate, manage and orchestrate the virtualized resources including Virtual Computing 19, Virtual Storage 20 and Virtual Network 21 , where VNFs may be deployed. These resources are virtualized from Hardware Resources 22 comprising Computing Hardware 23, Storage Hardware 24 and Network Hardware 25 via a Virtualization Layer 26. The Os-Ma interface 27 between the OSS/BSS 15 and the NFVMANO 14, the Ve-Vnfm interface 28 between the EM 17 or the VNFs 16 and the VNF Manager 13, the Nf-Vi interface 29 between the NFVI 18 and the NFVMANO 14, the Vn-Nf interface 30 between the VNFs 16 and the NFVI 18, and the Vi-Ha interface 31 between the Virtualization layer 26 and the Hardware resources 22 are also depicted. At present, the NFV framework is undergoing rapid development because of 5G business opportunities, and its ecosystem is growing with strong support from service operators and all varieties of solution providers
Monitoring using Artificial Intelligence (Al)/Machine Learning (ML) techniques
There are new emerging techniques of AI/ML for management of complex networks such as NFV based deployments. Many of these techniques may provide network monitoring functionality by creating a baseline or normal working conditions, and then reporting deviations from the baseline.
The use of AI/ML techniques, however, may require access to data to train the AI/ML models that may be built as a part of the application of these AI/ML techniques. Furthermore, data may be required to retrain the model on a regular basis so that the ML models may be able to adapt to changing baselines, that is to changing operating environments, such as for example, changes in workloads due to a change in policy. For example, a physical server where the typical CPU utilization is 40%, and due to a change in policy, more workload is deployed on the server. Due to this policy change, the typical CPU utilization is increased to 60%. An ML model that is tracking deviation from baseline may initially consider 40% CPU utilization to be the normal baseline value. After the new policy, the ML model may be understood to need to be retrained to consider 60% CPU utilization to be the new normal baseline value.
Hence, while the main problem with the traditional approach to monitoring the performance of a network is the difficulty in manually managing large and complex networks, the use of AI/ML techniques, although they may be understood to reduce the manual effort compared to traditional monitoring techniques, may be understood to require data on a regular basis in order to train and re-train the models that may be built as a part of the application of these AI/ML techniques.
SUMMARY
It is an object of embodiments herein to facilitate prediction of an event in the communications system.
According to a first aspect of embodiments herein, the object is achieved by a computer- implemented method performed by a first node operating in a communications system. The method is to facilitate prediction of an event in the communications system. The first node determines whether or not an accuracy of a local update to a global machine-learning model to predict an event in the communications system exceeds a threshold. The first node then sends the local update to another node operating in the communications system based on a result of the determination. One of the following applies. With the proviso that the accuracy exceeds the threshold, the first node proceeds with the sending of the local update. With the proviso that the accuracy does not exceed the threshold, the first node refrains from sending the local update.
According to a second aspect of embodiments herein, the object is achieved by a computer-implemented method performed by a second node. The second node operates in the communications system. The method is to facilitate prediction of the event in the communications system. The second node obtains the global context information indicating which one or more virtual network functions the second node is to collect data from to train the local machine-learning model to predict the event. The second node then sends the local update of a second local machine-learning model to predict the event in the communications system 10, to the first node operating in the communications system. The local update is based on the obtained global context information.
According to a third aspect of embodiments herein, the object is achieved by a computer- implemented method performed by a third node. The third node operates in the communications system. The method is to facilitate prediction of the event in the communications system. The third node sends, to the second node operating in the communications system, the global context information. The global context information indicates which one or more virtual network functions the second node is to collect data from to train the local machine-learning model to predict the event.
According to a fourth aspect of embodiments herein, the object is achieved by a computer-implemented to facilitate prediction of the event in the communications system. The method comprises sending, by the third node operating in the communications system, the global context information to the second node operating in the communications system. The global context information indicates which one or more virtual network functions the second node is to collect data from to train the local machine-learning model to predict the event. The method also comprises obtaining, by the second node, the global context information. The method then comprises sending, by the second node, the local update of the second local machine-learning model to predict the event in the communications system 10, to the first node operating in the communications system. The local update is based on the obtained global context information. The method further comprises determining, by the first node, whether or not the accuracy of the local update exceeds the threshold. Finally, the method comprises, sending, by the first node, the local update to another node operating in the communications system based on a result of the determination. One of the following applies. With the proviso that the accuracy exceeds the threshold, the second node proceeds with the sending of the local update. With the proviso that the accuracy does not exceed the threshold, the second node refrains from sending the local update. According to a fifth aspect of embodiments herein, the object is achieved by the first node. The first node is configured to operate in the communications system. The first node is configured to facilitate prediction of the event in the communications system. The first node is configured to determine whether or not the accuracy of the local update to the global machinelearning model to predict the event in the communications system exceeds the threshold. The first node is also configured to send the local update to the another node configured to operate in the communications system based on the result of the determination. One of the following applies. With the proviso that the accuracy exceeds the threshold, the first node is further configured to proceed with the sending of the local update. With the proviso that the accuracy does not exceed the threshold, the first node is further configured to refrain from sending the local update.
According to a sixth aspect of embodiments herein, the object is achieved by the second node. The second node is configured to operate in the communications system. The second node is further configured to facilitate prediction of the event in the communications system. The second node is further configured to obtain the global context information configured to indicate which one or more virtual network functions the second node is to collect data from to train the local machine-learning model to predict the event. The second node is also configured to send the local update of the second local machine-learning model to predict the event in the communications system 10, to the first node operating in the communications system. The local update is configured to be based on the global context information configured to be obtained.
According to a seventh aspect of embodiments herein, the object is achieved by the third node. The third node is configured to operate in the communications system. The third node is further configured to facilitate prediction of the event in the communications system. The third node is further configured to send, to the second node configured to operate in the communications system, the global context information. The global context information is configured to indicate which one or more virtual network functions the second node is to collect data from to train the local machine-learning model to predict the event.
According to an eighth aspect of embodiments herein, the object is achieved by the communications system. The communications system is configured to operate in the communications system. The communications system is configured to facilitate prediction of the event in the communications system. The communications system is further configured to send by the third node configured to operate in the communications system, the global context information to the second node configured to operate in the communications system. The global context information is configured to indicate which one or more virtual network functions the second node is to collect data from to train the local machine-learning model to predict the event. The communications system is further configured to obtain, by the second node, the global context information. The communications system is further configured to send, by the second node, the local update of the second local machine-learning model to predict the event in the communications system 10, to the first node configured to operate in the communications system. The local update is configured to be based on the global context information configured to be obtained. The communications system is further configured to determine, by the first node, whether or not the accuracy of the local update exceeds the threshold. The communications system is further configured to send, by the first node, the local update to another node configured to operate in the communications system based on the result of the determination. One of the following applies. With the proviso that the accuracy exceeds the threshold, the first node is further configured to proceed with the sending of the local update. With the proviso that the accuracy does not exceed the threshold, the first node is further configured to refrain from sending the local update.
By the first node determining whether or not an accuracy of the local update exceeds the threshold, the first node may be enabled to filter out anomalous updates that could potentially diminish the accuracy of the global ML model if they were used to update it, and only select the local updates being sufficiently accurate to update the global ML model, thereby enabling to obtain a global ML model capable of predicting the event with higher accuracy. The filtering is enabled by sending the local update to the another node based on the result of the determination.
By sending the global context information to the second node, the third node enables the second node to know from which virtual network functions to collect the data to train the local ML model, and to normalize the data collected by virtue of the dynamic changes in e.g., the number of virtual network functions instantiated. This in turn enables the second node to determine a more accurate local ML model to predict the event, which may then be used by the another node to determine a more accurate global ML model to predict the event.
By the second node updating the second local machine-learning model with the normalized collected data, the second node is also enabled to determine a more accurate local ML model to predict the event, which may then be used, by the another node, to also determine a more accurate global ML model to predict the event.
One further advantage of embodiments herein may be understood to be the reduced bandwidth load on the management plane, a central server, due to the smaller sized ML model parameters that may be sent to a central location, by the first node only sending the update, rather than voluminous metric data in order to train and update the global ML model to predict the event. BRIEF DESCRIPTION OF THE DRAWINGS
Examples of embodiments herein are described in more detail with reference to the accompanying drawings, and according to the following description.
Figure 1 is a schematic diagram illustrating an example of an ETSI NFV reference architecture, according to existing methods.
Figure 2 is a schematic diagram illustrating two non-limiting embodiments, in panel a) and panel b) a communications system, according to embodiments herein.
Figure 3 is a flowchart depicting a method in a first node, according to embodiments herein.
Figure 4 is a flowchart depicting a method in a second node, according to embodiments herein.
Figure 5 is a flowchart depicting a method in a third node, according to embodiments herein.
Figure 6 is a flowchart depicting a method in a communications system, according to embodiments herein.
Figure 7 is a schematic diagram illustrating a non-limiting example of a method in a communications system, according to embodiments herein.
Figure 8 is a schematic diagram of a machine-learning model, according to embodiments herein.
Figure 9 is a schematic diagram illustrating another non-limiting example of a method in a communications system, according to embodiments herein.
Figure 10 is a schematic diagram illustrating yet another non-limiting example of a method in a communications system, according to embodiments herein.
Figure 11 is a schematic block diagram illustrating two non-limiting examples, a) and b), of a first node, according to embodiments herein.
Figure 12 is a schematic block diagram illustrating two non-limiting examples, a) and b), of a second node, according to embodiments herein.
Figure 13 is a schematic block diagram illustrating two non-limiting examples, a) and b), of a third node, according to embodiments herein.
Figure 14 is a schematic block diagram illustrating two non-limiting examples, a) and b), of a communications system, according to embodiments herein.
DETAILED DESCRIPTION
Certain aspects of the present disclosure and their embodiments may provide solutions to the challenges discussed in the background section. There are, proposed herein, various embodiments which address one or more of the issues disclosed herein. As stated in the Background section, a problem with the use of AI/ML techniques to monitor network performance may be understood to be their need of data on a regular basis to train and re-train a given model.
In case of managing NFV-based deployments hosting VNF, such data may be understood to correspond to various metrics collected via the virtual ports of these VNFs.
As the number of VNFs and their instances increases, transfer of such metrics, e.g., on a periodic basis, may be understood to result in a non-trivial load on the management plane of a communications network.
In order to reduce the transfer of such metric data, a Federated Learning (FL) approach may be used. FL may be understood to attempt to solve the problems that traditional ML leaves on the table. The training of algorithms may be understood to move to the edge of the network, so that data may never leave the devices, whether they may be a mobile phone or a Fog node. Once a model may “learn” from the data, the results may be uploaded and aggregated with updates from all the other devices on the network. The improved model may then be shared with the entire network. FL may be understood as a variation of traditional ML, in which powerful computers may run algorithms that identify patterns in data and apply what they learn to make predictions. The systems may be trained by being fed vast quantities of information. The garbage in, garbage out principle may be understood to apply: Higher quality data may be understood to yield better predictions. FL may also be understood to address a common problem with cloud-based Al: communication delays, or latency, between remote devices and the central ML system. Reducing latency may be understood to be critical for Al powered loT devices such as industrial equipment, where even brief delays between identifying a problem and responding to it may lead to significant damage. Since the intelligence in the device may be understood to be held locally, manufacturers may use federated learning to bring Al to environments with limited or non-existent network connections. FL may be understood to empower sectors where data cannot be transferred to third parties for confidentiality reasons, such as, e.g., health sector, banks, insurance companies, smart city security, etc.
Unlike consumer/subscriber-based FL scenarios, where user behavior may be understood to be stable, in cloud environments, the VNF instances may be regularly scaled up and/or down. This may be understood to be performed in order to handle the changing traffic load conditions, compute server resource usage etc., which may be more or less the same in consumer-based scenarios, but may be more dynamic when data e.g., managed by a VNF. Furthermore, a single compute server may be hosting multiple VNFs with different operating characteristics. For example, some VNFs may perform processing that may require a lot of CPU resource, whereas other VNFs may require large memory, e.g., RAM memory. In normal FL, model input data may be used, without any global context. The global context may be understood as information indicating the one or more circumstances that may define how data may have been generated in the respective VNFs. When FL is used naively, that is, without global operational context, in such cloud environments, FL may be understood to result in inefficient training of ML models because the dynamics of the data may change with addition and/or removal of more VNFs.
Also, there may be a situation where ML model updates coming from distributed compute servers may be anomalous. Anomalous may be understood to mean as having differing characteristics from the majority of the processed data. In this case, if such anomalous data is used, it may cause issues with the global model accuracy.
Hence, updates from those anomalous VNFs may be understood to need to be avoided. A significant challenge here may be understood to be to identify an anomalous VNF without relying on the usage data of the VNF, since the data is not shared in FL.
Embodiments herein may be understood to address some of these challenges by making use of distributed learning in a cloud environment.
Embodiments herein may be understood to allow AI/ML techniques to be used for network monitoring with reduced load on the management plane.
Embodiments herein may be considered to be based on emerging FL techniques for ML model training.
FL, a.k.a. collaborative learning, may be understood to be an ML technique that may train an algorithm across multiple decentralized edge devices or servers holding local data samples, without exchanging their data samples. This approach may be understood to stand in contrast to traditional centralized ML techniques, where all data samples may be uploaded to one server, as well as to more classical decentralized approaches, which may assume that local data samples may be identically distributed.
Although FL may be typically used in a consumer/subscriber context with focus on privacy, the same technique may be used in a cloud infrastructure context with focus on reduced data transfer.
Some of the embodiments contemplated will now be described more fully hereinafter with reference to the accompanying drawings, in which examples are shown. In this section, the embodiments herein will be illustrated in more detail by a number of exemplary embodiments. Other embodiments, however, are contained within the scope of the subject matter disclosed herein. The disclosed subject matter should not be construed as limited to only the embodiments set forth herein ; rather, these embodiments are provided by way of example to convey the scope of the subject matter to those skilled in the art. It should be noted that the exemplary embodiments herein are not mutually exclusive. Components from one embodiment may be tacitly assumed to be present in another embodiment and it will be obvious to a person skilled in the art how those components may be used in the other exemplary embodiments.
Note that although terminology from LTE/5G has been used in this disclosure to exemplify the embodiments herein, this should not be seen as limiting the scope of the embodiments herein to only the aforementioned system. Other wireless systems with similar features, may also benefit from exploiting the ideas covered within this disclosure.
Figure 2 depicts two non-limiting examples, in panel a) and panel b), respectively, of a communications system 10, in which embodiments herein may be implemented. In some example implementations, such as that depicted in the non-limiting example of Figure 2, the communications system 10 may be a computer network. In other example implementations, the communications system 10 may be implemented in a telecommunications network 100, sometimes also referred to as a cellular radio system, cellular network or wireless communications system. In some examples, the telecommunications network 100 may comprise network nodes which may serve receiving nodes, such as wireless devices, with serving beams.
In some examples, the telecommunications network 100 may for example be a network such as 5G system, or Next Gen network or an Internet service provider (ISP)-oriented network. The telecommunications network 100 may also support other technologies, such as a Long-Term Evolution (LTE) network, e.g. LTE Frequency Division Duplex (FDD), LTE Time Division Duplex (TDD), LTE Half-Duplex Frequency Division Duplex (HD-FDD), LTE operating in an unlicensed band, Wideband Code Division Multiple Access (WCDMA), Universal Terrestrial Radio Access (UTRA) TDD, GSM/Enhanced Data Rate for GSM Evolution (EDGE) Radio Access Network (GERAN) network, Ultra-Mobile Broadband (UMB), EDGE network, network comprising of any combination of Radio Access Technologies (RATs) such as e.g. Multi-Standard Radio (MSR) base stations, multi-RAT base stations etc., any 3rd Generation Partnership Project (3GPP) cellular network, Wireless Local Area Network/s (WLAN) or WiFi network/s, Worldwide Interoperability for Microwave Access (WiMax), IEEE 802.15.4-based low-power short-range networks such as IPv6 over Low-Power Wireless Personal Area Networks (6LowPAN), Zigbee, Z-Wave , Bluetooth Low Energy (BLE), or any cellular network or system.
Although terminology from Long Term Evolution (LTE)/5G has been used in this disclosure to exemplify the embodiments herein, this should not be seen as limiting the scope of the embodiments herein to only the aforementioned system. Other wireless systems support similar or equivalent functionality may also benefit from exploiting the ideas covered within this disclosure. In future radio access, e.g., in the sixth generation (6G), the terms used herein may need to be reinterpreted in view of possible terminology changes in future radio access technologies.
A plurality of nodes may be comprised in the communications system 10, whereof a first node 111 , a second node 112, a third node 113, and another node 114, also referred to herein as a fourth node 114, are depicted in Figure 2. The second node 112 may be one of a plurality of other nodes similar to it, comprised in the communications system 10, which may be referred to collectively as second nodes 112.
Each of the first node 111 , the second node 112, the third node 113 and the another node 114 may be understood, respectively, as a first computer system or server, a second computer system or server, a third computer system or server, and a fourth computer system or server. Any of the first node 111 , the second node 112, the third node 113 and the another node 114, may be implemented as a standalone server in e.g., a host computer in the cloud 120. In other examples, any of the first node 111 , the second node 112, the third node 113 and the another node 114 may be a distributed node or distributed server, such as a virtual node in the cloud 120, and may perform some of its respective functions locally, e.g., by a client manager, and some of its functions in the cloud 120, by e.g., a server manager. In other examples, any of the first node 111 , the second node 112, the third node 113 and the another node 114, may perform its functions entirely on the cloud 120, or partially, in collaboration or collocated with a radio network node. Yet in other examples, any of the first node 111 , the second node 112, the third node 113 and the another node 114, may also be implemented as processing resource in a server farm. Any of the first node 111 , the second node 112, the third node 113 and the another node 114, may be under the ownership or control of a service provider, or may be operated by the service provider or on behalf of the service provider.
Any of the first node 111 , the second node 112, the third node 113 and the another node 114, may be a core network node in a core network, which may be e.g., a 3GPP SBA based 5GC core network. Any of the first node 111 , the second node 112 and the another node 114, may have the capability to determine, e.g., derive or calculate, one or more respective machine-learning models, respectively, which may be stored, in a respective database or memory.
In fact, in some examples, any of the first node 111 , the third node 113, and the another node 114 may be co-located, or be the same node. In typical embodiments, any of the first node 111 , the second node 112, the third node 113, and the another node 114 may be located in the cloud 120, as depicted in the examples of Figure 2, and the first node 111 , the second node 112 may be located in a separate location geographically than the other nodes.
It may be understood that the communications system 10 may comprise additional nodes. The first node 1 11 may be referred to herein as a “Global Moderator”. The first node 11 1 may be understood to have a capability to moderate local updates coming from various compute servers such as the second node 112, before sending them to the another node 114, which may be understood as a global model trainer. An update may be understood as an adjustment of one or more parameters of weights of a ML model, to fit to data.
The second node 112 may be understood as a compute server; the second node 1 12 may have a capability to collect local data from one or more ports and run, locally, a local copy of a global ML model.
The third node 113 may be understood as a Cloud Orchestrator. The another node 114 may be understood as a Global Model Trainer; that is a node having a capability to train and run global ML model to predict an event.
The capabilities and functions of each of these nodes will be described later, along with the description of the methods performed by each one of them.
The communications network 100 may comprise one or more radio network nodes, whereof a radio network node 130 is depicted in Figure 2b. The radio network node 130 may typically be a base station or Transmission Point (TP), or any other network unit capable to serve a wireless device or a machine type node in the communications network 100. The radio network node 130 may be e.g., a 5G gNB, a 4G eNB, or a radio network node in an alternative 5G radio access technology, e.g., fixed or WiFi. The radio network node 130 may be e.g., a Wide Area Base Station, Medium Range Base Station, Local Area Base Station and Home Base Station, based on transmission power and thereby also coverage size. The radio network node 130 may be a stationary relay node or a mobile relay node. The radio network node 130 may support one or several communication technologies, and its name may depend on the technology and terminology used. The radio network node 130 may be directly connected to one or more networks and/or one or more core networks.
The communications network 100 covers a geographical area which may be divided into cell areas, wherein each cell area may be served by a radio network node, although, one radio network node may serve one or several cells.
The communications network 100 comprises a device 140. The device 140 may be also known as e.g., user equipment (UE), a wireless device, mobile terminal, wireless terminal and/or mobile station, mobile telephone, cellular telephone, or laptop with wireless capability, or a Customer Premises Equipment (CPE), just to mention some further examples. The device 140 in the present context may be, for example, portable, pocket-storable, hand-held, computer- comprised, or a vehicle-mounted mobile device, enabled to communicate voice and/or data, via a RAN, with another entity, such as a server, a laptop, a Personal Digital Assistant (PDA), or a tablet computer, sometimes referred to as a tablet with wireless capability, or simply tablet, a Machine-to- Machine (M2M) device, a device equipped with a wireless interface, such as a printer or a file storage device, modem, sensor, Laptop Embedded Equipped (LEE), Laptop Mounted Equipment (LME), USB dongles, or any other radio network unit capable of communicating over a radio link in the communications network 100. The device 140 may be wireless, i.e., it may be enabled to communicate wirelessly in the communications network 100 and, in some particular examples, may be able support beamforming transmission. The communication may be performed e.g., between two devices, between a device and a radio network node, and/or between a device and a server. The communication may be performed e.g., via a RAN and possibly one or more core networks, comprised, respectively, within the communications network 100. In some particular embodiments, the device 140 may be an loT device, e.g., a NB loT device.
The first node 1 11 may communicate with the second node 112 over a first link 151 . The first node 111 may communicate with the another node 1 14 over a second link 152. The another node 114 may communicate with the third node 1 13 over a third link 153. The another node 1 13 may communicate with the second node 1 12 over a fourth link 154. The second node 1 12 may communicate with the device over a fifth link 155. The second node 1 12 may communicate with the radio network node 130 over a sixth link 156. The radio network node 130 may communicate with the device 140 over a seventh link 157.
Any of the first link 151 , the second link 152, the third link 153, the fourth link 154, the fifth link 155 and the sixth link 156 just described may be e.g., a radio link, an infrared link, or a wired link.
Any of the links described may be a direct link or may be comprised of a plurality of individual links, wherein it may go via one or more computer systems or one or more core networks, which are not depicted in Figure 2, or it may go via an optional intermediate network. The intermediate network may be one of, or a combination of more than one of, a public, private or hosted network; the intermediate network, if any, may be a backbone network or the Internet; in particular, the intermediate network may comprise two or more sub-networks, which is not shown in Figure 2.
In general, the usage of “first”, “second”, “third”, “fourth”, “fifth”, “sixth” and/or “seventh”, herein may be understood to be an arbitrary way to denote different elements or entities, and may be understood to not confer a cumulative or chronological character to the nouns they modify.
Generally, all terms used herein are to be interpreted according to their ordinary meaning in the relevant technical field, unless a different meaning is clearly given and/or is implied from the context in which it is used. All references to a/an/the element, apparatus, component, means, step, etc. are to be interpreted openly as referring to at least one instance of the element, apparatus, component, means, step, etc., unless explicitly stated otherwise. The steps of any methods disclosed herein do not have to be performed in the exact order disclosed, unless a step is explicitly described as following or preceding another step and/or where it is implicit that a step must follow or precede another step. Any feature of any of the embodiments disclosed herein may be applied to any other embodiment, wherever appropriate. Likewise, any advantage of any of the embodiments may apply to any other embodiments, and vice versa. Other objectives, features and advantages of the enclosed embodiments will be apparent from the following description.
In general, the usage of “first”, “second”, “third”, “fourth”, “fifth” and/or “sixth” herein may be understood to be an arbitrary way to denote different elements or entities, and may be understood to not confer a cumulative or chronological character to the nouns they modify, unless otherwise noted in the text.
Several embodiments are comprised herein. It should be noted that the examples herein are not mutually exclusive. Components from one embodiment may be tacitly assumed to be present in another embodiment and it will be obvious to a person skilled in the art how those components may be used in the other exemplary embodiments.
Embodiments of a computer-implemented method, performed by the first node 1 11 operating in the communications system 10, will now be described with reference to the flowchart depicted in Figure 3. The method may be understood to be to facilitate prediction of an event in the communications system 10.
The method may comprise the actions described below. In some embodiments some of the actions may be performed. In some embodiments all the actions may be performed. In Figure 3, optional actions are indicated with dashed boxes. One or more embodiments may be combined, where applicable. All possible combinations are not described to simplify the description. It should be noted that the examples herein are not mutually exclusive. Components from one example may be tacitly assumed to be present in another example and it will be obvious to a person skilled in the art how those components may be used in the other examples.
Action 301
In the course of operations in the communications system 10, data indicating some aspect of the performance of the communications system 10 may be generated, e.g., by the device 140, and collected from one or more ports by the second node 112. The other second nodes 112 may collect respective data about other devices in the communications system 10. Each of the second nodes 112 may be a compute server respectively hosting one or more VNFs mapped to respective one or more ports.
These data, which may be referred to as metrics data, may include metrics such as volume of incoming traffic on a local virtual port. The data may be collected locally by each of the second nodes 112 and may enable prediction of an event in the communication system 10 by being used in training an generating a global machine-learning (ML) model by the another node 114. As an example, the ML model may be a Deep Neural Network model that may use virtual port metrics to predict traffic volume in the next 5 minutes.
In embodiments herein, instead of transferring the metric data from the second nodes 112 hosting VNFs to a common and/or central server for training and retraining the ML model, initially, a global ML model may be sent, that is, synchronized, by the another node 114, to each, or a selected number of, the second nodes 112 hosting VNFs. Multiple ML models may be locally retrained, respectively, on the second nodes 112, e.g., compute servers, using the respective metric data available locally on the respective second nodes 112. Once the local copy of the global ML model may have been trained locally, each of the second nodes 1 12 may determine a local update of the global ML model, that is. The local update may be understood to be “local” as it may be understood to be obtained by the second node 112 locally training its own copy of the global ML model, with its respective local data. The update may be understood as an adjustment of one or more of the parameters of weights of the global ML model, to fit to the data locally obtained by the second node 112. Each of the second nodes 112 may then provide its respective local update to the first node 1 11.
According to the foregoing, the first node 1 11 in this Action 301 , may obtain a local update from the second node 112 operating in the communications system 10.
Obtaining may be understood as e.g., receiving, and may be performed via the first link 151 .
The local update may be understood to be one of a plurality of local updates to be used to determine, that is to further update, the global machine-learning model to predict an event in the communications system 10. The plurality of local updates may be understood to be provided, respectively, by each of the second nodes 112.
No metric data may be sent from the second nodes 112 to the first node 11 1 or a central server. Only the parameters, or weights, of the updated ML model may be obtained in this Action 301 . The main benefit from this may be understood to be that the bandwidth required to transfer the parameters, or weights, of updated models may be enabled to be substantially smaller that the bandwidth that would otherwise be required to transfer metric data, e.g., periodically. The first node 111 may aggregate the parameters, or weights, coming from the various second nodes 112, and may then process them to enable to update the global ML model accordingly, as will be described in the next actions.
Apart from the local data, a global context may also have been provided to the second nodes 112 for normalizing the local training data. As stated earlier, the global context may be understood as information indicating the one or more circumstances that may define how data may have been generated in the respective VNFs. Two global contexts may be identified: a global configuration context, and a global operational context. The global configuration context, which may provide configuration state information, may provide the mapping of various virtual port names, from which metric data may be collected, to different types of VNFs for collecting local training data. In a cloud environment, a single physical server, also referred to herein as a “compute server”, that is, the second node 112, may host different VNFs. A VNF may be understood as a logical entity, such as a firewall. A VNF may have multiple ‘instances’ to provide redundancy, scalability etc. A single VNF instance may have multiple Virtual Machines (VMs). A single VM may run on a compute server. It may typically not be shared across multiple compute servers. Each VM may have multiple ports. Using a global configuration context, virtual port names associated with a specific VNF may be identified, for example, different VNFs such as firewall, CDN server etc. Local training data may then be collected for VNF specific ports. Use of VNF specific training data may be understood to allow for training of VNF specific models. These models may be expected to be more accurate than models that may have trained using data from all virtual ports on the physical server, irrespective of the type of VNF. The global operational context may provide information about operational status, such as number of instances of VNF which may be running etc., to normalize the local training data.
In a cloud environment, the number of instances of a VNF may be changed, scale-in or scale-out, based on a global dynamic policy. This change in number of VNF instances may impact some metric data that may be collected on the virtual ports of the respective fourth nodes 114, e.g., the respective local server. The use of global operational status may allow scaling the such metrics so that the training data may be normalized for the dynamic policy actions. For example, if there are three VNFs, the data may be understood to be shared between the VNFs. If there are five VNFs, then data may be understood to be shared across the VNFs. To capture the dynamics of the VNFs accurately, the data may need to be normalized with number of VNFs that may be used at a particular time. If the dynamic number of VNF changes, then the data may need to be normalized to understand the dynamics completely. This number of VNFs information may be available in the first node 1 11 , e.g., a global orchestrator, such as a cloud orchestrator, and may be used as global context in embodiments herein.
The global context may be regularly updated on local nodes so that training data may be filtered and adapted to dynamically changing operating conditions.
According to the foregoing, the local update may be based on global context information. In some embodiments, that the local update may be based on the global context information may comprise that the data collected by the second node 112 may be normalized with respect to the global context information.
The global context information may comprise at least one of: i) configuration state information of one or more virtual network functions (VNFs), and ii) operational state information of the virtual network functions. The configuration state information may be understood to indicate the mapping of the one or more VNFs to one or more virtual ports. The operational state information of the VNFs may be understood to indicate the number of VNF instances in running state.
Embodiments herein may be understood to provide a technique to identify the global context-based updating in order to accommodate the elastic nature of the nodes in FL.
By obtaining the local update in this Action 301 , the first node 1 11 may be enabled to then check if the local update obtained is sufficiently reliable to be used to update the global ML model, as will be explained in the next Action 302.
Action 302
Another challenge of embodiments herein may be to detect anomalous VNFs. For example, if one of the VNFs comprised in the network is anomalous, it may result in poor updating of the global model. Hence, these instances may be understood to need to be taken care of separately. Embodiments herein may be understood to provide an approach to handle anomalous VNF instances with poor quality data in an FL framework. The first node 111 may be understood as a moderator module that may lie between the another node 114 running the global model, and the second nodes 1 12 running local models. The purpose of moderator may be understood to be to monitor the incoming model updates from the second nodes 112 and check the authenticity of the data.
According to embodiments herein, a VNF may be determined as anomalous or not, solely based on model update data that may be being obtained from the second nodes 112, and without usage of data associated with the VNF.
According to the foregoing, and in order to check if the local updated obtained may be sufficiently accurate to be used to update the global ML model, in this Action 302, the first node 111 determines whether or not an accuracy of the local update to the global machine-learning model to predict the event in the communications system 10 exceeds a threshold.
Determining may be understood as calculating, deriving, or similar.
The accuracy may be measured in terms of how the performance of the global model changed when compared with before and after the local update. How the accuracy may be measured may be understood to depend on the type of problem being studied. For example, in the case of classification, the accuracy may be measured as percentage of the correctly classified values and incorrectly classified values. As another example, in the case of regression, the accuracy may be measured as percentage of the difference between predicted output and actual output. The local update may be understood to be one of a plurality of local updates to be used to determine the global machine-learning model. The plurality of local updates may be understood to have a respective accuracy exceeding the threshold. That is, only the local updates exceeding the threshold may be used to update the global ML model. The other local updates may be discarded.
The global machine-learning model may have a first accuracy. In a first group of embodiments, the determining in this Action 302 may further comprise a) executing the global machine-learning model with the local update, b) calculating a second accuracy of the updated global machine-learning model, when using the update obtained from the second node 112, c) comparing a change in accuracy from the first accuracy to the second accuracy with respect to the threshold, with a corresponding change observed with respective local updates obtained from a plurality of other local nodes, and d) labelling the local update as anomalous with the proviso that the second accuracy does not exceed the threshold.
In a second group of embodiments, the first node 111 may perform Action 302 by first performing Action 303, as will be described next.
By determining whether or not an accuracy of the local update exceeds the threshold, the first node 111 may be enabled to filter out anomalous updates that could potentially diminish the accuracy of the global ML model if they were used to update it, and only select the local updates being sufficiently accurate to update the global ML model, thereby enabling to obtain a global ML model capable of predicting the event with higher accuracy. In short, embodiments herein may enable to detect anomalous VNF behavior without the usage data of VNFs. Instead of normal averaging, accuracy-based averaging may be used to prevent a drop in accuracy of the global model. The accuracy-based averaging may be understood to mean that the local updates may be weighed by the accuracies of the respective local model models. This may be useful in the case where some of the local models may perform well, and some other local model may perform poorly.
Action 303
In a second group of embodiments, a model which may be enabled to predict the anomalous VNF instances ahead in time may be created, so that use of the moderator may not be required in the future, and the local model may be directly used to update the global model. This model may provide assistance in understanding anomalous VNFs, and this may be continuously retrained whenever global model accuracy may go below some threshold.
According to the foregoing, the first node 111 , may in time, instead of performing Action 302 local update, by local update, perform first this Action 303. In this Action 303, the first node 111 may generate a first local machine-learning model. The first local machine-learning model may be to predict local updates, of the global machine-learning model to predict the event in the communications system 10, having a respective accuracy not exceeding the threshold.
That is, after having performed Action 302 a sufficient amount of times, the first node 111 may be able to build a model to predict when a particular update may be anomalous, that is, may have the respective accuracy not exceeding the threshold. The first node 111 may then perform the determining of Action 302, using the generated first local machine-learning model.
When the decrease in accuracy may be greater than threshold T, then the update may be labelled as anomalous. Further, such instances may be collected with the first node 111 , and it may label them. Finally, the first node 111 , may be replaced with the first local ML model which may predict the anomalous VNF instances ahead of time and, further, not use them in the global model update.
To illustrate how this may be implemented with an example, it may be assumed that the data is as shown in Table 1 .
Figure imgf000022_0001
Table 1.
The first local ML model may be trained, which may predict the anomalous instance of a VNF with the iteration so that, in the future, it may predict which instance of VNF may anomalous and ensure that the global ML model is not updated with this anomalous instance. In this way, the first node 111 may be eventually removed and replaced with the first local ML model. The node type may be included also into the first local ML model to ensure heterogeneous node capabilities, and to use the first local ML model so that it may have improved capabilities of anomalous VNF detection. Also, different features may be included which may represent the features of this VNF and use them as features to the first local ML model. For example, if there is an increase in the CPU usage when compared with previous iterations, the nodes may be suspected of being anomalous.
If the global ML model accuracy does not decrease below the threshold, the first node 111 may be employed to update the global ML model with the update and use the updated data to retrain the first local ML model. In this way, retraining may also be handled.
The performance of Action 303 may be understood to enable to expedite the filtering of the local updates, since the first local machine-learning model may enable the first node 111 to refrain from executing the global machine-learning model with every local update, calculating the second accuracy of the updated global machine-learning model, and comparing the change in accuracy from the first accuracy to the second accuracy with respect to the threshold, with the corresponding change observed with the respective local updates obtained from the plurality of other local nodes.
Action 304
In this Action 304, the first node 111 sends the local update to the another node 114 operating in the communications system 10 based on a result of the determination performed in Action 302, optionally with the help of the generated first local machine-learning model in Action 303. With the proviso that the accuracy exceeds the threshold, the first node 111 proceeds with the sending of the local update. With the proviso that the accuracy does not exceed the threshold, the first node 111 refrains from sending the local update. In other words, the first node 111 , may filter out the local update if it may have determined that it is anomalous, or it may send it to the another node 114 if it has a sufficient accuracy, so that the another node 114 may update the global ML model with it.
Sending may be understood as e.g., transmitting, and may be performed via the second link 152.
With the proviso that the accuracy exceeds the threshold, the global ML model may be averaged with the features of the local VNFs, such as CPU usage, number of applications running etc., and use them to scale the VNFs, instead of general averaging in federated learning, as indicated below:
(New_averaging = % of CPU usage * update_1 + % of CPU usage * update_2 + . )
This may be one of the simple new averaging methods and any averaging may be used based on available features. By sending the local update to the another node 114 based on the result of the determination, the first node 111 may be enabled to filter out anomalous updates that could potentially diminish the accuracy of the global ML model if they were used to update it, and only select the local updates being sufficiently accurate to update the global ML model, thereby enabling to obtain a global ML model capable of predicting the event with higher accuracy.
Embodiments of a method performed by the second node 112, will now be described with reference to the flowchart depicted in Figure 4. The second node 112 operates in the communications system 10. The method is to facilitate prediction of an event in the communications system 10.
The method may comprise the following actions. Several embodiments are comprised herein. In some embodiments, some actions may be performed, in other embodiments, all actions may be performed. One or more embodiments may be combined, where applicable. All possible combinations are not described to simplify the description. It should be noted that the examples herein are not mutually exclusive. Components from one example may be tacitly assumed to be present in another example and it will be obvious to a person skilled in the art how those components may be used in the other examples. In Figure 4, optional actions are represented in boxes with dashed lines.
The detailed description of some of the following corresponds to the same references provided above, in relation to the actions described for the first node 111 and will thus not be repeated here to simplify the description. For example, the second node 112 may be a compute server.
Action 401
In this Action 410, the second node 112 obtains the global context information indicating which one or more virtual network functions the second node 112 is to collect data from, to train a local machine-learning model to predict the event.
In some embodiments, the obtaining in this Action 401 of the global context information may comprise one of the following two options. In a first option, the obtaining of the global context information may comprise receiving the global context information from the third node 113 operating in the communications system 10, e.g., a global model orchestrator. In a second option, the obtaining of the global context information may comprise determining a local machine-learning model, referred to herein as a “third” local machine-learning model, to predict the global context information by training the third machine-learning model with observed global context information received from the third node 113. Determining may be understood as calculating, deriving, or similar.
Obtaining may be understood as e.g., receiving, and may be performed via the third link 153.
This third local ML model may be used to determine e.g., the number of active instances of firewall VNF based on traffic volume. This third local ML model may be trained in a supervised manner where the output label, that is, the number of active instances, may be provided by the another node 114. Once the third local ML model performance may be acceptable, the second node 112 may inform the another node 114 to stop sending the operational data. Use of such an ML model may be understood to reduce the data exchanged between the another node 114 and the second node 112. If this third local ML model performance deteriorates, the third local ML model may be re-trained. For re-training, the second node 112 may inform the second node 112 to start sending the operational data.
In another set of embodiments, the third local ML model may be used to determine the type of VNF based on traffic volume on a virtual port. This third local ML model may be trained in a supervised manner where the output label, that is, the type of VNF for a virtual port, may be provided by the third node 113. Once the third local ML model performance may be acceptable, the third node 113 may be informed to stop sending the list of relevant ports.
Use of such a third local ML model may be understood to reduce the data exchanged between the third node 113 and the second node 112.
If the performance of this third local ML model deteriorates, the third local ML model may be re-trained. For re-training, the second node 112 inform the third node 113 to start sending the list of relevant ports and associated VNF.
As described earlier, the global context information may comprise one of: i) the configuration state information of the one or more virtual network functions, and ii) the operational state information of the one or more virtual network functions.
By obtaining the global context information in this Action 410, the second node 112 is then enabled to know from which virtual network functions to collect the data to train the local ML model, and to normalize the data collected by virtue of the dynamic changes in e.g., the number of virtual network functions instantiated, as will be explained later, in Action 404. This in turn enables the second node 112 to determine a more accurate local ML model, referred to herein as the second local ML model, to predict the event, which may then be used to also determine a more accurate global ML model to predict the event.
Action 402
For training the global ML model to predict the event, the another node 114 may request that the local copy of the global ML model be trained for a specific VNF, such as a firewall VNF. To achieve such a purpose, the another node 114 may query the third node 113 to identify the second nodes 112 on which the distributed ML model training may be performed. The third node 113 may also send the name of relevant virtual ports and associated VNF to the second node 112.
Accordingly, in some embodiments, the global context information may comprise configuration state information of the one or more virtual network functions enabling mapping information of virtual port names to types of virtual network functions.
In this Action 402, the second node 112 may collect the data of the one or more virtual network functions, based on the obtained configuration state information.
In some examples, this Action 402 may comprise collecting training data from various virtual ports and may mapping them to various VNFs, as may have been specified by the third node 113. In some examples, this Action 402 may be performed by a VNF data mapper sub-module managed by the second node 112.
As and a when new instance of a specific VNF may be created, the third node 113 may send the updated port list to the second node 112.
In case the VNF instance is migrated, other hosts or the VNF instance may be shut down, the virtual port may no longer exist on the second node 112. In such cases, no further virtual port data may be available for training. In such cases, the second local ML model may be trained only with the data that may have been collected until that time.
Action 403
During the course of model training, the number of VNF instances may be increased or decreased, based on some dynamic policy decisions.
In other embodiments, the global context information may comprise operational state information of the one or more virtual network functions enabling normalization of the data based on a number of the one or more virtual network functions instantiated at the time of the updating 404.
In this Action 403, the second node 112 may normalize the data collected by the second node 112 with respect to the obtained operational state information of the one or more virtual network functions.
For example, consider a case where a total of 12GB/s traffic is coming to a datacenter which is being distributed equally to two instances of firewall VNF. The volume of traffic being received by each firewall VNF is ~ 6GB/s. If the number of active VNF instances is increased to three instances due to a dynamic policy decision, the share of traffic coming to each firewall VNF may decrease to 4GB/s. In the absence of data normalization, based on total active instances, the model may try to find a pattern for this traffic decrease (from 6GB/s to 4GB/s). This may cause the model to learn a spurious pattern since the volume change was due to a decision by a dynamic operational policy.
The second node 112 may overcome this problem, e.g., via an operational state mapper module managed by the second node 112. In the above case, initially, the volume metric may be multiplied by 2, corresponding to the number of active instances, to get an overall estimate of the total traffic coming to the datacenter, that is, 6GB/s * 2 = 12GB/s. Once the number of instances is increased, the volume metric may be multiplied by 3, that is, the number of active instances, to get volume estimate, namely 4GB/s * 3 =12GB/s. Due to this normalization, the training data may accurately represent that there is no change in the underlying data generating process, that is, the subscriber traffic coming to datacenter.
Action 404
In this Action 404, the second node 112 may update a local machine-learning model, referred to herein as a “second” local machine-learning model to predict the event in the communications system 10, with the normalized collected data to obtain the local update.
The updating in this Action 404 may comprise training a local copy of the global machine-learning model with the normalized collected data, to obtain a new version of the model, that tries to fit the newly collected data. That is, the properly mapped data after normalization may be used for training the second local ML model.
By updating the second local machine-learning model with the normalized collected data in this Action 404, the second node 112 is then enabled to determine a more accurate local ML model to predict the event, which may then be used to also determine a more accurate global ML model to predict the event.
Action 405
In this Action 405, the second node 112 sends the local update of the second local machine-learning model to the first node 111 operating in the communications system 10. The local update is based on the obtained global context information.
That is, once the training may be complete, the updated ML model parameters may be sent to the first node 111 , which may obtain the local update in Action 301 , as described earlier. Embodiments of method, performed by the third node 113, will now be described with reference to the flowchart depicted in Figure 5. The third node 113 operates in the communications system 10. The method is to facilitate prediction of the event in the communications system 10.
The method comprises the following action. Several embodiments are comprised herein. One or more embodiments may be combined, where applicable. All possible combinations are not described to simplify the description. It should be noted that the examples herein are not mutually exclusive. Components from one example may be tacitly assumed to be present in another example and it will be obvious to a person skilled in the art how those components may be used in the other examples.
The detailed description of some of the following corresponds to the same references provided above, in relation to the actions described for the first node 111 and will thus not be repeated here to simplify the description. For example, the second node 112 may be a compute server.
Action 501
In some embodiments, as described earlier, the third node 113 sends, to the second node 112 operating in the communications system 10, the global context information. The global context information indicates which one or more virtual network functions the second node 112 is to collect data from to train a local machine-learning model to predict the event.
The sending in this Action 501 , may be understood as e.g., transmitting, and may be performed via the fourth link 154.
As stated earlier, the global context information may comprise one of: i) the configuration state information of the one or more virtual network functions, and ii) the operational state information of the one or more virtual network functions.
By sending the global context information to the second node 121 in this Action 501 , the third node 113 enables the second node 112 to know from which virtual network functions to collect the data to train the local ML model, and to normalize the data collected by virtue of the dynamic changes in e.g., the number of virtual network functions instantiated, as explained earlier. This in turn enables the second node 112 to determine a more accurate local ML model to predict the event, which may then be used by the another node 114 to determine a more accurate global ML model to predict the event.
Embodiments of a computer-implemented method, performed by the communications system 100 comprising the plurality of nodes 110, will now be described with reference to the flowchart depicted in Figure 6. The method may be understood to be to facilitate prediction of the event in the communications system 10.
The method may comprise the actions described below. In some embodiments some of the actions may be performed. In some embodiments all the actions may be performed. In Figure 6, optional actions are indicated with a dashed box. One or more embodiments may be combined, where applicable. All possible combinations are not described to simplify the description. It should be noted that the examples herein are not mutually exclusive. Components from one example may be tacitly assumed to be present in another example and it will be obvious to a person skilled in the art how those components may be used in the other examples.
The detailed description of the Actions depicted in Figure 6 may be understood to correspond to that already provided when describing the actions performed by each of the first node 111 , the second node 112, and the third node 113, and will therefore not be repeated here.
Action 601
This Action 601 , which corresponds to Action 501 , comprises, sending 601 , by the third node 113 operating in the communications system 10, the global context information to the second node 112 operating in the communications system 10. The global context information indicates which one or more virtual network functions the second node 112 is to collect data from to train the local machine-learning model, e.g., the second local machine-learning model, to predict the event.
The global context information may comprise at least one of: i) the configuration state information of the one or more virtual network functions, and ii) the operational state information of the virtual network functions.
Action 602
This Action 602, which corresponds to Action 401 , comprises, obtaining, by the second node 112, the global context information.
The obtaining in this Action 602, 401 of the global context information may comprise one of: a) receiving the global context information from the third node 113 operating in the communications system 10, and b) determining the third local machine-learning model to predict the global context information by training the third machine-learning model with the received global context information. The global context information may comprise the configuration state information of the one or more virtual network functions enabling mapping information of virtual port names to types of virtual network functions.
In some embodiments, the global context information may comprise the operational state information of the one or more virtual network functions enabling normalization of the data based on the number of the one or more virtual network functions instantiated at the time of the updating of Action 605, 404.
In some embodiments, the second node 112 may be a compute server.
Action 603
In some embodiments, in this Action 603, which corresponds to Action 402, the second node 112, collecting the data based on the obtained configuration state information of the one or more virtual network functions.
Action 604
In some embodiments, in this Action 604, which corresponds to Action 403, the second node 112, normalizing the data collected by the second node 112 with respect to the obtained operational state information of the one or more virtual network functions.
Action 605
In some embodiments, in this Action 605, which corresponds to Action 403, the second node 112, updating the second local machine-learning model with the normalized collected data to obtain the local update.
Action 606
This Action 606, which corresponds to Action 404, comprises, sending, by the second node 112, the local update of the second local machine-learning model to predict the event in the communications system 10, to the first node 111 operating in the communications system 10. The local update is based on the obtained global context information.
That the local update is based on the global context information may comprise normalizing the data collected by the second node 112 with respect to the global context information.
Action 607
In some embodiments, the method may comprise, in this Action 607, which corresponds to Action 301 , obtaining, by the first node 111 , the local update from the second node 112 operating in the communications system 10. The local update may be based on the global context information.
Action 608
This Action 608, which corresponds to Action 302, comprises, determining 608, 302, by the first node 111 , whether or not an accuracy of the local update exceeds a threshold.
In some embodiments, the global machine-learning model may have the first accuracy. The determining in this Action 604 may further comprises a executing the global machinelearning model with the local update, b calculating a second accuracy of the updated global machine-learning model, c comparing a change in accuracy from the first accuracy to the second accuracy with respect to the threshold, with a corresponding change observed with respective local updates obtained from a plurality of other local nodes, and d labelling the local update as anomalous with the proviso that the second accuracy does not exceed the threshold.
Action 609
In some embodiments, the method may comprise, in this Action 609, which corresponds to Action 303, generating, by the first node 111 , the first local machine-learning model to predict the local updates, of the global machine-learning model to predict the event in the communications system 10, having the respective accuracy not exceeding the threshold.
Action 610
This Action 610, which corresponds to Action 304, comprises sending, by the first node 111 , the local update to the another node 114 operating in the communications system 10 based on the result of the determination. One of the following applies: i) with the proviso that the accuracy exceeds the threshold, the first node 111 proceeds with the sending of the local update; and II) with the proviso that the accuracy does not exceed the threshold, the first node 111 refrains from sending the local update.
The local update may be one of the plurality of local updates used by the another node 114 to determine the global machine-learning model. The plurality of local updates may have the respective accuracy exceeding the threshold.
The interplay between the actions performed by each of the first node 111 , the second node 112, the third node 113, and the another node 114 will now be illustrated with a few nonlimiting examples in the next Figures. Figure 7 is a schematic diagram illustrating a non-limiting example of embodiments herein, the communications network 100 may comprise the following components according to which the first node 111 may be a Global Moderator, the second node 112 may be a Compute Server 1 , the third node 113 may be a Cloud Orchestrator, and the another node 114 may be a Global Model Trainer. Further details on these nodes is provided next.
Cloud Orchestrator
The third node 113 may be understood to maintain the logical view of the VNFs and their physical realization on the second nodes 112, which in the particular example of Figure 7 comprise a Compute server 1 , and a Compute Server 2. The third node 113 may also maintain the operational state of the various VNFs, such as the number of active instances of VNFs that may be running.
In an NFV based deployment, the third node 113 may be realized by the NFV Orchestrator.
The third node 113 may update the second nodes 112 with a global configuration and an operational context for the training data to be appropriately extracted and normalized.
Global ML model trainer global re-training of the global ML model may be required, the another node 114 may synchronize, that is, send, the global ML model to the various second nodes 112.
The another node 114 may receive the updated ML model parameters from the first node
111 , which in turn may get the updated ML model parameters from multiple second nodes 112. The another node 114 may aggregate these parameters to update the global ML model.
Local ML model trainer
One instance of the local ML model trainer module 70 may reside on each second node
112. This module may have two sub-modules: a Configuration Data mapper 71 of VNF Data mapper 71 , and an Operational state mapper 72.
Configuration Data mapper 71
This sub-module may be understood to be responsible for mapping the virtual ports, from which the data may be collected, to different VNF.
To illustrate the functionality of this module, a scenario may be considered where a single compute server may host two VNF, e.g., a firewall (fw) and a router (rtr). The virtual ports on the server may be fw-port-in, fw-port-out, rtr-port-in, rtr-port-out.
The module may then map these ports to VNF, that is, : fw-port-in and a fw-port-out may map to a firewall VNF, and rtr-port-in, rtr-port-out may map to a Router VNF.
In one implementation, this information, that is, the mapping of virtual ports to VNFs, may be derived based on provisioning information available from the third node 113. In alternative implementation, this information may be derived based on the traffic characteristics of the virtual ports itself.
Operational state mapper 72
This module may be understood to be responsible for ascertaining the global operational state of various VNF instances. To illustrate the functionality of this module, a scenario may be considered where the cloud operations may manage two VNFs: a firewall VNF and a router VNF. The module may determine the number of instances of firewall instances that may be active and the number of instances of router VNF that may be active.
In one implementation, this information may be determined based on operational information available from the third node 113.
In an alternative implementation, this information may be determined based on traffic characteristics.
Global moderator
During the initial phases, the first node 111 may be understood to accept all the updates sent from the second nodes 1 12 and use them to update the cached version of the global model present in its memory. Further, it may randomly select the second nodes 112 and send the cached global model back to specific set of second nodes 112.
The performance report of the local copy of the global ML model 73 may then be sent from the second nodes 112 to the first node 111 on how the local copy of the global ML model 73 performed with the local data. Depending on the reduction, or increase, in accuracy, a VNF instance may be labelled as anomalous, or as normal, as described earlier.
Example Operation - Training the ML model
Figure 7 also illustrates a particular example of how a method according to embodiments herein may be implemented for a ML model training.
Stepl : The another node 1 14, may start by querying the third node 1 13, to identify the second nodes 1 12, on which the distributed ML model training may be performed.
Step2: The third node 113 may reply with the list of second nodes 1 12. The selection of second nodes 1 12 may be based on a policy, such as second nodes 1 12 that may have enough free local storage to store the training data.
Step3: The global ML model may be synchronized, that is, sent, to various second nodes 1 12, by the another node 1 14.
Step4: The VNF data mapper sub-module 71 of the local-trainer-module 70 may collect, according to Actions 402 and 603, the training data from various virtual ports and may map them to various VNFs. A VNF instance may have multiple Virtual Machines (VMs). VM1 and VM2 correspond of VMs of VNF instances. The data switch 76 may be understood as a software-based switch that may allow VMs to send and receive traffic. The traffic may be sent to other VMs on the same compute server, or outside of the compute server.
Step5: The Operational state mapper sub-module 72 of the local-trainer module 70 may determine the global operational state. It may use this information to normalize the training data, according to Actions 403 and 604.
Step6: The properly mapped data after normalization may be used for training the ML model 73, according to Actions 404 and 605. Once the training may be complete, the updated ML model parameters may be sent to the first node 111 , according to Actions 405 and 606.
Step7: The first node 111 may filter out the updates from anomalous second nodes 112, according to Actions 302 and 608. The filtered updates may then be sent to the another node 114, according to Actions 304 and 610.
Step8: The another node 114 may aggregate the updates coming from the first node 111 to update the global ML model.
Figure 8 is a schematic diagram illustrating a non-limiting example of a sample ML model. In this example, the ML model may be a Deep Neural Network model that may use virtual port metrics to predict traffic volume in the next 5 minutes, as shown in Figure 8 to be output by the ML model. As input to the input layer of the ML model, there may be data from a first port, port metric-1 , and a moving average of metric-1 , data from a second port, port metric- 2, and a moving average of metric-2, and data from a third port, port metric-3. The ML model may comprise hidden layers, and an output layer. In a neural network, an activation function may be understood to be responsible for transforming the summed weighted input from the node into the activation of the node or output for that input. The rectified linear activation function (ReLU) may be understood as a piecewise linear function that may output the input directly if it is positive, and otherwise, it may output zero.
Example Operation - Training the ML model during a scale-in and scale-out procedure
Figure 9 illustrates a particular non-limiting example of how a method according to embodiments herein may be implemented for a training the ML model during a scale-in and scale-out procedure.
During the course of model training, the number of VNF instances may be increased or decreased, based on some dynamic policy decisions. To illustrate this, a scenario may be considered where an ML model is being trained to predict the traffic volume being received by a virtual firewall VNF. In a typical deployment, all the traffic destined to the firewall VNF will be distributed equally among the various firewall instances. In such conditions, according to an example of embodiments herein, the following steps may be performed:
Stepl : Whenever new firewall VNF instances may be instantiated, the third node 113 may send the count of number of active firewall VNF instances to the operational state mapper module 72 on relevant second nodes 112, where the local copy of the global ML model 73 may be being trained.
Step2: Based on the timestamp of the new VNF instantiation and new count of firewall VNF instances, the operational state mapper 72 may normalize the training data, according to Actions 402 and 604.
Step3: As an alternative, the operational state mapper module 72 may host an additional ML model, the third local machine-learning model. This third local machine-learning model may be used to determine the number of active instances of firewall VNF based on traffic volume. This third local machine-learning model may be trained, according to Actions 401 and 602, in a supervised manner where the output label, that is, the number of active instances, may be provided by the third node 113, since the number of instances may keep changing. Once the third local machine-learning model performance is acceptable, the third node 113 may be informed to stop sending the operational data, e.g., the number of active instances of VNF.
Use of such a third local machine-learning model may reduce the data exchanged between the third node 113 and the operational mapper module 72. This may be understood to be because, initially, the third node 113 may send the operational state data. Later, when the third local machine-learning model may have been trained, and may be able to predict the operation state data from the incoming local data itself, the third node 113 may no longer need to send the operational state data.
If this third local machine-learning model performance deteriorates, the third local machine-learning model may be re-trained. For re-training, the third node 113 may be informed to start sending the operational data.
Step4: The normalization procedure by the operational mapper module 72 may be performed during the main model prediction phase as well. The procedure remains same as described above.
Example Operation - Training a VNF specific ML model
For the model training, the another node 114 may request the model to be trained for a specific VNF such as a firewall VNF. In such cases, a particular example of embodiments herein may be as follows: Stepl : Similar to what has been described in the Section entitled “training the ML model”, the another node 114 may query the third node 113 to identify the second nodes 112 on which the distributed ML model training may be performed. However, in this case, the another node 114 may also pass the relevant VNF identification, e.g., such as the firewall VNF.
Step2: Similar to the Section entitled “training the ML model”, the third node 113 may reply with the list of second nodes 112. The selection of second nodes 112 may be based on policy such as second nodes 112 that may have enough free local storage to store the training data.
Step3: Similar to the Section entitled “training the ML model”, the global ML model may be synchronized, that is, sent, to various second nodes 112.
Step4: Unlike the step noted in the Section entitled “training the ML model”, in this scenario, the third node 113 may also send the name of relevant virtual ports and associated VNF to the Configuration Data mapper 71.
Step5: The VNF data mapper sub-module 71 of the local-trainer-module 70 may collect, according to Actions 402 and 603, the training data from the various virtual ports as specified by the third node 113.
Step6: As and a when new instance of a specific VNF may be created, the third node 113 may send the updated port list to the VNF data mapper module 71.
Step7: In case the VNF instance is migrated, other hosts or the VNF instance may be shut down, the virtual port may no longer exist on the local second node 112. In such cases, no further virtual port data may be available for training. In such cases, the second local machine-learning model may be trained only with the data that may have been collected until that time.
Step7: The rest of the steps are same as noted in the Section entitled “training the ML model”, starting with Step5: Operational state mapper sub-module 72.
As an alternative, the VNF data mapper module 71 may host an additional ML model, the third local machine-learning model. This third local machine-learning model may be used to determine the type of VNF based on traffic volume on a virtual port. This model may be trained in a supervised manner where the output label, that is, the type of VNF for a virtual port, may be provided by the third node 113. Once the ML model performance is acceptable, the third node 113 may be informed to stop sending the list of relevant ports.
Use of such a third local machine-learning model may be understood to reduce the data exchanged between the third node 113 and the VNF data mapper module 71. If the performance of this third local machine-learning model deteriorates, the third local machine-learning model may be re-trained. For re-training, the third node 113 may be informed to start sending the list of relevant ports and associated VNF.
Example Operation - Filtering anomalous updates
In case of anomalous data coming from VNF instances, it may spoil the global ML model. In this case, these anomalous instances may need to be identified before the global ML model may be updated. Since the raw VNF data is not available, it will not be possible to detect anomalous instances.
Figure 9 is a schematic diagram illustrating a particular example of how in order to detect anomalous VNF without depending on VNF data, the first node 111 may be created according to embodiments herein, which may be understood to lie between the local-model- trainer 70 and the another node 114, that is, the global-model-trainer. The first node 111 may have the version of the global updated model. The first node 111 may take the local updates obtained according to Action 301 from the second nodes 112, one at a time, and compute the global updated model. Now, other second nodes 112 may be asked to report the accuracy obtained with data from VNFs other than the VNF whose model may have been used to update. Based on the decrease in accuracies, the local updates may be labelled as anomalous according to Action and 302 and 608, and the anomalous local updates may then not be used to update the global model.
As a summarized overview, embodiments herein may be understood to relate to the following. Newer ML model training techniques are emerging to train ML models in a distributed manner. One such technique is federated learning. Federated learning, a.k.a. collaborative learning, may be understood as a machine learning technique that may train an algorithm across multiple decentralized edge devices or servers holding local data samples, without exchanging their data samples. Embodiments herein may be understood to use similar techniques to train a global ML model based on VNF traffic related data.
The global context may be understood to provide the mapping of various virtual ports, of which metric data may be collected, to different types of VNFs.
The global context may be understood to provides information about operational status, such as number of instances of VNF which may be running etc., which may then be used to normalize the training data.
Furthermore, embodiments herein may enable to detect the anomalous VNFs in Federated Learning context by using the ML models without using the VNF data. One advantage of embodiments herein may be understood to be the reduced bandwidth load on the management plane due to the smaller sized ML model parameters that may be sent to a central location rather than the voluminous metric data.
One further advantage of embodiments herein may be understood to be the better model accuracy due to the improved normalization of the training data using operational status and avoiding parameters for aggregation that may originate from second nodes 112 where VNF behavior may be anomalous.
Another advantage of embodiments herein may be understood to be the faster model convergence due to the better labelling and/or categorization of training data by associating virtual ports of different VNF types.
Figure 11 depicts two different examples in panels a) and b), respectively, of the arrangement that the first node 111 may comprise to perform the method described in Figure 3. In some embodiments, the first node 111 may comprise the following arrangement depicted in Figure 11a. The first node 111 may be understood to be configured to operate in the communications system 10. The first node 111 may be further configured to facilitate prediction of the event in the communications system 10.
Several embodiments are comprised herein. It should be noted that the examples herein are not mutually exclusive. One or more embodiments may be combined, where applicable. All possible combinations are not described to simplify the description. Components from one embodiment may be tacitly assumed to be present in another embodiment and it will be obvious to a person skilled in the art how those components may be used in the other exemplary embodiments. In Figure 11 , optional units are indicated with dashed boxes.
The detailed description of some of the following corresponds to the same references provided above, in relation to the actions described for the first node 111 and will thus not be repeated here. For example, in some examples, each of the second nodes 112 may be configured to be a compute server.
The first node 111 is configured to, e.g., by means of a determining unit 1101 within the first node 111 , configured to determine whether or not the accuracy of the local update to the global machine-learning model to predict the event in the communications system 10 exceeds the threshold.
The first node 111 is further configured to, e.g., by means of a sending unit 1102 within the first node 111 , configured to send the local update to the another node 114 configured to operate in the communications system 10 based on the result of the determination. One of the following options may apply: I) with the proviso that the accuracy exceeds the threshold, the first node 111 is further configured to proceed with the sending of the local update, and II) with the proviso that the accuracy does not exceed the threshold, the first node 111 is further configured to refrain from sending the local update.
In some embodiments, the local update may be configured to be one of the plurality of local updates to be used to determine the global machine-learning model. The plurality of local updates may be configured to have the respective accuracy exceeding the threshold.
The global machine-learning model may be configured to have the first accuracy. To determine may be further configured to comprise a) executing the global machine-learning model with the local update, b) calculating the second accuracy of the global machine-learning model configured to be updated, c) comparing the change in accuracy from the first accuracy to the second accuracy with respect to the threshold, with the corresponding change configured to be observed with respective local updates configured to be obtained from the plurality of other local nodes, and d) labelling the local update as anomalous with the proviso that the second accuracy does not exceed the threshold
In some embodiments, the first node 111 may be further configured to, e.g., by means of a generating unit 1103 within the first node 111 , configured to generate the first local machine-learning model to predict the local updates, of the global machine-learning model to predict the event in the communications system 10, having the respective accuracy not exceeding the threshold.
In some embodiments, the first node 111 may be further configured to, e.g., by means of an obtaining unit 1104 within the first node 111 , configured to obtain the local update from the second node 112 configured to operate in the communications system 10. The local update may be configured to be based on global context information.
In some embodiments, the global context information may be configured to comprise at least one of: i) configuration state information of the one or more virtual network functions, and ii) operational state information of the virtual network functions.
That the local update is configured to be based on the global context information may comprise that the data configured to be collected by the second node 112 may be configured to be normalized with respect to the global context information.
The embodiments herein in the first node 111 may be implemented through one or more processors, such as a processor 1105 in the first node 111 depicted in Figure 11a, together with computer program code for performing the functions and actions of the embodiments herein. A processor, as used herein, may be understood to be a hardware component. The program code mentioned above may also be provided as a computer program product, for instance in the form of a data carrier carrying computer program code for performing the embodiments herein when being loaded into the first node 111. One such carrier may be in the form of a CD ROM disc. It is however feasible with other data carriers such as a memory stick. The computer program code may furthermore be provided as pure program code on a server and downloaded to the first node 111.
The first node 111 may further comprise a memory 1106 comprising one or more memory units. The memory 1106 is arranged to be used to store obtained information, store data, configurations, schedulings, and applications etc. to perform the methods herein when being executed in the first node 111.
In some embodiments, the first node 111 may receive information from, e.g., any of the second nodes 112, the third node 113, and/or the another node 114 through a receiving port 1107. In some embodiments, the receiving port 1107 may be, for example, connected to one or more antennas in first node 111. In other embodiments, the first node 111 may receive information from another structure in the communications system 10 through the receiving port 1107. Since the receiving port 1107 may be in communication with the processor 1105, the receiving port 1107 may then send the received information to the processor 1105. The receiving port 1107 may also be configured to receive other information.
The processor 1105 in the first node 111 may be further configured to transmit or send information to e.g., any of the second nodes 112, the third node 113, another node 114, and/or another structure in the communications system 10, through a sending port 1108, which may be in communication with the processor 1105, and the memory 1106.
Those skilled in the art will also appreciate that the units 1101-1104 described above may refer to a combination of analog and digital circuits, and/or one or more processors configured with software and/or firmware, e.g., stored in memory, that, when executed by the one or more processors such as the processor 1105, perform as described above. One or more of these processors, as well as the other digital hardware, may be included in a single Application-Specific Integrated Circuit (ASIC), or several processors and various digital hardware may be distributed among several separate components, whether individually packaged or assembled into a System-on-a-Chip (SoC).
Also, in some embodiments, the different units 1101 -1104 described above may be implemented as one or more applications running on one or more processors such as the processor 1105.
Thus, the methods according to the embodiments described herein for the first node 111 may be respectively implemented by means of a computer program 1109 product, comprising instructions, i.e., software code portions, which, when executed on at least one processor 1105, cause the at least one processor 1105 to carry out the actions described herein, as performed by the first node 111. The computer program 1109 product may be stored on a computer-readable storage medium 1110. The computer-readable storage medium 1110, having stored thereon the computer program 1109, may comprise instructions which, when executed on at least one processor 1105, cause the at least one processor 1105 to carry out the actions described herein, as performed by the first node 111. In some embodiments, the computer-readable storage medium 1110 may be a non-transitory computer-readable storage medium, such as a CD ROM disc, or a memory stick. In other embodiments, the computer program 1109 product may be stored on a carrier containing the computer program 1109 just described, wherein the carrier is one of an electronic signal, optical signal, radio signal, or the computer-readable storage medium 1110, as described above.
The first node 111 may comprise a communication interface configured to facilitate, or an interface unit to facilitate, communications between the first node 111 and other nodes or devices, e.g., any of the second nodes 112, the third node 113, another node 114, and/or another structure in the communications system 10. The interface may, for example, include a transceiver configured to transmit and receive radio signals over an air interface in accordance with a suitable standard.
In other embodiments, the first node 111 may comprise the following arrangement depicted in Figure 11b. The first node 111 may comprise a processing circuitry 1105, e.g., one or more processors such as the processor 1105, in the first node 111 and the memory 1106. The first node 111 may also comprise a radio circuitry 1111 , which may comprise e.g., the receiving port 1107 and the sending port 1108. The processing circuitry 1105 may be configured to, or operable to, perform the method actions according to Figure 3, Figure 6, Figure 7, Figure 9 and/or Figure 10, in a similar manner as that described in relation to Figure 11 a. The radio circuitry 1111 may be configured to set up and maintain at least a wireless connection with the any of the second nodes 112, the third node 113, another node 114, and/or another structure in the communications system 10. Circuitry may be understood herein as a hardware component.
Hence, embodiments herein also relate to the first node 111 operative to operate in the communications system 10. The first node 111 may comprise the processing circuitry 1105 and the memory 1106, said memory 1106 containing instructions executable by said processing circuitry 1105, whereby the first node 111 is further operative to perform the actions described herein in relation to the first node 111 , e.g., in Figure 3, Figure 6, Figure 7, Figure 9 and/or Figure 10.
Figure 12 depicts two different examples in panels a) and b), respectively, of the arrangement that the second node 112, may comprise to perform the method described in Figure 4. In some embodiments, the second node 112 may comprise the following arrangement depicted in Figure 12a. The second node 112 is configured to operate in the communications system 10. The second node 112 is further configured to facilitate prediction of the event in the communications system 10.
Several embodiments are comprised herein. It should be noted that the examples herein are not mutually exclusive. One or more embodiments may be combined, where applicable. All possible combinations are not described to simplify the description. Components from one embodiment may be tacitly assumed to be present in another embodiment and it will be obvious to a person skilled in the art how those components may be used in the other exemplary embodiments. In Figure 12, optional units are indicated with dashed boxes.
The detailed description of some of the following corresponds to the same references provided above, in relation to the actions described for the second node 112 and will thus not be repeated here. For example, in some examples, each of the second nodes 112 may be configured to be a compute server.
The second node 112 is configured to, e.g. by means of an obtaining unit 1201 within the second node 112, configured to, obtain the global context information configured to indicate which one or more virtual network functions the second node 112 is to collect data from to train the local machine-learning model to predict the event.
The second node 112 is configured to, e.g., by means of a sending unit 1202 within the second node 112, configured to, send the local update of a second local machine-learning model to predict the event in the communications system 10, to the first node 111 operating in the communications system 10. The local update is configured to be based on the global context information configured to be obtained.
In some embodiments, to obtain the global context information may be configured to comprise one of: a) receiving the global context information from the third node 113 configured to operate in the communications system 10, and b) determining the third local machinelearning model to predict the global context information by training the third machine-learning model with observed global context information configured to be received from the third node 113.
The global context information may be configured to comprise one of: i) the configuration state information of the one or more virtual network functions, and ii) the operational state information of the one or more virtual network functions.
In some embodiments, the second node 112 may be further configured to, e.g., by means of a collecting unit 1203 within the second node 112, configured to, collect the data of the one or more virtual network functions, based on the configuration state information configured to be obtained.
In some of such embodiments, the second node 112 may be further configured to, e.g., by means of a normalizing unit 1204 configured to, normalize the data configured to be collected by the second node 112 with respect to the operational state information, configured to be obtained, of the one or more virtual network functions.
In some embodiments, the second node 112 may be further configured to, e.g., by means of an updating unit 1205 within the second node 112, configured to, update the second local machine-learning model with the data configured to be collected and normalized to obtain the local update.
In some embodiments, the second node 112 may be configured to be a compute server. In some of such embodiments, the global context information may be configured to comprise the configuration state information of the one or more virtual network functions enabling mapping information of virtual port names to types of virtual network functions.
In some embodiments, the second node 112 may be configured to be a compute server. In some of such embodiments, the global context information may be configured to comprise the operational state information of the one or more virtual network functions enabling normalization of the data based on the number of the one or more virtual network functions configured to be instantiated at the time of the updating.
The embodiments herein in the second node 112 may be implemented through one or more processors, such as a processor 1206 in the second node 112 depicted in Figure 12a, together with computer program code for performing the functions and actions of the embodiments herein. A processor, as used herein, may be understood to be a hardware component. The program code mentioned above may also be provided as a computer program product, for instance in the form of a data carrier carrying computer program code for performing the embodiments herein when being loaded into the second node 112. One such carrier may be in the form of a CD ROM disc. It is however feasible with other data carriers such as a memory stick. The computer program code may furthermore be provided as pure program code on a server and downloaded to the second node 112.
The second node 112 may further comprise a memory 1207 comprising one or more memory units. The memory 1207 is arranged to be used to store obtained information, store data, configurations, schedulings, and applications etc. to perform the methods herein when being executed in the second node 112.
In some embodiments, the second node 112 may receive information from, e.g., the first node 111 , the other second nodes 112, the third node 113, the another node 114 through a receiving port 1208. In some embodiments, the receiving port 1208 may be, for example, connected to one or more antennas in second node 112. In other embodiments, the second node 112 may receive information from another structure in the communications system 10 through the receiving port 1208. Since the receiving port 1208 may be in communication with the processor 1206, the receiving port 1208 may then send the received information to the processor 1206. The receiving port 1208 may also be configured to receive other information. The processor 1206 in the second node 112 may be further configured to transmit or send information to e.g., the first node 111 , the other second nodes 112, the third node 113, the another node 114 and/or another structure in the communications system 10, through a sending port 1209, which may be in communication with the processor 1206, and the memory 1207.
Those skilled in the art will also appreciate that the units 1201-1205 described above may refer to a combination of analog and digital circuits, and/or one or more processors configured with software and/or firmware, e.g., stored in memory, that, when executed by the one or more processors such as the processor 1206, perform as described above. One or more of these processors, as well as the other digital hardware, may be included in a single Application-Specific Integrated Circuit (ASIC), or several processors and various digital hardware may be distributed among several separate components, whether individually packaged or assembled into a System-on-a-Chip (SoC).
Also, in some embodiments, the different units 1201 -1205 described above may be implemented as one or more applications running on one or more processors such as the processor 1206.
Thus, the methods according to the embodiments described herein for the second node 112 may be respectively implemented by means of a computer program 1210 product, comprising instructions, i.e., software code portions, which, when executed on at least one processor 1206, cause the at least one processor 1206 to carry out the actions described herein, as performed by the second node 112. The computer program 1210 product may be stored on a computer-readable storage medium 1211 . The computer-readable storage medium 1211 , having stored thereon the computer program 1210, may comprise instructions which, when executed on at least one processor 1206, cause the at least one processor 1206 to carry out the actions described herein, as performed by the second node 112. In some embodiments, the computer-readable storage medium 1211 may be a non-transitory computer-readable storage medium, such as a CD ROM disc, or a memory stick. In other embodiments, the computer program 1210 product may be stored on a carrier containing the computer program 1210 just described, wherein the carrier is one of an electronic signal, optical signal, radio signal, or the computer-readable storage medium 1211 , as described above.
The second node 112 may comprise a communication interface configured to facilitate, or an interface unit to facilitate, communications between the second node 112 and other nodes or devices, e.g., the first node 111 , the other second nodes 112, the third node 113, the another node 114 and/or or another structure in the communications system 10. The interface may, for example, include a transceiver configured to transmit and receive radio signals over an air interface in accordance with a suitable standard.
In other embodiments, the second node 112 may comprise the following arrangement depicted in Figure 12b. The second node 112 may comprise a processing circuitry 1206, e.g., one or more processors such as the processor 1206, in the second node 112 and the memory 1207. The second node 112 may also comprise a radio circuitry 1212, which may comprise e.g., the receiving port 1208 and the sending port 1209. The processing circuitry 1206 may be configured to, or operable to, perform the method actions according to Figure 4, Figure 6, Figure 7, Figure 9 and/or Figure 10, in a similar manner as that described in relation to Figure 12a. The radio circuitry 1212 may be configured to set up and maintain at least a wireless connection with the first node 111 , the other second nodes 112, the third node 113, the another node 114 and/or or another structure in the communications system 10. Circuitry may be understood herein as a hardware component.
Hence, embodiments herein also relate to the second node 112 operative to operate in the communications system 10. The second node 112 may comprise the processing circuitry 1206 and the memory 1207, said memory 1207 containing instructions executable by said processing circuitry 1206, whereby the second node 112 is further operative to perform the actions described herein in relation to the second node 112, e.g., in Figure 4, Figure 6, Figure 7, Figure 9 and/or Figure 10.
Figure 13 depicts two different examples in panels a) and b), respectively, of the arrangement that the third node 113, may comprise to perform the method described in Figure 5. In some embodiments, the third node 113 may comprise the following arrangement depicted in Figure 13a. The third node 113 is configured to operate in the communications system 10. The third node 113 is further configured to facilitate prediction of the event in the communications system 10.
Several embodiments are comprised herein. It should be noted that the examples herein are not mutually exclusive. One or more embodiments may be combined, where applicable. All possible combinations are not described to simplify the description. Components from one embodiment may be tacitly assumed to be present in another embodiment and it will be obvious to a person skilled in the art how those components may be used in the other exemplary embodiments. In Figure 13, optional units are indicated with dashed boxes.
The detailed description of some of the following corresponds to the same references provided above, in relation to the actions described for the third node 113 and will thus not be repeated here. For example, in some examples, each of the second nodes 112 may be configured to be a compute server.
The third node 113 is configured to, e.g., by means of a sending unit 1301 within the third node 113, configured to, send, to the second node 112 configured to operate in the communications system 10, the global context information. The global context information is configured to indicate which one or more virtual network functions the second node 112 is to collect data from to train the local machine-learning model to predict the event.
In some embodiments, the global context information may be configured to comprise one of: i) the configuration state information of the one or more virtual network functions, and ii) the operational state information of the one or more virtual network functions.
The embodiments herein in the third node 113 may be implemented through one or more processors, such as a processor 1302 in the third node 113 depicted in Figure 13a, together with computer program code for performing the functions and actions of the embodiments herein. A processor, as used herein, may be understood to be a hardware component. The program code mentioned above may also be provided as a computer program product, for instance in the form of a data carrier carrying computer program code for performing the embodiments herein when being loaded into the third node 113. One such carrier may be in the form of a CD ROM disc. It is however feasible with other data carriers such as a memory stick. The computer program code may furthermore be provided as pure program code on a server and downloaded to the third node 113.
The third node 113 may further comprise a memory 1303 comprising one or more memory units. The memory 1303 is arranged to be used to store obtained information, store data, configurations, schedulings, and applications etc. to perform the methods herein when being executed in the third node 113.
In some embodiments, the third node 113 may receive information from, e.g., the first node 111 , the other second nodes 112, and/or the another node 114 through a receiving port 1304. In some embodiments, the receiving port 1304 may be, for example, connected to one or more antennas in third node 113. In other embodiments, the third node 113 may receive information from another structure in the communications system 10 through the receiving port 1304. Since the receiving port 1304 may be in communication with the processor 1302, the receiving port 1304 may then send the received information to the processor 1302. The receiving port 1304 may also be configured to receive other information.
The processor 1302 in the third node 113 may be further configured to transmit or send information to e.g., the first node 111 , the other second nodes 112, the third node 113, the another node 114 and/or another structure in the communications system 10, through a sending port 1305, which may be in communication with the processor 1302, and the memory 1303. Those skilled in the art will also appreciate that the unit 1301 described above may refer to a combination of analog and digital circuits, and/or one or more processors configured with software and/or firmware, e.g., stored in memory, that, when executed by the one or more processors such as the processor 1302, perform as described above. One or more of these processors, as well as the other digital hardware, may be included in a single Application- Specific Integrated Circuit (ASIC), or several processors and various digital hardware may be distributed among several separate components, whether individually packaged or assembled into a System-on-a-Chip (SoC).
Also, in some embodiments, the different unit 1301 described above may be implemented as one or more applications running on one or more processors such as the processor 1302.
Thus, the methods according to the embodiments described herein for the third node 113 may be respectively implemented by means of a computer program 1306 product, comprising instructions, i.e., software code portions, which, when executed on at least one processor 1302, cause the at least one processor 1302 to carry out the actions described herein, as performed by the third node 113. The computer program 1306 product may be stored on a computer-readable storage medium 1307. The computer-readable storage medium 1307, having stored thereon the computer program 1306, may comprise instructions which, when executed on at least one processor 1302, cause the at least one processor 1302 to carry out the actions described herein, as performed by the third node 113. In some embodiments, the computer-readable storage medium 1307 may be a non-transitory computer-readable storage medium, such as a CD ROM disc, or a memory stick. In other embodiments, the computer program 1306 product may be stored on a carrier containing the computer program 1306 just described, wherein the carrier is one of an electronic signal, optical signal, radio signal, or the computer-readable storage medium 1307, as described above.
The third node 113 may comprise a communication interface configured to facilitate, or an interface unit to facilitate, communications between the third node 113 and other nodes or devices, e.g., the first node 111 , the second nodes 112, the another node 114 and/or or another structure in the communications system 10. The interface may, for example, include a transceiver configured to transmit and receive radio signals over an air interface in accordance with a suitable standard.
In other embodiments, the third node 113 may comprise the following arrangement depicted in Figure 13b. The third node 113 may comprise a processing circuitry 1302, e.g., one or more processors such as the processor 1302, in the third node 113 and the memory 1303. The third node 113 may also comprise a radio circuitry 1308, which may comprise e.g., the receiving port 1304 and the sending port 1305. The processing circuitry 1302 may be configured to, or operable to, perform the method actions according to Figure 5, Figure 6, Figure 7, Figure 9 and/or Figure 10, in a similar manner as that described in relation to Figure 13a. The radio circuitry 1308 may be configured to set up and maintain at least a wireless connection with the first node 111 , the second nodes 112, the another node 114 and/or or another structure in the communications system 10. Circuitry may be understood herein as a hardware component.
Hence, embodiments herein also relate to the third node 113 operative to operate in the communications system 10. The third node 113 may comprise the processing circuitry 1302 and the memory 1303, said memory 1303 containing instructions executable by said processing circuitry 1302, whereby the third node 113 is further operative to perform the actions described herein in relation to the third node 113, e.g., in Figure 5, Figure 6, Figure 7, Figure 9 and/or Figure 10.
Figure 14 depicts an example of the arrangement that the communications system 10 may comprise to perform the method described in Figure 6. In embodiments herein the communications system 10 is configured to facilitate prediction of the event in the communications system 10.
The communications system 10 is configured to, e.g., by means of a sending unit 1301 within the third node 113, configured to, send by the third node 113 configured to operate in the communications system 10, the global context information to the second node 112 configured to operate in the communications system 10. The global context information is configured to indicate which one or more virtual network functions the second node 112 is to collect data from to train the local machine-learning model to predict the event.
The communications system 10 is configured to, e.g. by means of an obtaining unit 1201 within the second node 112, configured to, obtain, by the second node 112, the global context information.
The communications system 10 is configured to, e.g., by means of a sending unit 1202 within the second node 112, configured to, send, by the second node 112, the local update of the second local machine-learning model to predict the event in the communications system 10, to the first node 111 operating in the communications system 10. The local update is configured to be based on the global context information configured to be obtained.
The communications system 10 is configured to, e.g., by means of a determining unit 1101 within the first node 111 , configured to determine, by the first node 111 , whether or not the accuracy of the local update exceeds the threshold. The communications system 10 is configured to, e.g., by means of a sending unit 1102 within the first node 111 , configured to send, by the first node 111 , the local update to the another node 114 configured to operate in the communications system 10 based on the result of the determination. One of the following options may apply: I) with the proviso that the accuracy exceeds the threshold, the first node 111 is further configured to proceed with the sending of the local update, and II) with the proviso that the accuracy does not exceed the threshold, the first node 111 is further configured to refrain from sending the local update.
In some embodiments, the local update may be configured to be one of the plurality of local updates configured to be used by the another node 114 to determine the global machinelearning model. The plurality of local updates may be configured to have the respective accuracy exceeding the threshold.
The global machine-learning model may be configured to have the first accuracy. To determine may be further configured to comprise a) executing the global machine-learning model with the local update, b) calculating the second accuracy of the global machine-learning model configured to be updated, c) comparing the change in accuracy from the first accuracy to the second accuracy with respect to the threshold, with the corresponding change configured to be observed with respective local updates configured to be obtained from the plurality of other local nodes, and d) labelling the local update as anomalous with the proviso that the second accuracy does not exceed the threshold
In some embodiments, the communications system 10 may be further configured to, e.g., by means of a generating unit 1103 within the first node 111 , configured to generate, by the first node 111 , the first local machine-learning model to predict the local updates, of the global machine-learning model to predict the event in the communications system 10, having the respective accuracy not exceeding the threshold.
In some embodiments, the communications system 10 may be further configured to, e.g., by means of an obtaining unit 1104 within the first node 111 , configured to obtain, by the first node 111 , the local update from the second node 112 configured to operate in the communications system 10. The local update may be configured to be based on global context information.
In some embodiments, the global context information may be configured to comprise at least one of: I) the configuration state information of the one or more virtual network functions, and ii) the operational state information of the virtual network functions.
That the local update is configured to be based on the global context information may be configured to comprise that the normalizing data configured to be collected by the second node 112 with respect to the global context information. In some embodiments, to obtain the global context information may be configured to comprise one of: a) receiving the global context information from the third node 113 configured to operate in the communications system 10, and b) determining the third local machinelearning model to predict the global context information by training the third machine-learning model with observed global context information configured to be received from the third node 113.
In some embodiments, the communications system 10 may be further configured to, e.g., by means of a collecting unit 1203 within the second node 112, configured to, collect, by the second node 112, the data of the one or more virtual network functions, based on the configuration state information configured to be obtained.
In some of such embodiments, the communications system 10 may be further configured to, e.g., by means of a normalizing unit 1204 configured to, normalize, by the second node 112, the data configured to be collected by the second node 112 with respect to the operational state information, configured to be obtained, of the one or more virtual network functions.
In some embodiments, the communications system 10 may be further configured to, e.g., by means of an updating unit 1205 within the second node 112, configured to, update, by the second node 112, the second local machine-learning model with the data configured to be collected and normalized to obtain the local update.
In some embodiments, the second node 112 may be configured to be a compute server. In some of such embodiments, the global context information may be configured to comprise the configuration state information of the one or more virtual network functions enabling mapping information of virtual port names to types of virtual network functions.
In some embodiments, the second node 112 may be configured to be a compute server. In some of such embodiments, the global context information may be configured to comprise the operational state information of the one or more virtual network functions enabling normalization of the data based on the number of the one or more virtual network functions configured to be instantiated at the time of the updating.
The remaining configurations described for the first node 111 , the second node 112 and the third node 113 in relation to Figure 6, may be understood to correspond to those described in Figure 11 , Figure 12, and Figure 13, respectively, and to be performed, e.g., by means of the corresponding units and arrangements described in Figure 11 , Figure 12, and Figure 13, which will not be repeated here.
When using the word "comprise" or “comprising”, it shall be interpreted as non- limiting, i.e. meaning "consist at least of". The embodiments herein are not limited to the above described preferred embodiments. Various alternatives, modifications and equivalents may be used. Therefore, the above embodiments should not be taken as limiting the scope of the invention.
Generally, all terms used herein are to be interpreted according to their ordinary meaning in the relevant technical field, unless a different meaning is clearly given and/or is implied from the context in which it is used. All references to a/an/the element, apparatus, component, means, step, etc. are to be interpreted openly as referring to at least one instance of the element, apparatus, component, means, step, etc., unless explicitly stated otherwise. The steps of any methods disclosed herein do not have to be performed in the exact order disclosed, unless a step is explicitly described as following or preceding another step and/or where it is implicit that a step must follow or precede another step. Any feature of any of the embodiments disclosed herein may be applied to any other embodiment, wherever appropriate. Likewise, any advantage of any of the embodiments may apply to any other embodiments, and vice versa. Other objectives, features and advantages of the enclosed embodiments will be apparent from the following description.
As used herein, the expression “at least one of:” followed by a list of alternatives separated by commas, and wherein the last alternative is preceded by the “and” term, may be understood to mean that only one of the list of alternatives may apply, more than one of the list of alternatives may apply or all of the list of alternatives may apply. This expression may be understood to be equivalent to the expression “at least one of:” followed by a list of alternatives separated by commas, and wherein the last alternative is preceded by the “or” term.
Any of the terms processor and circuitry may be understood herein as a hardware component.
As used herein, the expression “in some embodiments” has been used to indicate that the features of the embodiment described may be combined with any other embodiment or example disclosed herein.
As used herein, the expression “in some examples” has been used to indicate that the features of the example described may be combined with any other embodiment or example disclosed herein.
REFERENCES
1 . Predicting network communication performance using federated learning https://patents.google.com/patent/W02020115273A1/en?oq=WO2020115273A1

Claims

CLAIMS:
1 . A computer-implemented method, performed by a first node (111) operating in a communications system (10), the method being to facilitate prediction of an event in the communications system (10), the method comprising:
- determining (302) whether or not an accuracy of a local update to a global machine-learning model to predict an event in the communications system (10) exceeds a threshold, and
- sending (304) the local update to another node (114) operating in the communications system (10) based on a result of the determination, wherein one of: a) with the proviso that the accuracy exceeds the threshold, proceeding with the sending of the local update, and b) with the proviso that the accuracy does not exceed the threshold, refraining from sending the local update.
2. The method according to claim 1 , wherein the local update is one of a plurality of local updates to be used to determine the global machine-learning model, the plurality of local updates having a respective accuracy exceeding the threshold.
3. The method according to any of claims 1-2, wherein the global machine-learning model has a first accuracy, and wherein the determining (302) further comprises a) executing the global machine-learning model with the local update, b) calculating a second accuracy of the updated global machine-learning model, c) comparing a change in accuracy from the first accuracy to the second accuracy with respect to the threshold, with a corresponding change observed with respective local updates obtained from a plurality of other local nodes, and d) labelling the local update as anomalous with the proviso that the second accuracy does not exceed the threshold.
4. The method according to any of claims 1-3, wherein the method further comprises:
- generating (303) a first local machine-learning model to predict local updates of the global machine-learning model to predict the event in the communications system (10) having a respective accuracy not exceeding the threshold.
5. The method according to any of claims 1-4, further comprising: - obtaining (301) the local update from a second node (112) operating in the communications system (10), wherein the local update is based on global context information.
6. The method according to claim 5, wherein the global context information comprises at least one of: i. configuration state information of the one or more virtual network functions, and ii. operational state information of the virtual network functions.
7. The method according to any of claims 5-6, wherein that the local update is based on the global context information comprises that the data collected by the second node (112) are normalized with respect to the global context information.
8. A computer-implemented method, performed by a second node (112) operating in a communications system (10), the method being to facilitate prediction of an event in the communications system (10), the method comprising:
- obtaining (401) global context information indicating which one or more virtual network functions the second node (112) is to collect data from to train a local machine-learning model to predict the event, and
- sending (404) a local update of a second local machine-learning model to predict the event in the communications system 10, to a first node (111) operating in the communications system (10), wherein the local update is based on the obtained global context information.
9. The method according to claim 8, wherein the obtaining (401) of the global context information comprises one of: a. receiving the global context information from a third node (113) operating in the communications system (10), and b. determining a third local machine-learning model to predict the global context information by training the third machine-learning model with observed global context information received from the third node (113).
10. The method according to any of claims 8-9, wherein the global context information comprises one of: i. configuration state information of the one or more virtual network functions, and ii. operational state information of the one or more virtual network functions. The method according to claim 10, further comprising:
- collecting (402) the data of the one or more virtual network functions, based on the obtained configuration state information, and
- normalizing (403) the data collected by the second node (112) with respect to the obtained operational state information of the one or more virtual network functions, and
- updating (404) the second local machine-learning model with the normalized collected data to obtain the local update. The method according to any of claims 8-11 , wherein the second node (112) is a compute server, and the global context information comprises configuration state information of the one or more virtual network functions enabling mapping information of virtual port names to types of virtual network functions. The method according to any of claims 8-12, wherein the second node (112) is a compute server, and the global context information comprises operational state information of the one or more virtual network functions enabling normalization of the data based on a number of the one or more virtual network functions instantiated at the time of the updating (503). A computer-implemented method, performed by a third node (113) operating in a communications system (10), the method being to facilitate prediction of an event in the communications system (10), the method comprising:
- sending (501), to a second node (112) operating in the communications system (10), global context information, the global context information indicating which one or more virtual network functions the second node (112) is to collect data from to train a local machine-learning model to predict the event. The method according to claim 14, the global context information comprising one of: i. configuration state information of the one or more virtual network functions, and ii. operational state information of the one or more virtual network functions. A computer-implemented method to facilitate prediction of an event in the communications system (10), the method comprising:
- sending (601 , 501) by a third node (113) operating in the communications system (10), global context information to a second node (112) operating in the communications system (10), the global context information indicating which one or more virtual network functions the second node (112) is to collect data from to train a local machine-learning model to predict the event,
- obtaining (602, 401), by the second node (112), the global context information,
- sending (606, 404), by the second node (112), a local update of a second local machine-learning model to predict the event in the communications system 10, to a first node (111 ) operating in the communications system (10), wherein the local update is based on the obtained global context information,
- determining (608, 302), by the first node (111), whether or not an accuracy of the local update exceeds a threshold, and
- sending (610, 304), by the first node (111), the local update to another node (114) operating in the communications system (10) based on a result of the determination, wherein one of: a) with the proviso that the accuracy exceeds the threshold, proceeding with the sending of the local update, and b) with the proviso that the accuracy does not exceed the threshold, refraining from sending the local update. The method according to claim 16, wherein the local update is one of a plurality of local updates used by the another node (114) to determine the global machine-learning model, the plurality of local updates having a respective accuracy exceeding the threshold. The method according to 16-17, wherein the global machine-learning model has a first accuracy, and wherein the determining (608, 302) further comprises a) executing the global machine-learning model with the local update, b) calculating a second accuracy of the updated global machine-learning model, c) comparing a change in accuracy from the first accuracy to the second accuracy with respect to the threshold, with a corresponding change observed with respective local updates obtained from a plurality of other local nodes, and d) labelling the local update as anomalous with the proviso that the second accuracy does not exceed the threshold.
19. The method according to any of claims 16-18, wherein the method further comprises:
- generating (609, 303), by the first node (111 ), a first local machine-learning model to predict local updates, of the global machine-learning model to predict the event in the communications system (10), having a respective accuracy not exceeding the threshold.
20. The method according to any of claims 16-19, further comprising:
- obtaining (607, 301 ), by the first node (111 ), the local update from a second node (112) operating in the communications system (10), wherein the local update is based on the global context information.
21. The method according to any of claims 16-20, wherein the global context information comprises at least one of: i. configuration state information of the one or more virtual network functions, and ii. operational state information of the virtual network functions.
22. The method according to any of claims 16-21 , wherein that the local update is based on the global context information comprises normalizing data collected by the second node (112) with respect to the global context information.
23. The method according to any of claims 16-22, wherein the obtaining (401) of the global context information comprises one of: a. receiving the global context information from a third node (113) operating in the communications system (10), and b. determining a third local machine-learning model to predict the global context information by training the third machine-learning model with the received global context information.
24. The method according to any of claims 16-23, further comprising the second node (112):
- collecting (603, 402) the data based on the obtained configuration state information of the one or more virtual network functions, and - normalizing (604, 403) the data collected by the second node (112) with respect to the obtained operational state information of the one or more virtual network functions, and
- updating (605, 404) the second local machine-learning model with the normalized collected data to obtain the local update. The method according to any of claims 16-24, wherein the second node (112) is a compute server, and the global context information comprises configuration state information of the one or more virtual network functions enabling mapping information of virtual port names to types of virtual network functions. The method according to any of claims 16-25, wherein the second node (112) is a compute server, and the global context information comprises operational state information of the one or more virtual network functions enabling normalization of the data based on a number of the one or more virtual network functions instantiated at the time of the updating (605, 404). A first node (111), configured to operate in a communications system (10), the first node (111) being further configured to facilitate prediction of an event in the communications system (10), the first node (111 ) being further configured to:
- determine whether or not an accuracy of a local update to a global machinelearning model to predict an event in the communications system (10) exceeds a threshold, and
- send the local update to another node (114) configured to operate in the communications system (10) based on a result of the determination, wherein one of: a) with the proviso that the accuracy exceeds the threshold, the first node (111 ) is further configured to proceed with the sending of the local update, and b) with the proviso that the accuracy does not exceed the threshold, the first node (111) is further configured to refrain from sending the local update. The first node (111) according to claim 27, wherein the local update is configured to be one of a plurality of local updates to be used to determine the global machine-learning model, the plurality of local updates being configured to have a respective accuracy exceeding the threshold.
29. The first node (111) according to any of claims 27-28, wherein the global machinelearning model is configured to have a first accuracy, and wherein to determine is further configured to comprise a) executing the global machine-learning model with the local update, b) calculating a second accuracy of the global machine-learning model configured to be updated, c) comparing a change in accuracy from the first accuracy to the second accuracy with respect to the threshold, with a corresponding change configured to be observed with respective local updates configured to be obtained from a plurality of other local nodes, and d) labelling the local update as anomalous with the proviso that the second accuracy does not exceed the threshold.
30. The first node (111) according to any of claims 27-29, wherein the first node (111) is further configured to:
- generate a first local machine-learning model to predict local updates, of the global machine-learning model to predict the event in the communications system (10), having a respective accuracy not exceeding the threshold.
31 . The first node (111) according to any of claims 27-30, being further configured to:
- obtain the local update from a second node (112) configured to operate in the communications system (10), wherein the local update is configured to be based on global context information.
32. The first node (111) according to claim 31 , wherein the global context information is configured to comprise at least one of: i. configuration state information of the one or more virtual network functions, and ii. operational state information of the virtual network functions.
33. The first node (111) according to any of claims 31 -32, wherein that the local update is configured to be based on the global context information comprises that the data configured to be collected by the second node (112) are configured to be normalized with respect to the global context information. A second node (112), configured to operate in a communications system (10), the second node (112) being configured to facilitate prediction of an event in the communications system (10), the second node (112) being further configured to:
- obtain global context information configured to indicate which one or more virtual network functions the second node (112) is to collect data from to train a local machine-learning model to predict the event, and
- send a local update of a second local machine-learning model to predict the event in the communications system 10, to a first node (111 ) operating in the communications system (10), wherein the local update is configured to be based on the global context information configured to be obtained. The second node (112) according to claim 34, wherein to obtain the global context information is configured to comprise one of: a. receiving the global context information from a third node (113) configured to operate in the communications system (10), and b. determining a third local machine-learning model to predict the global context information by training the third machine-learning model with observed global context information configured to be received from the third node (113). The second node (112) according to any of claims 34-35, wherein the global context information is configured to comprise one of: i. configuration state information of the one or more virtual network functions, and ii. operational state information of the one or more virtual network functions. The second node (112) according to claim 36, being further configured to:
- collect the data of the one or more virtual network functions, based on the configuration state information configured to be obtained, and
- normalize the data configured to be collected by the second node (112) with respect to the operational state information, configured to be obtained, of the one or more virtual network functions, and
- update the second local machine-learning model with the data configured to be collected and normalized to obtain the local update. The second node (112) according to any of claims 34-37, wherein the second node (112) is configured to be a compute server, and the global context information is configured to comprise configuration state information of the one or more virtual network functions enabling mapping information of virtual port names to types of virtual network functions. The second node (112) according to any of claims 34-38, wherein the second node (112) is configured to be a compute server, and the global context information is configured to comprise operational state information of the one or more virtual network functions enabling normalization of the data based on a number of the one or more virtual network functions configured to be instantiated at the time of the updating (503). A third node (113), configured to operate in a communications system (10), the third node (113) being further configured to facilitate prediction of an event in the communications system (10), the third node (113) being further configured to:
- send, to a second node (112) configured to operate in the communications system (10), global context information, the global context information being configured to indicate which one or more virtual network functions the second node (112) is to collect data from to train a local machine-learning model to predict the event. The third node (113) according to claim 40, the global context information being configured to comprise one of: i. configuration state information of the one or more virtual network functions, and ii. operational state information of the one or more virtual network functions. A communications system (10) configured to facilitate prediction of an event in the communications system (10), the communications system (10) being configured to:
- send by a third node (113) configured to operate in the communications system (10), global context information to a second node (112) configured to operate in the communications system (10), the global context information is configured to indicate which one or more virtual network functions the second node (112) is to collect data from to train a local machine-learning model to predict the event,
- obtain, by the second node (112), the global context information, - send, by the second node (112), a local update of a second local machine-learning model to predict the event in the communications system 10, to a first node (111) configured to operate in the communications system (10), wherein the local update is configured to be based on the global context information configured to be obtained,
- determine, by the first node (111), whether or not an accuracy of the local update exceeds a threshold, and
- send, by the first node (111), the local update to another node (114) configured to operate in the communications system (10) based on a result of the determination, wherein one of: a) with the proviso that the accuracy exceeds the threshold, the first node 111 is further configured to proceed with the sending of the local update, and b) with the proviso that the accuracy does not exceed the threshold, the first node 111 is further configured to refrain from sending the local update. The communications system (10) according to claim 42, wherein the local update is configured to be one of a plurality of local updates configured to be used by the another node (114) to determine the global machine-learning model, the plurality of local updates being configured to have a respective accuracy exceeding the threshold. The communications system (10) according to 42-43, wherein the global machinelearning model is configured to have a first accuracy, and wherein to determine (608, 302) further comprises a) executing the global machine-learning model with the local update, b) calculating a second accuracy of the updated global machine-learning model, c) comparing a change in accuracy from the first accuracy to the second accuracy with respect to the threshold, with a corresponding change configured to be observed with respective local updates configured to be obtained from a plurality of other local nodes, and d) labelling the local update as anomalous with the proviso that the second accuracy does not exceed the threshold. The communications system (10) according to any of claims 43-44, wherein the third node (113) is further configured to:
- generate, by the first node (111 ), a first local machine-learning model to predict local updates, of the global machine-learning model to predict the event in the 60 communications system (10), having a respective accuracy not exceeding the threshold.
46. The communications system (10) according to any of claims 42-45, being further configured to:
- obtain, by the first node (111 ), the local update from a second node (112) configured to operate in the communications system (10), wherein the local update is configured to be based on the global context information.
47. The communications system (10) according to any of claims 42-46, wherein the global context information is configured to comprise at least one of:
I. configuration state information of the one or more virtual network functions, and ii. operational state information of the virtual network functions.
48. The communications system (10) according to any of claims 42-47, wherein that the local update is configured to be based on the global context information is configured to comprise normalizing data configured to be collected by the second node (112) with respect to the global context information.
49. The communications system (10) according to any of claims 42-48, wherein to obtain the global context information is configured to comprise one of: a. receiving the global context information from a third node (113) configured to operate in the communications system (10), and b. determine a third local machine-learning model to predict the global context information by training the third machine-learning model with the global context information configured to be received from the third node 113.
50. The communications system (10) according to any of claims 42-49, wherein the second node (112) is further configured to:
- collect the data of the one or more virtual network functions, based on the configuration state information configured to be obtained, and
- normalize the data configured to be collected by the second node (112) with respect to the operational state information, configured to be obtained, of the one or more virtual network functions, and 61
- update the second local machine-learning model with the data configured to be collected and normalized to obtain the local update.
51. The communications system (10) according to any of claims 42-50, wherein the second node (112) is configured to be a compute server, and the global context information is configured to comprise configuration state information of the one or more virtual network functions enabling mapping information of virtual port names to types of virtual network functions. 52. The communications system (10) according to any of claims 42-51 , wherein the second node (112) is configured to be a compute server, and the global context information is configured to comprise operational state information of the one or more virtual network functions enabling normalization of the data based on a number of the one or more virtual network functions configured to be instantiated at the time of the updating.
PCT/IN2021/050037 2021-01-15 2021-01-15 First node, second node, third node, communications system, and methods performed thereby to facilitate prediction of an event WO2022153324A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/IN2021/050037 WO2022153324A1 (en) 2021-01-15 2021-01-15 First node, second node, third node, communications system, and methods performed thereby to facilitate prediction of an event

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/IN2021/050037 WO2022153324A1 (en) 2021-01-15 2021-01-15 First node, second node, third node, communications system, and methods performed thereby to facilitate prediction of an event

Publications (1)

Publication Number Publication Date
WO2022153324A1 true WO2022153324A1 (en) 2022-07-21

Family

ID=82448027

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/IN2021/050037 WO2022153324A1 (en) 2021-01-15 2021-01-15 First node, second node, third node, communications system, and methods performed thereby to facilitate prediction of an event

Country Status (1)

Country Link
WO (1) WO2022153324A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117062132A (en) * 2023-10-12 2023-11-14 北京信息科技大学 CF-UAV intelligent transmission signaling interaction method considering time delay and energy consumption

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10489457B1 (en) * 2018-05-24 2019-11-26 People.ai, Inc. Systems and methods for detecting events based on updates to node profiles from electronic activities
WO2020114592A1 (en) * 2018-12-05 2020-06-11 Telefonaktiebolaget Lm Ericsson (Publ) First node, second node, third node and methods performed thereby for handling roaming information

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10489457B1 (en) * 2018-05-24 2019-11-26 People.ai, Inc. Systems and methods for detecting events based on updates to node profiles from electronic activities
WO2020114592A1 (en) * 2018-12-05 2020-06-11 Telefonaktiebolaget Lm Ericsson (Publ) First node, second node, third node and methods performed thereby for handling roaming information

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117062132A (en) * 2023-10-12 2023-11-14 北京信息科技大学 CF-UAV intelligent transmission signaling interaction method considering time delay and energy consumption
CN117062132B (en) * 2023-10-12 2024-01-09 北京信息科技大学 CF-UAV intelligent transmission signaling interaction method considering time delay and energy consumption

Similar Documents

Publication Publication Date Title
US11963051B2 (en) Context aware handovers
US20220124543A1 (en) Graph neural network and reinforcement learning techniques for connection management
US20220014963A1 (en) Reinforcement learning for multi-access traffic management
NL2033617B1 (en) Resilient radio resource provisioning for network slicing
Brik et al. Deep learning for B5G open radio access network: Evolution, survey, case studies, and challenges
US20230072769A1 (en) Multi-radio access technology traffic management
US11617094B2 (en) Machine learning in radio access networks
Marzouk et al. On energy efficient resource allocation in shared RANs: Survey and qualitative analysis
WO2019125255A1 (en) Method and arrangement for beam assignment support
Thembelihle et al. Softwarization of mobile network functions towards agile and energy efficient 5G architectures: a survey
US20230308199A1 (en) Link performance prediction using spatial link performance mapping
Foukas et al. Iris: Deep reinforcement learning driven shared spectrum access architecture for indoor neutral-host small cells
US20220322226A1 (en) System and method for ran power and environmental orchestration
US11470560B2 (en) Determining power optimization for multiple radios based on historical power usage in advanced networks
US20220326757A1 (en) Multi-timescale power control technologies
Koudouridis et al. An architecture and performance evaluation framework for artificial intelligence solutions in beyond 5G radio access networks
US20240086787A1 (en) Method and system to predict network performance using a hybrid model incorporating multiple sub-models
WO2022153324A1 (en) First node, second node, third node, communications system, and methods performed thereby to facilitate prediction of an event
Tseng et al. Micro operator design pattern in 5G SDN/NFV network
US11051226B1 (en) Facilitating enablement of intelligent service aware access utilizing multiaccess edge computing in advanced networks
Zou et al. Resource multi-objective mapping algorithm based on virtualized network functions: RMMA
US11622322B1 (en) Systems and methods for providing satellite backhaul management over terrestrial fiber
JP6945089B1 (en) Network slice configuration
Ericson et al. Setting 6g architecture in motion–the hexa-x approach
US20230096832A1 (en) First node, second node, third node, fourth node, fifth node and methods performed thereby for handling firmware

Legal Events

Date Code Title Description
NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21919236

Country of ref document: EP

Kind code of ref document: A1