WO2020261285A1 - First node, and method performed thereby, for handling a problem in a communications network - Google Patents
First node, and method performed thereby, for handling a problem in a communications network Download PDFInfo
- Publication number
- WO2020261285A1 WO2020261285A1 PCT/IN2019/050474 IN2019050474W WO2020261285A1 WO 2020261285 A1 WO2020261285 A1 WO 2020261285A1 IN 2019050474 W IN2019050474 W IN 2019050474W WO 2020261285 A1 WO2020261285 A1 WO 2020261285A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- node
- neural network
- artificial neural
- activation function
- linear activation
- Prior art date
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/0703—Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
- G06F11/0706—Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation the processing taking place on a specific hardware platform or in a specific software environment
- G06F11/0709—Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation the processing taking place on a specific hardware platform or in a specific software environment in a distributed system consisting of a plurality of standalone computer nodes, e.g. clusters, client-server systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/0703—Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
- G06F11/0793—Remedial or corrective actions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/048—Activation functions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/088—Non-supervised learning, e.g. competitive learning
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
- H04L41/14—Network analysis or design
- H04L41/145—Network analysis or design involving simulating, designing, planning or modelling of a network
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
- H04L41/14—Network analysis or design
- H04L41/147—Network analysis or design for predicting network behaviour
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
- H04L41/14—Network analysis or design
- H04L41/149—Network analysis or design for prediction of maintenance
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
- H04L41/16—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks using machine learning or artificial intelligence
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
- H04L41/40—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks using virtualisation of network functions or resources, e.g. SDN or NFV entities
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
- G06N20/20—Ensemble learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/10—Interfaces, programming languages or software development kits, e.g. for simulating neural networks
- G06N3/105—Shells for specifying net layout
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computing arrangements using knowledge-based models
- G06N5/01—Dynamic search techniques; Heuristics; Dynamic trees; Branch-and-bound
Definitions
- the present disclosure relates generally to a node and methods performed thereby for handling a problem in a communications network.
- the present disclosure further relates generally to a computer program product, comprising instructions to carry out the actions described herein, as performed by the node.
- the computer program product may be stored on a computer-readable storage medium.
- Computer systems in a communications network may comprise one or more nodes, which may also be referred to simply as nodes.
- a node may comprise one or more processors which, together with computer program code may perform different functions and actions, a memory, a receiving and a sending port.
- a node may be, for example, a server.
- a mission-critical application may be understood as an application which may require actions to be taken in a very minimal time, e.g., a driverless cars application, and it may require a low latency network.
- fog computing which may be understood to reside in between cloud and loT devices.
- loT devices may be connected to fog devices.
- a fog device may be understood as a low computation device which may act as an intermediate device between edge and cloud.
- These fog devices may be located in close proximity to users and may be responsible for intermediate computation and storage.
- Fog computing research is still in its infancy, and taxonomy-based investigation into the requirements of fog infrastructure, platform, and applications mapped to current research may be still required. Taxonomy-based investigation may be understood as a study of the dynamic nature of fog devices and a design of a network based on that.
- the fog computing paradigm may be understood as a highly distributed
- the fog may be understood to comprise multiple devices, given that fog devices may only be connected in a decentralized manner, and the management of fog devices may typically not be central
- the devices may fail for many reasons, such as hardware failure, software failure, or because of user activity. Besides these problems, some other reasons may include connectivity, mobility, and power source, which may also play a big role.
- Most of the devices in a fog environment may be connected via wireless connections, and wireless connections may not always be reliable.
- Most of the devices that are connected via wireless are mobile, so these devices may change location to different clusters frequently.
- Fog devices may be understood to not keep a fixed structure. The structure may be understood to keep on changing and be dynamic in nature. Hence, the cluster of fog nodes may constantly change.
- One other characteristic of these devices may be that they may typically be battery powered and may fail anytime. Hence, dealing with the complex nature of failure may be very difficult.
- Fault tolerance may be considered one requirement of fog infrastructure. Fault tolerance may be understood to allow a system to keep performing even when a part of the system has failed. This failure may be a software failure, a hardware failure, or a network failure. The solution for fault tolerance may result in a fully operational system, where the system may continue its operation with a lower capability, instead of shutting down totally.
- fault tolerance has been mostly studied in the cloud computing paradigm [1 ,2]
- proactive fault tolerance may be used to reduce the impact of failures on a system when the failures have occurred.
- Techniques based on this policy may be job migration, checkpoint and/or restart, replication, rollback and recovery, task resubmission, user-defined exception handling, and workflow rescue.
- Proactive fault tolerance may be understood to predict the faults pro-actively and replace the suspected components with other working components; thus, avoiding recovery from faults and errors.
- Proactive Fault Tolerance may use self-adaptation, pre-emptive migration, and software rejuvenation, which may be understood to be the few proactive fault tolerance techniques.
- Proactive fault tolerance may predict the faults proactively and may replace the suspected components by other working components, thus avoiding recovery from faults and errors.
- Fault tolerance is mostly investigated in the cloud. However, it may be necessary to investigate fault tolerance in the fog as well. Although many research efforts have addressed the need to explore fault tolerance problems [3], [4], [5] in fog computing, none have investigated the issue based on trying to identify the root sources of the problems.
- the main challenge of a 5G deployment may be considered to be automation on all levels of the 5G eco-system. Without it, the network may simply not work; neither for the anticipated scale nor for the desired functional complexity. In order to introduce automation on all levels in a 5G system, self-adaptation may be considered the most prominent feature that may need to be added. Considering the requirement of setting up fault tolerance in all the important fog nodes may be understood to generally increase the overhead, and lead into latency problems in 5G communication services. 5G is envisioned to support unprecedented diverse applications and services with extremely
- heterogeneous performance requirements such as, mission critical loT communication, massive machine-type communication and Big data management in mobile connectivity.
- the operators may collaborate with application and/or service providers to provide better quality of loT services by providing self-adaptation for a few latency mission critical applications.
- the importance of initiating self-adaptation in fog nodes to address latency critical 5G applications will be discussed.
- Proactive Fault Tolerance using self-adaptation is a way of controlling the failure of an instance of an application running on multiple virtual machines automatically.
- self-adaptation features may be implemented to mimic the animal selfadaptation process to bring automation to the system.
- Self-adaptation may be introduced for lifelong learning to update the features to perform self-adaptation on a continuous basis.
- Plastic Neural Networks has been introduced the aim of which may be understood to be to autonomously design and create learning systems. It may also introduce lifelong learning to the existing systems by bootstrapping learning from scratch, recovering performance in unseen conditions, testing the computational advantages of neural components, and deriving hypotheses on the emergence of biological learning.
- Plastic Neural networks may not be directly used for fog networking due to the presence of the Hebbian term in the plastic network.
- the Hebbian term is designed for biological network space and may not be used for engineering applications directly.
- Sustainability in fog computing may be understood to optimize its economic and environmental interest on a great extent.
- the overall sustainable architecture of fog computing is subject to many issues such as assurance of Quality of Service (QoS), service reusability, energy efficient resource management etc.
- QoS Quality of Service
- reliability in fog computing may be discussed in terms of consistency of fog nodes, availability of high performance services, secured interactions, fault tolerance etc. In the existing literature, a very narrow discussion towards sustainable and reliable fog computing has been provided.
- LTE-A LTE Advanced
- the main idea of utilizing a cellular infrastructure for fog computing is that it explores the use of hierarchical architecture, e.g., where the cloud may be connected to fog devices in a tree like structure, and where fog devices may also be connected to another set of fog devices in a tree structure.
- LTE-A network for the purpose of high speed communication is for the signal processing activities through fog Radio Access Networks. Peng et al.
- [6] have proposed a RAN architecture for 5G systems based on fog computing which is an effective extension of a cloud based RAN. It may be used to reduce the front haul load and delay with the help of using virtualized baseband processing units.
- the edge processing and virtualization are the most efficient aspects in the context of 5G networks. Recently, fog based catching at the edge devices in radio access network has been explored and it may be used to identify the optimal catching along with front haul and edge transmission policies. Catching may be understood as a mechanism in which the edge devices may be connected to a fog network.
- 5G systems may be understood to need more latency-sensitivity than the 4G systems.
- Fog computing is being applied in 5G systems to minimize the delay which includes communication and computing delay.
- Fog computing may be able to provide low latency interactions between machine to machine communications.
- the 5G based cellular system and the fog computing framework are very much related to each other in terms of compatibility compared to cloud computing.
- the object is achieved by a method, performed by a first node.
- the method is for handling a problem in a communications network.
- the first node manages an artificial neural network.
- the first node determines, in a set of data collected from the communications network, a set of one or more features.
- the set of one or more features defines a problem in an operation of the communications network.
- the set of one or more features is previously undetected in the communications network by the first node.
- the set of one or more features is also lacking a corresponding set of one or more solutions.
- the first node then trains the artificial neural network to find a solution to the problem defined by the set of one or more features.
- the modified pre-existing linear activation function adjusts a pre-existing linear activation function by adding a factor.
- the added factor adjusts a respective weight, respectively, of a subset of neurons comprised in the artificial neural network.
- the subset of neurons comprises one or more neurons providing the highest output with the pre-existing linear activation function, according to a first threshold. This is when the pre-existing linear activation function is used to train the artificial neural network with the set of data.
- the object is achieved by the first node.
- the first node may be considered to be for handling the problem in the communications network.
- the first node is configured to manage the artificial neural network.
- the first node is further configured to determine, in the set of data configured to be collected from the communications network, the set of one or more features configured to define the problem in the operation of the communications network.
- the set of one or more features is configured to be: a) previously undetected in the communications network by the first node, and b) lacking a corresponding set of one or more solutions.
- the first node is also configured to train the artificial neural network to find the solution to the problem configured to be defined by the set of one or more features. This is configured to be performed by training the artificial neural network with the modified preexisting linear activation function.
- the modified pre-existing linear activation function is configured to adjust the pre-existing linear activation function by adding the factor.
- the added factor is configured to adjust the respective weight, respectively, of the subset of neurons configured to be comprised in the artificial neural network.
- the subset of neurons is configured to comprise the one or more neurons providing the highest output with the pre-existing linear activation function, according to the first threshold, when the pre-existing linear activation function is used to train the artificial neural network with the set of data.
- the object is achieved by a computer program, comprising instructions which, when executed on at least one processor, cause the at least one processor to carry out the method performed by the first node.
- the object is achieved by a computer-readable storage medium, having stored thereon the computer program, comprising instructions which, when executed on at least one processor, cause the at least one processor to carry out the method performed by the first node.
- the first node By the first node determining the previously undetected set of one or more features lacking a corresponding set of one or more solutions, and then training the artificial neural network to find the solution to the problem with the modified pre-existing linear activation function, the first node is enabled to automate its own self-adaptation and manage any failures that it may not have previously encountered. Furthermore, the first node may be enabled to address failure management with reduced latency of applications. This may be particularly useful in fog nodes, improving the functioning of the communications network and its cost. In some aspects, embodiments herein may be understood to introduce proactive fault tolerance measures capable of being deployed within fog networking by running a modified plastic neural network which may introduce lifelong learning in performing self-adaptation to enhance and achieve expected reach in mission critical 5G applications.
- Figure 1 is a schematic diagram illustrating two non-limiting examples of a
- Figure 2 is a flowchart depicting a method in a first node, according to embodiments herein.
- Figure 3 is a schematic diagram depicting aspects of the method performed by the first node, according to embodiments herein.
- Figure 4 is a schematic diagram depicting other aspects of the method performed by the first node, according to embodiments herein.
- Figure 5 is a schematic diagram of an example of the first node, according to
- Figure 6 is a schematic diagram depicting aspects of the method performed by the first node, according to embodiments herein.
- Figure 7 is a schematic diagram depicting aspects of the method performed by the first node, according to embodiments herein.
- Figure 8 is a schematic block diagram illustrating embodiments of a first node, according to embodiments herein.
- any fog self-adaptation problem may need to handle two cases: (i) first, it may need to be able to adapt to existing problems, for which the model may already be trained, and (ii) second, it may need to be able to adapt to new problems.
- the first problem is very much discussed in literature and numerous methods are available for the same. However, the second problem seems much more complex and, there are no such existing methods for the same.
- the systems may need to behave like human beings, where they may need to try to identify the solutions for the new problems. This may be also known as lifelong learning since fog nodes may need to understand the different sets of problems they are likely to face over the entire life.
- Neural networks may be understood to recognize problems as a set of features.
- a problem in this context may be, for example, a classification problem, where the set of features may need to be categorized based on a pattern of the features.
- a feature may be understood as a variable or parameter, such as, e.g., sensor data.
- any neural network may perform methodically for the data with similar features on which the network may have been trained. However, when new features are encountered, the network may fail and tend to give wrong results. A result may be considered a wrong result when the identified category is different from that of the pattern observed in the features.
- a plastic neural network has been recently proposed [7], which may be considered to have a similar function to the functioning of the plasticity in the human brain.
- the problem with the method proposed in [7] is that it is computationally non- tractable, as the computational cost may be understood to increase exponentially with the number of features.
- the problem with the method proposed in [7] requires the knowledge of domain, that is, knowledge to understand the Hebbian term, in which the self-adaptation is performed.
- the network may classify it as one of the old features it may have encountered in the past, absent a more likely alternative. This will generate many false positives in the classification problem. However, in the case of self-adaptation, the number of false positives may be understood to be desired and expected to be as low as possible. Otherwise, the system will be never end looping and may require human intervention to solve the problems it may be facing.
- the system may need to try ten different solutions, that is, ten different times, before fetting an actual solution. This is time consuming considering the self-healing scenario, since results are expected to be delivered in the order of milliseconds. Hence, reducing the number of false positives may be understood to play a relevant role in designing any self-healing system.
- embodiments herein may be understood to be drawn to introducing a new method to decrease false positives in a neural network.
- embodiments herein may be understood to be drawn to a modified method for self-adaptation.
- embodiments herein may be understood to be drawn a new method in a computer system managing an artificial neural network to understand the features of learned problems, which may be considered to belong to a first category out of the categories in which the set of patterns may be divided, where the first category here is the true category of the problem, and to then use them to find similarity with a new problem.
- a new solution may be proposed by taking all the solutions of identified similar problems.
- embodiments herein may be understood to be drawn to a new loop inside a fog node to monitor the environment in 5G applications.
- Embodiments herein may be understood to be aimed to design a proactive fault tolerance methodology, such as a self-adaptive methodology, within fog nodes, for mission critical 5G applications.
- the failure handling method may be proactive, which may be understood to mean that it may always monitor the host and may continuously try to predict the chances of failure. If the prediction becomes true, the system may look up other available resources and then migration may be performed to avoid the problem from having more damaging consequences for the functioning of the network.
- To employ such a technique in fog computing further investigation may be needed because the types of device in the fog may be diverse. Because of the unstable nature of failures and heterogeneous characteristics, a hybrid failure handling method combining several methods may be more appropriate for the fog computing environment.
- embodiments herein may be understood to introduce proactive fault tolerance measures within fog networking by building a cognitive fog framework and running a modified plastic neural network which may introduce lifelong learning in performing self-adaptation to enhance and achieve expected reach in mission critical 5G applications. Further particularly, embodiments herein may be understood to propose a new system to introduce a proactive fault tolerance method within a cognitive fog framework by running a plastic neural network inside fog node. The method may be understood to introduce lifelong self-adaptation in fog nodes, and thereby enhance the performance in mission critical latency-aware 5G applications.
- embodiments herein may be understood to comprise two steps.
- a new cognitive fog framework layer and relevant taxonomy in fog Networking may be introduced.
- the taxonomy discussed in [1] may be used to understand different hierarchies in a fog network and their connections, since the fog network may be continuously changing. This understanding may assist the system of embodiments herein to introduce peer-to-peer associations to solve a problem.
- the cognitive fog framework layer may be built inside the fog node with the application of both Machine Learning (ML) and neural learning, to introduce lifelong learning, such that the proposed system may address fault tolerance automatically for any new faulty conditions.
- ML Machine Learning
- a ML method may be trained to understand the problems and perform fault tolerance.
- the fog node may be enabled to self-adapt for any failures based on lifelong learning.
- the lifelong learning may be implemented with the application of a plastic neural network, which may continuously learn and establish self-adaptation for any new faulty situations. This may be achieved by changing weights of some of the neurons in an artificial neural network managed by the node, rather than the whole neuronal network, in order to decrease the number of false positives, and to maintain the stability of the network.
- particular embodiments herein may be understood to relate to a system and method to empower cognitive fog networking for proactive fault tolerance in 5G applications.
- FIG. 1 depicts two non-limiting examples, in panels“a” and“b”, respectively, of a communications network 10, in which embodiments herein may be implemented.
- the communications network 10 may be a computer network.
- the communications network 10 may be implemented in a telecommunications network 100, sometimes also referred to as a cellular radio system, cellular network or wireless communications system.
- the telecommunications network 100 may comprise network nodes which may serve receiving nodes, such as wireless devices, with serving beams.
- the telecommunications network 100 may for example be a network such as 5G system, or Next Gen network or an Internet service provider (ISP)- oriented network that may support an SCEF.
- the telecommunications network 100 may also support other technologies, such as a Long-Term Evolution (LTE) network, e.g.
- LTE Long-Term Evolution
- LTE Frequency Division Duplex (FDD), LTE Time Division Duplex (TDD), LTE Half-Duplex Frequency Division Duplex (HD-FDD), LTE operating in an unlicensed band, Wideband Code Division Multiple Access (WCDMA), Universal Terrestrial Radio Access (UTRA) TDD, GSM/Enhanced Data Rate for GSM Evolution (EDGE) Radio Access Network (GERAN) network, Ultra-Mobile Broadband (UMB), EDGE network, network comprising of any combination of Radio Access Technologies (RATs) such as e.g.
- RATs Radio Access Technologies
- Multi-Standard Radio (MSR) base stations multi-RAT base stations etc., any 3rd Generation Partnership Project (3GPP) cellular network, Wireless Local Area Network/s (WLAN) or WiFi network/s, Worldwide Interoperability for Microwave Access (WiMax), IEEE 802.15.4- based low-power short-range networks such as IPv6 over Low-Power Wireless Personal Area Networks (6LowPAN), Zigbee, Z-Wave , Bluetooth Low Energy (BLE), or any cellular network or system.
- 3GPP 3rd Generation Partnership Project
- WLAN Wireless Local Area Network/s
- WiFi Worldwide Interoperability for Microwave Access
- WiMax Worldwide Interoperability for Microwave Access
- IEEE 802.15.4- based low-power short-range networks such as IPv6 over Low-Power Wireless Personal Area Networks (6LowPAN), Zigbee, Z-Wave , Bluetooth Low Energy (BLE), or any cellular network or system.
- the communications network 10 comprises a plurality of nodes, whereof a first node 101 and a second node 102 are depicted in Figure 1.
- the first node 101 and the second node 102 may be understood, respectively, as a first computer system or server and a second computer system or server. Any of the first node 101 and the second node 102 may be implemented as a standalone server in e.g., a host computer in the cloud
- any of the first node 101 and the second node 102 may be a distributed node or distributed server, such as a virtual node in the cloud 110, and may perform some of its respective functions being locally, e.g., by a client manager, and some of its functions in the cloud 1 10, by e.g., a server manager.
- any of the first node 101 and the second node 102 may perform its functions entirely on the cloud 110, or partially, in collaboration or collocated with a radio network node.
- any of the first node 101 and the second node 102 may also be implemented as processing resource in a server farm. Any of the first node 101 and the second node 102 may be under the ownership or control of a service provider, or may be operated by the service provider or on behalf of the service provider.
- any of the first node 101 and the second node 102 may be a core network node, such as, e.g., a Serving General Packet Radio Service Support Node (SGSN), a Mobility Management Entity (MME), a positioning node, a coordinating node, a Self- Optimizing/Organizing Network (SON) node, a Minimization of Drive Test (MDT) node, etc....
- SGSN Serving General Packet Radio Service Support Node
- MME Mobility Management Entity
- SON Self- Optimizing/Organizing Network
- MDT Minimization of Drive Test
- any of the first node 101 and the second node 102 may be located in the OSS (Operations Support Systems).
- the first node 101 may be understood to have the capability to perform machine- implemented learning procedures, which may be also referred to as“machine learning”.
- the model used for prediction may be understood as a predictive model, e.g., a predictive regression model such as Random Forest.
- the system that may be used for training the model and the one used for prediction may be different.
- the system used for training the model may require more computational resources than the one to use the built/trained model to make predictions.
- the first node 101 may be understood to have a capability to manage an artificial neural network 105.
- the artificial neural network 105 may be understood as a machine learning framework, which may comprise a collection of connected nodes, where in each node or perceptron there may be an elementary decision unit. Each such node may have one or more inputs and an output. The input to a node may be from the output of another node or from a data source.
- Each of the nodes and connections may have certain weights or parameters associated with it. In order to solve a decision task, the weights may be learnt or optimized over a data set which may be representative of the decision task.
- the most commonly used node may have each input separately weighted, and the sum may be passed through a non-linear function which may be known as an activation function.
- the nature of the connections and the node may determine the type of the neural network, for example a feedforward network, recurrent neural network etc.
- To have a capability to manage an artificial neural network 105 may be understood herein as having the capability to store the training data set and the models that may result from the machine learning, to train a new model, and once the model may have been trained, to use this model for prediction.
- the system that may be used for training the model and the one used for prediction may be different.
- the system used for training the model may require more computational resources than the one to use the built/trained model to make predictions. Therefore, the first node 101 may, for example, support running python/Java with Tensorflow or Pytorch, theano etc...
- the node 101 may also have GPU capabilities.
- the first node 101 may be comprised in a fog node in the communications network 10.
- the first node 101 may have access to a memory or a database 130, depicted on Figure 1 b), which may comprise pre-existing predictive models 131 of problems, e.g., of the communications network 10, or of another network.
- the memory or database 130 may alternatively be comprised in the first node 101 itself.
- the second network node 1 12 may be another core network node, as depicted in the non-limiting example of Figure 1 a), a radio network node, such as the radio network node 150 described below, or a user equipment, such as the communication device 140 described below.
- the first network node 111 and the second network node 1 12 may be co-located, or be a same node.
- the communications network 10 may comprise a plurality of communication devices, whereof a communication device 140 is depicted in the non-limiting example scenario of Figure 1 .
- the communications network 10 may also comprise other communication devices.
- the communication device 140 may be a UE or a Customer Premises
- CPE Central Processing Entity
- M2M Machine-to-Machine
- the communication device 140 may be also e.g., a mobile terminal, wireless device, wireless terminal and/or mobile station, mobile telephone, cellular telephone, or laptop, just to mention some further examples.
- the communication device 140 may be, for example, portable, pocket-storable, hand-held, computer- comprised, a sensor, camera, or a vehicle-mounted mobile device, enabled to
- the communication device 140 may be enabled to communicate wirelessly in the communications network 10. The communication may be performed e.g., via a RAN and possibly one or more core networks, comprised within the communications network 10.
- the communications network 10 may comprise a plurality of radio network nodes, whereof a radio network node 150, e.g., an access node, or radio network node, such as, for example, the radio network node, depicted in Figure 1 b).
- the telecommunications network 100 may cover a geographical area, which in some embodiments may be divided into cell areas, wherein each cell area may be served by a radio network node, although, one radio network node may serve one or several cells.
- the radio network node 150 may be e.g., a gNodeB.
- a transmission point such as a radio base station, for example an eNodeB, or a Home Node B, a Home eNode B or any other network node capable to serve a wireless device, such as the communications device 140 in the communications network 10.
- the radio network node 150 may be of different classes, such as, e.g., macro eNodeB, home eNodeB or pico base station, based on transmission power and thereby also cell size. In some examples, the radio network node may serve receiving nodes with serving beams.
- the radio network node 150 may support one or several communication technologies, and its name may depend on the technology and terminology used.
- the radio network node 150 may be directly connected to one or more core networks in the telecommunications network 100.
- the first node 101 is configured to communicate within the communications network 10 with the second node 102 over a first link 161 , e.g., a radio link, an infrared link, or a wired link.
- the first node 101 is configured to communicate within the communications network 10 with the second node 102 over a second link 162 with the radio network node 150, e.g., a radio link, an infrared link, or a wired link, which in turn may be configured to communicate within the communications network 10 with the communication device 140 over a third link 163, e.g., a radio link, an infrared link, or a wired link.
- the first node 101 may be configured to communicate with the database 130 within the communications network 10 over a fourth link 164, e.g., a radio link, an infrared link, or a wired link.
- first link 161 , the second link 162, the third link 163 and the fourth link 164 may be a direct link or may be comprised of a plurality of individual links, wherein it may go via one or more computer systems or one or more core networks in the
- the intermediate network may be one of, or a combination of more than one of, a public, private or hosted network; the intermediate network, if any, may be a backbone network or the Internet; in particular, the intermediate network may comprise two or more sub-networks, which is not shown in Figure 1 .
- Embodiments of a method, performed by the first node 101 will now be described with reference to the flowchart depicted in Figure 2.
- the method is for handling a problem in the communications network 10.
- the first node 101 manages an artificial neural network 105.
- a new problem may be understood as a problem that may not have been encountered by the first node 101 before, and may therefore not be available within historical records available to the first node 101 , e.g., in the database 130.
- the first node 101 determines, in a set of data collected from the communications network 10, a set of one or more features.
- the set of one or more features define a problem in an operation of the communications network 10.
- the set of one or more features are previously undetected in the communications network 10 by the first node 101 . That is, they are new.
- the fact that they are new may be understood to mean, not necessarily that each of the features in the set is new, but that the combination of features is new.
- At least one of the one or more features is the set may also be new, in some examples, but this is not necessary.
- the set of one or more features are also lacking a corresponding set of one or more solutions.
- the determining in this Action 201 may be understood as calculating, deriving or detecting.
- the set of data collected from the communications network 10 may be, for example, the history of an application, which may be e.g., stored in the database 130.
- the database 130 may also be referred to herein as a knowledge data store.
- the history of the application may comprise log files that may be application logs, counter files, alarm files or any other file specific to the application.
- the set of one or more features may be, a set of one or more variables or parameters, such as, for example, sensor values of temperature, and/or pressure, along with the time instants in which those values may have been obtained.
- the first node 101 may perform the determining in this Action 201 by using different components to explore the scenario based on the root sources of the problem, as explained next.
- Each of the different components may be, for example, implemented in a distributed environment, e.g., by a plurality of processors e.g., in the cloud 1 10.
- This component may be understood to be responsible for monitoring the history of an application to uncover any problems in the application system for example, the history of an application.
- the first node 101 may detect relevant log files.
- the monitor component may, for example, monitor all sensors attached to the first node 101 .
- the input to this component may be one single configuration file for all devices that may be connected to each fog node.
- a configuration file may be understood to comprise several details, such as device identifier (ID), type, value, and the time instant the value may have been recorded. It may also comprise information such as tree structure etc.
- the contents of the configuration file may have a path of the log files for each application, the pattern to search for in each log file for each application, iteration cycle etc..
- the monitor component may use the configured files to search in the log files to detect problems in the system.
- the first node 101 may compare the set of data collected with the configured pattern, to detect anomalies or indicators of problems in the operation of the communications network 10.
- the output of this component may be a xml/json data object that may comprise all the relevant information surrounding the log, e.g., timestamp, service, counter, occurrence etc., and which may be passed to the next component of the first node 101 , a correlation component.
- This component may be understood to be responsible for constructing the Sphere of Interest (Sol) Data/Indicators using the input received from the monitor component.
- the Sol may be understood to be a template that represents the application in the form of spheres and/or layers which may help in tagging a problem with a particular domain of an application. While the configured pattern in the Monitor component may only identify the problem, in this component, solutions may be associated with each and every problem. The problem that may have occurred may be seen as different pattern in the features. To solve the problem, the first node 101 may need to identify the pattern, and then use the corresponding solution to repair it. The solution may be domain specific or it may be across various domains also.
- the correlator may also update the knowledge data store, that is, the database 130, with the information.
- the Sol may have all the indicators associated with the problem monitored. It may be understood to be a stateless component, that is, it may be understood to not require any time information.
- the input to this component may be a Json/xml object received from the monitor component.
- the output of this component may be the updated xml/json object that may then be passed to an analyzer component. Also the Sol may be available with run time information.
- This component may be understood to be responsible for analyzing all the Sol indicators and arrive at the potential problem(s), that is, the set of one or more features. For that to happen, the analyzer may maintain threshold values of each Sol indicator and also may maintain a table of information to map indicators to a particular problem. An indicator may be understood as an anomalous pattern in the data which may indicate the presence of an issue or problem.
- This table may be either constructed at compile time, in real time, or constructed by learning, offline, or a mixture of the above variants.
- the initial table may be constructed with the information relevant to problems already known to the application. And the table may be updated with the information learnt by the selfadaptation system.
- the component may notify the presence of the problem to the second node 102, e.g., a UX client.
- the analyzer may also update the knowledge data store with the information.
- This component is stateless.
- the input to this component may be a Json/xml object received from the correlator component.
- the output of this component may be a list of problems, e.g., problem ids, identified.
- the output may be the updated xml/json object passed to an executor component.
- the Sol may be available with run time information, that is, the solution may have also some features such as time taken etc..
- This component may be understood to be responsible for fetching the solutions mapped for the Problem IDs that may have been reported by the analyzer.
- the order in which the solution may have to be applied depends on the characteristics of each solution and the rank of the solution.
- Example for characteristics of a solution may include time at which the solution may need to be applied, the range of Central
- CPU Processing Unit
- the ranking of the solution may be assigned based on the success ratio of the problem resolution when the solution may have been applied previously. Based on the two factors, the solution may be chosen and applied. When a solution is applied for a problem, until its cooling period there may not be any other solution applied for another occurrence of the same problem. This may be done to keep the system light weighted without adding unnecessary complexity. Due to this, the executor may be understood to be a stateful component, that is, it may be understood to require time information. The executor may also update the knowledge data store with the information. As was described for the analyzer component, if there is no known solution for the reported problem identifiers, the first node 101 may just notify the presence of the problem using UX client.
- the analyzer component if there is no known solution for the reported problem identifiers, the first node 101 may just notify the presence of the problem using UX client.
- the input to this component may be a set of problem identifiers received from the analyzer component.
- the output of this component may be either based on an applied solution waiting in its cooling period, or a notification towards the UX interfaces.
- the first node 101 in this Action 201 e.g., via the executor component, may conclude that the set of one or more features lack a corresponding set of one or more solutions.
- the first node 101 trains the artificial neural network 105 to find a solution to the problem defined by the set of one or more features. This is performed by training the artificial neural network 105 with a modified pre-existing linear activation function.
- the pre-existing linear activation function may be understood as a function which may introduce non-linearity between the relation between input and output of the neural network 105. This may be understood to be to ensure that the relation may be defined properly.
- the training in this Action 202 may be performed with another component of the first node 101 , an offline monitor.
- This component may be understood to be responsible for training the self-adaptation system and introduce a learning component of the self-adaptation system.
- the offline monitor component may, using machine learning, analyse the current trend in traffic, a pattern of events, learn about the system utilization during different hours of a day, different day of a week and so on.
- the offline monitor component may use the database 130, that is, the knowledge data store, to learn the system.
- Embodiments herein may make use of a modified plastic neural network to find the optimal solution for the new problem, that is, the set of the one or more features. This practice may be understood to inculcate a lifelong learning to the system.
- the input may be the historically stored data objects by other components in the first node 101 .
- the output may be with and/or without manual approval, perform one of the following actions for new learnings: a) update the analyzer table with the new entry mapping of identified problems and its indicators or modify existing entries, and b) update the executor table with new entry mapping of identified solutions and its indicators or modify existing entries.
- the offline monitor may run the modified plastic neural network model to find a solution for the problem as a next step.
- the modified pre-existing linear activation function adjusts a pre-existing linear activation function by adding a factor.
- the added factor adjusts a respective weight, respectively, of a subset of neurons comprised in the artificial neural network 105.
- the artificial neural network 105 may be understood to comprise a group of so- called“neurons”, which may have so-called“connections” among them.
- a neuron herein may be understood as a unit of computation.
- a neuron may have one or more inputs and an output.
- Each input to the neuron may be associated with a weight or connection.
- the sum of the product of the inputs with its corresponding weight may be passed through an activation function such as Rectified Linear Unit (ReLU), tanh, Leaky ReLU etc, softmax etc.
- ReLU Rectified Linear Unit
- tanh tanh
- Leaky ReLU etc softmax etc.
- These activation functions may be understood to add non-linearity to the network.
- a connection may determine which input may be fed to which neuron.
- the weight associated with the connection may determine how important that input is for the computation of the connected neuron.
- Any artificial neural network may compute an output vector y as a multiplication of weights of
- the artificial neural network may learn the weights such that the predicted output may match with the actual output.
- the weights may be trained only on given features and output.
- the first node 101 may rely on the assistance of the trained model to help.
- the above learning pattern may be modified as:
- U may be considered as the weights of the artificial neural network 105 which may help in addressing the new set of problems.
- the artificial neural network 105 may be understood to learn new things from the present incidents. These are referred to herein as stochastic weights. However, the choice of stochastic weights may depend on the training the artificial neural network 105 may require from the earlier patterns identification.
- weights in a matrix U may be initialized to zero. Further, for any new problem that may come, the features of the problem may be checked. This will now be illustrated with an example.
- the artificial neural network 105 is trained to handle problems p 1 p 2 and p 3 , and that the specific combination of features depicted in Table 1 , create the problems previously detected in the communications network 10 by the first node 101 , which are given in the corresponding rows of Table 1 .
- a problem may be understood to be identified by the features.
- the new features of the problem are / 2 , / 5 and f 6, all individually known, but in a combination previously undetected in the communications network 10 by the first node 101 and lacking a corresponding set of one or more solutions.
- the current features have not matched with any of the known problems.
- an artificial neural network may identify the new problem as either p 2 or 3 . This is wrong may lead to a waste of resources and an increase in false positive scores, that is, instances when the true category is false and even then, the predicted system may have returned it as true.
- Embodiments herein provide a new method to solve this problem. This may comprise the generation of a new solution for set of new features. The following procedure does this.
- the artificial neural network 105 may have initially been constructed in this case to map the features and solutions. Use of a multi-layer perceptron may be assumed in this case.
- the artificial neural network 105 constructed may have three hidden layers and a sigmoid activation function.
- the artificial neural network 105 initially constructed may have the architecture shown in Figure 3.
- the number of input nodes in the artificial neural network 105 may need to map the dimension of the feature vector.
- the number of output nodes may need to map the number of solutions in the data. In brief, this may be considered a categorization problem.
- this information may be passed to the input of the artificial neural network 105.
- the output of the artificial neural network 105 may depend on the similarity of the feature present in the dataset. However, this may be understood to not be a simple classification problem, as the feature f 7 may be totally different from that of existing features.
- the artificial neural network 105 may be retrained using the procedure discussed in the plasticity network.
- the stochastic weights discussed in the plastic network may be calculated usign the random adjustment.
- the random adjustment may be done only on a particular set of weights so as to ensure stability.
- the artificial neural network 105 may become unstable and result in poor results. Hence, the adjustment may be performed only on specific weights to ensure stabilty.
- the set of weights may be selected using the below procedure.
- the trained model may be used on the existing set of features.
- the first node 101 may check which neurons in the artificial neural network 105 may be firing other neurons continuously for all the features. That is, the first node 101 may select the subset of neurons which may be continuously firing other neurons. In this way, the most dominant neurons may be identified. From another perspective, it may be seen that these neurons participate effectively in the classification.
- the added factor adjusts a respective weight, respectively, of a subset of neurons comprised in the artificial neural network 105.
- the subset of neurons comprises one or more neurons providing the highest output with the pre-existing linear activation function, according to a threshold, which is referred to herein as a first threshold, when the pre-existing linear activation function is used to train the artificial neural network 105 with the set of data.
- the number of neurons is not restricted and may depend on the user. However, it may be remembered that a high number of neurons selected may destabilize the system and a low number of neurons selected may not be converged. Hence, the following procedure may be used to select the subset.
- the subset of neurons may be identified based on a back- propagation process in the training of the artificial neural network 105.
- the errors are computed and they may back propagate to initial inputs to calculate weights.
- some of the neurons may fire other neurons. This process may be repeated until all the data for training may have been used. In this way, it may be observed which neurons may fire others, that is, which neurons may be sending output to other neurons throughout the entire training process.
- the first node 101 may select the top 10 neurons which fire other neurons frequently. Hence, in some embodiments, the subset of neurons may comprise 10 neurons. In this way, the computation complexity of the method may be reduced. Overall, the first node 101 may then have 10 weights to play with using the proposed method. The mathematical details behind this are discussed below.
- the artificial neural network 105 has pre-trained weights W.
- the artificial neural network 105 may be trained for learned features and solutions. However, when a new feature is encountered, it will introduce false positive for the data even though it is closer to an existing feature. Then in this case, the first node 101 may need to retrain the artificial neural network 105 in total. This may be understood to require adding the new feature and a corresponding solution. This may be considered to be difficult and an automatic way of arriving at the solution may be required.
- the weights of the artificial neural network 105 may be retuned by adding some random numbers. For example, according to this, the new weights may be:
- some elements are non-zero in matrix U.
- this may be understood to mean that only some of the weights of the artificial neural network 105 may be changed.
- the artificial neural network 105 may perform well and the change of weights does not destabilize the artificial neural network 105.
- the predictions may still be reasonable, since only some weights may be understood to be adjusted.
- a decrease in the number of false may be observed when compared to the original network W.
- the second case of weight selection may be understood to be chosen as the objective here may be understood to be to decrease the number of false positives.
- embodiments herein may be understood to provide a solution for the problem based on lifelong learning approach.
- the component which may actually apply the corrected solution, perform may need to be executed as a final step in FTC loop to illustrate the self-adaptation of any new problem relevant to domain.
- the first threshold may be based on a ranking of the neurons in the artificial neural network 105 according to their output with the pre-existing linear activation function.
- the above procedure may be understood to imply that the weight matrix W corresponds to existing weights of the artificial neural network 105, and matrix U may be understood to correspond to stochastic weights for the artificial neural network 105.
- the first node 101 may not change all the values of U matrix, that is, all the values except some finite values will be zero.
- the artificial neural network 105 may predict with the new weights of (W+U). This may be understood to make the artificial neural network 105 learn the new features without training, which may then enable the reduction in the number of false positives.
- Action 203 may be understood to imply that the weight matrix W corresponds to existing weights of the artificial neural network 105, and matrix U may be understood to correspond to stochastic weights for the artificial neural network 105.
- the first node 101 may not change all the values of U matrix, that is, all the values except some finite values
- the set of neurons may be selected, and their weights may be adjusted at random with small change in weights. After every adjustment, a new set of solutions may be obtained, and each and every solution may be tested.
- the first node 101 may determine whether or not the solution to the problem is found based on the modified pre-existing linear activation function.
- the problem may be solved, that it if it subsidizes, then it may be considered that that the random weights may be initialized correctly and hence the problem may be solved.
- To initialize may be understood as to assign certain weights at the beginning of training to the connections.
- the first node 101 may perform Action 204.
- the first node 101 may adjust, based on a result of the determination in Action 203, the respective weight of at least a first neuron in the subset of neurons.
- the adjustment may be by a random amount.
- the first node 101 may go back to change the weights with another random quantity, and repeat this process until the problem is solved. In this way, the entire solution may be performed autnomously, without the need of any human to solve the problem.
- the first node 101 may use the artificial neural network 105 with the modified pre-existing linear activation function using the adjusted respective weight to find the solution to the problem defined by the set of one or more features.
- the set of one or more features may comprise a previously unidentified feature by the pre-existing linear activation function.
- the modified pre-existing linear activation function may be used to identify future problems in the communications network 10, based on the previously unidentified feature and the found solution.
- the first node 101 may iterate the determining of Action 203 of whether or not the solution to the problem has been found, the adjusting of Action 204 and the using of Action 205 until one of: the solution is found or a number of iterations has exceeded another threshold, referred to herein as a second threshold, in the absence of having found a solution.
- a second threshold another threshold
- the above procedure may be repeated for a specific set of iterations, for example.
- the second threshold may be 10. If the problem is not solved within 10 iterations, then this may be understood as as unresolved new problem, and a human may be asked to solve it. But most of the problems may be expected to be solved within 10 iterations. Human intervention may be needed only for very rare cases.
- the weight matrix learned may be assumed to be W .
- This matrix may be understood to contain columns which are shown below
- weights may be selected from the above weights. Let it be W s . Now, after a first iteration, the weights may be updated as:
- R s is the random change and which may be considered to be input to the matrix of the weights. Now, this may be replaced in the matrix W and a classification may be performed, that is, the neural network 105 may be executed to obtain the solution. If the new category may be predicted, it may be used to solve the problem. If the solution is worked, then the new model may be used to predict next time also. If the problem is not solved, another random set of weights may be chosen as:
- the first node 101 may determine that the number of iterations has exceeded the second threshold in the absence of having found a solution.
- Action 208 the first node 101 may determine that the number of iterations has exceeded the second threshold in the absence of having found a solution.
- the first node 101 may send, based on the determination that the number of iterations has exceeded the second threshold, an indication to a second node 102 in the communications network 10 based on a result of the determination. For example, if the problem is not solved within 10 times, then the first node 101 may send this output to a human, e.g., via the second node 102, for updating a new solution.
- the new solution may be added to the artificial neural network 105 and it may be retrained. In this way, the first node 101 may perform self-adaptation to most of the new problems.
- Figure 3 is a schematic illustration depicting a first non-limiting example of the architecture the artificial neural network 105 may have.
- the artificial neural network constructed have three inputs x Q , x x d and an output y 1 , y 2 .
- the output of the neural network is computed as shown in the Figure 3.
- the first node 101 may use it as a new feature to train the artificial neural network 105 again, which may be considered the optimal tuning.
- the first node 101 may repeat steps 3 and 4 until the problem is solved.
- embodiments herein may be understood to introduce a new method, e.g., via an offline monitor, to solve a new set of problems which are not trained by the first node 101 earlier, using a self-adaptation principle.
- embodiments herein may modify the plastic artificial neural network 105 by tuning the weights of the neurons in such a way that the plastic artificial neural network 105 may address the features of the new problem, one or more of which features may be new.
- the new method of embodiments herein may be understood to introduce a lifelong learning for self-adaptation principle as an additional contribution to the 5G applications.
- particular embodiments herein may relate to an adaptive method of inculcating fault tolerance on the fog nodes. As discussed, this method may comprise three sequential parts.
- a fault in the fog node may be detected.
- the root source that is, the reasons, from where the fault is propagated may be identified.
- the first node 101 may come with a new novel method to solve the problem which it may not have faced earlier in the future.
- Embodiments herein may be understood to be particularly drawn to a new system and methods to address the last step of the approach. Embodiments herein may be understood to a solution to self-adapt, within fog nodes, for any new problem, that is, a problem which may not available within history.
- a modified plastic neural network mechanism has been disclosed herein that enables a fog node to understand the new problem and perform fault tolerance on its own.
- the method may depend on a two step approach. In the first step, a cognitive Framework may be constructed within each fog node, which may enable a Fault Tolerance Control (FTC) loop to interpret a new set of problems.
- FTC Fault Tolerance Control
- a framework inside the fog computing environment may be created, according to embodiments herein, as schematically depicted in the illustration of Figure 4.
- An arrangement according to existing methods is shown on the left, in panel a), where a fog Node may collect data from loT sensors and send it to the cloud.
- An arrangement according to a non-limiting example of embodiments herein is shown on the right, on panel b), where the first node 101 implements an FTC loop within a fog Node to interpret a new set of problems detected in the data collected from loT sensors and send it to the cloud.
- the FTC loop may introduce a cognitive modelling framework inside the fog node, as depicted in the schematic illustration of Figure 5.
- the first node 101 senses the devices in the environment such as sensors, by collecting sensor data etc.
- the first node 101 plans actions based on the identified sensor data. It may be noted that the plan actions for an existing set of features, e.g., from the history for which the first node 101 is trained. If not, the first node 101 tries to generate a new set of actions based on the predictions made. Further, at 503 the first node 101 reasons the planned actions. These may include why these actions are taken and what may be the outcome if the actions are implemented. Finally, at 504, the first node 101 performs these set of actions to see the result.
- the first node 101 uses self-healing techniques to mitigate the effect of actions and also to increase the efficiency of the planned set of actions.
- the FTC loop may be activated for the fault tolerance cycle based on state changes, from normal at 506, abnormal at 507, to dead at 508.
- the FTC loop may be performed before the first node 101 may move to a next state.
- Figure 6 is a schematic illustration depicting a non-limiting example of the different components that may be built and used according to particular examples of embodiments herein, to explore the scenario based on the root sources of the problem.
- the first node 101 may detect an anomaly using steps l and 2 described earlier, and the first node 101 may find the attributes, that is the root sources, behind the anomaly, the plan action part in FTC loop will be immediately triggered.
- the first node 101 may then execute the cycle as depicted in Figure 6, with the use of the different components depicted, to find more reasons towards a solution.
- the Monitor component monitors the history of an application to uncover any problems in the application system, using the input from the configuration file.
- the correlator component generates the Sol 604.
- the correlator 603 may also update the knowledge data store with the information at 605.
- the correlator passes the Sol to the analyzer component 607.
- This component may be understood to be responsible for analyzing all the Sol indicators and arrive at the set of one or more features.
- the analyzer may maintain the problem identification table 608, as described earlier.
- the analyzer may also update the knowledge data store with the information at 609. If the component is unable to derive the problem, it may notify the presence of the problem to the UX client at 610.
- the output of this component may be a list of problems identified with the Sol indicators, passed to the executor component 61 1 . This component may read the table and at 612, fetch the solutions mapped for the problem IDs that may have been reported by the analyzer.
- the executor 610 may also update the knowledge data store with the information at 613. As was described for the analyzer component, if there is no known solution for the reported issue identifiers, this component may just notify the presence of the problem using the UX client at 614.
- the offline monitor component 615 may be understood to be responsible for training the self- adaptation system and introduce a learning component of the self-adaptation system.
- the offline monitor component may use the data store or database 130 at 616 to learn the system.
- the offline monitor component may use the data store or database 130 at 616 to learn the system.
- implementation is in a green energy building
- another example implementation is in incident management in telecom networks
- yet another example implementation is in a fog RAN scenario.
- the method is implemented in a green energy building.
- the main objective is to control the air conditioning system inside the building.
- the first node 101 is a fog node designed to take actions to start and/or stop an air conditioning system to control the air temperature inside a room.
- the air conditioning system will not work efficiently, as there is an anomaly in the temperature. That is, the air conditioning system will be able to be turned on or off based on a configured temperature to be maintained, but it may be used unnecessarily, wasting energy, if the cause of the temperature anomaly is elsewhere.
- the air conditioning system may be able to estimate the underlying true cause of the temperature anomaly, which will help the air conditioning system to work efficiently. If the fault is identified, the first node 101 may be able to compute the underlying non- faulty data, that is, the data that may have been observed if there were no faults in the data.
- the multi-variable data may comprise all the data obtained from other devices attached to the first node 101 , which may be referred to as fogs, and nearby fogs. If the fault is identified in one variable of the fog, the first node 101 may use the process of embodiments herein to obtain the source of the anomaly. In addition, the underlying non- faulty data may be estimated. Once the source of the anomaly is identified, the first node 101 may take the necessary actions to repair the concerned fog node. Also, the first node
- the 101 may use the estimated fog-node to take the necessary actions. Since, the non-faulty data has been estimated, the actions taken may be understood to be proper and accurate.
- the first node 101 is a fog node to monitor the green building.
- the objective is to utilize the resources such that the energy used is minimum. It may be assumed that the variables measured are temperature inside the room, light intensity inside the room, outside temperature, outside light intensity etc.
- the air conditioning system will increase its blower and also light intensity may be controlled such that the inside temperature and light intensity will decrease, although there is no true increase in temperature.
- This is inefficient operation as it will waste both energy sources.
- One may identify the root source of the anomaly and identify the anomaly in the temperature is caused by an anomaly in the light intensity. In this way, the proposed method may identify the root source of the anomaly faster. This will save the energy as only closing windows may save the problem as opposed to controlling two sources in the previous scenario.
- the features of the variables may be used as inputs.
- the features may be outside light intensity values and temperature inside the room. These features may be mapped into problems such as light intensity problems, and the solution may be to shut the windows and doors to prevent light from entering inside.
- the solution may be to increase or decrease the air conditioning system.
- carbon dioxide levels in the room may be measured. For illustrative purposes, it may be assumed that the model is trained for handling this problem.
- the artificial neural network 105 may be assumed to be trained on this data.
- the number of input nodes in the artificial neural network 105 in this example may be 100. That is, the artificial neural network 105 may have 100 past light intensity readings collected over time 100 minutes. Similarly, 100 temperature readings may be collected over 100 minutes and used to train the model. The input may be collected at different time instants to reduce the bias in the training artificial neural network 105.
- the artificial neural network 105 may be chosen to consist of three hidden layers with fully connected nodes. The number of nodes in each hidden layer may be chosen to be 10. Overall, in this way the first node 101 may have a total of 1302 weights to learn from the training data. For training, regular back propagation may be used with a gradient descent approach.
- the artificial neural network 105 may be trained for 20 epochs and training accuracy may be obtained as 85%. For testing, the first node 101 may choose 100 new readings of variables and a validation accuracy may be obtained of 78%.
- the proposed method was tested during for incident management in a telecommunication system.
- the set of data collected from the communications network 10 were all the incidents happened in cells of a location in India, India.
- only the incidents which had minimum and maximum impact were collected, which in total was 100 different incidents in the communications network 10.
- the artificial neural network 105 was trained with 75 incidents and corresponding solutions. According to the method disclosed herein, the remaining incidents were collected and predicted using the disclosed method. Out of 25 new incidents, 20 incidents were solved automatically using the disclosed method. Out of these 20 incidents, 10 incidents solved the problem within next two iterations of changing weights. The remaining incidents got the solution with six to eight iterations. Only 5 incidents were reported back to an engineer via the second node 102 to solve the problem. In this way, the human effort in solving the problems was reduced. In this way, the disclosed method may provide a better solution to the problem.
- embodiments herein may be applied in a fog RAN scenario, to decrease the latency of 5G applications.
- the first node 101 which is a fog RAN node in this example. It may be assumed for the purposes of this example that the fog RAN node is taking actions based on the variable values. If there is an anomaly in a sensor reading, the fog node will take bad decisions because of the anomaly and it will reduce the efficiency of the communications network 10. If the data is identified as an anomaly, the first node 101 may need to wait for a next data sample to arrive or use a past sample. Both of these options are inefficient and decrease the efficiency of the process.
- the first node 101 may enable to take efficient actions with ultra-low latency. This may be considered best suited for 5G networking and applications, which may e.g., have ultra-low latency requirements.
- Figure 7 is a schematic illustration depicting a non-limiting example of a method performed by the first node 101 , according to the Example 1 .
- the diagram shows that there may be four steps the cognitive system of the first node 101 may perform to arrive at the solution.
- the monitor component 601 monitors all the log information of the data.
- the data collected comprises three sets of log files, where one log file corresponds to temperature data, another log file corresponds to light intensity data and another for carbon dioxide levels.
- the monitor component 601 sends the log data to the correlator component 603.
- the correlator component 603 correlates all the log files and merges them into single file.
- the analyzer component 607 monitors the problems in the correlated log file and identifies the problems in it. For example, in this case the analyzer component 607 identifies a temperature increase, a carbon dioxide level increase and no problem with light intensity. In this way, the analyzer component 607 may come at the issue of the problem, and send it to the executor component 61 1 . Fourth, the executor component 61 1 may try to find a solution for the problem according to Actions 202-206.
- the features for the problem are an increase in carbon dioxide levels and an increase in temperature levels.
- the artificial neural network 105 is trained for only temperature related problems, if existing methods were to be followed, that is, following the existing fog scenario, this problem may treated as a temperature increase problem. In this case, the artificial neural network 105 may classify it as a temperature related problem, and give the corresponding solution. In contrast, according to embodiments herein, the first node 101 may be able to use the trained artificial neural network 105 to self-heal the problem. Now, the real problem may come when a new feature may be input to the system. For example, the feature carbon dioxide content may be input to the artificial neural network 105.
- the artificial neural network 105 Since, the artificial neural network 105 is not trained for it, it may try to give a mathematical solution for it, that is, either temperature or light, which may be understood to mean that false positives may be obtained. In this case, the new features which may be available to classify the new problem may be passed to offline monitor for finding new solution. In the current case, existing methods could return the problem as an issue with light. However, with the help of the method disclosed herein, the weights of the artificial neural network 105 may be retuned to suit the problem. The first node 101 may select the top 10 weights to tune.
- the artificial neural network 105 may correctly predict the problem as a temperature related problem by first retuning itself and execute a solution for it even though the artificial neural network 105 may not have previously trained for it.
- the weight values before the retuning and after the retuning in this particular non-limiting example are shown in the vectors below.
- the artificial neural network 105 may be understood to be self-sufficient and may be used for self-adaptation. It may be observed that sometimes, a small change in tuning weights may destabilize the artificial neural network 105. However, the probability is very low, as the number of weights retuned may be considered to be very low when compared with the actual number of weights in the artificial neural network 105. Also, the change may be considered to be very small, e.g., of decimal places. Another aspect to be noted is that the change in weights may allow the weights of the artificial neural network 105 to go more than 1 . In this case, it may be necessary to restrict the weight value to be 1 since it may affect the stability of the artificial neural network 105.
- the artificial neural network 105 may be trained with existing features and solutions. For this, a multilayer perceptron with one hidden layer may be used. The artificial neural network 105 may be trained with a back propagation method using a gradient descent method. In the current example implementation, 10 new features were tested. These 10 new features were similar to existing features. When the trained artificial neural network 105 was used, it resulted in 4 false positives. In this case, the top 10 weights of the artificial neural network 105 were retuned, and this resulted in 1 false positive. Hence, it was observed that the retuning resulted in a lower number of false positives.
- weights in the artificial neural network 105 may be changed rather than changing all the weights. This may be understood to be done to ensure the stability of the artificial neural network 105. To verify this statement, all the weights of the artificial neural network 105 were tuned. In this case, the predictions went for toss and it resulted in 10 false positives. That is, all the features were wrongly identified, and it resulted in a poor network. Hence, it may be concluded that the disclosed method to retune the weights reduces the number of false positives and may be used efficiently in any application.
- Mapping the solution to the new problem with existing features may be, in some examples, considered to be the work of the offline monitor component 615, referred to in Figure 7 as the rendering engine.
- the offline monitor component 615 may be understood to be equivalent to the monitor component 601 , working offline. Since the increase in carbon dioxide maps to the increase in temperature, the solution of the temperature related problem may be employed. In this case, the network correctly predicted the problem and correctly gave the solution.
- the executor may give the solution and may pass it on to the executor, which may then perform the solution and it may solve the problem in this case.
- the first node 101 may, at 703, send the information to the rendering engine component 615.
- the weights may be redefined, and the rendering engine component 615 may try to identify another solution.
- the new solution may be sent to the executor component 61 1 and it may be tested again. If the problem is not solved after the given number of iterations, for example, after 10 iterations, the first node 101 , e.g., via the executor component 615 may, at 705, ask the monitor for more information/data from the system.
- Embodiments herein may provide one or more of the following technical advantage(s).
- Embodiments herein be used effectively in mission critical applications of 5G which automate its own self-adaptation and manage any failures.
- Embodiments herein may be understood to be able to reduce the latency of applications by addressing failure management, which may be understood as the most important factor in 5G applications.
- embodiments herein may facilitate to build a self-adaptation mechanism for the applications where it may automatically monitor and control the situation.
- embodiments herein may be endeavoured in an existing fog network setup to avoid the failure of fog nodes and optimize the environment and cost.
- the growth of loT in present industrial setup is explosive, inspiring, and untenable under current architectural approaches due to the introduction of large number 3PP devices and communication protocols.
- Fog computing adds a hierarchy of elements that distributes resources and services of computing, storage, control, and networking anywhere along the continuum from Cloud of things to meet these challenges in a high performance, open and interoperable way. It will also support multiple industry verticals and application domains, delivering intelligence and services to users and business. To enhance the performance of fog system, it is very important for a system or an application to recover from the problems by itself without manual intervention. This would help in avoiding the turnaround time to fix the problem without manual intervention. Based on the history for dynamic problems, it is highly important to possess the property of fault tolerance in fog computing.
- Figure 8 depicts two different examples in panels a) and b), respectively, of the arrangement that the first node 101 may comprise to perform the method actions described above in relation to Figure 2.
- the first node 101 may comprise the following arrangement depicted in Figure 8a.
- the first node 101 is configured to manage the artificial neural network 105.
- the first node 101 may be understood to be for handling a problem in the communications network 10.
- the first node 101 may be configured to be comprised in a fog node in the communications network 10.
- optional modules are indicated with dashed boxes.
- the first node 101 is configured to, e.g. by means of a determining unit 801 within the first node 101 configured to, determine, in the set of data configured to be collected from the communications network 10, the a set of one or more features configured to define the problem in the operation of the communications network 10.
- the set of one or more features are configured to be: a) previously undetected in the communications network 10 by the first node 101 , and b) lacking a corresponding set of one or more solutions.
- the first node 101 is configured to, e.g. by means of an training unit 802 within the first node 101 configured to, train the artificial neural network 105 to find the solution to the problem configured to be defined by the set of one or more features. This is configured to be performed by training the artificial neural network 105 with a modified pre-existing linear activation function.
- the modified pre-existing linear activation function is configured to adjust a pre-existing linear activation function by adding the factor.
- the added factor is configured to adjust the respective weight, respectively, of the subset of neurons configured to be comprised in the artificial neural network 105.
- the subset of neurons is configured to comprise the one or more neurons providing the highest output with the preexisting linear activation function, according to the first threshold, when the pre-existing linear activation function is used to train the artificial neural network 105 with the set of data.
- the first node 101 may be further configured to, e.g. by means of the determining unit 801 within the first node 101 configured to, determine whether or not the solution to the problem is found based on the modified pre-existing linear activation function.
- the first node 101 may be further configured to, e.g. by means of a adjusting unit 803 within the first node 101 configured to, adjust, based on the result of the determination, the respective weight of at least the first neuron in the subset of neurons.
- the adjustment is configured to be by a random amount.
- the first node 101 may be further configured to, e.g. by means of a using unit 804 within the first node 101 configured to, use the artificial neural network 105 with the modified pre-existing linear activation function using the adjusted respective weight to find the solution to the problem defined by the set of one or more features.
- the first node 101 may be configured to, e.g. by means of an iterating unit 805 within the first node 101 configured to, iterate the determining of whether or not the solution to the problem has been found, the adjusting and the using, until one of: the solution is found or a number of iterations has exceeded the second threshold in the absence of having found a solution.
- the first node 101 may be further configured to, e.g. by means of the determining unit 801 within the first node 101 configured to, determine that the number of iterations has exceeded the second threshold in the absence of having found a solution.
- the first node 101 may be configured to, e.g. by means of a sending unit 806 within the first node 101 configured to, send, based on the determination that the number of iterations has exceeded the second threshold, the indication to the second node 102 in the communications network 10 based on the result of the determination.
- the second threshold may be configured to be 10.
- the subset of neurons may be configured to be identified based on the back-propagation process in the training of the artificial neural network 105.
- the subset of neurons may be configured to comprise 10 neurons.
- the first threshold may be configured to be based on the ranking of the neurons in the artificial neural network 105 according to their output with the pre-existing linear activation function.
- the set of one or more features may comprise a previously unidentified feature by the pre-existing linear activation function, and after the solution to the previously unidentified feature is found by the artificial neural network 105 using the modified pre-existing linear activation function, the modified pre-existing linear activation function may be configured to be used to identify future problems in the communications network 10, based on the previously unidentified feature and the found solution.
- Other modules may be comprised in the first node 101 .
- the embodiments herein in the first node 101 may be implemented through one or more processors, such as a processor 807 in the first node 101 depicted in Figure 8a, together with computer program code for performing the functions and actions of the embodiments herein.
- a processor as used herein, may be understood to be a hardware component.
- the program code mentioned above may also be provided as a computer program product, for instance in the form of a data carrier carrying computer program code for performing the embodiments herein when being loaded into the first node 101 .
- One such carrier may be in the form of a CD ROM disc. It is however feasible with other data carriers such as a memory stick.
- the computer program code may furthermore be provided as pure program code on a server and downloaded to the first node 101 .
- the first node 101 may further comprise a memory 808 comprising one or more memory units.
- the memory 808 is arranged to be used to store obtained information, store data, configurations, schedulings, and applications etc. to perform the methods herein when being executed in the first node 101.
- the first node 101 may receive information from, e.g., the set of loT sensors, the database 130 and/or the second node 102, through a receiving port 809.
- the receiving port 809 may be, for example, connected to one or more antennas in first node 101 .
- the first node 101 may receive information from another structure in the communications network 10 through the receiving port 809. Since the receiving port 809 may be in communication with the processor 807, the receiving port 809 may then send the received information to the processor 807.
- the receiving port 809 may also be configured to receive other information.
- the processor 807 in the first node 101 may be further configured to transmit or send information to e.g., the set of loT sensors, the database 130, the second node 102, and/or another structure in the communications network 10, through a sending port 810, which may be in communication with the processor 807, and the memory 808.
- the units 801 -806 may refer to a combination of analog and digital modules, and/or one or more processors configured with software and/or firmware, e.g., stored in memory, that, when executed by the one or more processors such as the processor 807, perform as described above.
- processors as well as the other digital hardware, may be included in a single Application-Specific Integrated Circuit (ASIC), or several processors and various digital hardware may be distributed among several separate components, whether individually packaged or assembled into a System-on-a-Chip (SoC).
- ASIC Application-Specific Integrated Circuit
- SoC System-on-a-Chip
- any of the units 801 -806 described above may be respectively implemented as the processor 807 of the first node 101 , or an application running on such processor.
- the methods according to the embodiments described herein for the first node 101 may be respectively implemented by means of a computer program 811 product, comprising instructions, i.e., software code portions, which, when executed on at least one processor 807, cause the at least one processor 807 to carry out the actions described herein, as performed by the first node 101 .
- the computer program 81 1 product may be stored on a computer-readable storage medium 812.
- the computer-readable storage medium 812, having stored thereon the computer program 81 1 may comprise instructions which, when executed on at least one processor 807, cause the at least one processor 807 to carry out the actions described herein, as performed by the first node 101 .
- the computer-readable storage medium 812 may be a non-transitory computer-readable storage medium, such as a CD ROM disc, or a memory stick.
- the computer program 81 1 product may be stored on a carrier containing the computer program 81 1 just described, wherein the carrier is one of an electronic signal, optical signal, radio signal, or the computer-readable storage medium 812, as described above.
- the first node 101 may comprise an interface unit to facilitate communications between the first node 101 and other nodes or devices, e.g., the first node 101 , or any of the other nodes.
- the interface may, for example, include a transceiver configured to transmit and receive radio signals over an air interface in accordance with a suitable standard.
- the first node 101 may comprise the following arrangement depicted in Figure 8b.
- the first node 101 may comprise a processing circuitry 807, e.g., one or more processors such as the processor 807, in the first node 101 and the memory 808.
- the first node 101 may also comprise a radio circuitry 813, which may comprise e.g., the receiving port 809 and the sending port 810.
- the processing circuitry 807 may be configured to, or operable to, perform the method actions according to Figure 2, in a similar manner as that described in relation to Figure 8a.
- the radio circuitry 813 may be configured to set up and maintain at least a wireless connection any of the other nodes in the communications network 10. Circuitry may be understood herein as a hardware component.
- inventions herein also relate to the first node 101 operative to manage the artificial neural network 105.
- the first node 101 may be operative to operate in the communications network 10.
- the first node 101 may comprise the processing circuitry 807 and the memory 808, said memory 808 containing instructions executable by said processing circuitry 807, whereby the first node 101 is further operative to perform the actions described herein in relation to the first node 101 , e.g., in Figure 2.
- any advantage of any of the embodiments may apply to any other embodiments, and vice versa.
- the expression“at least one of:” followed by a list of alternatives separated by commas, and wherein the last alternative is preceded by the“and” term may be understood to mean that only one of the list of alternatives may apply, more than one of the list of alternatives may apply or all of the list of alternatives may apply.
- This expression may be understood to be equivalent to the expression“at least one of:” followed by a list of alternatives separated by commas, and wherein the last alternative is preceded by the“or” term.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Signal Processing (AREA)
- Computer Networks & Wireless Communication (AREA)
- General Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- Software Systems (AREA)
- Evolutionary Computation (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Computing Systems (AREA)
- General Health & Medical Sciences (AREA)
- Data Mining & Analysis (AREA)
- Mathematical Physics (AREA)
- Computational Linguistics (AREA)
- Molecular Biology (AREA)
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- Quality & Reliability (AREA)
- Computer Hardware Design (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Databases & Information Systems (AREA)
- Medical Informatics (AREA)
- Telephonic Communication Services (AREA)
Abstract
A method performed by a first node (101) for handling a problem in a communications network (10). The first node (101) manages an artificial neural network (105). The first node (101) determines (201), in a set of data, a set of features defining a problem in an operation, which is previously undetected, and lacks a corresponding set of solutions. The first node (101) also trains (202) the artificial neural network (105) to find a solution to the problem, by training the artificial neural network (105) with a modified pre-existing linear activation function. The modified function adjusts a pre-existing function by adding a factor. The added factor adjusts a respective weight, respectively, of a subset of neurons comprised in the artificial neural network (105). The subset of neurons comprises neurons providing the highest output with the pre-existing function, when the pre-existing function is used to train the artificial neural network (105) with the set of data.
Description
FIRST NODE, AND METHOD PERFORMED THEREBY, FOR HANDLING A PROBLEM
IN A COMMUNICATIONS NETWORK
TECHNICAL FIELD
The present disclosure relates generally to a node and methods performed thereby for handling a problem in a communications network. The present disclosure further relates generally to a computer program product, comprising instructions to carry out the actions described herein, as performed by the node. The computer program product may be stored on a computer-readable storage medium. BACKGROUND
Computer systems in a communications network may comprise one or more nodes, which may also be referred to simply as nodes. A node may comprise one or more processors which, together with computer program code may perform different functions and actions, a memory, a receiving and a sending port. A node may be, for example, a server.
Emerging technologies like the Internet of Things (loT) may require latency-aware computation for real-time application processing. Also such emerging technologies may need to address self-adaptation, that is, self-healing, of few of the problems that may be encountered in the course of operations. Data generated from loT devices may generally be processed in a cloud infrastructure because of the on-demand services and scalability features of the cloud computing paradigm. However, processing loT application requests exclusively on the cloud may not be considered an efficient solution for some loT applications, especially time-sensitive mission-critical ones. A mission-critical application may be understood as an application which may require actions to be taken in a very minimal time, e.g., a driverless cars application, and it may require a low latency network. To address this issue, fog computing, which may be understood to reside in between cloud and loT devices, was proposed. In general, in a fog computing environment, loT devices may be connected to fog devices. A fog device may be understood as a low computation device which may act as an intermediate device between edge and cloud. These fog devices may be located in close proximity to users and may be responsible for intermediate computation and storage. Fog computing research is still in its infancy, and taxonomy-based investigation into the requirements of fog infrastructure, platform, and applications mapped to current research may be still required. Taxonomy-based investigation may be understood as a study of the dynamic nature of fog devices and a design of a network based on that.
The fog computing paradigm may be understood as a highly distributed
heterogeneous platform where the probability of device failure is very high in comparison to the cloud fog device failure probability is always high because the fog may be understood to comprise multiple devices, given that fog devices may only be connected in a decentralized manner, and the management of fog devices may typically not be central The devices may fail for many reasons, such as hardware failure, software failure, or because of user activity. Besides these problems, some other reasons may include connectivity, mobility, and power source, which may also play a big role. Most of the devices in a fog environment may be connected via wireless connections, and wireless connections may not always be reliable. Most of the devices that are connected via wireless are mobile, so these devices may change location to different clusters frequently. Fog devices may be understood to not keep a fixed structure. The structure may be understood to keep on changing and be dynamic in nature. Hence, the cluster of fog nodes may constantly change. One other characteristic of these devices may be that they may typically be battery powered and may fail anytime. Hence, dealing with the complex nature of failure may be very difficult.
Fault tolerance
Fault tolerance may be considered one requirement of fog infrastructure. Fault tolerance may be understood to allow a system to keep performing even when a part of the system has failed. This failure may be a software failure, a hardware failure, or a network failure. The solution for fault tolerance may result in a fully operational system, where the system may continue its operation with a lower capability, instead of shutting down totally.
Since the fog is evolving, no study has yet been done on fault tolerance in fog computing. However, fault tolerance has been mostly studied in the cloud computing paradigm [1 ,2] In the cloud environment, faults may typically be handled based on two different techniques: proactive fault tolerance and reactive fault tolerance, at either the workflow level or task level. Reactive fault tolerance techniques may be used to reduce the impact of failures on a system when the failures have occurred. Techniques based on this policy may be job migration, checkpoint and/or restart, replication, rollback and recovery, task resubmission, user-defined exception handling, and workflow rescue. Proactive fault tolerance may be understood to predict the faults pro-actively and replace the suspected components with other working components; thus, avoiding recovery from faults and errors. Proactive Fault Tolerance may use self-adaptation, pre-emptive migration, and software rejuvenation, which may be understood to be the few proactive fault tolerance techniques.
Proactive fault tolerance may predict the faults proactively and may replace the suspected components by other working components, thus avoiding recovery from faults and errors.
Fault tolerance is mostly investigated in the cloud. However, it may be necessary to investigate fault tolerance in the fog as well. Although many research efforts have addressed the need to explore fault tolerance problems [3], [4], [5] in fog computing, none have investigated the issue based on trying to identify the root sources of the problems.
The main challenge of a 5G deployment may be considered to be automation on all levels of the 5G eco-system. Without it, the network may simply not work; neither for the anticipated scale nor for the desired functional complexity. In order to introduce automation on all levels in a 5G system, self-adaptation may be considered the most prominent feature that may need to be added. Considering the requirement of setting up fault tolerance in all the important fog nodes may be understood to generally increase the overhead, and lead into latency problems in 5G communication services. 5G is envisioned to support unprecedented diverse applications and services with extremely
heterogeneous performance requirements, such as, mission critical loT communication, massive machine-type communication and Big data management in mobile connectivity.
In the 5G communication era, the operators may collaborate with application and/or service providers to provide better quality of loT services by providing self-adaptation for a few latency mission critical applications. First, the importance of initiating self-adaptation in fog nodes to address latency critical 5G applications will be discussed.
Self-adaptation
Proactive Fault Tolerance using self-adaptation is a way of controlling the failure of an instance of an application running on multiple virtual machines automatically. But generally, self-adaptation features may be implemented to mimic the animal selfadaptation process to bring automation to the system. Self-adaptation may be introduced for lifelong learning to update the features to perform self-adaptation on a continuous basis. Recently a new technique known as Plastic Neural Networks has been introduced the aim of which may be understood to be to autonomously design and create learning systems. It may also introduce lifelong learning to the existing systems by bootstrapping learning from scratch, recovering performance in unseen conditions, testing the computational advantages of neural components, and deriving hypotheses on the emergence of biological learning. But Plastic Neural networks may not be directly used for fog networking due to the presence of the Hebbian term in the plastic network. The Hebbian term is designed for biological network space and may not be used for engineering applications directly.
Sustainability in fog computing may be understood to optimize its economic and environmental interest on a great extent. However, the overall sustainable architecture of fog computing is subject to many issues such as assurance of Quality of Service (QoS), service reusability, energy efficient resource management etc. On the other hand, reliability in fog computing may be discussed in terms of consistency of fog nodes, availability of high performance services, secured interactions, fault tolerance etc. In the existing literature, a very narrow discussion towards sustainable and reliable fog computing has been provided.
It may be understood that there may be a need for a fog Computing Framework for 5G Networking. Many research studies recently explored cellular infrastructure, especially 5G networks, that is LTE Advanced (LTE-A) for developing fog computing in different applications. The main idea of utilizing a cellular infrastructure for fog computing is that it explores the use of hierarchical architecture, e.g., where the cloud may be connected to fog devices in a tree like structure, and where fog devices may also be connected to another set of fog devices in a tree structure. One of the primary uses of an LTE-A network for the purpose of high speed communication is for the signal processing activities through fog Radio Access Networks. Peng et al. [6] have proposed a RAN architecture for 5G systems based on fog computing which is an effective extension of a cloud based RAN. It may be used to reduce the front haul load and delay with the help of using virtualized baseband processing units. The edge processing and virtualization are the most efficient aspects in the context of 5G networks. Recently, fog based catching at the edge devices in radio access network has been explored and it may be used to identify the optimal catching along with front haul and edge transmission policies. Catching may be understood as a mechanism in which the edge devices may be connected to a fog network. Moreover, 5G systems may be understood to need more latency-sensitivity than the 4G systems. Fog computing is being applied in 5G systems to minimize the delay which includes communication and computing delay. Another issue to be addressed in using fog computing for 5G applications is the load balancing. Fog computing may be able to provide low latency interactions between machine to machine communications. Hence, it may be noted that the 5G based cellular system and the fog computing framework are very much related to each other in terms of compatibility compared to cloud computing.
However, given the high probability of device failure in fog computing, its sustainability and reliability pose challenges for its use in 5G networks.
SUMMARY
It is an object of embodiments herein to improve the handling of a problem in a communications network. It is a further object of embodiments herein to improve the handling of a problem in fog nodes in a communications network.
According to a first aspect of embodiments herein, the object is achieved by a method, performed by a first node. The method is for handling a problem in a communications network. The first node manages an artificial neural network. The first node determines, in a set of data collected from the communications network, a set of one or more features. The set of one or more features defines a problem in an operation of the communications network. The set of one or more features is previously undetected in the communications network by the first node. The set of one or more features is also lacking a corresponding set of one or more solutions. The first node then trains the artificial neural network to find a solution to the problem defined by the set of one or more features. This is done by training the artificial neural network with a modified pre-existing linear activation function. The modified pre-existing linear activation function adjusts a pre-existing linear activation function by adding a factor. The added factor adjusts a respective weight, respectively, of a subset of neurons comprised in the artificial neural network. The subset of neurons comprises one or more neurons providing the highest output with the pre-existing linear activation function, according to a first threshold. This is when the pre-existing linear activation function is used to train the artificial neural network with the set of data.
According to a second aspect of embodiments herein, the object is achieved by the first node. The first node may be considered to be for handling the problem in the communications network. The first node is configured to manage the artificial neural network. The first node is further configured to determine, in the set of data configured to be collected from the communications network, the set of one or more features configured to define the problem in the operation of the communications network. The set of one or more features is configured to be: a) previously undetected in the communications network by the first node, and b) lacking a corresponding set of one or more solutions.
The first node is also configured to train the artificial neural network to find the solution to the problem configured to be defined by the set of one or more features. This is configured to be performed by training the artificial neural network with the modified preexisting linear activation function. The modified pre-existing linear activation function is configured to adjust the pre-existing linear activation function by adding the factor. The added factor is configured to adjust the respective weight, respectively, of the subset of neurons configured to be comprised in the artificial neural network. The subset of
neurons is configured to comprise the one or more neurons providing the highest output with the pre-existing linear activation function, according to the first threshold, when the pre-existing linear activation function is used to train the artificial neural network with the set of data.
According to a third aspect of embodiments herein, the object is achieved by a computer program, comprising instructions which, when executed on at least one processor, cause the at least one processor to carry out the method performed by the first node.
According to a fourth aspect of embodiments herein, the object is achieved by a computer-readable storage medium, having stored thereon the computer program, comprising instructions which, when executed on at least one processor, cause the at least one processor to carry out the method performed by the first node.
By the first node determining the previously undetected set of one or more features lacking a corresponding set of one or more solutions, and then training the artificial neural network to find the solution to the problem with the modified pre-existing linear activation function, the first node is enabled to automate its own self-adaptation and manage any failures that it may not have previously encountered. Furthermore, the first node may be enabled to address failure management with reduced latency of applications. This may be particularly useful in fog nodes, improving the functioning of the communications network and its cost. In some aspects, embodiments herein may be understood to introduce proactive fault tolerance measures capable of being deployed within fog networking by running a modified plastic neural network which may introduce lifelong learning in performing self-adaptation to enhance and achieve expected reach in mission critical 5G applications.
BRIEF DESCRIPTION OF THE DRAWINGS
Examples of embodiments herein are described in more detail with reference to the accompanying drawings, according to the following description.
Figure 1 is a schematic diagram illustrating two non-limiting examples of a
communications network, according to embodiments herein.
Figure 2 is a flowchart depicting a method in a first node, according to embodiments herein.
Figure 3 is a schematic diagram depicting aspects of the method performed by the first node, according to embodiments herein.
Figure 4 is a schematic diagram depicting other aspects of the method performed by the first node, according to embodiments herein.
Figure 5 is a schematic diagram of an example of the first node, according to
embodiments herein.
Figure 6 is a schematic diagram depicting aspects of the method performed by the first node, according to embodiments herein.
Figure 7 is a schematic diagram depicting aspects of the method performed by the first node, according to embodiments herein.
Figure 8 is a schematic block diagram illustrating embodiments of a first node, according to embodiments herein. DETAILED DESCRIPTION
As part of the development of embodiments herein, one or more problems with the existing technology will first be identified and discussed.
Further research in the area of sustainable and reliable fog computing is highly recommended for the desired performance of fog computing.
In general, any fog self-adaptation problem may need to handle two cases: (i) first, it may need to be able to adapt to existing problems, for which the model may already be trained, and (ii) second, it may need to be able to adapt to new problems. The first problem is very much discussed in literature and numerous methods are available for the same. However, the second problem seems much more complex and, there are no such existing methods for the same. To be able to adapt to new problems, the systems may need to behave like human beings, where they may need to try to identify the solutions for the new problems. This may be also known as lifelong learning since fog nodes may need to understand the different sets of problems they are likely to face over the entire life.
Neural networks may be understood to recognize problems as a set of features. A problem in this context may be, for example, a classification problem, where the set of features may need to be categorized based on a pattern of the features. A feature may be understood as a variable or parameter, such as, e.g., sensor data. In general, any neural network may perform methodically for the data with similar features on which the network may have been trained. However, when new features are encountered, the network may fail and tend to give wrong results. A result may be considered a wrong result when the identified category is different from that of the pattern observed in the features. A plastic neural network has been recently proposed [7], which may be considered to have a similar function to the functioning of the plasticity in the human brain. However, the problem with the method proposed in [7] is that it is computationally non- tractable, as the computational cost may be understood to increase exponentially with the number of features. Moreover, the problem with the method proposed in [7] requires the
knowledge of domain, that is, knowledge to understand the Hebbian term, in which the self-adaptation is performed.
In some cases, if the network is trained on the existing problems, and then a new feature is encountered, the network may classify it as one of the old features it may have encountered in the past, absent a more likely alternative. This will generate many false positives in the classification problem. However, in the case of self-adaptation, the number of false positives may be understood to be desired and expected to be as low as possible. Otherwise, the system will be never end looping and may require human intervention to solve the problems it may be facing.
For example, if there are ten false positives in the output of a classifier designed for self-healing, the system may need to try ten different solutions, that is, ten different times, before fetting an actual solution. This is time consuming considering the self-healing scenario, since results are expected to be delivered in the order of milliseconds. Hence, reducing the number of false positives may be understood to play a relevant role in designing any self-healing system.
Certain aspects of the present disclosure and their embodiments address the one or more of the issues with the existing methods and provide solutions to the challenges discussed earlier. In general terms, embodiments herein may be understood to be drawn to introducing a new method to decrease false positives in a neural network. In particular, and owing to all the reasons discussed in the preceding paragraphs, embodiments herein may be understood to be drawn to a modified method for self-adaptation.
More particularly, embodiments herein, may be understood to be drawn a new method in a computer system managing an artificial neural network to understand the features of learned problems, which may be considered to belong to a first category out of the categories in which the set of patterns may be divided, where the first category here is the true category of the problem, and to then use them to find similarity with a new problem. Once the similarity is computed, a new solution may be proposed by taking all the solutions of identified similar problems.
In some further particular aspects, embodiments herein may be understood to be drawn to a new loop inside a fog node to monitor the environment in 5G applications. Embodiments herein may be understood to be aimed to design a proactive fault tolerance methodology, such as a self-adaptive methodology, within fog nodes, for mission critical 5G applications. In order to handle faults in fog computing, it may be understood to be relevant to consider a fault at every step, not only for processing, but also for the transmit- and-receive process. The failure handling method may be proactive, which may be understood to mean that it may always monitor the host and may continuously try to predict the chances of failure. If the prediction becomes true, the system may look up
other available resources and then migration may be performed to avoid the problem from having more damaging consequences for the functioning of the network. To employ such a technique in fog computing, further investigation may be needed because the types of device in the fog may be diverse. Because of the unstable nature of failures and heterogeneous characteristics, a hybrid failure handling method combining several methods may be more appropriate for the fog computing environment.
In some aspects, embodiments herein may be understood to introduce proactive fault tolerance measures within fog networking by building a cognitive fog framework and running a modified plastic neural network which may introduce lifelong learning in performing self-adaptation to enhance and achieve expected reach in mission critical 5G applications. Further particularly, embodiments herein may be understood to propose a new system to introduce a proactive fault tolerance method within a cognitive fog framework by running a plastic neural network inside fog node. The method may be understood to introduce lifelong self-adaptation in fog nodes, and thereby enhance the performance in mission critical latency-aware 5G applications.
In some aspects, embodiments herein may be understood to comprise two steps. In a first step, a new cognitive fog framework layer and relevant taxonomy in fog Networking may be introduced. The taxonomy discussed in [1] may be used to understand different hierarchies in a fog network and their connections, since the fog network may be continuously changing. This understanding may assist the system of embodiments herein to introduce peer-to-peer associations to solve a problem. In the first stage, the cognitive fog framework layer may be built inside the fog node with the application of both Machine Learning (ML) and neural learning, to introduce lifelong learning, such that the proposed system may address fault tolerance automatically for any new faulty conditions. First, a ML method may be trained to understand the problems and perform fault tolerance. In a second step, the fog node may be enabled to self-adapt for any failures based on lifelong learning. The lifelong learning may be implemented with the application of a plastic neural network, which may continuously learn and establish self-adaptation for any new faulty situations. This may be achieved by changing weights of some of the neurons in an artificial neural network managed by the node, rather than the whole neuronal network, in order to decrease the number of false positives, and to maintain the stability of the network.
According to the foregoing, particular embodiments herein may be understood to relate to a system and method to empower cognitive fog networking for proactive fault tolerance in 5G applications.
Several embodiments and examples are comprised herein. It should be noted that the embodiments and/or examples herein are not mutually exclusive. Components from
one embodiment or example may be tacitly assumed to be present in another embodiment or example and it will be obvious to a person skilled in the art how those components may be used in the other exemplary embodiments and/or examples. Figure 1 depicts two non-limiting examples, in panels“a” and“b”, respectively, of a communications network 10, in which embodiments herein may be implemented. In some example implementations, such as that depicted in the non-limiting example of Figure 1 a), the communications network 10 may be a computer network. In other example implementations, such as that depicted in the non-limiting example of Figure 1 b), the communications network 10 may be implemented in a telecommunications network 100, sometimes also referred to as a cellular radio system, cellular network or wireless communications system. In some examples, the telecommunications network 100 may comprise network nodes which may serve receiving nodes, such as wireless devices, with serving beams.
In some examples, the telecommunications network 100 may for example be a network such as 5G system, or Next Gen network or an Internet service provider (ISP)- oriented network that may support an SCEF. The telecommunications network 100 may also support other technologies, such as a Long-Term Evolution (LTE) network, e.g. LTE Frequency Division Duplex (FDD), LTE Time Division Duplex (TDD), LTE Half-Duplex Frequency Division Duplex (HD-FDD), LTE operating in an unlicensed band, Wideband Code Division Multiple Access (WCDMA), Universal Terrestrial Radio Access (UTRA) TDD, GSM/Enhanced Data Rate for GSM Evolution (EDGE) Radio Access Network (GERAN) network, Ultra-Mobile Broadband (UMB), EDGE network, network comprising of any combination of Radio Access Technologies (RATs) such as e.g. Multi-Standard Radio (MSR) base stations, multi-RAT base stations etc., any 3rd Generation Partnership Project (3GPP) cellular network, Wireless Local Area Network/s (WLAN) or WiFi network/s, Worldwide Interoperability for Microwave Access (WiMax), IEEE 802.15.4- based low-power short-range networks such as IPv6 over Low-Power Wireless Personal Area Networks (6LowPAN), Zigbee, Z-Wave , Bluetooth Low Energy (BLE), or any cellular network or system.
The communications network 10 comprises a plurality of nodes, whereof a first node 101 and a second node 102 are depicted in Figure 1. The first node 101 and the second node 102 may be understood, respectively, as a first computer system or server and a second computer system or server. Any of the first node 101 and the second node 102 may be implemented as a standalone server in e.g., a host computer in the cloud
110, as depicted in the non-limiting example of Figure 1 b). In other examples, any of the first node 101 and the second node 102 may be a distributed node or distributed server,
such as a virtual node in the cloud 110, and may perform some of its respective functions being locally, e.g., by a client manager, and some of its functions in the cloud 1 10, by e.g., a server manager. In other examples, any of the first node 101 and the second node 102 may perform its functions entirely on the cloud 110, or partially, in collaboration or collocated with a radio network node. Yet in other examples, any of the first node 101 and the second node 102 may also be implemented as processing resource in a server farm. Any of the first node 101 and the second node 102 may be under the ownership or control of a service provider, or may be operated by the service provider or on behalf of the service provider.
Any of the first node 101 and the second node 102 may be a core network node, such as, e.g., a Serving General Packet Radio Service Support Node (SGSN), a Mobility Management Entity (MME), a positioning node, a coordinating node, a Self- Optimizing/Organizing Network (SON) node, a Minimization of Drive Test (MDT) node, etc.... In 5G, for example, any of the first node 101 and the second node 102 may be located in the OSS (Operations Support Systems).
The first node 101 may be understood to have the capability to perform machine- implemented learning procedures, which may be also referred to as“machine learning”. The model used for prediction may be understood as a predictive model, e.g., a predictive regression model such as Random Forest. In some embodiments, the system that may be used for training the model and the one used for prediction may be different. The system used for training the model may require more computational resources than the one to use the built/trained model to make predictions.
Particularly, the first node 101 may be understood to have a capability to manage an artificial neural network 105. The artificial neural network 105 may be understood as a machine learning framework, which may comprise a collection of connected nodes, where in each node or perceptron there may be an elementary decision unit. Each such node may have one or more inputs and an output. The input to a node may be from the output of another node or from a data source. Each of the nodes and connections may have certain weights or parameters associated with it. In order to solve a decision task, the weights may be learnt or optimized over a data set which may be representative of the decision task. The most commonly used node, may have each input separately weighted, and the sum may be passed through a non-linear function which may be known as an activation function. The nature of the connections and the node may determine the type of the neural network, for example a feedforward network, recurrent neural network etc. To have a capability to manage an artificial neural network 105 may be understood herein as having the capability to store the training data set and the models that may result from the machine learning, to train a new model, and once the model may have been trained,
to use this model for prediction. In some embodiments, the system that may be used for training the model and the one used for prediction may be different. The system used for training the model may require more computational resources than the one to use the built/trained model to make predictions. Therefore, the first node 101 may, for example, support running python/Java with Tensorflow or Pytorch, theano etc... The node 101 may also have GPU capabilities.
In some particular embodiments, the first node 101 may be comprised in a fog node in the communications network 10.
The first node 101 may have access to a memory or a database 130, depicted on Figure 1 b), which may comprise pre-existing predictive models 131 of problems, e.g., of the communications network 10, or of another network. The memory or database 130 may alternatively be comprised in the first node 101 itself.
The second network node 1 12 may be another core network node, as depicted in the non-limiting example of Figure 1 a), a radio network node, such as the radio network node 150 described below, or a user equipment, such as the communication device 140 described below.
In some examples of the communications network 10, which are not depicted in Figure 1 , the first network node 111 and the second network node 1 12 may be co-located, or be a same node.
The communications network 10 may comprise a plurality of communication devices, whereof a communication device 140 is depicted in the non-limiting example scenario of Figure 1 . The communications network 10 may also comprise other communication devices. The communication device 140 may be a UE or a Customer Premises
Equipment (CPE) which may be understood to be enabled to communicate data, with another entity, such as a server, a laptop, a Machine-to-Machine (M2M) device, device equipped with a wireless interface, or any other radio network unit capable of
communicating over a wired or radio link in a communications system such as the communications network 10. The communication device 140 may be also e.g., a mobile terminal, wireless device, wireless terminal and/or mobile station, mobile telephone, cellular telephone, or laptop, just to mention some further examples. The communication device 140 may be, for example, portable, pocket-storable, hand-held, computer- comprised, a sensor, camera, or a vehicle-mounted mobile device, enabled to
communicate voice and/or data, via a RAN, with another entity, such as a server, a laptop, a Personal Digital Assistant (PDA), or a tablet computer, sometimes referred to as a tablet with wireless capability, or simply tablet, a Machine-to-Machine (M2M) device, a device equipped with a wireless interface, such as a printer or a file storage device, modem, Laptop Embedded Equipped (LEE), Laptop Mounted Equipment (LME), USB dongles or
any other radio network unit capable of communicating over a wired or radio link in the communications network 10. The communication device 140 may be enabled to communicate wirelessly in the communications network 10. The communication may be performed e.g., via a RAN and possibly one or more core networks, comprised within the communications network 10.
The communications network 10 may comprise a plurality of radio network nodes, whereof a radio network node 150, e.g., an access node, or radio network node, such as, for example, the radio network node, depicted in Figure 1 b). The telecommunications network 100 may cover a geographical area, which in some embodiments may be divided into cell areas, wherein each cell area may be served by a radio network node, although, one radio network node may serve one or several cells. The radio network node 150 may be e.g., a gNodeB. That is, a transmission point such as a radio base station, for example an eNodeB, or a Home Node B, a Home eNode B or any other network node capable to serve a wireless device, such as the communications device 140 in the communications network 10. The radio network node 150 may be of different classes, such as, e.g., macro eNodeB, home eNodeB or pico base station, based on transmission power and thereby also cell size. In some examples, the radio network node may serve receiving nodes with serving beams. The radio network node 150 may support one or several communication technologies, and its name may depend on the technology and terminology used. The radio network node 150 may be directly connected to one or more core networks in the telecommunications network 100.
The first node 101 is configured to communicate within the communications network 10 with the second node 102 over a first link 161 , e.g., a radio link, an infrared link, or a wired link. In the particular example of Figure 1 b), the first node 101 is configured to communicate within the communications network 10 with the second node 102 over a second link 162 with the radio network node 150, e.g., a radio link, an infrared link, or a wired link, which in turn may be configured to communicate within the communications network 10 with the communication device 140 over a third link 163, e.g., a radio link, an infrared link, or a wired link. The first node 101 may be configured to communicate with the database 130 within the communications network 10 over a fourth link 164, e.g., a radio link, an infrared link, or a wired link.
Any of the first link 161 , the second link 162, the third link 163 and the fourth link 164 may be a direct link or may be comprised of a plurality of individual links, wherein it may go via one or more computer systems or one or more core networks in the
communications network 10, which are not depicted in Figure 1 , or it may go via an optional intermediate network. The intermediate network may be one of, or a combination of more than one of, a public, private or hosted network; the intermediate network, if any,
may be a backbone network or the Internet; in particular, the intermediate network may comprise two or more sub-networks, which is not shown in Figure 1 .
In general, the usage of“first”,“second”,“third”,“fourth” etc. herein may be understood to be an arbitrary way to denote different elements or entities, and may be understood to not confer a cumulative or chronological character to the nouns they modify.
Some of the embodiments contemplated herein will now be described more fully with reference to the accompanying drawings. Other embodiments, however, are contained within the scope of the subject matter disclosed herein, the disclosed subject matter should not be construed as limited to only the embodiments set forth herein; rather, these embodiments are provided by way of example to convey the scope of the subject matter to those skilled in the art.
Embodiments of a method, performed by the first node 101 , will now be described with reference to the flowchart depicted in Figure 2. The method is for handling a problem in the communications network 10. The first node 101 manages an artificial neural network 105.
Several embodiments are comprised herein. In some embodiments all the actions may be performed. In some embodiments, one or more actions may be optional. In Figure 2, optional actions are indicated with dashed lines. It should be noted that the examples herein are not mutually exclusive. Components from one embodiment may be tacitly assumed to be present in another embodiment and it will be obvious to a person skilled in the art how those components may be used in the other exemplary
embodiments. One or more embodiments may be combined, where applicable. All possible combinations are not described to simplify the description. Some actions may be performed in a different order than that shown in Figure 2.
Action 201
In order to enable the first node 101 to solve a problem which it may not have faced earlier, that is, in order to eventually self-adapt for any new problem, embodiments herein may be understood to provide a modified plastic neural network mechanism to understand the new problem and perform fault tolerance on its own. A new problem may be understood as a problem that may not have been encountered by the first node 101 before, and may therefore not be available within historical records available to the first node 101 , e.g., in the database 130. In order to solve this new problem, in this Action 201 , the first node 101 determines, in a set of data collected from the communications network 10, a set of one or more features. The set of one or more features define a
problem in an operation of the communications network 10. The set of one or more features are previously undetected in the communications network 10 by the first node 101 . That is, they are new. The fact that they are new may be understood to mean, not necessarily that each of the features in the set is new, but that the combination of features is new. At least one of the one or more features is the set may also be new, in some examples, but this is not necessary. The set of one or more features are also lacking a corresponding set of one or more solutions.
The determining in this Action 201 may be understood as calculating, deriving or detecting.
The set of data collected from the communications network 10 may be, for example, the history of an application, which may be e.g., stored in the database 130. The database 130 may also be referred to herein as a knowledge data store. The history of the application may comprise log files that may be application logs, counter files, alarm files or any other file specific to the application.
The set of one or more features may be, a set of one or more variables or parameters, such as, for example, sensor values of temperature, and/or pressure, along with the time instants in which those values may have been obtained.
The first node 101 may perform the determining in this Action 201 by using different components to explore the scenario based on the root sources of the problem, as explained next. Each of the different components may be, for example, implemented in a distributed environment, e.g., by a plurality of processors e.g., in the cloud 1 10.
a) Monitor
This component may be understood to be responsible for monitoring the history of an application to uncover any problems in the application system for example, the history of an application.
Through this component, the first node 101 may detect relevant log files. The monitor component may, for example, monitor all sensors attached to the first node 101 . The input to this component may be one single configuration file for all devices that may be connected to each fog node. A configuration file may be understood to comprise several details, such as device identifier (ID), type, value, and the time instant the value may have been recorded. It may also comprise information such as tree structure etc. The contents of the configuration file may have a path of the log files for each application, the pattern to search for in each log file for each application, iteration cycle etc.. The monitor component may use the configured files to search in the log files to detect problems in the system. That is, the first node 101 may compare the set of data collected with the configured pattern, to detect anomalies or indicators of problems in the operation of the communications network 10.
The output of this component may be a xml/json data object that may comprise all the relevant information surrounding the log, e.g., timestamp, service, counter, occurrence etc., and which may be passed to the next component of the first node 101 , a correlation component.
b) Correlator
This component may be understood to be responsible for constructing the Sphere of Interest (Sol) Data/Indicators using the input received from the monitor component. The Sol may be understood to be a template that represents the application in the form of spheres and/or layers which may help in tagging a problem with a particular domain of an application. While the configured pattern in the Monitor component may only identify the problem, in this component, solutions may be associated with each and every problem. The problem that may have occurred may be seen as different pattern in the features. To solve the problem, the first node 101 may need to identify the pattern, and then use the corresponding solution to repair it. The solution may be domain specific or it may be across various domains also. The correlator may also update the knowledge data store, that is, the database 130, with the information. The Sol may have all the indicators associated with the problem monitored. It may be understood to be a stateless component, that is, it may be understood to not require any time information.
The input to this component may be a Json/xml object received from the monitor component.
The output of this component may be the updated xml/json object that may then be passed to an analyzer component. Also the Sol may be available with run time information.
c) Analyzer
This component may be understood to be responsible for analyzing all the Sol indicators and arrive at the potential problem(s), that is, the set of one or more features. For that to happen, the analyzer may maintain threshold values of each Sol indicator and also may maintain a table of information to map indicators to a particular problem. An indicator may be understood as an anomalous pattern in the data which may indicate the presence of an issue or problem. This table may be either constructed at compile time, in real time, or constructed by learning, offline, or a mixture of the above variants. The initial table may be constructed with the information relevant to problems already known to the application. And the table may be updated with the information learnt by the selfadaptation system. If the component is unable to derive the problem, it may notify the presence of the problem to the second node 102, e.g., a UX client. The analyzer may also update the knowledge data store with the information. This component is stateless.
The input to this component may be a Json/xml object received from the correlator component. The output of this component may be a list of problems, e.g., problem ids, identified. The output may be the updated xml/json object passed to an executor component. Also, the Sol may be available with run time information, that is, the solution may have also some features such as time taken etc..
d) Executor
This component may be understood to be responsible for fetching the solutions mapped for the Problem IDs that may have been reported by the analyzer. IN general, there may be more than one known solution for the problem reported by the analyzer. The order in which the solution may have to be applied depends on the characteristics of each solution and the rank of the solution. Example for characteristics of a solution may include time at which the solution may need to be applied, the range of Central
Processing Unit (CPU) load at which the solution may need to be applied, cooling period, whether the solution may need cloud approval etc. The ranking of the solution may be assigned based on the success ratio of the problem resolution when the solution may have been applied previously. Based on the two factors, the solution may be chosen and applied. When a solution is applied for a problem, until its cooling period there may not be any other solution applied for another occurrence of the same problem. This may be done to keep the system light weighted without adding unnecessary complexity. Due to this, the executor may be understood to be a stateful component, that is, it may be understood to require time information. The executor may also update the knowledge data store with the information. As was described for the analyzer component, if there is no known solution for the reported problem identifiers, the first node 101 may just notify the presence of the problem using UX client.
In general terms, the input to this component may be a set of problem identifiers received from the analyzer component. The output of this component may be either based on an applied solution waiting in its cooling period, or a notification towards the UX interfaces. As stated earlier, in the embodiments herein, the first node 101 in this Action 201 , e.g., via the executor component, may conclude that the set of one or more features lack a corresponding set of one or more solutions.
Action 202
Having concluded that the set of one or more features lack a corresponding set of one or more solutions, in this Action 202, the first node 101 trains the artificial neural network 105 to find a solution to the problem defined by the set of one or more features. This is performed by training the artificial neural network 105 with a modified pre-existing linear activation function.
The pre-existing linear activation function may be understood as a function which may introduce non-linearity between the relation between input and output of the neural network 105. This may be understood to be to ensure that the relation may be defined properly.
The training in this Action 202 may be performed with another component of the first node 101 , an offline monitor.
e) Offline Monitor
This component may be understood to be responsible for training the self-adaptation system and introduce a learning component of the self-adaptation system. The offline monitor component may, using machine learning, analyse the current trend in traffic, a pattern of events, learn about the system utilization during different hours of a day, different day of a week and so on. The offline monitor component may use the database 130, that is, the knowledge data store, to learn the system. Embodiments herein may make use of a modified plastic neural network to find the optimal solution for the new problem, that is, the set of the one or more features. This practice may be understood to inculcate a lifelong learning to the system. The input may be the historically stored data objects by other components in the first node 101 . The output may be with and/or without manual approval, perform one of the following actions for new learnings: a) update the analyzer table with the new entry mapping of identified problems and its indicators or modify existing entries, and b) update the executor table with new entry mapping of identified solutions and its indicators or modify existing entries.
After detecting the specific knowledge of the problem and other parameters, the offline monitor may run the modified plastic neural network model to find a solution for the problem as a next step.
The modified pre-existing linear activation function adjusts a pre-existing linear activation function by adding a factor. The added factor adjusts a respective weight, respectively, of a subset of neurons comprised in the artificial neural network 105.
The artificial neural network 105 may be understood to comprise a group of so- called“neurons”, which may have so-called“connections” among them. A neuron herein may be understood as a unit of computation. A neuron may have one or more inputs and an output. Each input to the neuron may be associated with a weight or connection. The sum of the product of the inputs with its corresponding weight may be passed through an activation function such as Rectified Linear Unit (ReLU), tanh, Leaky ReLU etc, softmax etc. These activation functions may be understood to add non-linearity to the network. At each layer, a connection may determine which input may be fed to which neuron. And the weight associated with the connection may determine how important that input is for the computation of the connected neuron.
Any artificial neural network may compute an output vector y as a multiplication of weights of a network W and input features x. In mathematical terms, this may be represented, assuming linear activation functions, as:
y = Wx
The artificial neural network may learn the weights such that the predicted output may match with the actual output. As mentioned, the weights may be trained only on given features and output. However, for the case of new problems, the first node 101 may rely on the assistance of the trained model to help. Hence, the above learning pattern may be modified as:
y = Wx + Ux
where U may be considered as the weights of the artificial neural network 105 which may help in addressing the new set of problems. Here, the artificial neural network 105 may be understood to learn new things from the present incidents. These are referred to herein as stochastic weights. However, the choice of stochastic weights may depend on the training the artificial neural network 105 may require from the earlier patterns identification.
According to the foregoing, in some embodiments, the pre-existing linear activation function may have the formula: y=Wx. In some of such embodiments, the modified preexisting linear activation function may have the following formula: y=Wx+Ux, U being the added factor.
For the computation of , the below procedure may be followed.
First, the weights in a matrix U may be initialized to zero. Further, for any new problem that may come, the features of the problem may be checked. This will now be illustrated with an example.
Example
In this example, it is assumed that the artificial neural network 105 is trained to handle problems p1 p2 and p3 , and that the specific combination of features depicted in Table 1 , create the problems previously detected in the communications network 10 by the first node 101 , which are given in the corresponding rows of Table 1 .
Table 1.
A problem may be understood to be identified by the features. In this example, it may now be assumed that the first node 101 encounters a new problem, wherein the new features of the problem are /2, /5 and f6, all individually known, but in a combination previously undetected in the communications network 10 by the first node 101 and lacking a corresponding set of one or more solutions. In this case, the current features have not matched with any of the known problems. In existing methods, an artificial neural network may identify the new problem as either p2 or 3. This is wrong may lead to a waste of resources and an increase in false positive scores, that is, instances when the true category is false and even then, the predicted system may have returned it as true. As an alternative, it may be possible to execute all these solutions one by one. Sometimes this may also lead to a waste of time resources.
Embodiments herein provide a new method to solve this problem. This may comprise the generation of a new solution for set of new features. The following procedure does this.
First, all the features and solutions associated with the new problem may be collected. For purposes of this illustrative example, it is assumed that there are known features
ar|d they have known solutions s1, s2, s3, s4, s5, s6 for each feature. An important assumption here is each feature has only one solution. The artificial neural network 105 may have initially been constructed in this case to map the features and solutions. Use of a multi-layer perceptron may be assumed in this case. The artificial neural network 105 constructed may have three hidden layers and a sigmoid activation function. The artificial neural network 105 initially constructed may have the architecture shown in Figure 3.
The number of input nodes in the artificial neural network 105 may need to map the dimension of the feature vector. The number of output nodes may need to map the number of solutions in the data. In brief, this may be considered a categorization problem.
If now may be assumed a new feature f7 is determined to be present in the data in Action 201 . In this case, this information may be passed to the input of the artificial neural network 105. Hence, the output of the artificial neural network 105 may depend on the similarity of the feature present in the dataset. However, this may be understood to not be a simple classification problem, as the feature f7 may be totally different from that of existing features.
If existing methods were used, the classification problem which has most similar features may have been provided as solution. However, there would a problem with this approach. The problem is because that too many false positives would be provided. In the case of self-adaptation, very low to nill number of false postives may be understood to be
required. Otherwise, instead of self-adapting to the problem, another set of problems may be created. Hence, according to embodiments herein, in this Action 202, the artificial neural network 105 may be retrained using the procedure discussed in the plasticity network. The stochastic weights discussed in the plastic network may be calculated usign the random adjustment. The random adjustment may be done only on a particular set of weights so as to ensure stability.
If the first node 101 were to adjust all the weights in the artificial neural network 105, the artificial neural network 105 may become unstable and result in poor results. Hence, the adjustment may be performed only on specific weights to ensure stabilty. The set of weights may be selected using the below procedure.
First, the trained model may be used on the existing set of features. In this model, the first node 101 may check which neurons in the artificial neural network 105 may be firing other neurons continuously for all the features. That is, the first node 101 may select the subset of neurons which may be continuously firing other neurons. In this way, the most dominant neurons may be identified. From another perspective, it may be seen that these neurons participate effectively in the classification.
As mentioned earlier, the added factor adjusts a respective weight, respectively, of a subset of neurons comprised in the artificial neural network 105. The subset of neurons comprises one or more neurons providing the highest output with the pre-existing linear activation function, according to a threshold, which is referred to herein as a first threshold, when the pre-existing linear activation function is used to train the artificial neural network 105 with the set of data.
In practice, it may be understood that the number of neurons is not restricted and may depend on the user. However, it may be remembered that a high number of neurons selected may destabilize the system and a low number of neurons selected may not be converged. Hence, the following procedure may be used to select the subset.
In some embodiments, the subset of neurons may be identified based on a back- propagation process in the training of the artificial neural network 105.
Usually, in any back propagation with new input data is arrived, the errors are computed and they may back propagate to initial inputs to calculate weights. During this calculation, some of the neurons may fire other neurons. This process may be repeated until all the data for training may have been used. In this way, it may be observed which neurons may fire others, that is, which neurons may be sending output to other neurons throughout the entire training process. In some examples, the first node 101 may select the top 10 neurons which fire other neurons frequently. Hence, in some embodiments, the subset of neurons may comprise 10 neurons. In this way, the computation complexity of the method may be reduced. Overall, the first node 101 may then have 10 weights to play
with using the proposed method. The mathematical details behind this are discussed below.
For illustration purposes, it may be assumed that the artificial neural network 105 has pre-trained weights W. The artificial neural network 105 may be trained for learned features and solutions. However, when a new feature is encountered, it will introduce false positive for the data even though it is closer to an existing feature. Then in this case, the first node 101 may need to retrain the artificial neural network 105 in total. This may be understood to require adding the new feature and a corresponding solution. This may be considered to be difficult and an automatic way of arriving at the solution may be required. For this purpose, the weights of the artificial neural network 105 may be retuned by adding some random numbers. For example, according to this, the new weights may be:
W + u where U refers to the stochastic weights.
Two cases may be considered here. In a first case, all elements are non-zero in matrix U. Here, it means that all the weights of the artificial neural network would be changed. In this case, the artificial neural network would get destabilized, which would mean the predictions would go for toss. That is, the number of false positives would increase.
In a second case, some elements are non-zero in matrix U. Here, this may be understood to mean that only some of the weights of the artificial neural network 105 may be changed. In this case, the artificial neural network 105 may perform well and the change of weights does not destabilize the artificial neural network 105. In this case, the predictions may still be reasonable, since only some weights may be understood to be adjusted. In this case, a decrease in the number of false may be observed when compared to the original network W.
In embodiments herein, the second case of weight selection may be understood to be chosen as the objective here may be understood to be to decrease the number of false positives.
After reducing the false positives, embodiments herein may be understood to provide a solution for the problem based on lifelong learning approach.
Then, the component which may actually apply the corrected solution, perform, may need to be executed as a final step in FTC loop to illustrate the self-adaptation of any new problem relevant to domain.
The first threshold may be based on a ranking of the neurons in the artificial neural network 105 according to their output with the pre-existing linear activation function.
The above procedure may be understood to imply that the weight matrix W corresponds to existing weights of the artificial neural network 105, and matrix U may be understood to correspond to stochastic weights for the artificial neural network 105. As explained already, the first node 101 may not change all the values of U matrix, that is, all the values except some finite values will be zero. In the case of non-zero values of U, the artificial neural network 105 may predict with the new weights of (W+U). This may be understood to make the artificial neural network 105 learn the new features without training, which may then enable the reduction in the number of false positives. Action 203
The set of neurons may be selected, and their weights may be adjusted at random with small change in weights. After every adjustment, a new set of solutions may be obtained, and each and every solution may be tested. In this Action 203, the first node 101 may determine whether or not the solution to the problem is found based on the modified pre-existing linear activation function.
If the problem is solved, that it if it subsidizes, then it may be considered that that the random weights may be initialized correctly and hence the problem may be solved. To initialize may be understood as to assign certain weights at the beginning of training to the connections. However, if the problem is not solved, the first node 101 may perform Action 204.
Action 204
If in Action 203 the first node 101 determines that the problem is not solved, in this Action 204, the first node 101 may adjust, based on a result of the determination in Action 203, the respective weight of at least a first neuron in the subset of neurons. The adjustment may be by a random amount.
That is, if the problem is not solved, the first node 101 may go back to change the weights with another random quantity, and repeat this process until the problem is solved. In this way, the entire solution may be performed autnomously, without the need of any human to solve the problem.
Action 205
In this Action 205, the first node 101 may use the artificial neural network 105 with the modified pre-existing linear activation function using the adjusted respective weight to find the solution to the problem defined by the set of one or more features.
In some particular embodiments, the set of one or more features may comprise a previously unidentified feature by the pre-existing linear activation function. In some of
such embodiments, after the solution to the previously unidentified feature is found by the artificial neural network 105 using the modified pre-existing linear activation function, the modified pre-existing linear activation function may be used to identify future problems in the communications network 10, based on the previously unidentified feature and the found solution.
Action 206
In this Action 206, the first node 101 may iterate the determining of Action 203 of whether or not the solution to the problem has been found, the adjusting of Action 204 and the using of Action 205 until one of: the solution is found or a number of iterations has exceeded another threshold, referred to herein as a second threshold, in the absence of having found a solution.
The above procedure may be repeated for a specific set of iterations, for example.
In some embodiments, the second threshold may be 10. If the problem is not solved within 10 iterations, then this may be understood as as unresolved new problem, and a human may be asked to solve it. But most of the problems may be expected to be solved within 10 iterations. Human intervention may be needed only for very rare cases.
Mathematically, this may be represented as follows.
The weight matrix learned may be assumed to be W . This matrix may be understood to contain columns which are shown below
W = [input node weights first layer weights . output layer weights]
As per the above procedure, a subset of weights may be selected from the above weights. Let it be Ws. Now, after a first iteration, the weights may be updated as:
Ws + Rs
where Rs is the random change and which may be considered to be input to the matrix of the weights. Now, this may be replaced in the matrix W and a classification may be performed, that is, the neural network 105 may be executed to obtain the solution. If the new category may be predicted, it may be used to solve the problem. If the solution is worked, then the new model may be used to predict next time also. If the problem is not solved, another random set of weights may be chosen as:
Ws + RS1
The same procedure may be followed until the problem is solved. As mentioned in the introduction, the procedure may be repeated for 10 times.
Action 207
In this Action 207, the first node 101 may determine that the number of iterations has exceeded the second threshold in the absence of having found a solution. Action 208
In this Action 208, the first node 101 may send, based on the determination that the number of iterations has exceeded the second threshold, an indication to a second node 102 in the communications network 10 based on a result of the determination. For example, if the problem is not solved within 10 times, then the first node 101 may send this output to a human, e.g., via the second node 102, for updating a new solution. When a new solution is created, the new solution may be added to the artificial neural network 105 and it may be retrained. In this way, the first node 101 may perform self-adaptation to most of the new problems. Figure 3 is a schematic illustration depicting a first non-limiting example of the architecture the artificial neural network 105 may have. The artificial neural network constructed have three inputs xQ, x xd and an output y1, y2. The output of the neural network is computed as shown in the Figure 3. As a brief summary of the foregoing in other words, the proposed methodology has the following steps:
1 . Train the model on a set of known features and classes.
2. Check the number of most dominating neurons by looking at which neurons fire continuously other neurons.
3. Add, according to Action 202 and 204, a random change of weight to the selected subset of neurons.
4. Use, according to Action 205, the updated model to classify the new feature of the problem.
5. If the solution provided by the artificial neural network 105 helps in solving the problem, the first node 101 may use it as a new feature to train the artificial neural network 105 again, which may be considered the optimal tuning.
6. If not, according to Action 206, the first node 101 may repeat steps 3 and 4 until the problem is solved.
7. The number of repetitions may be restricted to 10, ensuring, according to Action 207, that there may be a limited time to solve the problem.
As a summarized overview of the foregoing in other words, embodiments herein may be understood to introduce a new method, e.g., via an offline monitor, to solve a new set of problems which are not trained by the first node 101 earlier, using a self-adaptation principle. Regarding this, embodiments herein may modify the plastic artificial neural network 105 by tuning the weights of the neurons in such a way that the plastic artificial neural network 105 may address the features of the new problem, one or more of which features may be new. Moreover, the new method of embodiments herein may be understood to introduce a lifelong learning for self-adaptation principle as an additional contribution to the 5G applications.
As described earlier, particular embodiments herein may relate to an adaptive method of inculcating fault tolerance on the fog nodes. As discussed, this method may comprise three sequential parts.
1 . A fault in the fog node may be detected.
2. The root source, that is, the reasons, from where the fault is propagated may be identified.
3. The first node 101 may come with a new novel method to solve the problem which it may not have faced earlier in the future.
Embodiments herein may be understood to be particularly drawn to a new system and methods to address the last step of the approach. Embodiments herein may be understood to a solution to self-adapt, within fog nodes, for any new problem, that is, a problem which may not available within history. A modified plastic neural network mechanism has been disclosed herein that enables a fog node to understand the new problem and perform fault tolerance on its own. For these particular embodiments, the method may depend on a two step approach. In the first step, a cognitive Framework may be constructed within each fog node, which may enable a Fault Tolerance Control (FTC) loop to interpret a new set of problems. For this purpose, a framework inside the fog computing environment may be created, according to embodiments herein, as schematically depicted in the illustration of Figure 4. An arrangement according to existing methods is shown on the left, in panel a), where a fog Node may collect data from loT sensors and send it to the cloud. An arrangement according to a non-limiting example of embodiments herein is shown on the right, on panel b), where the first node 101 implements an FTC loop within a fog Node to interpret a new set of problems detected in the data collected from loT sensors and send it to the cloud.
The FTC loop may introduce a cognitive modelling framework inside the fog node, as depicted in the schematic illustration of Figure 5. At 501 , the first node 101 senses the devices in the environment such as sensors, by collecting sensor data etc. At 502, the first node 101 plans actions based on the identified sensor data. It may be noted that the plan
actions for an existing set of features, e.g., from the history for which the first node 101 is trained. If not, the first node 101 tries to generate a new set of actions based on the predictions made. Further, at 503 the first node 101 reasons the planned actions. These may include why these actions are taken and what may be the outcome if the actions are implemented. Finally, at 504, the first node 101 performs these set of actions to see the result. At 505, the first node 101 uses self-healing techniques to mitigate the effect of actions and also to increase the efficiency of the planned set of actions. The FTC loop may be activated for the fault tolerance cycle based on state changes, from normal at 506, abnormal at 507, to dead at 508. The FTC loop may be performed before the first node 101 may move to a next state.
Figure 6 is a schematic illustration depicting a non-limiting example of the different components that may be built and used according to particular examples of embodiments herein, to explore the scenario based on the root sources of the problem. Whenever the first node 101 may detect an anomaly using steps l and 2 described earlier, and the first node 101 may find the attributes, that is the root sources, behind the anomaly, the plan action part in FTC loop will be immediately triggered. The first node 101 may then execute the cycle as depicted in Figure 6, with the use of the different components depicted, to find more reasons towards a solution. At 601 , the Monitor component monitors the history of an application to uncover any problems in the application system, using the input from the configuration file. At 603, the correlator component generates the Sol 604. The correlator 603 may also update the knowledge data store with the information at 605. At 606, the correlator passes the Sol to the analyzer component 607. This component may be understood to be responsible for analyzing all the Sol indicators and arrive at the set of one or more features. The analyzer may maintain the problem identification table 608, as described earlier. The analyzer may also update the knowledge data store with the information at 609. If the component is unable to derive the problem, it may notify the presence of the problem to the UX client at 610. The output of this component may be a list of problems identified with the Sol indicators, passed to the executor component 61 1 . This component may read the table and at 612, fetch the solutions mapped for the problem IDs that may have been reported by the analyzer. The executor 610 may also update the knowledge data store with the information at 613. As was described for the analyzer component, if there is no known solution for the reported issue identifiers, this component may just notify the presence of the problem using the UX client at 614. The offline monitor component 615 may be understood to be responsible for training the self- adaptation system and introduce a learning component of the self-adaptation system.
The offline monitor component may use the data store or database 130 at 616 to learn the system.
Next, three different non-limiting examples of how a method according to embodiments herein may be implemented will be presented. One example
implementation is in a green energy building, another example implementation is in incident management in telecom networks, and yet another example implementation is in a fog RAN scenario.
Example 1 :
For the first example, the method is implemented in a green energy building. In this example, the main objective is to control the air conditioning system inside the building.
For this, it may be assumed the first node 101 is a fog node designed to take actions to start and/or stop an air conditioning system to control the air temperature inside a room.
At some point in the course of operation, there is an anomaly in the temperature. In this case, the air conditioning system will not work efficiently, as there is an anomaly in the temperature. That is, the air conditioning system will be able to be turned on or off based on a configured temperature to be maintained, but it may be used unnecessarily, wasting energy, if the cause of the temperature anomaly is elsewhere. By using a method according to embodiments herein, one may be able to estimate the underlying true cause of the temperature anomaly, which will help the air conditioning system to work efficiently. If the fault is identified, the first node 101 may be able to compute the underlying non- faulty data, that is, the data that may have been observed if there were no faults in the data. In this case, the multi-variable data may comprise all the data obtained from other devices attached to the first node 101 , which may be referred to as fogs, and nearby fogs. If the fault is identified in one variable of the fog, the first node 101 may use the process of embodiments herein to obtain the source of the anomaly. In addition, the underlying non- faulty data may be estimated. Once the source of the anomaly is identified, the first node 101 may take the necessary actions to repair the concerned fog node. Also, the first node
101 may use the estimated fog-node to take the necessary actions. Since, the non-faulty data has been estimated, the actions taken may be understood to be proper and accurate.
In another example using the green energy building scenario, it may be assumed that the first node 101 is a fog node to monitor the green building. The objective is to utilize the resources such that the energy used is minimum. It may be assumed that the variables measured are temperature inside the room, light intensity inside the room, outside temperature, outside light intensity etc.
Then it may be assumed that there is anomaly detected in the inside temperature, for example, a sudden increase, because of the increase in the outside light intensity, which means two variables have faults. In this case, the air conditioning system will increase its blower and also light intensity may be controlled such that the inside temperature and light intensity will decrease, although there is no true increase in
temperature. This is inefficient operation as it will waste both energy sources. One may identify the root source of the anomaly and identify the anomaly in the temperature is caused by an anomaly in the light intensity. In this way, the proposed method may identify the root source of the anomaly faster. This will save the energy as only closing windows may save the problem as opposed to controlling two sources in the previous scenario.
For the first node 101 to perform self-adaptation, the features of the variables may be used as inputs. In this case, the features may be outside light intensity values and temperature inside the room. These features may be mapped into problems such as light intensity problems, and the solution may be to shut the windows and doors to prevent light from entering inside. For temperature, the solution may be to increase or decrease the air conditioning system. Also, carbon dioxide levels in the room may be measured. For illustrative purposes, it may be assumed that the model is trained for handling this problem.
The artificial neural network 105 may be assumed to be trained on this data. The number of input nodes in the artificial neural network 105 in this example may be 100. That is, the artificial neural network 105 may have 100 past light intensity readings collected over time 100 minutes. Similarly, 100 temperature readings may be collected over 100 minutes and used to train the model. The input may be collected at different time instants to reduce the bias in the training artificial neural network 105. The artificial neural network 105 may be chosen to consist of three hidden layers with fully connected nodes. The number of nodes in each hidden layer may be chosen to be 10. Overall, in this way the first node 101 may have a total of 1302 weights to learn from the training data. For training, regular back propagation may be used with a gradient descent approach. The artificial neural network 105 may be trained for 20 epochs and training accuracy may be obtained as 85%. For testing, the first node 101 may choose 100 new readings of variables and a validation accuracy may be obtained of 78%.
In this case, if any new feature is detected, such as a number of people inside the office increases, this may lead to an increase in carbon dioxide content and may also increase the temperature of the room.
Example 2:
For the second example, the proposed method was tested during for incident management in a telecommunication system. In particular, the set of data collected from the communications network 10 were all the incidents happened in cells of a location in India, Chennai. In particular, only the incidents which had minimum and maximum impact were collected, which in total was 100 different incidents in the communications network 10.
First, the artificial neural network 105 was trained with 75 incidents and
corresponding solutions. According to the method disclosed herein, the remaining incidents were collected and predicted using the disclosed method. Out of 25 new incidents, 20 incidents were solved automatically using the disclosed method. Out of these 20 incidents, 10 incidents solved the problem within next two iterations of changing weights. The remaining incidents got the solution with six to eight iterations. Only 5 incidents were reported back to an engineer via the second node 102 to solve the problem. In this way, the human effort in solving the problems was reduced. In this way, the disclosed method may provide a better solution to the problem.
Example 3:
In the third example, embodiments herein may be applied in a fog RAN scenario, to decrease the latency of 5G applications. For this example implementation, it may be assumed that there is variable which is being measured by the first node 101 , which is a fog RAN node in this example. It may be assumed for the purposes of this example that the fog RAN node is taking actions based on the variable values. If there is an anomaly in a sensor reading, the fog node will take bad decisions because of the anomaly and it will reduce the efficiency of the communications network 10. If the data is identified as an anomaly, the first node 101 may need to wait for a next data sample to arrive or use a past sample. Both of these options are inefficient and decrease the efficiency of the process. By using embodiments herein, the first node 101 may enable to take efficient actions with ultra-low latency. This may be considered best suited for 5G networking and applications, which may e.g., have ultra-low latency requirements.
Figure 7 is a schematic illustration depicting a non-limiting example of a method performed by the first node 101 , according to the Example 1 . The diagram shows that there may be four steps the cognitive system of the first node 101 may perform to arrive at the solution. First, the monitor component 601 monitors all the log information of the data. In this example, the data collected comprises three sets of log files, where one log file corresponds to temperature data, another log file corresponds to light intensity data and another for carbon dioxide levels. At 701 , the monitor component 601 sends the log data to the correlator component 603. Second, at 702, the correlator component 603 correlates all the log files and merges them into single file. Since the files may be coming from different sensors and have different time stamps, this may be understood to play a very important role as it may be very important to merge these files and extract the features. Third, according to Action 201 , the analyzer component 607 monitors the problems in the correlated log file and identifies the problems in it. For example, in this case the analyzer component 607 identifies a temperature increase, a carbon dioxide level increase and no problem with light intensity. In this way, the analyzer component 607 may come at the issue of the problem, and send it to the executor component 61 1 . Fourth,
the executor component 61 1 may try to find a solution for the problem according to Actions 202-206. The features for the problem are an increase in carbon dioxide levels and an increase in temperature levels. Since, the artificial neural network 105 is trained for only temperature related problems, if existing methods were to be followed, that is, following the existing fog scenario, this problem may treated as a temperature increase problem. In this case, the artificial neural network 105 may classify it as a temperature related problem, and give the corresponding solution. In contrast, according to embodiments herein, the first node 101 may be able to use the trained artificial neural network 105 to self-heal the problem. Now, the real problem may come when a new feature may be input to the system. For example, the feature carbon dioxide content may be input to the artificial neural network 105. Since, the artificial neural network 105 is not trained for it, it may try to give a mathematical solution for it, that is, either temperature or light, which may be understood to mean that false positives may be obtained. In this case, the new features which may be available to classify the new problem may be passed to offline monitor for finding new solution. In the current case, existing methods could return the problem as an issue with light. However, with the help of the method disclosed herein, the weights of the artificial neural network 105 may be retuned to suit the problem. The first node 101 may select the top 10 weights to tune. With the new tuning of weights, the artificial neural network 105 may correctly predict the problem as a temperature related problem by first retuning itself and execute a solution for it even though the artificial neural network 105 may not have previously trained for it. The weight values before the retuning and after the retuning in this particular non-limiting example are shown in the vectors below.
Before retuning: [0.8 -0.9 0.7 0.54 0.78 -0.72 0.99 0.46 0.95 0.87]
After retuning: [0.83 -0.94 0.77 0.61 0.84 -0.73 0.96 0.52 1 0.81 ]
In this way, the artificial neural network 105 may be understood to be self-sufficient and may be used for self-adaptation. It may be observed that sometimes, a small change in tuning weights may destabilize the artificial neural network 105. However, the probability is very low, as the number of weights retuned may be considered to be very low when compared with the actual number of weights in the artificial neural network 105. Also, the change may be considered to be very small, e.g., of decimal places. Another aspect to be noted is that the change in weights may allow the weights of the artificial neural network 105 to go more than 1 . In this case, it may be necessary to restrict the weight value to be 1 since it may affect the stability of the artificial neural network 105.
To test the number of false positives, the artificial neural network 105 may be trained
with existing features and solutions. For this, a multilayer perceptron with one hidden layer may be used. The artificial neural network 105 may be trained with a back propagation method using a gradient descent method. In the current example implementation, 10 new features were tested. These 10 new features were similar to existing features. When the trained artificial neural network 105 was used, it resulted in 4 false positives. In this case, the top 10 weights of the artificial neural network 105 were retuned, and this resulted in 1 false positive. Hence, it was observed that the retuning resulted in a lower number of false positives.
As already mentioned earlier, only a subset of weights in the artificial neural network 105 may be changed rather than changing all the weights. This may be understood to be done to ensure the stability of the artificial neural network 105. To verify this statement, all the weights of the artificial neural network 105 were tuned. In this case, the predictions went for toss and it resulted in 10 false positives. That is, all the features were wrongly identified, and it resulted in a poor network. Hence, it may be concluded that the disclosed method to retune the weights reduces the number of false positives and may be used efficiently in any application.
Mapping the solution to the new problem with existing features may be, in some examples, considered to be the work of the offline monitor component 615, referred to in Figure 7 as the rendering engine. The offline monitor component 615 may be understood to be equivalent to the monitor component 601 , working offline. Since the increase in carbon dioxide maps to the increase in temperature, the solution of the temperature related problem may be employed. In this case, the network correctly predicted the problem and correctly gave the solution. The executor may give the solution and may pass it on to the executor, which may then perform the solution and it may solve the problem in this case.
In the case if the problem persists, the first node 101 , e.g., via the executor component 61 1 may, at 703, send the information to the rendering engine component 615. In this case, the weights may be redefined, and the rendering engine component 615 may try to identify another solution. At 704, the new solution may be sent to the executor component 61 1 and it may be tested again. If the problem is not solved after the given number of iterations, for example, after 10 iterations, the first node 101 , e.g., via the executor component 615 may, at 705, ask the monitor for more information/data from the system.
Certain embodiments may provide one or more of the following technical advantage(s). Embodiments herein be used effectively in mission critical applications of 5G which automate its own self-adaptation and manage any failures. Embodiments herein may be understood to be able to reduce the latency of applications by addressing failure
management, which may be understood as the most important factor in 5G applications. As a further advantage, embodiments herein may facilitate to build a self-adaptation mechanism for the applications where it may automatically monitor and control the situation. Furthermore, embodiments herein may be endeavoured in an existing fog network setup to avoid the failure of fog nodes and optimize the environment and cost. The growth of loT in present industrial setup is explosive, inspiring, and untenable under current architectural approaches due to the introduction of large number 3PP devices and communication protocols. Many loT deployments face challenges related to latency, network bandwidth, reliability, and security, which cannot be addressed in cloud-only models. Fog computing adds a hierarchy of elements that distributes resources and services of computing, storage, control, and networking anywhere along the continuum from Cloud of things to meet these challenges in a high performance, open and interoperable way. It will also support multiple industry verticals and application domains, delivering intelligence and services to users and business. To enhance the performance of fog system, it is very important for a system or an application to recover from the problems by itself without manual intervention. This would help in avoiding the turnaround time to fix the problem without manual intervention. Based on the history for dynamic problems, it is highly important to possess the property of fault tolerance in fog computing.
Figure 8 depicts two different examples in panels a) and b), respectively, of the arrangement that the first node 101 may comprise to perform the method actions described above in relation to Figure 2. In some embodiments, the first node 101 may comprise the following arrangement depicted in Figure 8a. The first node 101 is configured to manage the artificial neural network 105. The first node 101 may be understood to be for handling a problem in the communications network 10.
Several embodiments are comprised herein. Components from one embodiment may be tacitly assumed to be present in another embodiment and it will be obvious to a person skilled in the art how those components may be used in the other exemplary embodiments. The detailed description of some of the following corresponds to the same references provided above, in relation to the actions described for the first node 101 , and will thus not be repeated here. For example, in some embodiments, the first node 101 may be configured to be comprised in a fog node in the communications network 10. In Figure 8, optional modules are indicated with dashed boxes.
The first node 101 is configured to, e.g. by means of a determining unit 801 within the first node 101 configured to, determine, in the set of data configured to be collected from the communications network 10, the a set of one or more features configured to define the problem in the operation of the communications network 10. The set of one or
more features are configured to be: a) previously undetected in the communications network 10 by the first node 101 , and b) lacking a corresponding set of one or more solutions.
The first node 101 is configured to, e.g. by means of an training unit 802 within the first node 101 configured to, train the artificial neural network 105 to find the solution to the problem configured to be defined by the set of one or more features. This is configured to be performed by training the artificial neural network 105 with a modified pre-existing linear activation function. The modified pre-existing linear activation function is configured to adjust a pre-existing linear activation function by adding the factor. The added factor is configured to adjust the respective weight, respectively, of the subset of neurons configured to be comprised in the artificial neural network 105. The subset of neurons is configured to comprise the one or more neurons providing the highest output with the preexisting linear activation function, according to the first threshold, when the pre-existing linear activation function is used to train the artificial neural network 105 with the set of data.
In some embodiments, the first node 101 may be further configured to, e.g. by means of the determining unit 801 within the first node 101 configured to, determine whether or not the solution to the problem is found based on the modified pre-existing linear activation function.
In some embodiments, the first node 101 may be further configured to, e.g. by means of a adjusting unit 803 within the first node 101 configured to, adjust, based on the result of the determination, the respective weight of at least the first neuron in the subset of neurons. The adjustment is configured to be by a random amount.
In some embodiments, the first node 101 may be further configured to, e.g. by means of a using unit 804 within the first node 101 configured to, use the artificial neural network 105 with the modified pre-existing linear activation function using the adjusted respective weight to find the solution to the problem defined by the set of one or more features.
In some embodiments, the first node 101 may be configured to, e.g. by means of an iterating unit 805 within the first node 101 configured to, iterate the determining of whether or not the solution to the problem has been found, the adjusting and the using, until one of: the solution is found or a number of iterations has exceeded the second threshold in the absence of having found a solution.
In some embodiments, the first node 101 may be further configured to, e.g. by means of the determining unit 801 within the first node 101 configured to, determine that the number of iterations has exceeded the second threshold in the absence of having found a solution.
In some of such embodiments, the first node 101 may be configured to, e.g. by means of a sending unit 806 within the first node 101 configured to, send, based on the determination that the number of iterations has exceeded the second threshold, the indication to the second node 102 in the communications network 10 based on the result of the determination.
In some embodiments, the second threshold may be configured to be 10.
In some embodiments, the subset of neurons may be configured to be identified based on the back-propagation process in the training of the artificial neural network 105.
In some embodiments, the subset of neurons may be configured to comprise 10 neurons.
In some embodiments, the first threshold may be configured to be based on the ranking of the neurons in the artificial neural network 105 according to their output with the pre-existing linear activation function.
In some embodiments, the pre-existing linear activation function may be configured to have the formula: y=Wx, and the modified pre-existing linear activation function may be configured to have the following formula: y=Wx+Ux, U being the added factor.
In some embodiments, the set of one or more features may comprise a previously unidentified feature by the pre-existing linear activation function, and after the solution to the previously unidentified feature is found by the artificial neural network 105 using the modified pre-existing linear activation function, the modified pre-existing linear activation function may be configured to be used to identify future problems in the communications network 10, based on the previously unidentified feature and the found solution.
Other modules may be comprised in the first node 101 .
The embodiments herein in the first node 101 may be implemented through one or more processors, such as a processor 807 in the first node 101 depicted in Figure 8a, together with computer program code for performing the functions and actions of the embodiments herein. A processor, as used herein, may be understood to be a hardware component. The program code mentioned above may also be provided as a computer program product, for instance in the form of a data carrier carrying computer program code for performing the embodiments herein when being loaded into the first node 101 . One such carrier may be in the form of a CD ROM disc. It is however feasible with other data carriers such as a memory stick. The computer program code may furthermore be provided as pure program code on a server and downloaded to the first node 101 .
The first node 101 may further comprise a memory 808 comprising one or more memory units. The memory 808 is arranged to be used to store obtained information, store data, configurations, schedulings, and applications etc. to perform the methods herein when being executed in the first node 101.
In some embodiments, the first node 101 may receive information from, e.g., the set of loT sensors, the database 130 and/or the second node 102, through a receiving port 809. In some embodiments, the receiving port 809 may be, for example, connected to one or more antennas in first node 101 . In other embodiments, the first node 101 may receive information from another structure in the communications network 10 through the receiving port 809. Since the receiving port 809 may be in communication with the processor 807, the receiving port 809 may then send the received information to the processor 807. The receiving port 809 may also be configured to receive other information.
The processor 807 in the first node 101 may be further configured to transmit or send information to e.g., the set of loT sensors, the database 130, the second node 102, and/or another structure in the communications network 10, through a sending port 810, which may be in communication with the processor 807, and the memory 808.
Those skilled in the art will also appreciate that the units 801 -806, described above may refer to a combination of analog and digital modules, and/or one or more processors configured with software and/or firmware, e.g., stored in memory, that, when executed by the one or more processors such as the processor 807, perform as described above. One or more of these processors, as well as the other digital hardware, may be included in a single Application-Specific Integrated Circuit (ASIC), or several processors and various digital hardware may be distributed among several separate components, whether individually packaged or assembled into a System-on-a-Chip (SoC).
Also, any of the units 801 -806 described above may be respectively implemented as the processor 807 of the first node 101 , or an application running on such processor.
Thus, the methods according to the embodiments described herein for the first node 101 may be respectively implemented by means of a computer program 811 product, comprising instructions, i.e., software code portions, which, when executed on at least one processor 807, cause the at least one processor 807 to carry out the actions described herein, as performed by the first node 101 . The computer program 81 1 product may be stored on a computer-readable storage medium 812. The computer-readable storage medium 812, having stored thereon the computer program 81 1 , may comprise instructions which, when executed on at least one processor 807, cause the at least one processor 807 to carry out the actions described herein, as performed by the first node 101 . In some embodiments, the computer-readable storage medium 812 may be a non-transitory computer-readable storage medium, such as a CD ROM disc, or a memory stick. In other embodiments, the computer program 81 1 product may be stored on a carrier containing the computer program 81 1 just described, wherein the carrier is one of an electronic
signal, optical signal, radio signal, or the computer-readable storage medium 812, as described above.
The first node 101 may comprise an interface unit to facilitate communications between the first node 101 and other nodes or devices, e.g., the first node 101 , or any of the other nodes. In some particular examples, the interface may, for example, include a transceiver configured to transmit and receive radio signals over an air interface in accordance with a suitable standard.
In other embodiments, the first node 101 may comprise the following arrangement depicted in Figure 8b. The first node 101 may comprise a processing circuitry 807, e.g., one or more processors such as the processor 807, in the first node 101 and the memory 808. The first node 101 may also comprise a radio circuitry 813, which may comprise e.g., the receiving port 809 and the sending port 810. The processing circuitry 807 may be configured to, or operable to, perform the method actions according to Figure 2, in a similar manner as that described in relation to Figure 8a. The radio circuitry 813 may be configured to set up and maintain at least a wireless connection any of the other nodes in the communications network 10. Circuitry may be understood herein as a hardware component.
Hence, embodiments herein also relate to the first node 101 operative to manage the artificial neural network 105. The first node 101 may be operative to operate in the communications network 10. The first node 101 may comprise the processing circuitry 807 and the memory 808, said memory 808 containing instructions executable by said processing circuitry 807, whereby the first node 101 is further operative to perform the actions described herein in relation to the first node 101 , e.g., in Figure 2.
Generally, all terms used herein are to be interpreted according to their ordinary meaning in the relevant technical field, unless a different meaning is clearly given and/or is implied from the context in which it is used. All references to a/an/the element, apparatus, component, means, step, etc. are to be interpreted openly as referring to at least one instance of the element, apparatus, component, means, step, etc., unless explicitly stated otherwise. The steps of any methods disclosed herein do not have to be performed in the exact order disclosed, unless a step is explicitly described as following or preceding another step and/or where it is implicit that a step must follow or precede another step. Any feature of any of the embodiments disclosed herein may be applied to any other embodiment, wherever appropriate. Likewise, any advantage of any of the embodiments may apply to any other embodiments, and vice versa. Other objectives, features and advantages of the enclosed embodiments will be apparent from the following description.
As used herein, the expression“at least one of:” followed by a list of alternatives separated by commas, and wherein the last alternative is preceded by the“and” term, may be understood to mean that only one of the list of alternatives may apply, more than one of the list of alternatives may apply or all of the list of alternatives may apply. This expression may be understood to be equivalent to the expression“at least one of:” followed by a list of alternatives separated by commas, and wherein the last alternative is preceded by the“or” term.
References:
[1 ] Z. Amin, H. Singh, and N. Sethi,“Review on fault tolerance techniques in cloud computing,” International Journal of Computer Applications, vol.1 16, no. 18, 2015.
[2] A. M. Sampaio and J. G. Barbosa,“A comparative cost study of fault tolerant techniques for availability on the cloud,” in International Symposium on Ambient
Intelligence. Springer, 2017, pp. 263-268.
[3] R. Mahmud and R. Buyya,“Fog computing: A taxonomy, survey and future directions,” Internet of Everything, Springer, 2017
[4] E. Baccarelli, P. G. V. Naranjo, M. Scarpiniti, M. Shojafar, and J. H.Abawajy,“Fog of everything: Energy-efficient networked computing architectures, research challenges, and a case study,” IEEE Access, 2017.
[5] A. Munir, P. Kansakar, and S. U. Khan,“Ifciot: Integrated fog cloud iot: A novel architectural paradigm for the future internet of things.” IEEE Consumer Electronics Magazine, vol. 6, no. 3, pp. 74-82, 2017.
[6] M. Peng, Y. Li, Z. Zhao and C. Wang, "System architecture and key technologies for 5G heterogeneous cloud radio access networks," in IEEE Network, vol. 29, no. 2, pp. 6- 14, March-April 2015.
[7]Thomas Miconi, Jeff Clune, Kenneth O. Stanley: Differentiable plasticity: training plastic neural networks with backpropagation, Proceedings of the 35th International Conference on Machine Learning (ICML2018), Stockholm, Sweden, PMLR 80, 2018
Claims
1 . A method performed by a first node (101 ) for handling a problem in a
communications network (10), the first node (101) managing an artificial neural network (105), the method comprising:
- determining (201), in a set of data collected from the communications network
(10), a set of one or more features defining a problem in an operation of the communications network (10), the set of one or more features being: a) previously undetected in the communications network (10) by the first node (101 ), and b) lacking a corresponding set of one or more solutions, and - training (202) the artificial neural network (105) to find a solution to the problem defined by the set of one or more features, by training the artificial neural network (105) with a modified pre-existing linear activation function, wherein the modified pre-existing linear activation function adjusts a preexisting linear activation function by adding a factor, the added factor adjusting a respective weight, respectively, of a subset of neurons comprised in the artificial neural network (105), the subset of neurons comprising one or more neurons providing the highest output with the pre-existing linear activation function, according to a first threshold, when the pre-existing linear activation function is used to train the artificial neural network (105) with the set of data.
2. The method according to claim 1 , further comprising:
- determining (203) whether or not the solution to the problem is found based on the modified pre-existing linear activation function,
- adjusting (204), based on a result of the determination (203), the respective weight of at least a first neuron in the subset of neurons, the adjustment being by a random amount, and
- using (205) the artificial neural network (105) with the modified pre-existing linear activation function using the adjusted respective weight to find the solution to the problem defined by the set of one or more features.
3. The method according to claim 2, wherein the method further comprises:
- iterating (206) the determining (203) of whether or not the solution to the problem has been found, the adjusting (204) and the using (205) until one of: the solution is found or a number of iterations has exceeded a second threshold in the absence of having found a solution.
4. The method according to claim 3, wherein the method further comprises:
- determining (207) that the number of iterations has exceeded the second threshold in the absence of having found a solution, and
- sending (208), based on the determination that the number of iterations has exceeded the second threshold, an indication to a second node (102) in the communications network (10) based on a result of the determination.
5. The method according to claims 4, wherein the second threshold is 10.
6. The method according to any of claims 1 -5, wherein the subset of neurons is
identified based on a back-propagation process in the training of the artificial neural network (105).
7. The method according to any of claims 1 -6, wherein the subset of neurons
comprises 10 neurons.
8. The method according to any of claims 1 -7, wherein the first threshold is based on a ranking of the neurons in the artificial neural network (105) according to their output with the pre-existing linear activation function.
9. The method according to any of claims 1 -8, wherein the pre-existing linear
activation function has the formula: y=Wx, and wherein the modified pre-existing linear activation function has the following formula: y=Wx+Ux, U being the added factor.
10. The method according to any of claims 1 -9, wherein the set of one or more features comprises a previously unidentified feature by the pre-existing linear activation function, and wherein, after the solution to the previously unidentified feature is found by the artificial neural network (105) using the modified pre-existing linear activation function, the modified pre-existing linear activation function is used to identify future problems in the communications network (10), based on the previously unidentified feature and the found solution.
1 1 . The method according to any of claims 1 -10, wherein the first node (101 ) is
comprised in a fog node in the communications network (10).
12. A computer program (610), comprising instructions which, when executed on at least one processor (606), cause the at least one processor (606) to carry out the method according to any one of claims 1 to 1 1 .
13. A computer-readable storage medium (61 1), having stored thereon a computer program (610), comprising instructions which, when executed on at least one processor (606), cause the at least one processor (606) to carry out the method according to any one of claims 1 to 1 1 .
14. A first node (101), for handling a problem in a communications network (10), the first node (101 ) being configured to manage an artificial neural network (105), the first node (101 ) being further configured to:
- determine, in a set of data configured to be collected from the communications network (10), a set of one or more features configured to define a problem in an operation of the communications network (10), the set of one or more features being configured to be: a) previously undetected in the
communications network (10) by the first node (101 ), and b) lacking a corresponding set of one or more solutions, and
- train the artificial neural network (105) to find a solution to the problem configured to be defined by the set of one or more features, by training the artificial neural network (105) with a modified pre-existing linear activation function, wherein the modified pre-existing linear activation function is configured to adjust a pre-existing linear activation function by adding a factor, the added factor being configured to adjust a respective weight, respectively, of a subset of neurons configured to be comprised in the artificial neural network (105), the subset of neurons being configured to comprise one or more neurons providing the highest output with the pre-existing linear activation function, according to a first threshold, when the pre-existing linear activation function is used to train the artificial neural network (105) with the set of data.
15. The first node (101 ) according to claim 14, being further configured to:
- determine whether or not the solution to the problem is found based on the modified pre-existing linear activation function,
- adjust, based on a result of the determination, the respective weight of at least a first neuron in the subset of neurons, the adjustment being configured to be by a random amount, and
- use the artificial neural network (105) with the modified pre-existing linear activation function using the adjusted respective weight to find the solution to the problem defined by the set of one or more features.
16. The first node (101 ) according to claim 15, wherein the first node (101) is further configured to:
- iterate the determining of whether or not the solution to the problem has been found, the adjusting and the using, until one of: the solution is found or a number of iterations has exceeded a second threshold in the absence of having found a solution.
17. The first node (101 ) according to claim 16, wherein the first node (101) is further configured to:
- determine that the number of iterations has exceeded the second threshold in the absence of having found a solution, and
- send, based on the determination that the number of iterations has exceeded the second threshold, an indication to a second node (102) in the communications network (10) based on a result of the determination.
18. The first node (101) according to claims 17, wherein the second threshold is
configured to be 10.
19. The first node (101 ) according to any of claims 14-18, wherein the subset of neurons is configured to be identified based on a back-propagation process in the training of the artificial neural network (105).
20. The first node (101) according to any of claims 14-19, wherein the subset of neurons is configured to comprise 10 neurons.
21 . The first node (101) according to any of claims 14-20, wherein the first threshold is configured to be based on a ranking of the neurons in the artificial neural network (105) according to their output with the pre-existing linear activation function.
22. The first node (101) according to any of claims 14-21 , wherein the pre-existing linear activation function is configured to have the formula: y=Wx, and wherein the modified pre-existing linear activation function is configured to have the following formula: y=Wx+Ux, U being the added factor.
23. The first node (101) according to any of claims 14-22, wherein the set of one or more features comprises a previously unidentified feature by the pre-existing linear activation function, and wherein, after the solution to the previously unidentified feature is found by the artificial neural network (105) using the modified pre-existing linear activation function, the modified pre-existing linear activation function is configured to be used to identify future problems in the communications network (10), based on the previously unidentified feature and the found solution.
24. The first node (101) according to any of claims 14-23, wherein the first node (101) is configured to be comprised in a fog node in the communications network (10).
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/IN2019/050474 WO2020261285A1 (en) | 2019-06-24 | 2019-06-24 | First node, and method performed thereby, for handling a problem in a communications network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/IN2019/050474 WO2020261285A1 (en) | 2019-06-24 | 2019-06-24 | First node, and method performed thereby, for handling a problem in a communications network |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2020261285A1 true WO2020261285A1 (en) | 2020-12-30 |
Family
ID=74060039
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/IN2019/050474 WO2020261285A1 (en) | 2019-06-24 | 2019-06-24 | First node, and method performed thereby, for handling a problem in a communications network |
Country Status (1)
Country | Link |
---|---|
WO (1) | WO2020261285A1 (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2023072436A1 (en) * | 2021-10-29 | 2023-05-04 | Telefonaktiebolaget Lm Ericsson (Publ) | Managing operational temperature of base station |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP0897566B1 (en) * | 1997-01-21 | 2003-08-27 | Cerebrus Solutions Limited | Monitoring and retraining neural network |
WO2018235448A1 (en) * | 2017-06-19 | 2018-12-27 | 株式会社デンソー | Multilayer neural network neuron output level adjustment method |
CN109376850A (en) * | 2018-11-29 | 2019-02-22 | 国网辽宁省电力有限公司抚顺供电公司 | A kind of detection method based on bad data in improved BP state estimation |
-
2019
- 2019-06-24 WO PCT/IN2019/050474 patent/WO2020261285A1/en active Application Filing
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP0897566B1 (en) * | 1997-01-21 | 2003-08-27 | Cerebrus Solutions Limited | Monitoring and retraining neural network |
WO2018235448A1 (en) * | 2017-06-19 | 2018-12-27 | 株式会社デンソー | Multilayer neural network neuron output level adjustment method |
CN109376850A (en) * | 2018-11-29 | 2019-02-22 | 国网辽宁省电力有限公司抚顺供电公司 | A kind of detection method based on bad data in improved BP state estimation |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2023072436A1 (en) * | 2021-10-29 | 2023-05-04 | Telefonaktiebolaget Lm Ericsson (Publ) | Managing operational temperature of base station |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20200374711A1 (en) | Machine learning in radio access networks | |
US20220321436A1 (en) | Method and apparatus for managing prediction of network anomalies | |
US10003502B1 (en) | Integrated wireless sensor network (WSN) and massively parallel processing database management system (MPP DBMS) | |
US11500370B2 (en) | System for predictive maintenance using generative adversarial networks for failure prediction | |
US20230216737A1 (en) | Network performance assessment | |
US20200162425A1 (en) | Active labeling of unknown devices in a network | |
EP4156631A1 (en) | Reinforcement learning (rl) and graph neural network (gnn)-based resource management for wireless access networks | |
Friesen et al. | Machine learning for zero-touch management in heterogeneous industrial networks-a review | |
WO2020152389A1 (en) | Machine learning for a communication network | |
Lee et al. | Federated learning-empowered mobile network management for 5G and beyond networks: From access to core | |
WO2021089864A1 (en) | Computer-implemented training of a policy model for specifying a configurable parameter of a telecommunications network, such as an antenna elevation degree of a network node, by smoothed-loss inverse propensity | |
WO2022028926A1 (en) | Offline simulation-to-reality transfer for reinforcement learning | |
US20230262489A1 (en) | Apparatuses and methods for collaborative learning | |
Munir et al. | Neuro-symbolic explainable artificial intelligence twin for zero-touch IoE in wireless network | |
EP4222934A1 (en) | Determining conflicts between kpi targets in a communications network | |
Miozzo et al. | Distributed and multi-task learning at the edge for energy efficient radio access networks | |
US20230099153A1 (en) | Risk-based aggregate device remediation recommendations based on digitized knowledge | |
El Rajab et al. | Zero-touch networks: Towards next-generation network automation | |
Akbari et al. | AoI-Aware Energy-Efficient SFC in UAV-Aided Smart Agriculture Using Asynchronous Federated Learning | |
Stanly Jayaprakash et al. | Deep q-network with reinforcement learning for fault detection in cyber-physical systems | |
US20220172054A1 (en) | Intermediate network node and method performed therein for handling data of communication networks | |
WO2020261285A1 (en) | First node, and method performed thereby, for handling a problem in a communications network | |
US20230213920A1 (en) | Sensor control system for controlling a sensor network | |
Riaz et al. | Challenges with providing reliability assurance for self-adaptive cyber-physical systems | |
Yang et al. | PreM-FedIoV: A Novel Federated Reinforcement Learning Framework for Predictive Maintenance in IoV |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 19935264 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 19935264 Country of ref document: EP Kind code of ref document: A1 |