WO2023193891A1

WO2023193891A1 - Apparatus and method for finite-state dynamic in-network learning

Info

Publication number: WO2023193891A1
Application number: PCT/EP2022/059018
Authority: WO
Inventors: Abdellatif ZAIDI; Piotr KRASNOWSKI
Original assignee: Huawei Technologies Co., Ltd.
Priority date: 2022-04-05
Filing date: 2022-04-05
Publication date: 2023-10-12

Abstract

A device (110, 305) for a communication network, the network comprising multiple input nodes (101, 102, 301, 302, 303) each configured to process respective first data (104, 105, 307, 308, 309) relating to an entity and output respective second data and configured to operate according to a finite set of network states, configured to: receive (501) respective second data from at least one of the input nodes (101, 102, 301, 302, 303); in dependence on the or each respective second data, select (502) one of the network states; send (503) a signal indicating the selected one of the network states to the at least one of the input nodes (101, 102, 301, 302, 303). A corresponding network node (101, 102, 301, 302, 303) is also described. This may allow for a device that can perform inference using a distributed learning network that can dynamically adjust to changes in network state.

Description

APPARATUS AND METHOD FOR FINITE-STATE DYNAMIC IN-NETWORK LEARNING

FIELD OF THE INVENTION

This disclosure relates to the processing of data in a network, for example to distributed learning and inference in such a network.

BACKGROUND

There are many techniques where multiple nodes work together to perform distributed learning. A practical example includes multiple base stations (BSs) communicating over a wireless network to locate a user.

In many situations, parts of relevant data are obtained or measured at multiple sites and only part of the information available at each of the distributed sources of input data may be useful for a given task. A source of data can be referred to as a “view” that represents some, possibly noisy, version of the target to be inferred. An example of a view is an area observed by some surveillance camera, and a sample of this view is a picture taken by the camera.

If the task of a system is to detect and classify a vehicle that is present within an observed area, then only some small portion of the information in the picture may be useful for the system. Other possible examples of views could be the access to the channel state information (CSI) between a mobile user and a base station in order to estimate the location of a user, access to medical databases in order to detect a potential disease of a patient, or access to a search history of a user to be used in some recommendation system.

In a distributed setting (for example, with multiple cameras), each view may provide some unique piece of information about the target that is not available to other sensing nodes. On the other hand, these views can be partially correlated.

Sometimes, the data/measurements cannot be transmitted directly to a remote central processing center due to limited bandwidth and/or privacy concerns. Thus, parts of possibly correlated data may be processed locally by each node during both training and inference phases.

During the training phase for such a system, the sensing nodes can learn how to efficiently compress samples from their respective views by suppressing irrelevant and/or redundant information about the target. The processed and optionally compressed measurements can be then transmitted over the network to a remote central center.

Good results can be achieved when the available views of the target are approximately static over time, i.e., a sensing node always observes similar types of features of the target, regardless of whether they are useful and/or redundant. In such cases, the sensing node may learn how to fine-tune the processing and compression process to better emphasize the useful information about the target.

However, in many real-world applications, the views of the target may change over time. In a simple example, consider a distributed network of surveillance cameras for traffic monitoring. During the daylight hours, the cameras may observe some different features of passing vehicles than during night. Thus, the sensing nodes may adapt their compression process depending on the level of illumination, even though the main objective (traffic monitoring) remains the same.

There exist two major challenges associated with this problem. Firstly, detecting the change of views. Secondly, adjusting the operation of all nodes in the system to a new detected configuration of views.

The simplest and the most popular approach is to train the sensing nodes to apply a “universal” compression that tries to fit all considered configurations of views. This “fit for all” approach can be highly sub-optimal, because the sensing nodes may suppress the information that in specific situations could be useful and/or the nodes must send some redundant information in case it would be needed. As a result, there may be a degradation in accuracy of inference and/or an increase of required power or bandwidth resources.

Another approach is to let each sensing node detect the change of its own view and to autonomously adapt to this new view. For example, each sensing node could be equipped with several neural networks where each network is “tuned” to a specific type of view (e.g., one dedicated network for operation during a day and another for operation during a night). Next, whenever a sensing node detects the change in the view (e.g., illumination drops below a certain threshold), it switches between the neural networks.

This independent and autonomous adaptation at the sensing nodes offers some flexibility, but at the same time it may suffer from several problems. Firstly, in many situations, a single node cannot reliably assess what is observed by the remaining sensing nodes. For example, if one of the surveillance cameras has a view obscured by fog, it does not necessarily mean that other views are also degraded by the fog. As a result, a single sensing node may adapt to a new view in a way that is sub-optimal with respect to other sensing nodes. Secondly, autonomous adaptation of sensing nodes may lead to conflicts between the nodes and to disruption of the whole system.

The above example emphasizes that the decision about the current configuration of views should preferably be orchestrated between the nodes in the distributed network, and that the decision should be based on the joint assessment of all views. The natural technical solution would be to introduce a centralized unit that receives full (uncompressed) observations from all sensing nodes, performs a joint assessment of the samples, and broadcasts a decision to all nodes in the network. After receiving the decision, each node adjusts its operation accordingly. Unfortunately, a major drawback in this solution is that sending raw uncompressed observations to a central unit requires significant bandwidth and raises privacy concerns.

It is desirable to develop a method which overcomes such problems.

SUMMARY

According to one aspect, there is provided a device for a communication network, the communication network comprising multiple input nodes each configured to process respective first data relating to an entity and output respective second data and being configured to operate according to a finite set of network states, the device being configured to: receive respective second data from at least one of the input nodes; in dependence on the or each respective second data from the at least one of the input nodes, select one of the finite set of network states; and send a signal indicating the selected one of the finite set of network states to the at least one of the input nodes.

This may allow for a device that can effectively infer a property of an entity using a distributed learning network that can dynamically adjust to changes in observed views.

The device may be configured to implement multiple neural networks, each neural network corresponding to one (i.e. a different one) of the finite set of network states. The device may be configured to process the or each respective second data from the at least one of the input nodes using the neural network corresponding to the selected one of the finite set of network states. This may allow the device to process the second data from the input nodes using a neural network suitable for the conditions of the current network state. The device may be configured to dynamically adjust the selected one of the finite set of network states in dependence on the or each respective second data from the at least one of the input nodes. This may allow for more accurate inference and more efficient usage of the available communication resources.

The device may be configured to receive the respective second data from at least one of the input nodes over a time period and select the one of the finite set of network states in dependence on the or each respective second data from the at least one of the input nodes received during the time period. This may allow the device to select the appropriate network state more accurately.

The device may be configured to: input the respective second data from the at least one of the input nodes into a trained analysis model implemented by the device; and select the one of the finite set of network states in dependence on an output of the trained analysis model.

The device may be further configured to receive a respective local assessment of the respective first data from the at least one of the input nodes and select the one of the finite set of network states in further dependence on the or each local assessment. This may allow the device to more accurately determine the current network state.

The device may be further configured to input the or each respective local assessment into the trained analysis model and select the one of the finite set of network states in dependence on the output of the trained analysis model. This may allow the device to more accurately detect a configuration of views (i.e. determine the current network state).

The respective second data may be a compressed representation of the respective first data. The compressed representation may be output by the neural network implemented at the respective input node. This may be an efficient way of processing the data in the network, for example where there is limited bandwidth and/or privacy concerns.

The respective second data may comprise an output of a respective neural network implemented by the respective input node. The use of a neural network may allow the respective first data to be processed at the respective input node and allow the output to be provided to the device. The respective neural network may be one of multiple neural networks each corresponding to a network state of the finite set of network states. This may allow the input node to effectively process the first data under the current network state conditions.

The respective second data may comprise an activation vector of a last layer of the respective neural network implemented by the respective input node. This may allow the device to receive the second data from input nodes and allow data from multiple input nodes to be combined to form a combined input to the neural network implemented by the device.

The device may be configured to process respective second data received by the device from two or more of the input nodes, and the device may be configured to concatenate the respective second data from the two or more of the input nodes to form a combined input and select the one of the finite set of network states in dependence on the combined input. This may allow the device to select an appropriate network state in dependence on the data processed at multiple input nodes.

The respective first data may represent a view of the entity and each network state of the finite set of network states may be associated with a specific configuration of views. This may allow a network state to be efficiently determined.

The device and each of the input nodes may be base stations. This may allow the device to be used in telecommunications networks.

The entity may be a vehicle or a mobile device. This may allow the device to determine a property of a variety of entities that is of use in real-world scenarios.

The device may be configured to determine a location of the entity. This may allow the device to be used in applications requiring accurate location determination.

The respective first data processed at each input node may comprise channel state information. This may allow the device to utilise CSI signals sent from the entity.

According to a second aspect, there is provided a method for implementation at a device in a communication network, the communication network comprising multiple input nodes each configured to process respective first data relating to an entity and output respective second data and being configured to operate according to a finite set of network states, the method comprising: receiving respective second data from at least one of the input nodes; in dependence on the or each respective second data from the at least one of the input nodes, selecting one of the finite set of network states; and sending a signal indicating the selected one of the finite set of network states to the at least one of the input nodes.

This method may allow for effective inference of a property of an entity using a distributed learning network that can dynamically adjust to changes in observed views.

According to a third aspect, there is provided a network node configured to communicate with a device in a communications network and to process first data relating to an entity and output second data and being configured to operate according to a finite set of network states, the network node being configured to implement multiple neural networks for processing the first data, each neural network corresponding to one of the finite set of network states, the network node being configured to: receive a signal from the device indicating a selected one of the finite set of network states; and process the first data using the neural network corresponding to the indicated one of the finite set of network states.

This may allow for dynamic adjustment in response to changes in observed views at a network node processing sensed data that can be used to infer a property of an entity.

The network node may be configured to send the second data output by the network node to the device. The second data may comprise the output of the neural network implemented by the network node. The second data may comprise an activation vector of a last layer of the neural network implemented by the network node. This may allow the device to use the second data to infer a property of the entity.

The second data may comprise the output of the neural network corresponding to the one of the finite set of network states. This may allow the network node to output data that has been processed accordingly in dependence on a current network state.

The second data may comprise an activation vector of a last layer of the neural network implemented by the network node. The use of a neural network may allow the respective first data to be processed at the respective input node and allow the output to be provided to the device.

The network node may be configured to perform a local assessment of the first data and indicate an estimate of a current network state of the finite set of network states to the device. By providing the result of a local assessment of the first data to the device, this may allow the device to determine the current network state more accurately.

The first data may represent a view of the entity. The view may be an image, sound, or other modality which can inform something about the presence/properties/state of the entity. A view may represent some, possibly noisy, version of the target to be inferred. An example of a view is an area observed by a surveillance camera, and a sample of this view is a picture taken by the camera.

The second data may be a compressed representation of the first data. This may allow for more efficient processing of the data from input nodes at the device, for example where there is limited bandwidth and/or privacy concerns.

The first data may comprise channel state information. This may allow the device to utilise CSI signals sent from the entity.

According to a further aspect, there is provided a method for implementation at a network node in a communications network, the network node being configured to communicate with a device in the communications network and to process first data relating to an entity and output second data, the network node being configured to implement multiple neural networks for processing the first data, each neural network corresponding to one of a finite set of network states, the network node being configured to: receive a signal from the device indicating a selected one of the finite set of network states; and process the first data using the neural network corresponding to the indicated one of the finite set of network states.

According to a further aspect, there is provided a computer readable storage medium having stored thereon computer readable instructions that, when executed at a computer system, cause the computer system to perform the method as set forth above. The computer system may comprise one or more processors. The computer readable storage medium may be a non-transitory computer readable storage medium.

BRIEF DESCRIPTION OF THE FIGURES

The present disclosure will now be described by way of example with reference to the accompanying drawings. In the drawings:

Figures 1 (a) and 1 (b) schematically illustrate examples of an inference diagram for the architecture described herein in the case in which there are K nodes that observe measurements from an entity. Figure 1(a) illustrates agreement between the views (case T) and the state of the system (state T). Figure 1(b) illustrates disagreement between the views (case ‘M’) and the state of the system (state ‘1 ‘).

Figure 2 shows a timeline illustrating the adjustment of the system to a new configuration of views and a simplified representation of signalling between the nodes.

Figure 3 schematically illustrates an exemplary diagram of a considered detection/ classification problem.

Figure 4 shows a timeline illustrating the adjustment of the system to a new configuration of views.

Figure 5 shows an example of a method for implementation at a device in a communication network in accordance with embodiments of the present invention.

Figure 6 shows an example of a method for implementation at a network node in a communication network in accordance with embodiments of the present invention.

DETAILED DESCRIPTION

Described herein is an architecture for distributed learning and inference that can allow for dynamic adjustment of the system to changes in the observed views.

One possible and efficient solution to distributed learning that can be used as a general framework for embodiments of the present invention is “In-network Learning” (INL). INL is a distributed learning and inference architecture in which an arbitrary number of nodes are involved during both the training phase and the inference phase. All nodes that are involved during the training phase are also active during the inference phase. The nodes operate simultaneously, not sequentially. Specifically, during the training phase, every node uses its own neural network (NN) to perform a forward pass on its data, possibly also using all received information from previous nodes in the network as part of the NN.

In such a system, some nodes may not always have reliable data. This issue can appear due to a fault in the data acquisition process, interference, or simply the absence of data to be measured by the node. In the INL system, if a node is not considered to collect reliable data, or has no available data, the node only uses any incoming information as input to its NN. If the node has no parents in the network, it only uses its available data as input. In all cases, the available/acquired information is concatenated vertically in a vector of inputs prior to using it as input of the node’s NN. Then, the node sends the vector output of the last layer (called the activation vector) of its NN to the next nodes to which it is connected in the graph network. The propagation of the forward pass continues until it reaches the end node, at which the decision is inferred. This node continues the forward pass. It then computes a backward pass on its local NN. The output of the first layer of its NN during the backward step is first split vertically and then sent back to the parents’ nodes. Each of those first computes the sum of all vectors it receives and then continues the backward pass. The process continues until convergence.

Embodiments of the present invention target a technical problem of distributed learning and inference when the available views of the target change over time due to some repetitive and/or predictable phenomena. For example, day and night illumination conditions, a fog partially occluding a view, etc. The main objective of the inference (for example, target detection, classification, localization etc.) remains the same. This may be of particular use in situations with constrained transmission resources and/or privacy concerns related to sharing raw data collected by sensing nodes in a network.

In embodiments of the present invention, a network comprises devices including multiple input nodes and a device for processing data received from one or more of the input nodes. The latter device is herein referred to as a fusion node or fusion centre. The device and each of the input nodes may comprise a processor and a memory. The memory stores in a non-transient way code that is executable by the processor to implement the device or respective node in the manner described herein. The device and each of the input nodes also comprises a transceiver for transmitting and receiving data to and from the device. The network may be a wireless network or a wired network.

There can be an arbitrary number of information sources that provide information about an entity that are observed or measured or received each at a distinct input node in the network. These input nodes can encode their information and send it to the fusion node, which performs the inference. The inference stage can infer a property of the entity, such as its location.

Figures 1(a) and 1(b) present diagrams of exemplary architectures for a simple, multi-access distributed network with K sensing nodes and a fusion node. This example can be naturally extended to more complex, multi-hop networks. Figure 1(a) illustrates agreement between the views (case T) and the state of the system (state ‘1’). Figure 1(b) illustrates disagreement between the views (case ‘M’) and the state of the system (state T).

In Figure 1 (a), node 1 is shown at 101 and node K at 102. The input nodes 101 , 102 are each configured to process data relating to an entity, herein referred to as respective first data. In this example, respective first data is received from the entity by the respective node. The first data may be transmitted from the entity to the nodes via the internet.

In the preferred implementation, the first data is a view of the entity. This view could be an image, sound, or other modality which can inform something about the presence/properties/state of the entity. A view may represent some, possibly noisy, version of the target to be inferred. An example of a view is an area observed by a surveillance camera, and a sample of this view is a picture taken by the camera. Other possible examples of views could be the access to the CSI between a mobile user and a base station in order to estimate the location of a user, access to medical databases in order to detect a potential disease of a patient, or access to a search history of a user to be used in some recommendation system.

Let Y 103 denote the target for an entity to be inferred at the fusion node, and Xi 104, XK 105 be the views of Y for respective sensing nodes 101 and 102. These views may change over time, and the number of considered configurations of views is finite and known in advance.

In this example, the input nodes 101 , 102 each receive a data signal from the entity comprising the respective first data. Alternatively, the input nodes may make measurements of a property or characteristic of the entity and optionally record the first data measurements at the respective input node. In a further example, an input node may receive a data signal from a camera sensor that records images, or video, of the entity. In the above examples, the first data therefore relates to the entity. The input nodes are each configured to process their respective first data.

The network therefore comprises multiple input nodes each configured to process respective first data relating to the entity. The first data may be sent to the input node from the entity. Preferably, each of the input nodes observes a different measurement relating to the entity. For example, for a problem of localization, to determine the location of an object, one sensor providing first data to be processed at an input node may be an accelerometer located at the entity (for example, inside a mobile phone) and another may record a GPS location of the entity. In another example, one sensor providing first data to be processed at an input node may record video and another sensor may be an accelerometer located at the entity (for example, inside a mobile phone). In another example, base stations acting as input nodes may measure signal strength or channel state estimation and process this data.

As illustrated in Figure 1 (a), each node 101 , 102 is configured to implement multiple NNs to process its respective first data. In the preferred implementation, the nodes each implement one NN at any one time. The NNs that can be implemented by an input node each correspond to a different network state 1 to M. Node 1 101 is configured to implement NNs 1 to M. NN 1 is indicated at 106 and NN M at 107. Node K 102 is configured to implement NNs 1 to M. NN 1 is indicated at 108 and NN M at 109. In the situation illustrated in Figure 1 (a), the nodes 101 and 102 are currently implementing NNs 106 and 108 respectively, which each correspond to network state T.

In a preferred embodiment, the NNs implemented at the input nodes compress the first data to form the second data.

During the training stage of the system, the input nodes can learn how to efficiently compress samples from their respective views by suppressing irrelevant and/or redundant information about the target and this can be done in dependence on the network state (i.e. the configuration of views). For example, during daylight hours, a camera may observe some different features of passing cars than during night. Thus, the sensing nodes can adapt their compression process depending on the level of illumination, even though the main objective (traffic monitoring) remains the same. The processed and compressed measurements can be then transmitted over the network to a fusion center. Therefore, the implementation of different NNs by the input nodes, in dependence on the current network state, can help the system to process and compress the data more appropriately.

The NNs implemented by the input nodes output respective second data. The first data preferably has a different data format to the second data.

In the preferred implementation, the output of an input node comprises an activation vector of the last layer (i.e. the output later) of the respective neural network implemented by that input node. The input nodes are each configured to output their respective second data.

The respective second data output by each of the input nodes can be sent to fusion node 110 for the inference. Fusion node 110 is also configured to implement multiple NNs to process second data received from one or more of the input nodes. In the preferred implementation, the fusion node implements one NN at any one time. The NNs that can be implemented by the fusion node each correspond to a different network state 1 to M. NN 1 is indicated at 111 and NN M at 112.

In some embodiments, not all of the input nodes send second data to the fusion node. Some of the input nodes may process data that may be considered to be more reliable than the data processed by other input nodes. In a preferred example, the sources of information measured by each input node can be placed in order of reliability. This can be done prior to the training of the NNs implemented by the input nodes and the fusion centre. In this example, node 1 101 is the most reliable node, i.e. node 1 processes the most reliable data from the entity. Node K 102 is the least reliable node.

This node 101 is referred to herein as the primary node. The primary node is a predetermined node which is considered to process the most reliable data of the data processed by the input nodes. The other, less reliable, nodes in the network can be referred to as secondary nodes.

For example, node 101 may be the spatially closest node to the entity and may therefore be considered to receive the most reliable data from the entity because the distance over which data signals are wirelessly transmitted is the smallest. The least reliable input node may be located furthest from the entity (i.e. data may be more unreliable in proportion to the distance of the input node from the entity). In another example, one input node may process data from a sensor that is more expensive and/or sophisticated than another and as a consequence may be more reliable. In a further example, one sensor may collect data of a type that is easier to process and therefore may be more reliable. For example, the algorithm used for location based on GPS data can be more accurate than localization based on images, for example from video. In another example, some sensors may be easier to access and maintain and thus would be easier to fix and keep working and therefore may be more reliable. The most reliable node and the order of reliability of the secondary nodes is preferably predetermined before training the neural networks implemented by the device.

Data may also be unreliable due to a fault in the data acquisition process, interference or the absence of data to be measured or received by the input node (i.e. an absence of respective first data at a respective input node).

The input node may determine whether its data is reliable, which can allow the input nodes to be ranked in order of reliability, for example, ranked from the primary node (node 1 101 in Figure 1 (a)) to the least reliable secondary node (node K 102 in Figure 1 (a)). As shown in Figure 1 (a), the input nodes 101 , 102 and the fusion node 110 are each configured to implement multiple neural networks. At least some of the multiple NNs are different. Two NNs may be considered to be different if they have the same architecture, but different weights.

In some embodiments, during the inference phase, the input nodes that are not the primary node (the primary node being considered to be the most reliable node) can assess the quality of the data they have received, measured or processed relating to the entity. In a preferred example, the assessment of the quality of the first and/or second data at an input node is performed by determining a Quality Indicator (QI). The QI may be received at the input node alongside the first data. For example, the QI may be (or may be determined in dependence on) a received signal strength indicator (RSSI). Alternatively, the NN implemented by a respective input node may output a confidence level on its prediction and the QI may be determined in dependence on the confidence level. In another example, the QI may be measured directly from the first data, for example based on the signal-to-noise ratio (SNR) of the first data.

Data at an input node may be considered to be reliable if the determined QI for that input node is above a threshold. The threshold may be predetermined. The threshold may be a chosen value above which the signal is still considered to be useful or reliable. Data at an input node may be considered to be unreliable if the QI for that input node is not above the threshold. The threshold is greater than zero. If the input node has not received any data from the entity (i.e. the first data comprises no data), the QI may, for example, be zero for that input node (i.e. below the threshold).

The primary node may have the highest QI of the node in the network, as determined prior to training of the NNs. This node can then be selected as the primary node for the subsequent training of the NNs and the inference.

The NN implemented by a respective input node may output both a quality assessment (for example, a QI) and the second data. In another example, the NN may output a quality assessment and some additional data that together form the second data sent to the fusion node. The second data itself (i.e. the data sent to the fusion node if a node is considered to be reliable) may also be used to assess the quality of data at a respective input node. This may allow for detection of a problem that has occurred during processing of first data by the NN implemented by a particular input node that may result in the second data output by the input node being unreliable. This can allow the fusion node to not use the second data output by that input node during inference. In some examples, an input node may assess the quality of both its respective first data and second data.

If the first data processed at a secondary input node and/or the second data output at the secondary node is considered to be reliable, the respective second data output from the respective neural network implemented at that input node is input to a neural network implemented by the fusion node.

As the primary node 101 is considered to process reliable data, its secondary data (the output of its NN) is sent to the fusion node during inference. In the example shown in Figure 1 (a), node K 102 is also considered to process reliable data, and so its respective second data is also sent to the fusion node to be used for the inference.

Therefore, the second data output from the primary node 101 is used by the fusion node for the inference, along with respective second data output from any other input nodes which are considered to be reliable.

The device therefore receives processed data (i.e. respective second data) from the input node designated as the primary node. In addition, the device can receive second data from other input nodes that are considered to process reliable data (for example, if the QI for that node is above a threshold).

The combined input to the fusion node may be formed by vertically concatenating activation vectors of the last layers (i.e. output layers) of the NNs implemented by the primary node and one or more secondary nodes that have reliable data. In this case, the size of the input layer of the NN implemented by the fusion node is equal to the sum of sizes of output layers of the NNs implemented by the primary node and the one or more secondary nodes (the secondary nodes which have reliable data).

As discussed above, fusion node 110 is configured to implement multiple NNs to process second data received from one or more of the input nodes. The NNs that can be implemented by the fusion node, shown at 111 and 112 in Figure 1(a), each correspond to a different network state 1 to M.

The inferred output / of NN 111 , which is being currently implemented by the fusion node 110 in Figure 1(a) and 1(b), is shown at 116. The output 116 of the inference may be a property of the entity, such as its location. The number of considered configurations of views (i.e. the number of network states), M, is finite and known in advance. Thus, the architecture can be thought of as a finite-state system that dynamically changes its internal state depending on the input data and some decision rule.

Each input node is equipped with as many neural networks as a considered number of states, M, where each state is associated with a specific configuration of views.

The sensing nodes can also be equipped with local analysis components that are trained to make local assessments of a view. These local assessments can be sent as short messages (beliefs) to the fusion center 110 to help the analysis component 113 in making final decisions as to the current state of the network. The local assessments of the views are shown at 114 and 115 for nodes 101 and 102 respectively. In the preferred implementation, the local assessments sent to the fusion centre are different to the quality assessments (e.g. Qis) determined locally at the input nodes to assess the quality of their first or second data in some embodiments.

Switching between the neural networks is dictated by feedback signalling from the fusion node 110. The decision about switching the neural networks is in this example performed at a trained analysis component 113 at the fusion center 110 which observes the second data from sensing nodes and (optionally) local assessments of the views performed independently by sensing nodes.

The analysis component 113 can be trained off-line to detect a configuration of views based on the distribution of the second data output by the encoding nodes 101 , 102.

In other examples, the analysis component may not implement a trained model and instead may use some alternative decision rule.

Therefore, assuming M different view configurations (indexed by m=1 ,... ,M), each sensing node 101 , 102 and the fusion node 110 can be equipped with M neural networks and a switch. Each NN in a node is associated with a respective configuration of views, and can be trained off-line using appropriate training data. The network can operate in one of M possible states, where each state is associated with a specific configuration of views. The state of the network should be preferably in agreement with the current configuration of views, as shown in Figure 1(a).

In the case of disagreement, as shown in Figure 1(b), where the network is now in state M, the case analysis component 113 at the fusion center 110 can select a new state and broadcast the decision to all nodes in the network.

In the preferred implementation, the decision made by the analysis component 113 is global and broadcasted to all nodes in the network. This can minimize the risk of a conflict between the nodes. Furthermore, since the data received by the fusion center 110 are preferably already compressed, the solution enjoys low bandwidth requirements and alleviates the level of privacy.

Figure 2 presents a timeline illustrating the adjustment of the system to a new configuration of views (one of states m=1 ,... ,M) and a simplified representation of signalling between the nodes. Different configuration of views and different states of the system are illustrated above and below the timeline.

The state of the network is in agreement with the current configuration of views when the index of the current state matches the index of the current configuration of views. Conversely, when the indices do not match, the state of the network is in disagreement with the current configuration of views.

The detection of the current configuration of views is performed on-line, during standard operation of the system. The case analysis component 113 analyses signals received by the fusion center 110 (in Figure 2, these signals are denoted as standard INL signals).

At first, there is agreement between the state of the network and the current configuration of views, both being m=1.

At 201 , there is a view change and the current configuration of views changes to M. At this point, the state of the network remains as 1.

As time progresses, the analysis component 113 collects more data and tries to assess whether the current state of the system is in agreement with the current configuration of views. The analysis component may also take into account local beliefs optionally sent by the sensing nodes, indicated at 202.

When the analysis component 113 gathers enough evidence indicating a disagreement between the current configuration of views and the state of the system, it selects a new state for the system and broadcasts the decision to all nodes in the network, as shown at 203. The state change is shown at 204 in Figure 2.

Therefore, the fusion node can receive the respective second data, and optionally the local assessments, from the input nodes over a time period and select one of the finite set of network states in dependence the second data and optional local assessments received during that time period.

In the preferred implementation, the case analysis component processes second data, which may be a compressed representation of the full observations (i.e. the first data), from the sensing nodes. For different configurations of views, some of the sensing nodes observe some different features of the target. As a consequence, these differences are generally pronounced in the compressed data as well as in the uncompressed data, thus allowing the analysis component to reliably detect the current configuration of views (e.g., presence of a fog) from the second data from the input nodes.

The optional local beliefs from the sensing nodes may be useful for the main case analysis component. Nonetheless, the case analysis component makes the global decisions because it can collect data from of all sensing nodes.

The size of the signals sent by a sensing node (i.e., the size of the output layer of the selected neural network at that node) may change depending on the state of the network. This may be justifiable in situations when a sensing node observes many or few target features depending on the current configuration of views.

In one example, it is desired to detect/classify an object (for example, a car, pedestrian, etc.) in an outdoor environment. The classification of an object is based on images taken by three remote surveillance cameras. The classification (i.e. the inference) is performed at a remote fusion center.

In this example, depicted in Figure 3, the assumed model architecture comprises five nodes 1-5, indicated at 301-305 respectively. A target 306 is observed by three cameras (nodes 1 , 2, and 3, shown at 301 , 302 and 303 respectively), and the final classification is performed at the remote fusion center (node 5, 305).

Node 4, 304, is an intermediate node which receives the output of input nodes 302 and 303. This intermediate node can perform some further processing and/or compression on the second data output by nodes 302, 303 before it is provided to the fusion node 305. Therefore, one or more of the nodes in the network may send data derived from the respective second data output by one or more input nodes in the network to the fusion node 305. The output of any intermediate nodes can be considered to comprise second data from the input nodes from which an intermediate node receives data. The fusion node may therefore receive respective second data from at least one of the input nodes directly or indirectly (i.e. via an intermediate node).

The view of the area observed by the cameras can change over time. The view of the target seen by nodes 301 and 302 depends on the level of illumination (high/low, e.g., day/night). Furthermore, node 301 can be additionally affected by a fog (present/absent).

In this example, each of the five nodes 301-305 is equipped with M=4 NNs, where each NN is associated with a specific configuration of views. The four considered configurations of views are listed in Table 1.

Table 1 : The four considered configurations of views (network states) for the example shown in Figure 3.

The sizes of input/output layers of these NNs may vary between the nodes and between different states. In this example, the in-network learning framework imposes that the combined size of signals output by the sending nodes must match the input of the following node. In the assumed example, for every state m = 1 ,2, 3, 4, the following relations hold:

• NNi,_m at node 1 (301) has input layer of size dim(Xi) and output layer of size Ki ,₅,m,

• NN2,m at node 2 (302) has input layer of size dim(X2) and output layer of size K2.4.m,

• NN₃,m at node 3 (303) has input layer of size dim(Xs) and output layer of size Ks,4,m,

• NN4,m at node 4 (304) has input layer of size (K2,4,m + Ks,4,m) and output layer of size Ki.s.m,

• NNs.m at node 5 (305, fusion center) has input layer of size (Ki,s,m + Ki.s.m) and output layer of size dim(Y).

Preferably, the NNs are trained using a distributed backpropagation technique. The NNs in all nodes that are associated with the same state are trained jointly using dedicated training datasets. Furthermore, the NNs associated with the same state are trained independently of NNs associated with other states.

In Figure 3, the system is currently in network state 1. Each of the nodes 301-305 is currently implementing the NN corresponding to state 1 , shown at 310-314 respectively. Each of the nodes 301-305 may also implement NNs corresponding to states 2-4. The NNs corresponding to state 4 are indicated at 315-319 for nodes 301-305 respectively.

In one example of the operation of the system, it is assumed that the surveillance cameras at nodes 301 , 302, and 303 take a measurement (307, 308, 309 respectively) of the observed area repeatedly over time (for example, every minute). At each time step, the measurements are compressed by the NNs being currently implemented at the sensing nodes 301 , 302, 303 to form respective second data.

In this example, the compressed measurements from node 301 are sent directly to the fusion center at node 305. Nodes 302 and 303 send their compressed measurements to node 304. Node 304 concatenates both inputs and processes them using its own neural network. Next, node 304 sends the output of the neural network to the fusion center at Node 305. The output of intermediate node 304 can be considered to comprise second data from input nodes 302 and 303. At the fusion center 305, the signals from node 301 and node 304 are concatenated and processed by the final neural network. The inferred output Y of NN 314, which is being currently implemented by the fusion node 305 in Figure 3 (corresponding to state 1), is shown at 320. The case analysis component, which in this example comprises a trained model, is shown at 321.

Figure 4 depicts an example of a timeline illustrating transition of the view seen by the nodes of Figure 3 and of the system’s state. The descriptions of selected stages 1-6 in the considered example will now be described.

Stage 1. Nodes 1 , 2, and 3 observe the area that is well illuminated and without fog. The system operates in the state m=1 , in agreement with the current configuration of views. The case analysis component analyses received signals and it does not detect any mismatch between the configuration of views and the current state of the system.

Stage 2. Illumination fades, as indicated at 401. As the illumination level goes down, node 1 and node 2 do not observe the same features of the target as before. The case analysis component detects that the distribution of received signals from node 1 and node 4 changes. In addition, node 1 and node 2 may occasionally send short messages (local beliefs) indicating that the illumination level of their views dropped significantly.

Stage 3. The case analysis component continues to receive more data indicating the change of the views of node 1 and node 2. Once the analysis component gathers enough evidence, it decides to select a new state, m=3. The decision is broadcasted to all nodes. From now on, the nodes process the input data using the new selected neural networks corresponding to state 3.

Stage 4. Fog appears, as indicated at 402. Node 1 observes very few details of the target variable. The case analysis component observes that the distribution of received signals from node 1 changes again. In addition, node 1 may occasionally send short messages (local beliefs) indicating that its view is dark and possibly obscured by the fog.

Stage 5. The case analysis component continues to receive more data indicating the change of the views from node 1 . Once it gathers enough evidence, it decides to select a new state, m=4. The decision is broadcasted to all nodes. From now on, the nodes process the input data using the new selected neural networks.

Stage 6. The system operates in a state that is again in agreement with the current configuration of views. The adjustment procedure can repeat each time when the case analysis component detects a change of the views. Therefore, the system for distributed inference described herein is a finite-state system that can dynamically adjust to the current configuration of views. The decision about switching to a new state is made at the fusion center based on signalling and optional local assessments from sensing nodes. The nodes are equipped with several neural networks that can be switched depending on the selected network state.

The fusion node can broadcast a short, preferably identical message to all nodes in the network each time it infers that the configuration of views seen by the sensing nodes changes. Preferably, the broadcast message is always the same whenever a particular configuration of views appears.

Preferably, the dimensionality and/or the distribution of standard INL signals changes immediately after receiving the feedback signalling from the fusion center.

The sensing nodes can optionally send local assessments as additional short messages to the fusion center. These messages are preferably different whenever a view of a sensing node changes.

Figure 5 shows an example of a method 500 for implementation at a device in a communication network in accordance with embodiments of the present invention. As described in the above examples, the communication network comprises multiple input nodes each configured to process respective first data relating to an entity and output respective second data and being configured to operate according to a finite set of network states. At step 501 , the method comprises receiving respective second data from at least one of the input nodes. At step 502, the method comprises, in dependence on the or each respective second data from the at least one of the input nodes, select one of the finite set of network states. At step 503, the method comprises sending a signal indicating the selected one of the finite set of network states to the at least one of the input nodes.

Figure 6 shows an example of a method 600 for implementation at a network node in a communications network in accordance with embodiments of the present invention. As described in the examples above, the network node is configured to communicate with a device in the communications network and to process first data relating to an entity and output second data, the network node being configured to implement multiple neural networks for processing the first data, each neural network corresponding to one of a finite set of network states. At step 601 , the method comprises receiving a signal from the device indicating a selected one of the finite set of network states. At step 602, the method comprises processing the first data using the neural network corresponding to the indicated one of the finite set of network states.

Embodiments of the distributed learning and inference architecture described herein can adjust dynamically to changes in the characteristics of sensed or collected data during the inference phase among a finite set of possible states. The finite set of states may comprise two or more states.

The solutions described above are applicable to a network having an arbitrary number of sensing nodes, optionally an arbitrary number of intermediary nodes, and a fusion node (a fusion center). All nodes in the network are active during the training phase and during the inference phase. Furthermore, every node is equipped with as many neural networks as a considered number of network states. Switching between the neural networks in every node is dictated by feedback signalling from the fusion center, which selects a new state for all nodes in the network. The decision about selecting a new state is determined by the fusion center by analysing its received signals.

The distributed learning and inference architecture is applicable to any machine learning scenario in which the inference phase requires two or more sources of input data which may change over time to one of a finite number of possibilities.

The solution can be extended to an arbitrary number of input nodes working together to infer a location or position, or other property, of an entity.

One use of the network described above may be, for example, to form an estimated diagnosis of a disease or condition in a patient. In this case, the multiple input nodes may each measure a property of the patient (for example, blood pressure, breathing rate, heart rate etc) and process the data using their respective neural networks. The inference at the fusion node device may form an estimated diagnosis for the patient.

In another example, multiple sensors may observe some area (mountains, forest, a factory) in order to detect some dangerous events (for example, a fire, an accident, etc.) or phenomena.

In other examples, the entity may be a person, a vehicle or a mobile device.

The communication network is preferably a wireless network. The fusion node and each of the input nodes may be base stations. The network may alternatively be a wired network. For example, a network may comprise several cameras with computing resources (sensing nodes) which are connected over wire to some remote computing server (acting as the fusion node).

The applicant hereby discloses in isolation each individual feature described herein and any combination of two or more such features, to the extent that such features or combinations are capable of being carried out based on the present specification as a whole in the light of the common general knowledge of a person skilled in the art, irrespective of whether such features or combinations of features solve any problems disclosed herein, and without limitation to the scope of the claims. The applicant indicates that aspects of the present invention may consist of any such individual feature or combination of features. In view of the foregoing description it will be evident to a person skilled in the art that various modifications may be made within the scope of the invention.

Claims

1. A device (110, 305) for a communication network, the communication network comprising multiple input nodes (101 , 102, 301 , 302, 303) each configured to process respective first data (104, 105, 307, 308, 309) relating to an entity and output respective second data and being configured to operate according to a finite set of network states, the device being configured to: receive (501) respective second data from at least one of the input nodes (101 , 102, 301 , 302, 303); in dependence on the or each respective second data from the at least one of the input nodes (101 , 102, 301 , 302, 303), select (502) one of the finite set of network states; and send (503) a signal indicating the selected one of the finite set of network states to the at least one of the input nodes (101 , 102, 301 , 302, 303).

2. The device (110, 305) as claimed in claim 1 , wherein the device is configured to implement multiple neural networks (111 , 112, 314, 319), each neural network corresponding to one of the finite set of network states, the device being configured to process the or each respective second data from the at least one of the input nodes (101 , 102, 301 , 302, 303) using the neural network corresponding to the selected one of the finite set of network states.

3. The device (110, 305) as claimed in claim 1 or claim 2, wherein the device is configured to dynamically adjust the selected one of the finite set of network states in dependence on the or each respective second data from the at least one of the input nodes (101 , 102, 301 , 302, 303).

4. The device (110, 305) as claimed in any preceding claim, wherein the device is configured to receive the respective second data from at least one of the input nodes (101 , 102, 301 , 302, 303) over a time period and select the one of the finite set of network states in dependence on the or each respective second data from the at least one of the input nodes received during the time period.

5. The device (110, 305) as claimed in any preceding claim, wherein the device is configured to: input the respective second data from the at least one of the input nodes (101 , 102, 301 , 302, 303) into a trained analysis model (113, 321) implemented by the device; and select the one of the finite set of network states in dependence on an output of the trained analysis model (113, 321).

6. The device (110, 305) as claimed in any preceding claim, wherein the device is further configured to receive a respective local assessment (114, 115) of the respective first data from the at least one of the input nodes (101 , 102) and select the one of the finite set of network states in further dependence on the or each local assessment (114, 115).

7. The device (110, 305) as claimed in claim 6 as dependent on claim 5, wherein the device is further configured to input the or each respective local assessment (114, 115) into the trained analysis model (113) and select the one of the finite set of network states in dependence on the output of the trained analysis model (113).

8. The device (110, 305) as claimed in any preceding claim, wherein the respective second data is a compressed representation of the respective first data.

9. The device (110, 305) as claimed in any preceding claim, wherein the respective second data comprises an output of a respective neural network (106, 107, 108, 109, 310, 311 , 312, 315, 316, 317) implemented by the respective input node.

10. The device (110, 305) as claimed in claim 9, wherein the respective neural network (106,

107, 108, 109, 310, 311 , 312, 315, 316, 317) is one of multiple neural networks each corresponding to a network state of the finite set of network states.

11. The device (110, 305) as claimed in claim 9 or claim 10, wherein the respective second data comprises an activation vector of a last layer of the respective neural network (106, 107,

108, 109, 310, 311 , 312, 315, 316, 317) implemented by the respective input node.

12. The device (110, 305) as claimed in any preceding claim, wherein the device is configured to process respective second data received by the device from two or more of the input nodes, and wherein the device is configured to concatenate the respective second data from the two or more of the input nodes to form a combined input and select the one of the finite set of network states in dependence on the combined input.

13. The device (110, 305) as claimed in any preceding claim, wherein the respective first data represents a view of the entity and each network state of the finite set of network states is associated with a specific configuration of views.

14. The device (110, 305) as claimed in any preceding claim, wherein the device and each of the input nodes are base stations.

15. The device (110, 305) as claimed in any preceding claim, wherein the entity is a vehicle or a mobile device.

16. The device (110, 305) as claimed in any preceding claim, wherein the respective first data processed at each input node comprises channel state information.

17. A method (500) for implementation at a device (110, 305) in a communication network, the communication network comprising multiple input nodes (101 , 102, 301 , 302, 303) each configured to process respective first data relating to an entity and output respective second data and being configured to operate according to a finite set of network states, the method comprising: receiving (501) respective second data from at least one of the input nodes (101 , 102, 301 , 302, 303); in dependence on the or each respective second data from the at least one of the input nodes (101 , 102, 301 , 302, 303), selecting (502) one of the finite set of network states; and sending (503) a signal indicating the selected one of the finite set of network states to the at least one of the input nodes (101 , 102, 301 , 302, 303).

18. A network node (101 , 102, 301 , 302, 303) configured to communicate with a device (110, 305) in a communications network and to process first data relating to an entity and output second data and being configured to operate according to a finite set of network states, the network node (101 , 102, 301 , 302, 303) being configured to implement multiple neural networks (106, 107, 108, 109, 310, 311 , 312, 315, 316, 317) for processing the first data, each neural network corresponding to one of the finite set of network states, the network node being configured to: receive (601) a signal from the device (110, 305) indicating a selected one of the finite set of network states; and process (602) the first data using the neural network corresponding to the indicated one of the finite set of network states.

19. The network node (101 , 102, 301 , 302, 303) as claimed in claim 18, wherein the network node is configured to send the second data output by the network node to the device (110,

20. The network node (101 , 102, 301 , 302, 303) as claimed in claim 18 or claim 19, wherein the second data comprises the output of the neural network (106, 107, 108, 109, 310, 311 , 312, 315, 316, 317) corresponding to the one of the finite set of network states.

21. The network node (101 , 102, 301 , 302, 303) as claimed in claim 20, wherein the second data comprises an activation vector of a last layer of the neural network (106, 107, 108, 109, 310, 311 , 312, 315, 316, 317) implemented by the network node.

22. The network node (101 , 102, 301 , 302, 303) as claimed in any of claims 18 to 21 , wherein the network node is configured to perform a local assessment (114, 115) of the first data and indicate an estimate of a current network state of the finite set of network states to the device.

23. The network node (101 , 102, 301 , 302, 303) as claimed in any of claims 18 to 22, wherein the first data represents a view of the entity.

24. The network node (101 , 102, 301 , 302, 303) as claimed in any of claims 18 to 23, wherein the second data is a compressed representation of the first data.

25. The network node (101 , 102, 301 , 302, 303) as claimed in any of claims 18 to 24, wherein the first data comprises channel state information.

26. A method (600) for implementation at a network node in a communications network, the network node being configured to communicate with a device (110, 305) in the communications network and to process first data relating to an entity and output second data, the network node being configured to implement multiple neural networks (106, 107, 108, 109, 310, 311 , 312, 315, 316, 317) for processing the first data, each neural network corresponding to one of a finite set of network states, the network node being configured to: receive (601) a signal from the device (110, 305) indicating a selected one of the finite set of network states; and process (602) the first data using the neural network corresponding to the indicated one of the finite set of network states.

27. A computer-readable storage medium having stored thereon computer-readable instructions that, when executed at a computer system, cause the computer system to perform the method of claim 17 or claim 26.