US20200356835A1

US20200356835A1 - Sensor-Action Fusion System for Optimising Sensor Measurement Collection from Multiple Sensors

Info

Publication number: US20200356835A1
Application number: US16/407,290
Authority: US
Inventors: Luke Anthony William ROBINSON; Vladimir Ceperic
Original assignee: Lgn Innovations Ltd
Current assignee: Lgn Innovations Ltd
Priority date: 2019-05-09
Filing date: 2019-05-09
Publication date: 2020-11-12
Also published as: WO2020224910A1; EP3966748A1

Abstract

The embodiments described herein aim to improve environmental sensing by providing a computationally efficient and accurate means for fusing sensor data and using this fused data to control sensors to focus on areas that would most reduce the uncertainty in the sensing system. In this way, the system can direct sensors to focus on the most important areas and features within the environment in order to provide the most effective sensor data (e.g. for use by a control system). The methods described herein make use of multi-agent sensor-action fusion. The methods are multi-agent in that a set of machine learning agents are trained in order to control the sensors to focus on the most important features and regions. The embodiments implement sensor-action fusion in that sensor fusion is performed in order to obtain a combined view of the environment and this combined view is utilised to determine the most appropriate actions.

Description

TECHNICAL FIELD

The present disclosure relates to computer implemented systems and methods for controlling the configuration of one or more sensors based on information shared between a plurality of sensors. In particular, but without limitation, this disclosure relates to methods for controlling sensors (e.g. in autonomous vehicles) in order to improve data acquisition, for instance, to reduce the uncertainty within the system or to focus on predefined features of interest. Certain embodiments focus on the areas of highest uncertainty in order to provide improved sensor data for optimized environmental sensing (e.g. to an autonomous vehicle control system).

BACKGROUND

Environmental sensing is an important aspect of control systems engineering. As the number and variety of sensors increases, it is important for control systems to be able to accurately combine sensor data in order to determine an appropriate control action. In addition, the large amount of data being transferred between sensors and central control systems can lead to large transmission and computing overheads.
One field of control systems engineering that is progressing rapidly is control systems for autonomous vehicles. Recent advances in machine learning have led to a number of significant improvements to such autonomous vehicle control systems. Having said this, the large amount of sensor data being provided to such systems, and the need for such systems to operate in real time with very little lag, provide significant technical hurdles.

SUMMARY

In light of the above, the embodiments described herein provide the ability to fuse information from a number of sensors and combine this with the means to adjust sensor configurations based on this fused data. This allows each individual sensor to be controlled to improve data acquisition across the whole system based on the combination of data across multiple sensors.
According to a first aspect there is provided a method for controlling the configuration of one or more sensors based on information shared between a plurality of sensors, the method comprising establishing a hierarchical network of nodes comprising at least a first level comprising a plurality of child nodes and a second level comprising one or more parent nodes, wherein each of the child nodes is assigned to a corresponding sensor and each of the one or more parent nodes is assigned to a corresponding group of child nodes to combine sensor data from the corresponding group of child nodes. The method further comprises, at each parent node: receiving, from each of the child nodes in the corresponding group of child nodes for the parent node, sensor data for the sensor corresponding to that child node, the sensor data occupying a corresponding sensor feature space for the child node; encoding the received sensor data to form an encoded combination of sensor data by mapping the received sensor data to a latent space for the parent node; decoding the encoded combination of sensor data to map, for each of one or more of the child nodes of the corresponding group, the encoded combination of sensor data to the sensor feature space for the child node to form a corresponding decoded combination of sensor data; and sending each decoded combination of sensor data to the child node corresponding to the sensor feature space for that decoded combination of sensor data. The method further comprises, at each child node that receives a decoded combination of sensor data, determining an action for updating a configuration of the corresponding sensor based on the received decoded combination of sensor data, and issuing an instruction to adjust the configuration of the corresponding sensor in accordance with the action.
Accordingly, in an embodiment a hierarchical network of nodes is provided for controlling one or more sensors based on data from a number of sensors. In the second level, data from a plurality of sensors is combined. In the first level, one or more of the child nodes receives a decoded combination of sensor data from the second level and takes an action to adjust the configuration of the corresponding sensor. The action may be for improving data acquisition by the sensor. This allows the individual sensor to be adjusted based on the information shared across a number of sensors. This provides improved sensing resolution across the system and allows less expensive sensors to be utilised whilst maintaining sensing resolution across the system.
Each node within the hierarchical network of nodes may be configured to encode data by mapping data onto a corresponding latent space via a bottleneck architecture comprising an encoder and a decoder. This may be implemented by a corresponding neural network for the node. The encoder may be configured to map information into a corresponding latent space of reduced dimensions to compress the information. A decoder may be provided to map encoded data out of the corresponding latent space. In this way, each node is able to compress the data effectively to improve the efficiency of the system by reducing the amount of data that is transferred between nodes. The bottleneck architecture also provides a means for combining data at parent nodes.
According to an embodiment the method further comprises at each child node: receiving one or more sensor measurements from the sensor corresponding to the child node; encoding the one or more sensor measurements to compress the one or more sensor measurements by mapping the one or more sensor measurements onto a sensor feature space for the child node to form the sensor data, the sensor feature space being a latent space; and sending the sensor data to the parent node corresponding to the child node.
Encoding data at each child node compresses the data thereby improving the efficiency of the system. The latent space of each child node may have reduced dimensions relative to the one or more sensor measurements. That is, the sensor data provided to the parent node for the child node may be compressed relative to the one or more sensor measurements.
Compression reduces the bandwidth across the system. By training the system to learn an effective latent space for compressing the data, only the most important features need to be shared between nodes.
The first level may comprise one node for each sensor, wherein for each sensor, the sensor data for the sensor is input into a corresponding neural network for encoding.
According to an embodiment each child node that receives a decoded combination of sensor data implements an agent for determining the action that is biased towards selecting an action that achieves one or more of reducing prediction error or focusing on one or more predefined features of interest.
Accordingly, in an embodiment, a machine learning agent is implemented in one or more nodes in the first level for determining actions for updating the configuration of the one or more corresponding sensors. The machine learning agent may be biased to select actions that focus on predefined features of interest (e.g. user defined features) or to reduce prediction error in the system or in the child node itself. The biasing may be through training, by training the agent to choose actions and biasing those actions (through a reward or loss function) accordingly. Whilst the agent may be biased towards one or more features of interest, it could equally be considered to be biased away from features of low importance.
According to an embodiment each child node that receives a decoded combination of sensor data implements a classifier configured to identify the one or more predefined features of interest within the corresponding decoded combination of sensor data and bias the agent towards an action that focuses on the one or more predefined features of interest.
According to a further embodiment wherein each agent: determines predicted sensor data based on the decoded combination of sensor data; determines a prediction error based on the predicted sensor data; and is biased towards an action that minimises a cost function comprising the prediction error.
By minimising a cost function comprising prediction error, each agent can attempt to minimise surprise within the system. It should be noted that minimising a cost function is functionally equivalent to maximising a reward function provided that the terms within each function are adapted accordingly (e.g. by taking the inverse or negative of certain terms).
According to a further embodiment each agent: determines predicted sensor data based on the combination of sensor data, determines the prediction error based on the predicted sensor data and determines a gradient of the prediction error over the action space for the node; and is biased towards determining an action that minimises a cost function comprising the gradient of the prediction error.
By attempting to minimise a cost function comprising the gradient of the prediction error, each agent attempts to direct attention towards areas regions in the environment that provide the greatest increase in knowledge (the greatest decrease in prediction error). Each agent may be biased towards regions in the environment corresponding to the most negative gradient in the prediction error.
According to a further embodiment the action comprises one or more of: adjusting a resolution of the corresponding sensor; adjusting a focus of the corresponding sensor; and directing the corresponding sensors to sense an updated region.
Sensing an updated region may be via an actuator moving the sensor or the sensor configuring itself to adjust its sensitivity in a certain direction (e.g. the adjustment of a phased array of antennas).
According to an embodiment each child node is implemented in a corresponding processor connected directly to the corresponding sensor for that child node. Each child node may be integrated within the corresponding sensor. This reduces the amount of data that is transferred from the sensor, as data may be compressed by the child node prior to it being sent from the child node to the parent node. Directly connecting the child node to the parent reduces transmission overheads and reduces latency within the system.
According to a further embodiment the second level is implemented in a second set of one or more processors and wherein the processors for the first level communicate the sensor data to the one or more processors for the second level.
According to an embodiment, the hierarchical network predicts future sensor measurements based on the sensor data and, in response to a sensor measurement at a specified time differing from a predicted sensor measurement by more than a predefined amount, outputs details of the sensor measurement for validation. This allows potentially erroneous or doctored measurements to be flagged up for evaluation. This can help improve the security of the system by flagging up potential attempts to infiltrate the system and doctor measurements.
According to an embodiment the method further comprises, at each child node that receives a decoded combination of sensor data, determining a bandwidth action to adjust a size of the sensor feature space for the child node based on the received decoded combination of sensor data and adjusting the size of the sensor feature space in accordance with the action.
Adjusting the sensor feature space allows the amount of compression to be varied at each node to improve efficiency. The system may be trained to optimise the bandwidth (the size of each sensor feature space) such that information is shared effectively whilst maintaining efficiency.
According to an embodiment the bandwidth action is biased towards a bandwidth action that reduces the size of the sensor feature space but is biased away from an action that increases a prediction error for the child node
According to an aspect there is provided a node for controlling the configuration of a sensor based on information shared between a plurality of sensors, the node comprising a processor configured to: receive one or more sensor measurements from the sensor;
encode the one or more sensor measurements to compress the one or more sensor measurements by mapping the one or more sensor measurements onto latent space for the node to form encoded sensor data; send the encoded sensor data to a parent node for combination with further encoded sensor data from one or more other sensors of the plurality of sensors; receive from the parent node a combination of sensor data comprising a combination of the encoded sensor data for the node and the further encoded sensor data from the one or more other sensors mapped to the latent space of the node; determine an action for updating a configuration of the sensor based on the combination of sensor data; and issue an instruction to adjust the configuration of the sensor in accordance with the action.
Accordingly, a single (child) node is able to encode sensor data to compress the sensor data, send the encoded sensor data to a parent node and determine an action for adjusting a configuration based on a combination of encoded sensor data received from the parent node.
According to an embodiment the node is biased towards selecting an action that achieves one or more of reducing prediction error or focusing on one or more predefined features of interest.
According to an embodiment the processor is configured to implement a classifier configured to identify one or more predefined features of interest within the combination of sensor data and bias the node towards an action that focuses on the one or more predefined features of interest.
According to an embodiment the processor is configured to determine predicted sensor data based on the combination of sensor data and determine the prediction error based on the predicted sensor data and the processor is biased towards selecting an action that minimises a cost function comprising the prediction error.
According to an embodiment the processor is configured to determine predicted sensor data based on the combination of sensor data, determine the prediction error based on the predicted sensor data and determine a gradient of the prediction error over the action space for the node and the processor is biased towards determining an action that minimises a cost function comprising the gradient of the prediction error.
According to an embodiment the action comprises one or more of: adjusting a resolution of the sensor; adjusting a focus of the sensor; and directing the sensor to sense an updated region.
According to an embodiment the processor is configured to be connected directly to the sensor for receiving the sensor data.
According to an embodiment the processor is configured to determine a bandwidth action for adjusting the size of the latent space based on the combination of sensor data, wherein the processor is biased towards a bandwidth action that reduces the size of the latent space but is biased away from an action that increases a prediction error for the node.
According to an embodiment the action is determined using a reinforcement learning agent in accordance with parameters of the agent and the processor is configured to update the parameters of the agent to reduce a cost function based on one or more of a prediction error, a gradient of the prediction error over an action space for the node, and a weighting towards one or more predefined features of interest.
According to a further aspect there is provided a parent node for combining sensor data from multiple sensors for use by one or more child nodes in controlling the configuration of one or more of the sensors, the parent node comprising a processor configured to: receive, from each of a group of child nodes, sensor data for a sensor corresponding to the child node, the sensor data occupying a corresponding sensor feature space for the child node; encode the received sensor data to form an encoded combination of sensor data by mapping the received sensor data to a latent space for the parent node; decode the encoded combination of sensor data to map, for each of one or more of the child nodes of the group, the encoded combination of sensor data to the sensor feature space for the child node to form a corresponding decoded combination of sensor data; and send each decoded combination of sensor data to the child node corresponding to the sensor feature space for that decoded combination of sensor data to enable the child node to determine and issue an action for updating a configuration of the corresponding sensor for that child node based on the corresponding combination of sensor data.
According to an embodiment there is provided a computing system comprising one or more processors configured to implement any of the methods described herein.
According to an embodiment there is provided a non-transitory computer readable medium comprising computer executable instructions that, when executed by a processor, cause the processor to implement any of the methods described herein.

BRIEF DESCRIPTION OF THE DRAWINGS

Arrangements of the present invention will be understood and appreciated more fully from the following detailed description, made by way of example only and taken in conjunction with drawings in which:

FIG. 1 shows a schematic of a sensor fusion system according to an embodiment;

FIG. 2 shows a schematic detailing sensor fusion for a multi-level sensor-action fusion system according to an embodiment;

FIG. 3 shows a schematic detailing action determination for of a multi-level sensor-action fusion system according to an embodiment;

FIG. 4 shows a schematic detailing a node within a sensor fusion system according to an embodiment; and

FIG. 5 shows a method for determining an action for adjusting the configuration of a sensor according to an embodiment.

DETAILED DESCRIPTION

The embodiments described herein aim to improve environmental sensing by providing a computationally efficient and accurate means for fusing sensor data and using this fused data to control sensors to focus on areas that would most reduce the uncertainty in the sensing system. In this way, the system can direct sensors to focus on the most important areas and features within the environment in order to provide the most effective sensor data (e.g. for use by a control system).
The methods described herein make use of multi-agent sensor-action fusion. The methods are multi-agent in that a set of machine learning agents are trained in order to control the sensors to focus on the most important features and regions. The embodiments implement sensor-action fusion in that sensor fusion is performed in order to obtain a combined view of the environment and this combined view is utilised to determine the most appropriate actions.
Sensor fusion is the combination of data from several sensors. This allows the system to obtain a holistic view of the environment and mitigates the effect of individual errors or faults in individual sensors. It also provides improved resolution relative to independent measurements. This means that less advanced (and correspondingly, less expensive) sensors may be utilised whilst still obtaining the same resolution of data. In the present embodiments, sensor fusion is obtained by encoding sensor data from multiple sensors into a combined latent space.
The systems described herein are configured to sit between the sensors (e.g. a camera or LiDAR) and a machine learning system (such as a control system for an automated vehicle). As the data is encoded into a combined latent space, it is in an ideal format for being processed further by the machine learning system.
The embodiments described herein compress sensor data by encoding the sensor data using specifically trained encoders. The compression reduces the amount of data sent to, and optimises the data for, the machine learning system. Directional feedback loops are provided to focus the sensors on the most important information in the environment. The result is a self-directed, active learning system that maps an environment faster and with higher resolution than alternative methods. This can provide the following advantages:

- increased speed of feature detection, machine learning rate and accuracy;
- increased resolution of sensors (allowing for simpler, less expensive and more efficient sensors to be utilised); and
- reduced volume of data sent from sensors to machine learning system. This increases the throughput of the system, reduces the power consumption of both the sensor and the machine learning system and reduces the cost of the machine learning system (due to the decreased computational requirements for the machine learning system).

The embodiments described herein implement a hierarchical network of neural networks (in specific embodiments, these are machine learning agents that each include an encoder to a corresponding latent space and a decoder out of the latent space). Sensor fusion may be performed at a number of levels for various levels of resolution/generality. For instance, the sensor data for an autonomous vehicle may be combined by a processor on a vehicle before being transmitted to a regional controller that combines the sensor data for a number of vehicles. A higher level may be provided for combining data across a number of regions and this may continue up to a global level of generality. This allows the sensor data to be interrogated by one or more control systems at various resolution levels. For instance, a driver or a control system for an automated vehicle may require the combined data for a single car, whereas a control or monitoring system for a fleet of vehicles may require combined data across the fleet. The combined data across multiple vehicles can be fed back to individual vehicles to further improve the data acquisition function at each vehicle.
FIG. 1 shows a schematic of a sensor fusion system according to an embodiment. The system comprises three nodes, a first sensor node 10, a second sensor node 20 and a fusion node 30. Each node is implemented on a separate processor.
The first 10 and second 20 sensor nodes form a first level within the network. The fusion node 30 sits above the sensor nodes 10, 20 in a second level within the network. The fusion node 30 can be considered a parent node and the first 10 and 20 second sensor nodes can be considered child nodes of the parent node.
The first 10 and second 20 sensor nodes are implemented in processors connected directly to a corresponding sensor. Accordingly, the first sensor node 10 receives first sensor data S₁from a first sensor and the second sensor node 20 receives second sensor data S₂from a second sensor. The first S₁and second S₂sensor data each include one or more sensor measurements.
Each of the first 10 and second 20 sensor nodes is configured to compress its respective sensor data to produce an encoded version of that sensor data. The first sensor node 10 compresses the first sensor data S₁to produce first encoded data E₁. The second sensor node 20 compresses the second sensor data S₂to produce second encoded data E₂. The first 10 and second 20 sensor nodes compress the sensor data through mapping the sensor data onto a corresponding latent space via a machine learning model such as a neural network. In the present embodiment, this is achieved through the use of specially trained bottleneck architecture comprising an encoder and a corresponding decoder. Each latent space encodes a learned and controllable translation in time, space or level of abstraction (such as bandwidth or compression).
Accordingly, the first 10 and second 20 sensor nodes implement a corresponding encoder to compress input data through a mapping onto a corresponding latent space. Each latent space for each sensor node 10, 20 can be considered a sensor feature space.
Each node includes an encoder 12, 22 and a decoder 14, 24. Each encoder 12, 22 maps input sensor data onto a corresponding latent space. Each decoder 14, 24 maps encoded data back onto the input space of the sensor data to produce decoded sensor data S₁′, S₂′. The decoded sensor data S₁′, S₂′ is a prediction of the sensor data based on the encoded data.
Each node may implement an autoencoder, such that the output (the decoded sensor data) attempts to recreate the original input (the sensor data). In this case, each node (each encoder-decoder pair) would be trained in order to minimise the reproduction error (the error between the input data S₁, S₂and the decoded data S₁′, S₂′). Having said this, the bottleneck architecture need not be an autoencoder, and alternative architectures may be used that do not attempt to reproduce the original input.
The nodes make use of a bottleneck architecture such that the input data is mapped to a lower dimensional latent space. This may be through defining a lower-dimensional latent space, or through the use of a sparse encoder. Sparse encoders have an embedding space that is larger (has more dimensions) than the input/output; however, a loss function is implemented in order that only a few of the embedded features are used. The dimensions of the bottleneck can be varied by varying the drop-out rate (the loss function) of the sparse encoder.
The size of the bottleneck (the number of dimensions, or the number of encoded features used) can be varied. Generally, the smallest bottleneck is chosen whilst maintaining a given level of accuracy (keeping the reproduction error below a set level). Having said this, the system may adjust the size of each bottleneck in order to optimise the amount of compression. As the amount of compression affects the amount of data transferred between the nodes in the network, this is a form of automatic bandwidth control. This embodiment is described in more detail below.
One or more of the autoencoders may be variational autoencoders (VARs). These encode data as a distribution (e.g. a Gaussian distribution represented by a mean and standard deviation). When decoded, the encoded distribution may be sampled to produce a latent vector that is then passed through the decoder to produce an output. Variational autoencoders may be trained to minimise a loss term which includes the reproduction error (e.g. the mean-squared error) as well as the latent loss which represents how well the variables match a predefined statistical distribution (e.g. the Kullback-Liebler divergence representing the difference between the encoded distribution and a unit Gaussian distribution):
$Loss = \frac{1}{m} \sum_{i = 1}^{m} {(y - \tilde{y})}^{2} + K L$
where m is the number of training examples (the number of sensor readings), {tilde over (y)} is the predicted output, y is the ground truth output (equal to the input for the purposes of an autoencoder, or equal to a future input where the system attempts to minimise prediction error), and KL is the Kullback-Liebler divergence.
The encoded data E₁, E₂from each node is sent to the fusion node 30 which occupies a second level within the network. The fusion node 30 is configured to combine the sensor data by mapping the encoded data E₁, E₂on to a latent space for the fusion node 30 to produce combined encoded data C. As with the sensor nodes 10, 20, the fusion node 30 implements a bottleneck with an encoder and decoder. The encoder maps the encoded sensor data E₁, E₂from the first 10 and second 20 nodes onto a latent space via a bottleneck in order to produce the combined encoded data C₁. This represents a compressed version of the combined information from the first S₁and second S₂sensor data.
The decoder of the fusion node 30 can decode the combined encoded data C₁to produce corresponding predications E₁′, E₂′ of the encoded sensor data E₁, E₂from the first 10 and second 20 nodes. These predictions take the shared information from both sensors but represent this information in the latent space for each sensor. These predictions can then be decoded (using the decoders 14, 24 of the first 10 and second 20 nodes) to produce predicted sensor data S₁′, S₂′ for the corresponding sensors. In the present embodiment this is achieved by sending the predictions E₁′, E₂′ of the encoded sensor data back to the corresponding sensor nodes 10, 20 for decoding. Alternatively, this may be achieved through the fusion node 30 implementing copies of the decoders 14, 24 of the sensor nodes 10, 20. The fusion node itself may therefore be able to decode the predicted encoded sensor data to produce predicted sensor data.
The combined encoded data C₁may be output by the fusion node 30 for use by a monitoring system so that the monitoring system can make use of the combined data to determine an overall view of the sensor data. The monitoring system may be configured to output analysis data relating to the sensor information or determine control steps to be taken. For instance, the monitoring system might send control signals to the network based on the combined encoded data C₁.
The whole system can be trained by training the encoders and decoders of the first 10 and second 20 sensor nodes and the fusion node 30 to minimise their reproduction error. This can include minimising the reproduction error of the input sensor data S₁, S₂when encoding onto the combined latent space and then decoding back to the input space (through the encoders and decoders of all nodes).
In one embodiment, the system forms a generative adversarial network. In this case, each node attempts to recreate all of the input data and a discriminator (or classifier) attempts to determine whether the recreated data is the true input data or generated data. The node is trained to increase the accuracy of the generated data, and therefore, increase the classification error of the discriminator; whereas, the discriminator is trained to decrease its classification error. By utilising this adversarial technique, the nodes can be trained to more accurately encode data into a latent space and decode the data to form reproductions of the data.
Whilst each node can be trained independently, additional advantages may be provided through training the whole system holistically. For instance, training operations can be run across the network of nodes. Sensor data may be input at the first level, encoded and propagated through the network of nodes. Cyclical training can be used to train across a variety of paths throughout the network of nodes. Each node implements an encoder-decoder pair with a corresponding latent space. As each node implements an encoder-decoder pair, it should be possible to encode the sensor data by passing the data up the network through a set of linked encoders and then return to the original feature space through the corresponding set of decoders. Whilst the input need not match the output (as shall be discussed later with regard to predictions after actions have been taken), the prediction error can still be tested to ensure that information was not lost through the series of encoding and decoding steps.
Accordingly, when training the system, data can be passed through the network along cyclical paths and any divergence between decoded outputs and ground-truth input values can be used as a training signal (as a quantifier of signal loss). The methods of generative adversarial networks may be used in this regard to train the system.
Taking the system of FIG. 1, the input data S₁may be encoded at the first sensor node 10 and sent to the fusion node 30 for further encoding. This may then be decoded by the fusion node 30 and the first sensor node 10 to produce decoded input S₁“. A discriminator or classifier may be used to determine whether or not the decoded input S₁” is equivalent to the input S₁. This may be through a comparison of the two (S₁with S₁”) or through a machine learning system configured to determine whether a given input is a genuine input or a decoded input. The parameters of the nodes can then be updated based on the accuracy of the decoded input.
When establishing the network of nodes, the system may be trained by initially training the nodes within the first level to operate autonomously (to take actions and make predictions based only on the sensor data for their corresponding sensor). The initially trained neural networks (encoder and decoders) for the nodes in the first level can then be used as the basis for the nodes in the second level. For instance, the fusion nodes can initially be formed by copying the neural networks from their child nodes and, for instance, concatenating the neural networks (e.g. concatenating the encoders and concatenating the decoders). Following this initialisation, the nodes in the second level (the parent nodes) can then be trained to more effectively encode their received sensor data. This process can then be repeated for the level above, continuing up until the top level has been trained.
In light of the above, data fusion can be obtained through the mapping of at least two representations of separate sensor data (in this case, the encoded sensor data E₁, E₂) into a combined representation C₁in a corresponding latent space. Whilst this fusion can be obtained without the encoding of data by the sensor nodes 10, 20, this extra level of encoding makes the system more efficient by compressing the data to avoid transmission overheads in the situation where sensors are distributed and so sensor data needs to be transmitted to the fusion node 30 for combination.
The sensor nodes 10, 20 may be integrated into the corresponding sensors. That is, the first sensor node 10 may be integrated into a first sensor obtaining the first sensor data S₁. The second sensor node 20 may be similarly integrated into a second sensor obtaining the second sensor data S₂. This may be through the addition of a specially configured processor for implementing the functionality of the corresponding sensor node, or may be through integration of software into the processor of the sensor (that is also configured to obtain the sensor data). Integrating the sensor nodes into the hardware of the sensors allows the data to be compressed before it is output by the sensor thereby increasing the efficiency of the overall system.
Whilst the embodiment of FIG. 1 shows only two sensor nodes 10, 20 and a single fusion node 30, the methods described herein may be expanded to any number of sensors, any number of levels of fusion and any corresponding network structure.
FIG. 2 shows a schematic detailing sensor fusion within a multi-level sensor-action fusion system according to an embodiment. As with the system of FIG. 1, the system is made up of a number of nodes in various levels in the form of a graph or tree structure. Each node implements an encoder-decoder pair (not shown) to provide compression and, at higher levels, to combine multiple inputs.
A set of I sensor inputs 110 {S_i}_i=0 ^Iare input into a first level of nodes (sensor nodes 120). In the present embodiment, there are n sensor nodes. Each node is represented by its level number j and the number i of the node within the level via N_i ^j.
Each sensor node 120 receives at least one corresponding sensor input 110; however, multiple sensor inputs may be received by a single sensor node and combined in a manner similar to that described with regard to the fusion node 30 of FIG. 2 (embedding the sensor inputs into a single latent space). An example of this is shown in FIG. 2 at node N_n-1 ¹. In this case, there is no need to independently encode the sensor inputs before combining them at the sensor node, but instead the sensor inputs can be mapped directly onto the embedding space.
Each sensor node 120 encodes its input(s) and outputs the encoding to the nodes 130 next level. The next level then combines the input encoding data to form one or more combinations of the encodings. These combinations are then passed up to nodes 130 in the next level for combination, and this process continues until a final top/root node 140 (on level L) is reached which combines the encodings input from the penultimate level into one coverall combination that contains a representation of information from all of the sensors 110.
The system therefore forms a network of neural networks (each node being a self-contained neural network). Various options for the arrangement of the overall network are shown in FIG. 2. For instance, data from sensor node 120 N₃ ¹is not combined in the second level but is instead passed directly to a higher level for combination with one or more of the outputs a higher level (such as output(s) of the second level). Equally, more than two encodings may be combined at each node, as shown with the top/root node 140 N₁ ^Lwhich combines multiple encodings.
Through the arrangement of FIG. 2, data can be efficiently compressed at each node and combined at various levels of generality to form various combinations of sensor information. At each level, the encoded information (the combined information) may be output to a monitoring system. Equally, each node may store all of the relevant decoders for its descendants within the network. This allows each node to decode the data to reproduce the sensor data (translate the data back into the input space) or to make predictions for any encoded representation at any lower level within the network.
The levels may be arranged to represent various resolutions within the data, with the resolution of the sensor data decreasing up the levels. This allows the sensor data to be interrogated at various resolutions for the various control processes (e.g. lower levels controlling more specific functionality, such as the control of individual sensors or the automation of individual cars (based on a fusion of the sensor data for that car), and higher levels controlling more general functions, such as the distribution of a fleet of cars).
The above discussion relates primarily to the manner in which sensor data is reported upwards within the network for the purposes of data fusion. Having said this, the embodiments described herein also pass data back down the network (back towards the sensor nodes) in order to provide feedback for actions {A_i}_i=1 ^Ito be performed based on the fused sensor data. This may be achieved by implementing machine learning agents for issuing control signals in response to the sensor data. The agents may make use of recurrent neural networks (such as recurrent autoencoders) determining actions based on the present, and on past states. Such a system may be trained via reinforcement learning.
Accordingly, one or more of the input nodes 120 not only encode and decode data but also implement machine learning agents for determining actions based on the sensor data and issuing instructions to their corresponding sensors to control the sensors to improve data acquisition. By passing sensor data back down the network, individual agents can make use of shared information across a number of sensors to make more informed decisions.
The agents may be used to improve the quality of the data acquired by the sensors. For instance, each agent may issue control signals in accordance with determined actions to control the corresponding sensors for that agent. The control signals may direct the sensor's attention (e.g. via changing resolution or changing sensing direction) in order to focus on more useful or important features. For instance, a camera may be mounted on an actuator such that the orientation of the camera may be changed. Equally, the zoom, focus (or, in general, the resolution) of the camera may be altered through control signals to the camera which can adjust this resolution by controlling the arrangement of its lenses or mirrors.
In the present embodiment, the decision regarding where to focus each sensor is made based not only on the sensor data for that individual sensor 90, but also based on additional information passed down from higher levels in the network. This allows each agent to make a more informed decision with regard to where to focus the sensor in order to best improve the knowledge within the system.
An example of such a system would be the combination of a low-resolution camera with a wide-angle lens and a higher resolution camera with a narrower angle of view lens (a longer focal length). The wide-angle camera can be used to determine an overall view of the environment and this information can be used to direct the narrower-angle camera to focus on the most important features in the environment (that may currently be out of view for the narrower-angle camera). Accordingly, by implementing sensor-action fusion as described herein, a less expensive sensor set-up may be implemented without losing fidelity in the sensing resolution for the most important features in the environment.
FIG. 3 shows a schematic detailing action determination for of a multi-level sensor-action fusion system according to an embodiment. The network is the same as in structure of FIG. 2; however, data is now being passed back down the network from higher levels to the input level. Data is passed down the network to each node (other than the root node 140) from its corresponding parent node.
The sensor nodes 120 in the first level implement machine learning agents that control the sensors connected to each sensor node 120. Each sensor node 120 is able to issue control signals to direct the attention its corresponding sensor(s) to focus the sensor(s) on specific features (specific regions or areas within the environment being sensed).
FIG. 4 shows a schematic detailing a node within a sensor fusion system according to an embodiment.
The present node is a sensor node within the first level of the hierarchical network. As described previously, the node includes an encoder 12 and a decoder 14. The node receives sensor data 50 from a sensor 90, along with a sensor identifier (ID) 54 and a time-stamp 56. The sensor ID 54 uniquely identifies the sensor. The time-stamp 56 conveys the time at which the sensor data 50 (the measurement(s)) was taken.
The encoder maps the sensor data 50 to a latent space to form encoded sensor data 60. The encoded sensor data 60 can be shared with a parent node in a higher level which, in turn, can provide predicated encoded sensor data that includes a fusion of data from a variety of sensors. This predicted encoded sensor data is decoded by the decoder 14 to produce predicted sensor data 70. The decoder also determines a predicted sensor ID 74 and a predicted time-stamp 76. That is, the decoder generates a reproduction of each input parameter. This allows the node to be trained via generative adversarial techniques to further improve its accuracy at compressing the data.
The node not only encodes and decodes data for use in sensor fusion, but also acts as a machine learning agent for controlling the sensor to improve sensor data acquisition. The agent is configured to determine an action 80 in response to the current state (e.g. an action to adjust the configuration of the associated sensor(s) to focus on more important features within the environment). At each time-step, the node is configured to determine an action 80 based on encoded data 60 (be that the encoded sensor data from the decoder 12 or the predicted encoded sensor data from the parent node). Reinforcement learning can then be used to train the system to learn the optimal actions for various states. Whilst this embodiment relates to the determination of an action at each time-step, in alternative embodiments, an action may be determined after a predefined number of time-steps or after a certain criteria has been reached (e.g. at a specified time-step).
The actions relate to the adjustment of a configuration of one or more sensors that the given node is controlling. In one embodiment, one node controls one sensor, with shared information between sensors being obtained at higher levels within the network of neural networks.
In the present embodiment, the predicted sensor data 70 relates to the sensor data at a future time-step (such as the immediately succeeding time-step), after the determined action 80 has been implemented. In this case, the prediction error is assessed relative to the measured sensor data at the future time-step (after the action 80 has been implemented).
Whilst the node ideally makes use of the predicted encoded sensor data received from its parent node, the node can operate independently of the hierarchical network so that the sensor 90 can be controlled even when connection with the other nodes in the network has been lost. In this case, the action 80 would be based just on the encoded sensor data from the sensor 90 for that node. Having said this, the node can make more informed decisions by making use of the shared information from the other sensors. In this case, the action would be based on the predicted encoded sensor data received from the parent node. This predicted sensor data is a mapping of the shared information across the siblings for the node (the child nodes of the node's parent node) into the latent space for the node.
The agent can be implemented through a recurrent neural network. This means that the neural network can take into account the information from previous time-steps when determining an encoded representation of input sensor data. For each time-step, a hidden state is calculated and used to determine the output. The hidden state is passed to the next time-step. The time-step does not have to be fixed. The hidden state from the previous time-step is then used to condition the output (the encoding) for that time-step. This provides some form of memory (via the hidden state) to the system to learn features (e.g. patterns) over time. This allows the agent to determine more effective actions.
The agent can be trained via reinforcement learning. That is, each time an action is determined based on an input state and the action is applied to the environment (the configuration of the sensor is adapted), an updated state is received (updated sensor data) which is then used to determine a reward (based on a reward function) or, conversely, a loss (based on a loss function). The parameters of the agent are then updated to minimise the loss or maximise the reward. The remainder of the application discusses the use of a loss function. Having said this, a reward function may equally be used (for instance, by taking the inverse of the parameters that are included in the loss function). Accordingly, for the purposes of this application, the maximisation of a reward function is considered equivalent to the minimisation of a cost function.
The system can be trained to determine the optimal actions in order to minimise an uncertainty within the system (by incorporating an uncertainty term into the loss function). Specifically, in one embodiment, the system is trained in order to minimize surprise. This may be based on the prediction error. The system may be trained in order to be biased towards an action that provides the largest decrease in prediction error. This allows the system to operate in the area between the known and the unknown and take the actions that best improve the knowledge within the system (that provide the greatest increase in knowledge/largest decrease in prediction error).
Specifically, the node is trained to identify the prediction error in the system. The node predicts a future state (e.g. the sensor data for a future time-step such as the next time-step) and, upon measuring the actual future state (receiving the sensor data for the future time-step) determines the prediction error (the difference between the predicted state and the actual measured state).
The prediction error is included within the loss function (the cost function) so that the system is trained to minimise the prediction error. This trains the system to make more accurate predictions.
The system can be configured to find the gradient of the predication error over various actions and learn to take actions that tend towards steeper prediction error gradients. This biases the node to take actions that focus on regions of input space where the prediction error gradients/input gradient is higher. This is effectively achieved by including a corresponding term (relating to the gradient of the prediction error) in the cost function being implemented during training. This results in a system that learns to take actions that lead to the fastest rate of learning. That is, the system is biased towards actions that decrease the prediction error by the greatest amount; the system learns to take the actions that are most likely to increase the knowledge (decrease the prediction error) in the system by the fastest rate.
By optimising based on the prediction error gradients (biasing the system towards areas that provide the greatest learning), the actions in the system are biased towards regions of novelty (e.g. regions with new features less observed with higher uncertainty about their subsequent states) and/or changes in the environment (e.g. previously observed objects moving within the environment).
As mentioned previously, the agent may be implemented using a recurrent neural network. This provides some form of memory to allow the agent to take into account changes in the input over time.
Furthermore, the actions of the system can be weighted to focus on certain predefined features that may be considered more important. These features can be defined by the user. An example of a potential feature is the image of a stop-sign for an autonomous vehicle. A sensor action fusion system implemented within an autonomous vehicle can be biased to direct more attention to stop-signs as these will need to be recognised accurately for an autonomous vehicle control system to control the vehicle safely. The system may be biased in this way towards specific features through the use of a machine learning classifier.
For instance, a node may implement a classifier to determine whether a given set of features are present within the received sensor data. The future action for the node may then be conditioned on the classification. That is, a value, reward or cost function implemented by each agent (the action generator) to determine the next action may include a weighting towards predefined features that are identified as being important. Alternatively, or in addition, one or more terms may be included that bias the agent away from regions containing predefined features that are identified as not important. Accordingly, the agent makes use of the classification of the data to implement actions to direct attention towards features of importance to the user.
The classifier for identifying specific user defined features can be trained via supervised learning based on labelled data. Classification is included within the embodiment of FIG. 4. The node receives as an input label 52 indicating the classification of the sensor data 50. The classification may include classifications for various regions within the environment (e.g. corresponding to various coordinates in the sensor data). The node then generates a classification (a predicted label) 72 based on the sensor data 50, the encoded sensor data 60 or, preferably, the predicted encoded sensor data received from the parent node. The system is trained to minimise the classification error based on the ground truth label 52.
Once the classifier is trained, the agent utilises the classification to bias actions towards regions containing features of importance, or away from regions containing features of less importance. This is achieved by applying a weighting to the system based on the identified features. For instance, the cost function may be weighted to reward actions that focus on predefined features of importance and punish actions that focus of predefined features of lower-importance. An example of a lower-importance feature in the context of autonomous vehicles might be a section of the vehicle that the sensor is attached to (e.g. the bonnet of the car that the sensor is attached to), whilst an example of a feature of importance might be a pedestrian, another vehicle or a stop-sign.
Each classifier may be applied either to the data input into the node or to the data embedded by the node. In one embodiment, the classifier makes use of the predicted encoded sensor data obtained from the parent node. This allows the classifier to make use of the shared information between the various sensors.
Whilst the above embodiment discusses a classifier, one or more discriminators may equally be used to determine whether a given feature or set of features are present.
In addition to the above, each node is able to implement variable bandwidth control. This allows the nodes to adapt the amount of data transferred between nodes to avoid excessive transmission overheads. That is, the system can be trained to determine the amount of data that is to be transferred between nodes in order to balance latency and transmission overhead requirements against the accuracy of the system.
The amount of data that is transferred between nodes can be changed by changing the amount of compression performed by a node (e.g. the size of the latent space). To achieve this, the node receives a variable bandwidth hyperparameter 58. This defines an initial setting for the amount of data shared between the nodes. The variable bandwidth hyperparameter 68 may be chosen by the user in accordance with the technical criteria for the system (e.g. the latency requirements and transmission overheads within the system).
The variable bandwidth hyperparameter 58 is input into the encoder 12, which generates the encoded sensor data 60. The decoder 14 determines, from the encoded sensor data 60, a bandwidth action 78. The bandwidth action 78 defines the bandwidth for the node (the amount of data to be sent to the parent node). The agent is configured to adjust the bandwidth (adjust the size of the latent space) based on a trained reinforcement learning model. The agent (via the decoder 14) outputs a bandwidth action 78. The bandwidth action 78 is an action on the bandwidth to adjust the bandwidth (adjust the size of the latent space).
The agent can learn to play-off the transmission and performance requirements of the system. This can be through the inclusion of one or parameter(s) within the loss-function that punish the transmission of large amounts of data but reward improvements in performance (e.g. improvements in prediction error).
At the next time-step, the bandwidth action 78 is enacted (the updated bandwidth is used). The node therefore shares data with the other nodes (e.g. the parent node) in accordance with the updated bandwidth and receives a response from the parent node (e.g. including the predicted encoded sensor data). The node uses this response to determine a new bandwidth action 78. This process repeats for each time-step.
Where automatic bandwidth control is implemented, the predicted sensor data 70 may be predicted based not only on the action 80 performed on the sensor but also on the bandwidth action 78 to be performed on the bandwidth.
Whilst the above embodiment discusses updating the bandwidth at each time-step, the action may be performed over any time-period, such as after a predefined number of time-steps.
By adjusting the amount of data transferred by adjusting the compression (the size of the latent space) along with the other optimisation functions (such as optimising based on prediction error), the network learns to prioritise the transfer of novelty (e.g. new, or relatively rare, features within the data) over common features.
The above embodiment relates to a node that controls how much data it transmits to a parent node. In an alternative embodiment, data is pulled up from the parent node. That is, the parent node can control how much data the child node shares. This mechanism is the same as in the above embodiment; however, the parent node determines the bandwidth (the size of the latent space) for the child node and sends this bandwidth to the child node to instruct the child node to adjust its bandwidth accordingly (adjust the size of the bottleneck for the child node).
Each agent is autonomous, in that it is able to take actions regardless of the actions taken by other agents in the system. Each agent takes actions based on the information that it has available to it. Accordingly, where a node loses communication with the network, it is still able to operate, but will only make decisions based on its local sensor data (the locally encoded sensor data 60), rather than any shared sensor data obtained from other nodes (the predicted sensor data from the parent node). This provides resilience within the system. FIG. 5 shows a method for determining an action for adjusting the configuration of a sensor according to an embodiment. This method may be performed by one of the nodes of the first level (the sensor nodes). In the present embodiment, the node is associated with one sensor. The node may be integrated within the sensor.
The node receives sensor data from the sensor 202. The sensor data is encoded 204 by mapping the sensor data onto a corresponding latent space. This compresses the sensor data.
The encoded data is sent 206 to a fusion node in the level above the node. This fusion node is the parent node for the present node. The fusion node is configured to combine encoded sensor data from multiple nodes by mapping each encoded sensor data onto a corresponding latent space to produce an encoded combination. This encoded combination is decoded into the latent space of the node and this decoded information is passed back to the node. This forms a recurrent configuration. The shared latent space in the fusion node supports this translation of the mutual information across the different lower-level nodes.
The node receives the information passed back from the fusion node 208. The node then decodes this information to determine an action. The action is a change to the configuration of the sensor in order to optimise information acquisition. The action is biased towards minimising surprise (reducing prediction uncertainty). The action may also be biased towards features identified by the user as being of importance. Further weightings/biases may be applied to the actions, for instance, to penalise large changes in configuration (e.g. large movements).
The node determines its action based on a policy. This can be trained via reinforcement learning. As mentioned above, the agent for each node may be trained by determining the action and determining predicted sensor data. The node may be trained via a parameter update method that attempts to minimize the following function F:
F=x+y+z+A*(error(E))+B*predicted error(E(E))+C*predicted predicted error(E(E(E)))+ . . .
where the parameters x, y and z are user defined cost functions over actions states and environmental states. These action states can include bandwidth control (controlling the size of the latent space). A, B and C are tuneable hyperparameters.
The node then issues an instruction to the sensor to update its configuration according to the determined action 212. The sensor makes the required changes and then obtains further sensor data using the updated configuration. This further sensor data is passed back to the node, which then implements the method of FIG. 5 again to determine combined data and determine a new action for further changing the configuration of the sensor.
By utilising data fusion to inform actions to improve data acquisition, the data obtained by the system can be improved, and the system can be better configured to focus on the most important aspects of the environment. This can provide improved resolution than would be available with each sensor independently (through data fusion) and can help the system adapt to the loss of information from various sensors (e.g. through damage or temporary loss of communication).
As discussed herein, the various nodes within the system share information to allow for better decisions to be taken by each node. Data may be pushed from one node to another, or may be pulled from one node to another. Each communication link within the network of nodes may be governed by dedicated communication modules within each node. These may implement variable bandwidth control (automatic bandwidth management) in order to adjust the amount of data shared between the nodes.
Local bandwidth can be controlled by an output state of the local node which in turn may be controlled by cyclical and/or recurrent processes acting on the node. Part of the output of the node (through the decoder) is the bandwidth action that controls the amount of data that is transmitted to another node (via controlling the size of the latent space for the node). This ensures that the local and global bandwidth requirements are reduced within the network, thereby leading to a more efficient network.
Training may continue during use, such that the system continues to train itself and adapt to new scenarios by updating the parameters of the network based on new sensor data.
In addition to making use of surprise minimisation for improving data acquisition, this may also be used by the system to flag anomalies to a user. This can help determine whether there are any errors in the system and/or whether the data has been tampered with (for instance, by an intruder within the system).
Novel sensor data may be flagged for checking. This may be through the administration of a threshold (for instance, threshold prediction error). A dedicated neural network may be implemented for reviewing flagged sensor data. This may be trained to distinguish between genuine sensor data and erroneous or edited sensor data. If the data is flagged as accurate, then it is passed back to the system to further train the system. If the data is flagged as erroneous, then it can be passed to an administrator for review.
Further security improvements are provided by the fact that only encoded data is shared between nodes. As this data is encoded via machine learning techniques, it can only be decoded by a system that has the corresponding decoder. This means that the data is protected from man-in-the-middle attacks, as anyone who intercepts the communication would be unable to recover the sensor data (which may include private information) without the corresponding decoder.
In light of the above, methods and systems are presented for optimising sensor data acquisition. This may be for use with machine learning systems such as control systems for autonomous vehicles. Sensor data is shared between nodes, with data being compressed before transmission to improve efficiency. Each sensor may be provided with its own sensor node to compress data and to control its configuration. Data is fused at higher levels within the network and shared information is passed back to the bottom level to help inform actions to change the configuration of the sensors to adjust their attention/focus towards the most important features in the environment. This may be based on surprise minimisation, biasing towards predefined features of importance and biasing towards novel and/or changing features. In this way, the system is able to configure its sensors to obtain more useful information through sharing of information across the system.
Implementations of the subject matter and the operations described in this specification can be realized in digital electronic circuitry, or in computer software, firmware, or hardware, including the structures disclosed in this specification and their structural equivalents, or in combinations of one or more of them. For instance, hardware may include processors, microprocessors, electronic circuitry, electronic components, integrated circuits, etc. Implementations of the subject matter described in this specification can be realized using one or more computer programs, i.e., one or more modules of computer program instructions, encoded on computer storage medium for execution by, or to control the operation of, data processing apparatus. Alternatively or in addition, the program instructions can be encoded on an artificially-generated propagated signal, e.g., a machine-generated electrical, optical, or electromagnetic signal that is generated to encode information for transmission to suitable receiver apparatus for execution by a data processing apparatus. A computer storage medium can be, or be included in, a computer-readable storage device, a computer-readable storage substrate, a random or serial access memory array or device, or a combination of one or more of them. Moreover, while a computer storage medium is not a propagated signal, a computer storage medium can be a source or destination of computer program instructions encoded in an artificially-generated propagated signal. The computer storage medium can also be, or be included in, one or more separate physical components or media (e.g., multiple CDs, disks, or other storage devices).
While certain arrangements have been described, the arrangements have been presented by way of example only, and are not intended to limit the scope of protection. The inventive concepts described herein may be implemented in a variety of other forms. In addition, various omissions, substitutions and changes to the specific implementations described herein may be made without departing from the scope of protection defined in the following claims.

Claims

1. A method for controlling the configuration of one or more sensors based on information shared between a plurality of sensors, the method comprising:

establishing a hierarchical network of nodes comprising at least a first level comprising a plurality of child nodes and a second level comprising one or more parent nodes, wherein each of the child nodes is assigned to a corresponding sensor and each of the one or more parent nodes is assigned to a corresponding group of child nodes to combine sensor data from the corresponding group of child nodes;

at each parent node:

receiving, from each of the child nodes in the corresponding group of child nodes for the parent node, sensor data for the sensor corresponding to that child node, the sensor data occupying a corresponding sensor feature space for the child node;

encoding the received sensor data to form an encoded combination of sensor data by mapping the received sensor data to a latent space for the parent node;

decoding the encoded combination of sensor data to map, for each of one or more of the child nodes of the corresponding group, the encoded combination of sensor data to the sensor feature space for the child node to form a corresponding decoded combination of sensor data; and

sending each decoded combination of sensor data to the child node corresponding to the sensor feature space for that decoded combination of sensor data; and

at each child node that receives a decoded combination of sensor data:

determining an action for updating a configuration of the corresponding sensor based on the received decoded combination of sensor data; and

issuing an instruction to adjust the configuration of the corresponding sensor in accordance with the action.

2. The method of claim 1 further comprising at each child node:

receiving one or more sensor measurements from the sensor corresponding to the child node;

encoding the one or more sensor measurements to compress the one or more sensor measurements by mapping the one or more sensor measurements onto a sensor feature space for the child node to form the sensor data, the sensor feature space being a latent space; and

sending the sensor data to the parent node corresponding to the child node.

3. The method of claim 1 wherein each child node that receives a decoded combination of sensor data implements an agent for determining the action that is biased towards selecting an action that achieves one or more of reducing prediction error or focusing on one or more predefined features of interest.

4. The method of claim 3 wherein each child node that receives a decoded combination of sensor data implements a classifier configured to identify the one or more predefined features of interest within the corresponding decoded combination of sensor data and bias the agent towards an action that focuses on the one or more predefined features of interest.

5. The method of claim 3 wherein each agent:

determines predicted sensor data based on the decoded combination of sensor data;

determines a prediction error based on the predicted sensor data; and

is biased towards an action that minimises a cost function comprising the prediction error.

6. The method of claim 5 wherein each agent:

determines predicted sensor data based on the combination of sensor data, determines the prediction error based on the predicted sensor data and determines a gradient of the prediction error over the action space for the node; and

is biased towards determining an action that minimises a cost function comprising the gradient of the prediction error.

7. The method of claim 1 wherein the action comprises one or more of:

adjusting a resolution of the corresponding sensor;

adjusting a focus of the corresponding sensor; and

directing the corresponding sensors to sense an updated region.

8. The method of claim 7 wherein each child node is implemented in a corresponding processor connected directly to the corresponding sensor for that child node.

9. The method of claim 8 wherein the second level is implemented in a second set of one or more processors and wherein the processors for the first level communicate the sensor data to the one or more processors for the second level.

10. The method of claim 1 further comprising:

at each child node that receives a decoded combination of sensor data, determining a bandwidth action to adjust a size of the sensor feature space for the child node based on the received decoded combination of sensor data and adjusting the size of the sensor feature space in accordance with the action.

11. The method of claim 10 wherein the bandwidth action is biased towards a bandwidth action that reduces the size of the sensor feature space but is biased away from an action that increases a prediction error for the child node

12. A node for controlling the configuration of a sensor based on information shared between a plurality of sensors, the node comprising a processor configured to:

receive one or more sensor measurements from the sensor;

encode the one or more sensor measurements to compress the one or more sensor measurements by mapping the one or more sensor measurements onto latent space for the node to form encoded sensor data;

send the encoded sensor data to a parent node for combination with further encoded sensor data from one or more other sensors of the plurality of sensors;

receive from the parent node a combination of sensor data comprising a combination of the encoded sensor data for the node and the further encoded sensor data from the one or more other sensors mapped to the latent space of the node;

determine an action for updating a configuration of the sensor based on the combination of sensor data; and

issue an instruction to adjust the configuration of the sensor in accordance with the action.

13. The node of claim 12 wherein the node is biased towards selecting an action that achieves one or more of reducing prediction error or focusing on one or more predefined features of interest.

14. The node of claim 13 wherein the processor is configured to implement a classifier configured to identify one or more predefined features of interest within the combination of sensor data and bias the node towards an action that focuses on the one or more predefined features of interest.

15. The node of claim 13 wherein:

the processor is configured to determine predicted sensor data based on the combination of sensor data and determine the prediction error based on the predicted sensor data; and

the processor is biased towards selecting an action that minimises a cost function comprising the prediction error.

16. The node of claim 13 wherein:

the processor is configured to determine predicted sensor data based on the combination of sensor data, determine the prediction error based on the predicted sensor data and determine a gradient of the prediction error over the action space for the node; and

the processor is biased towards determining an action that minimises a cost function comprising the gradient of the prediction error.

17. The node of claim 12 wherein the action comprises one or more of:

adjusting a resolution of the sensor;

adjusting a focus of the sensor; and

directing the sensor to sense an updated region.

18. The node of claim 12 wherein the processor is configured to be connected directly to the sensor for receiving the sensor data.

19. The node of claim 12 wherein the processor is configured to determine a bandwidth action for adjusting the size of the latent space based on the combination of sensor data, wherein the processor is biased towards a bandwidth action that reduces the size of the latent space but is biased away from an action that increases a prediction error for the node.

20. The node of claim 12 wherein the action is determined using a reinforcement learning agent in accordance with parameters of the agent and wherein the processor is configured to update the parameters of the agent to reduce a cost function based on one or more of a prediction error, a gradient of the prediction error over an action space for the node, and a weighting towards one or more predefined features of interest.

21. A parent node for combining sensor data from multiple sensors for use by one or more child nodes in controlling the configuration of one or more of the sensors, the parent node comprising a processor configured to:

receive, from each of a group of child nodes, sensor data for a sensor corresponding to the child node, the sensor data occupying a corresponding sensor feature space for the child node;

encode the received sensor data to form an encoded combination of sensor data by mapping the received sensor data to a latent space for the parent node;

decode the encoded combination of sensor data to map, for each of one or more of the child nodes of the group, the encoded combination of sensor data to the sensor feature space for the child node to form a corresponding decoded combination of sensor data; and

send each decoded combination of sensor data to the child node corresponding to the sensor feature space for that decoded combination of sensor data to enable the child node to determine and issue an action for updating a configuration of the corresponding sensor for that child node based on the corresponding combination of sensor data.

22. A computing system comprising one or more processors configured to implement the method of claim 1.

23. A non-transitory computer readable medium comprising computer executable instructions that, when executed by a processor, cause the processor to implement the method of claim 1.