AU2008322797A1

AU2008322797A1 - Sensor control

Info

Publication number: AU2008322797A1
Application number: AU2008322797A
Authority: AU
Inventors: David Nicholson; Antony James Waldock
Original assignee: BAE Systems PLC
Current assignee: BAE Systems PLC
Priority date: 2007-11-12
Filing date: 2008-11-11
Publication date: 2009-05-22
Anticipated expiration: 2028-11-11
Also published as: WO2009063182A1; JP4991876B2; JP2010503874A; AU2008322797B2; EP2215602A1

Description

WO 2009/063182 PCT/GB2008/003791 1 Sensor Control The present invention relates to controlling at least one sensor. Sensors are widely used for monitoring and surveillance applications and it is often useful to detect and track moving targets. Generally, any particular 5 sensor will be limited in terms of the number of targets that it can sense at any one time and in many cases a sensor may be limited to sensing one target at any given time interval. Therefore, when there are multiple targets a decision will need to be made regarding which target a particular sensor is to measure/sense. 10 According to one aspect of the present invention there is provided a method of controlling a sensor to measure one target from a plurality of targets, the method including: predicting states of a plurality of targets; receiving information regarding a state of a said target from the plurality of 15 targets obtained by at least one other sensor; generating a set of probability distributions, each said probability distribution in the set representing a setting or settings of at least one control parameter of the sensor; calculating an expected information gain value. for each said control 20 parameter in the set, a said information gain value representing an expected quality of a measurement of one of the targets taken by the sensor if controlled according to the control parameter, based on the predicted state of the target; updating the set of probability distributions to identify the sensor control parameters that maximise the expected information gain value, and CONFIRMATION COPY WO 2009/063182 PCT/GB2008/003791 2 controlling the sensor in accordance with the maximising control parameters, the steps of generating the set of probability distributions and calculating the information gain values include: generating a sample block using the probability distributions over the 5 control parameters of the sensor and the at least one other sensor; evaluating a global objective function, and updating the set of probability distributions for the sensor and the at least one other sensor using the global objective function, characterised in that a Monte Carlo Optimisation technique involving immediate sampling and 10 parametric learning is used for the updating of the set of probability distributions. The parametric learning technique may comprise cross-validation. The step of predicting the states of the targets may be implemented using an information filter technique, such as an Information-Based Kalman filter. The target state may correspond to a spatial state of the target, such as coordinates 15 representing its position, its bearing/trajectory and/or its velocity, and the step of predicting the state may use a target motion model. The expected information gain value may be calculated for a set of control parameters 6 using an equation: -ok) - 1 0Y(k~k - 1) + Io(k)| 2 |Y(k~k - 1)1 20 where Y(kk-1) is an information matrix at time k based on all measurements made by the sensor and/or the at least one other sensor up to WO 2009/063182 PCT/GB2008/003791 3 time k-I and le(k) is an information matrix associated with a measurement made by the firstmentioned sensor at time k for a set of said control parameters. The method may further include transferring information regarding the state of the one target obtained by the sensor to the at least one other sensor. 5 The at least one other sensor may be configured to execute at least some of the steps of the method of the firstmentioned sensor. According to another aspect of the present invention there is provided a sensor controllable to measure one target from a plurality of targets, the sensor including: 10 means for predicting states of a plurality of targets; means for receiving information regarding a state of a said target from the plurality of targets obtained by at least one other sensor; means for generating a set of probability distributions, each said probability distribution in the set representing a setting or settings of at least one control 15 parameter of the sensor; means for calculating an expected information gain value for each said control parameter in the set, a said information gain value representing an expected quality of a measurement of one of the targets taken by the sensor if controlled according to the control parameter, based on the predicted state of 20 the target; means for updating the set of probability distributions to identify the sensor control parameters that maximise the expected information gain value, and WO 2009/063182 PCT/GB2008/003791 4 means for controlling the sensor in accordance with the maximising control parameters, the means for generating the set of probability distributions and the means for calculating the expected information gain values being configured to: generate a sample block using the probability distributions over the control 5 parameters of the sensor and the at least one other sensor; evaluate a global objective function, and update the set of probability distributions for the sensor and the at least one other sensor using the global objective function, characterised in that a Monte Carlo Optimisation technique involving immediate sampling and 10 parametric learning is used for the updating of the set of probability distributions. According to another aspect of the present invention there is provided a computer program product comprising computer readable medium, having thereon computer program code means, when the program code is loaded, to make the computer execute a method of configuring a sensor to sense one 15 target from a plurality of targets substantially as described herein. According to yet another aspect of the present invention there is provided a plurality of sensors substantially as described herein, each sensor being configured to communicate information that it obtains regarding the state of the one target to at least one other said sensor. 20 Whilst the invention has been described above, it extends to any inventive combination of features set out above or in the following description. Although illustrative embodiments of the invention are described in detail herein with reference to the accompanying drawings, it is to be understood that the invention is not limited to these precise embodiments. As such, many WO 2009/063182 PCT/GB2008/003791 5 modifications and variations will be apparent to practitioners skilled in the art. Furthermore, it is contemplated that a particular feature described either individually or as part of an embodiment can be combined with other individually described features, or parts of other embodiments, even if the other features and 5 embodiments make no mention of the particular feature. Thus, the invention extends to such specific combinations not already described. The invention may be performed in various ways, and, by way of example only, embodiments thereof will now be described, reference being made to the accompanying drawings in which: 10 Figure 1 is a schematic diagram of a plurality of sensors and plurality of targets; Figure 2 illustrates schematically steps performed by an example process executed on a sensor; Figures 3 and 4 illustrate schematically further detail regarding the steps 15 performed; Figure 5 is a graph showing a mean number of samples required for a time step during the target sensing process; Figures 6A and 6B are graphs showing the mean global information and samples during the sensing process involving different numbers of sensors, and 20 Figures 7A and 7B are graphical representations of example probability distributions. Referring to Figure 1, three sensors 102A - 102C are shown in an environment in which there are three moving targets 104A - 104C. It will be appreciated that the Figure is exemplary only and the system can operate with WO 2009/063182 PCT/GB2008/003791 6 any number (from one upwards) of sensors and any reasonable number of targets. In the example the targets are moving (along the direction indicated by the respective arrows), whilst the sensors are essentially static. In other situations the sensor may be mounted on a vehicle, e.g an autonomous land, air 5 or water-based vehicle. The sensors can be controlled to improve sensing of a particular target, e.g. if the sensor includes an image input (e.g. a camera or infra-red image receiver) then the focus, orientation, etc of the image-receiving component can be set. An example of a sensor that could be used is AXIS 214 PTZ Camera used in CCTV systems. In alternative embodiments the sensors 10 may be controllable in other ways, e.g. be relocated. The term "control" herein is intended to cover adjustments that may commonly be called "re-configuration", e.g. contrast or compression modification, as well as conventional control, e.g. movement, operations. In some embodiments, the sensors can be configured as agents that are 15 networked together and engage in tracking multiple targets in their environment (although other embodiments only require a single sensor). In multiple sensor embodiments, each sensor can take a resource-constrained action (orientate towards a particular target in the environment) that results in measuring the position of a limited number (e.g. one) of target(s). In embodiments where there 20 is more than one sensor, two or more of the sensors may be allowed to form a coalition and measure the same target. The overall aim in such embodiments is to select a set of joint actions (sensor control parameters) that reduces the total amount of uncertainty associated with position and velocity estimates of the targets across the entire sensor network. In the example given herein, each WO 2009/063182 PCT/GB2008/003791 7 sensor 102A - 102C is only allowed to sense one of the targets during any time step, although two or more sensors can sense the same target. Figure 2 illustrates schematically steps performed by a processor that is in communication with one of the sensors 102A - 102C. The processor may be 5 integral with the sensor, or it may be remote. The process 200 is based on a Decentralised Data Fusion (DDF) algorithm (DDF is described in J. Manyika and H.F. Durrant-Whyte, Data Fusion and Sensor Management: A Decentralised Information-Theoretic Approach, Ellis Horwood, 1994). DDF imposes architectural constraints on the sensor network, which eliminate the conventional 10 notion of a fusion centre as well as access to full knowledge of the global network topology by each sensor node. DDF also defines probabilistic information update algorithms that map to a variety of sensor network architectures. The algorithms are implemented at each sensor node, to filter and fuse their local data and to assimilate processed data from the other nodes. 15 In the example the sensor network comprises of N (3) stationary sensors engaged in tracking M (3) mobile targets in their environment. The sensors implement DDF-based algorithms to estimate the states of the targets. In the example, the states estimated are dynamic states, specifically position and velocity, but it will be understood that other states (e.g. bearing, temperature, 20 identity, etc) could be processed. Interleaved within each sensor's DDF-based process is a target assignment algorithm that informs the sensor nodes about which target to observe, given the constraint they can only observe one-out-of-N targets at each sensing opportunity. However, two or more sensors may simultaneously observe the same target.

WO 2009/063182 PCT/GB2008/003791 8 At step 202 the sensor algorithm and the physical sensor are initialised. The initialisation of the physical sensor will vary from device to device, but the algorithm initialisation will normally include initialising the sensor model. This can involve specifying the sensor position in Cartesian coordinates (assumed 5 known); nonlinear model relating sensor observations in polar coordinates (range and bearing) to target state variables (position and velocity) in Cartesian coordinates; or range and bearing observation noise standard deviations. The target model is also initialised by specifying either a dynamic model (e.g. Newton's equations, but could be more complex) or a process model (i.e. a 10 zero-mean noise process of specified standard deviation which captures difference between the target's true motion and its predicted motion). Sensor filter initialisation is also performed. A filter in this sense is a technique for calculating optimum (or near) estimates of process variables (target states) in the presence of noise. The filter is initialised to track the target state. The filter 15 can be implemented using a variety of known techniques including a Kalman filter, particle filter or a Grid-based tracking technique. The target state(s) tracked can vary depending on the exact application, but in the example they comprise position and velocity. In embodiments where there are a plurality of cooperating sensors, step 202 can also involve registration and discovery of the 20 sensors on a distributed sensor network. At step 204 the sensor algorithm predicts the states of the targets by means of software implementation of equations that predict the target state (position and velocity) at one or more time steps in the future, based on the current state of the target and the target motion model. Kalman filter prediction WO 2009/063182 PCT/GB2008/003791 9 equations are utilised in the example. The DDF-based algorithm 200 maintains information states relating to the targets for computational and communication efficiency. However, information also provides a direct normative basis on which to manage the sensor-to-target assignments. The key quantity is the Fisher 5 information matrix, Y(k I k), which is calculated directly by the information form of the Kalman filter. The notation (k 1) refers to an estimate at time k conditioned on all observations up to and including time L. At step 204 of the example, the information filter of the sensor predicts the target state Y(k I k -1) using a motion model for the specific target under track. The movement of the 10 targets in the example are assumed to be based on a linear motion model with additive Gaussian process noise. At step 206 the control/reconfiguration parameters of the sensor with respect to the target states predicted are set. The control or reconfiguration parameters can include, but are not limited to, position, orientation, internal 15 reconfiguration or environmental manipulation. More specifically in the example, the parameter setting is performed so that the sensor is assigned to one of the plurality of targets it is to sense. A common assignment strategy is to assign sensor i to target j in order to maximise the mutual information gain ZJ(k) defined in Equation 1 below. The sensor i then uses its observation model to 20 predict the amount of information, Ii, (k), associated with observing each target j at time k. The mutual information gain for an assignment i -+ j is: WO 2009/063182 PCT/GB2008/003791 10 IJ(k)=1 |og eI,(kIk - 1) +I ,j(k)| 2 |Yis (klk - 1)| Equation 1: Mutual Information Gain At step 208 the sensor is controlled or reconfigured to observe/measure/sense the state of the selected target. The target state can be measured and expressed as a mean and standard deviation. 5 Steps 210 and 212 are performed in embodiments where there are a plurality of cooperating sensors. In embodiments involving one sensors only, these two steps are omitted and the other steps in the process 200 of Figure 2 utilise information based on the measurements taken by the single sensor alone. At step 210, when the sensor i has observed its assigned target j, the 10 observed information I, 1 (k) is sent to all other sensors in the sensor network. The target measurements can be communicated via a globally broadcast message or propagated across the network between sensors via a point-to-point protocol. At step 212 the sensor receives information from the other sensors 15 regarding the targets they have observed. The sensor i then assimilates its own information about the target with the information it has received about the target from its communication channels. The assimilation equation has the advantage of being additive in DDF: N Yij(kjk) = Yi,.(kIk - 1) + lij (k) i=1 Equation 2: DDF Information Update WO 2009/063182 PCT/GB2008/003791 11 At step 214 the sensor processor updates the estimated target states. The filter of the sensor is updated using the measurements of the targets that has been taken by the sensor itself and, in embodiments involving several cooperating sensors, the measurements received from at least one of the other 5 sensors. The distributed data fusion products can be fed back to the target state prediction step 204 to form the basis for further sensor control and distributed data fusion steps. The exchange and assimilation of observation information in DDF couples future sensor-to-target assignment decisions leading to coordinated decisions. 10 Step 216 is performed when the system is to be switched off and involves shutting down the sensor and network interfaces in a controlled manner. In embodiments involving several cooperating sensors, negotiation techniques can be used to improve performance in terms of maximising the overall information gain resulting from substantially optimal sensor-to-target 15 assignment. However, such negotiating does incur the expense of additional communication between the sensors. Explicit cooperation can be viewed as a distributed optimisation and a technique called Probability Collectives can be used to find the optimal joint action. For the example sensor configuration application, the cost function is defined in terms of the predicted information gain 20 achievable from measuring a target's position. For a single sensor embodiment, the optimal control parameters (those that result in the minimum cost) can be found using an array of optimisation because sampling from the cost function is relatively efficient. The computational and communicational cost of sampling from the global cost function is expensive if the problem is not factorable, i.e.

WO 2009/063182 PCT/GB2008/003791 12 capable of being split into two single agent problems. Hence, the approach must intelligently sample from the global function to reduce the amount of computation and communication. The problem of explicit cooperation within the context a distributed sensor 5 network can be formulated as a distributed optimisation using the joint objective function defined in Equation 3 below: maximiseA (2 log Yik k - )O i=0 j=0 kk ) Equation 3: Joint Objective Equation 3 defines that from a given target to sensor assignment A, a sensor must evaluate the mutual information gain for all sensors and targets given the measurements taken by all sensors. This joint objective function is 10 used for this work, but other objective functions are likely to include further terms that incorporate power requirements for sensing; time to execute the action; probability of acquisition, etc. Existing techniques that address explicit cooperation as a distributed optimisation are either centralised or rely on a smooth and differentiable utility function; however, the present inventors' 15 approach eliminates this requirement and treats the optimisation as a Monte Carlo Optimisation (MCO). PC can be used to efficiently perform a distributed MCO. PC is a broad framework for analysing and controlling distributed systems (see D.H. Wolpert, Collective Intelligence, Computational Intelligence Beyond 20 2001: Real and Imagined, Wiley, 2001). Typically an optimisation problem is solved by manipulating a set of optimisation variables, in a deterministic or WO 2009/063182 PCT/GB2008/003791 13 stochastic fashion (e.g. Simulated Annealing), until some global objective or cost function of those variables is minimised. PC regards the variables as independent agents playing an iterated game. However, what is manipulated by PC is probability distributions over those variables. The manipulation process 5 seeks to induce a distribution that is highly peaked about the value of the variables that optimise the global objective function. A key result of PC is that the minimum value of the global cost function can be found by considering the maxent Lagrangian equation for each agent (variable). This is written as: Li(qi) = gi(qi) - T x S(qi) Equation 4: Lagrangian Equation 10 Here, q, is agent i's probability distribution over its actions denoted x,; g,(q,) is the expected cost evaluated with respect to the distributions of the agents other than i; T is temperature; S(q,) is the entropy associated with the probability distribution q,. PC algorithms are still being actively researched and matured, but the example employs the following algorithm for optimising the 15 Lagrangian: WO 2009/063182 PCT/GB2008/003791 14 Algorithm 2 PC Optimisation 1: beta betamin 2: repeat 3: iterations <- 0 4: repeat 5: Generate a Sample Block using qi from each agent 6: Evaluate the expected global cost gi(qi) 7: Update qt using gi(qi) 8: iterations = iterations + 1 9: until (iterations > Imax) OR (S(qi) < Smin) 10: beta <-- alpha x beta 11: until beta < betamax The maxent Lagrangian is convex over the set of product distributions over the agent's action space. By operating on q, in this convex space it is possible to use powerful search methods for finding function extrema developed for continuous domain problems, such as gradient descent. Note that while 5 adding entropy makes the descent easier, it also biases the solution away from extreme solutions. That bias is gradually lowered by annealing T. The minimisation of the Lagrangian is amenable to solution using gradient descent or Newton updating since both the gradient and the Hessian are obtained in closed form. Using Newton updating and forcing the constraint on 10 total probability, the following update rule is obtained: qi(xi) - qi(xi) - aqi(xi) x E[Glxi] - E[G] ± S(qz) + Inqi(xi) T Equation 4: PC Update Rule where x, is agent is action and G is the global cost function. The parameter a plays the role of a step size since the expectations result from the current probability distributions of all the agents. Constraints can be included by WO 2009/063182 PCT/GB2008/003791 15 augmenting the global cost function with Lagrange multipliers and the constraint functions. Performing the update involves a separate conditional expected utility for each agent. These are estimated either directly if a closed form expression is 5 available, or with Monte Carlo sampling if no simple closed form exists. In Monte Carlo sampling the agents repeatedly and jointly independent and identically distributed (iid) sample their probability distributions to generate joint actions, and the associated costs/utilities are recorded. Since accurate estimates usually require extensive sampling, the global cost G occurring in 10 each agent is update rule can be replaced with a private cost g, chosen to ensure that the Monte Carlo estimation of E(g, | x,) has both low bias, with respect to estimating E(G I x,) and low variance. Now that the PC algorithm has been defined, the global cost function G used to enable cooperative behaviour needs to be identified. 15 DDF and PC are coupled by an information theoretic utility function: DDF operations create the utility function and PC determines the set of actions (here sensor-to-target assignments) that maximise it. Specifically, the total information contained in sensor is DDF estimates of the target set is defined in Equation 5. The global objective is simply the sum of the individual sensors information 20 contributions from across the sensor network. Thus: GANk)=- E (log|Y,(kk + (k))| Equation 5: Joint Objective Function for PC WO 2009/063182 PCT/GB2008/003791 16 The minus sign appears above because PC performs minimisation. Now that the global cost function and actions have been defined, PC can be used to derive the optimal assignment. A decentralised implementation of PC will now be discussed. 5 As part of its optimisation process, PC requires each sensor to sample its probability distribution over sensor-to-target assignments. To perform the sampling in a decentralised sensor network (steps 5 and 6 in the PC optimisation algorithm above) a strategy based on the known token-ring strategy can be implemented. In token-ring message passing, the sensors are logically 10 organised into a circle. A token travels around the circle to all the sensors on the network. To send a message around the network, a sensor catches the token and attaches a message to it. First, the token is passed around the network to build a sample block containing a set of joint actions. As the token arrives at each sensor, the current probability distribution over target assignments is used 15 to populate the block with actions. Once a sample block has been constructed (been passed around the entire network), the token is passed back around the network to allow each sensor to evaluate the set of joint actions within the sample block. As the global cost function (Equation 5 above) is a sum over all the 20 predicted global information, the local cost is added to the sample block as the token circulates. At this stage, the sample block contained in the token represents the expected cost from using the current probability distributions over target assignments. This expected cost can be used to update the probability distributions locally on each sensor. This approach enables the sample block to WO 2009/063182 PCT/GB2008/003791 17 be generated and evaluated in a distributed manner without using a centralised oracle. The global information for a single sensor can be defined as: 1 1 1: log (2 ,e) 9 IYij (k Ik)I Equation 6: Global Information for a Single Sensor 5 Figure 5 shows the mean number of samples required for each time step during the tracking scenario involving three static sensors and three moving targets. The profile of the sampling required follows the cooperation required in the tracking scenario. The two peaks in the sampling performed correspond to handover points in the scenario. A handover point is when the sensor-to-target 10 assignment strategy changes, i.e. for example, two sensors swap the targets to measure. At these points, a greater number of samples are required to determine the optimal assignment and provide evidence that the PC algorithm naturally adapts the communication dependant on the cooperation required. For example, at handover points when tight cooperation is required, the amount of 15 sampling is increased. Although this result is encouraging, the amount of sampling and hence communication is disappointing because on average the sensors are performing between 40 and 77 samples per time step. Figures 6A and 6B show the mean global information and samples during the scenario for the same tracking scenario with different numbers of sensors. 20 Figure 6A shows that the gap in performance between the joint optimal action and a selfish solution increases as the number of sensors is increased. The difference between the performance of PC and a brute force or optimal solution WO 2009/063182 PCT/GB2008/003791 18 is difficult to see because the performance is the same. This verifies that the PC algorithm results in the optimal joint action to perform. Figure 6B compares the number of samples required by PC to the complexity of the brute force or optimal approach. As Figure 6B shows the complexity of the brute force 5 approach increases exponentially, as the number of sensors is increased, while the complexity (samples required) of the PC algorithm remains constant. This provides a promising indication that the PC approach proposed will scale up to higher-dimensional problems. Figure 3 illustrates steps performed during step 206 of Figure 2. At step 10 302 initialisation takes place by specifying the probability of the target being measured by the sensor over the range of the sensor's allowed control or reconfiguration parameters (based on the sensor's predicted position as computed at step 204). In general this can be a flat distribution, or it could be biased toward a specific control parameter if there is good prior knowledge or 15 operational reasons to support this. At step 304 the sensor maintains a probability distribution, which may be over its own parameters, or over the parameters of at least some of the other sensors in the network (joint control parameters), depending on the characteristic of the control problem. In this step the sensor draws multiple 20 independent samples from these distributions to generate a sample block. A cost/utility value is also associated with each sample, as will be explained below with reference to Figure 4. At step 306 the sensor updates the set of probability distributions over the control parameters. The probability distributions can be updated using a range WO 2009/063182 PCT/GB2008/003791 19 of techniques, such as gradient descent or nearest Newton. The update can be performed using only the latest sample block (delayed sampling) or using all the sample blocks (Immediate Sampling). With Immediate Sampling, the probability distributions are updated using all the samples contained in the previous sample 5 blocks using a weighted average. An unbiased estimated is achieved by using a weight based on the inverse variance of the sample block. Immediate sampling enables a well-principled approach to the reuse of previous samples and hence, reduces the number of samples that must be taken. Another method to reduce the number of samples taken, and hence the 10 communication between the sensors, is to adjust the parameter beta (e.g. beta = 1/T in equation 4 above) automatically using parametric learning techniques. These are a general set of techniques that can be used by Immediate Sampling to refine its optimisation performance. One such parametric learning technique is cross-validation. Rather than use a fixed cooling schedule, the value of beta 15 can be adapted to enable rapid cooling when possible. The optimal beta parameter is calculated using cross-validation. Cross-Validation is implemented by dividing the complete set of samples into two: a training set and a test set. The training set is used to update the probability distributions as above and then the test set is used to evaluate the cost associated with the new probability 20 distributions. The beta parameter that results in minimising the cost function is used to update the probability distributions using the complete set of samples. Using cross-validation to adapt the beta parameter results in fewer samples and hence reduced communication between sensors.

WO 2009/063182 PCT/GB2008/003791 20 The inventors have recognised the value of using these two techniques for solving sensor control problems in distributed networks. Specifically, they do not require a central control point and it is potentially less bandwidth-intensive than alternative distributed control solutions. The reasons for this are two-fold: 5 1. Immediate Sampling allows efficient reuse of old sample blocks such that the amount of communication between the sensors is significantly reduced. 2. Often, in stochastic optimisation methods, one has to 'guess' a cooling schedule and does so conservatively in order to capture a global solution. 10 Cross-validation enables the cooling schedule to be set automatically so bandwidth (and compute) resources are not wasted on inappropriately fine-scaled searches. The probability distributions can be updated using a variety of optimisation techniques, e.g. a simple hill-climbing algorithm. It will also be 15 appreciated that other parametric learning techniques, such as Gaussian Processes, could be used to "intelligently" draw samples from the sensors' probability distributions over their actions. As these samples are communicated (typically over bandwidth constraint links) a reduced sample set is desirable. At step 308 the set of probability distributions about the optimal control 20 . parameters are sharpened. This can be achieved using an iterative process that is terminated by a convergence criterion relating to a judgement about how sharp those distributions need to be in practice. In practice, this iterative process is likely to be controlled by two parameters: an upper limit on the time taken to perform the optimisation, and the accuracy of the sensor actuation. For WO 2009/063182 PCT/GB2008/003791 21 example, if the sensor can only orientate to within +/- 5 degrees then this will determine the variance (sharpness) of the target probability distribution required. At step 310 the terminal probability distributions are sampled for a final time and the mean (or median) of those samples defined the sensing action 5 which is then executed and the sensors to control/reconfigure it. Figure 4 illustrates steps performed during step 304 of Figure 3. At step 402 an initial/empty sample block is populated with a joint set of control or reconfiguration parameters. A sample block can either be populated by a single sensor or by involving all sensors, depending on the type of control problem 10 being solved. The control parameters are drawn from the current probability distributions. Graphical representations of example probability distributions are shown in Figures 7A and 7B. In Figure 7A the X-axis represents a control parameter of the sensor, whilst the Y-axis represents the probability of the sensor measuring the target. In the example the line, the line peaks around 15 value 0 on the X-axis represents, indicating that those control parameter values are the ones at which the sensor is most likely to measure the target. It will be appreciated that the probability distributions can be based on more than one control parameter of the sensor. In Figure 7B the X-axis and the Y-axis represent two control parameters (e.g. tilt and pan angles), with the 20 shading/colour of the plot representing the probability of the target being measured at those parameters. At step 404, once the sample block has been populated with the control or reconfiguration parameters for all sensors, the cost of executing these WO 2009/063182 PCT/GB2008/003791 22 parameters can be evaluated by all the sensors using Equation 5. The table below illustrates an example sample block. .9 1 e 2 a9 H(e 2 ) H(e 2 ) H(e 3 ) G 23 45 67 0.8 0.1 0.6 1 45 23 67 0.1 0.2 0.8 3 5 Where 01-3 are a set of three control parameters of the sensor (the angle to which the sensor will orientate in this case); H is the sampling probability with which the action (0n) was selected and G is the associate cost for the joint set of actions.

Claims

1. A method of controlling a sensor to sense one target from a plurality of targets, the method including: predicting states of a plurality of targets; 5 receiving information regarding a state of a said target from the plurality of targets obtained by at least one other sensor; generating a set of probability distributions, each said probability distribution in the set representing a setting or settings of at least one control parameter of the sensor; 10 calculating an expected information gain value for each said control parameter in the set, a said information gain value representing an expected quality of a measurement of one of the targets taken by the sensor if controlled according to the control parameter, based on the predicted state of the target; updating the set of probability distributions to identify the sensor control 15 parameters that maximise the expected information gain value, and controlling the sensor in accordance with the maximising control parameters, the steps of generating the set of probability distributions and calculating the information gain values include: generating a sample block using the probability distributions over the 20 control parameters of the sensor and the at least one other sensor; evaluating a global objective function, and updating the set of probability distributions for the sensor and the at least one other sensor using the global objective function, wherein a Monte Carlo WO 2009/063182 PCT/GB2008/003791 24 Optimisation technique involving immediate sampling and parametric learning is used for the updating of the set of probability distributions.

2. A method according to claim 1, wherein the parametric learning technique comprises cross-validation. 5

3. A method according to claim 1 or 2, wherein the step of predicting the states of the targets is implemented using an information filter technique, such as an Information-Based Kalman filter.

4. A method according to claim 3, wherein the target state corresponds to a spatial state of the target, such as coordinates representing its position; its 10 bearing/trajectory and/or its velocity, and the step of predicting the target state uses a target motion model for the information filter technique.

5. A method according to any one of the preceding claims, wherein the expected information gain value is calculated using an equation: I (k) = 1 IY(kk - 1) + Io(k) 2 * |Y(kk -1)| 15 where Y(kk-1) is an information matrix at time k based on all measurements made by the sensor and the at least one other sensor up to time k-1 and le(k) is an information matrix associated with a measurement made by the firstmentioned sensor at time k for a set of said control parameters.

6. A computer program product comprising computer readable medium, 20 having thereon computer program code means, when the program code is loaded, to make the computer execute a method of controlling a sensor to measure one target from a plurality of targets by: WO 2009/063182 PCT/GB2008/003791 25 predicting states of a plurality of targets; receiving information regarding a state of a said target from the plurality of targets obtained by at least one other sensor; generating a set of probability distributions, each said probability 5 distribution in the set representing a setting or settings of at least one control parameter of the sensor; calculating an expected information gain value for each said control parameter in the set, a said information gain value representing an expected quality of a measurement of one of the targets taken by the sensor if controlled 10 according to the control parameter, based on the predicted state of the target; updating the set of probability distributions to identify the sensor control parameters that maximise the expected information gain value, and controlling the sensor in accordance with the maximising control parameters, the generating of the set of probability distributions and the calculating of the 15 information gain values including: generating a sample block using the probability distributions over the control parameters of the sensor and the at least one other sensor; evaluating a global objective function, and updating the set of probability distributions for the sensor and the at least 20 one other sensor using the global objective function, wherein a Monte Carlo Optimisation technique involving immediate sampling and parametric learning is used for the updating of the set of probability distributions.

7. A sensor controllable to measure one target from a plurality of targets, the sensor including: WO 2009/063182 PCT/GB2008/003791 26 means for predicting states of a plurality of targets; means for receiving information regarding a state of a said target from the plurality of targets obtained by at least one other sensor; means for generating a set of probability distributions, each said probability 5 distribution in the set representing a setting or settings of at least one control parameter of the sensor; means for calculating an expected information gain value for each said control parameter in the set, a said information gain value representing an expected quality of a measurement of one of the targets taken by the sensor if 10 controlled according to the control parameter, based on the predicted state of the target; means for updating the set of probability distributions to identify the sensor control parameters that maximise the expected information gain value, and means for controlling the sensor in accordance with the maximising control 15 parameters, the means for generating the set of probability distributions and the means for calculating the expected information gain values being configured to: generate a sample block using the probability distributions over the control parameters of the sensor and the at least one other sensor; evaluate a global objective function, and 20 update the set of probability distributions for the sensor and the at least one other sensor using the global objective function, wherein a Monte Carlo Optimisation technique involving immediate sampling and parametric learning is used for the updating of the set of probability distributions.