WO2012034169A1 - Commande décentralisée - Google Patents

Commande décentralisée Download PDF

Info

Publication number
WO2012034169A1
WO2012034169A1 PCT/AU2011/001175 AU2011001175W WO2012034169A1 WO 2012034169 A1 WO2012034169 A1 WO 2012034169A1 AU 2011001175 W AU2011001175 W AU 2011001175W WO 2012034169 A1 WO2012034169 A1 WO 2012034169A1
Authority
WO
WIPO (PCT)
Prior art keywords
vehicle
decision
decentralised
variables
discrete
Prior art date
Application number
PCT/AU2011/001175
Other languages
English (en)
Inventor
Zhe XU
Salah Sukkarieh
Original Assignee
The University Of Sydney
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from AU2010904113A external-priority patent/AU2010904113A0/en
Application filed by The University Of Sydney filed Critical The University Of Sydney
Publication of WO2012034169A1 publication Critical patent/WO2012034169A1/fr

Links

Classifications

    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05DSYSTEMS FOR CONTROLLING OR REGULATING NON-ELECTRIC VARIABLES
    • G05D1/00Control of position, course, altitude or attitude of land, water, air or space vehicles, e.g. using automatic pilots
    • G05D1/02Control of position or course in two dimensions
    • G05D1/021Control of position or course in two dimensions specially adapted to land vehicles
    • G05D1/0287Control of position or course in two dimensions specially adapted to land vehicles involving a plurality of land vehicles, e.g. fleet or convoy travelling
    • G05D1/0291Fleet control
    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05DSYSTEMS FOR CONTROLLING OR REGULATING NON-ELECTRIC VARIABLES
    • G05D1/00Control of position, course, altitude or attitude of land, water, air or space vehicles, e.g. using automatic pilots
    • G05D1/02Control of position or course in two dimensions
    • G05D1/021Control of position or course in two dimensions specially adapted to land vehicles
    • G05D1/0231Control of position or course in two dimensions specially adapted to land vehicles using optical position detecting means
    • G05D1/0246Control of position or course in two dimensions specially adapted to land vehicles using optical position detecting means using a video camera in combination with image processing means

Definitions

  • the present invention relates to pluralities of vehicles and methods of operating a plurality of vehicles.
  • the present invention relates to decentralised optimisation process for operating a plurality of vehicles.
  • a centralised approach to coordination has a number of disadvantages. For example, if the central decision making node fails, the rest of the system tends to fail also. Also, as the number of robots in the team increases, the computational load on the central node increases. Also, the resulting solution tends not to be modular, since the central node must have an understanding of the system as a whole.
  • the present invention provides a method of operating a plurality of vehicles, wherein the plurality of vehicles is tasked to perform a common task, the method comprising: for each vehicle, defining a decision vector; wherein defining a decision vector comprises: defining at least one continuous variable; defining at least one discrete variable; and forming into a vector, values for each of the defined variables; and updating the values of the variables in the decision vector of each vehicle by performing a decentralised optimisation process; wherein each variable corresponds to a decision to be made by the corresponding vehicle; the decentralised optimisation process comprises determining a function representative of the common task; the function representative of the common task is a function of the continuous variable(s) and the discrete variable(s); and the discrete variables are constrained in the function representative of the common task to being continuous over a range of values.
  • the decentralised optimisation process may comprise performing an iterative process until either: (a) values for the discrete and continuous variables have been converged upon, or (b) until a predetermined number of iterations has been performed.
  • the discrete variables may be constrained using one or more penalty multipliers.
  • the method may further comprise iteratively updating the decision vectors of each vehicle by iteratively performing the decentralised optimisation process.
  • the penalty multipliers may be reduced.
  • the penalty multipliers may be reduced geometrically with a common ratio.
  • the penalty multipliers may be reduced according to a schedule.
  • the schedule may be implemented using a plurality of synchronised clocks, and each vehicle may comprise one or more clocks.
  • the decentralised optimisation process may be iteratively performed until the penalty multipliers have values below a predetermined threshold value.
  • the method may further comprise, for each vehicle, determining decisions corresponding to the updated decision vector for that vehicle.
  • the method may further comprise each vehicle enacting the corresponding determined decisions.
  • the decentralised optimisation process may further comprise, for each vehicle, determining a partial derivative of the function representative of the common task with respect to the decision vector of that vehicle.
  • the present invention provides a system comprising: a plurality of vehicles, each vehicle comprising a processor and a sensor; wherein for each vehicle, a decision vector is defined using a process of defining a decision vector; the process of defining a decision vector comprises: defining at least one continuous variable; defining at least one discrete variable; and forming into a vector, values for each of the defined variables; each variable corresponds to a decision to be made by the corresponding vehicle; each sensor is arranged to measure the value of one or more parameter(s) for the corresponding vehicle; and using the measured parameter values, the processors are arranged to update the values of the variables in the decision vector of each vehicle by performing a decentralised optimisation process; the decentralised optimisation process comprises determining a function representative of the common task; the function representative of the common task is a function of the continuous variable(s) and the discrete variable(s); and the discrete variables are constrained in the function representative of the common task to being continuous over a range of values.
  • the present invention provides a program or plurality of programs arranged such that when executed by a computer system or one or more processors it/they cause the computer system or the one or more processors to operate in accordance with the method of any of the above aspects.
  • the present invention provides a machine readable storage medium storing a program or at least one of the plurality of programs according to the above aspect.
  • Figure 1 is a schematic illustration (not to scale) of an example vehicle that will be used in an embodiment of a decentralised optimisation algorithm
  • Figure 2 is a schematic illustration (not to scale) of a scenario in which the vehicle will be used to implement an embodiment of the decentralised optimisation algorithm
  • Figure 3 is a diagram showing certain steps of an embodiment of the decentralised optimisation algorithm.
  • Figure 1 is a schematic illustration (not to scale) of an example vehicle 2 that will be used in an embodiment of a decentralised optimisation algorithm, which is described in more detail later below with reference to Figure 3.
  • the vehicle 2 is an autonomous, unmanned land-based vehicle.
  • the vehicle 2 comprises a sensor 40, a navigation module 41 , a processor 5, and a transceiver 6.
  • the senor 40 is a bearings-only sensor, e.g. a camera.
  • the sensor 40 has a limited field of view.
  • the field of view of the sensor 40 is schematically illustrated by dotted lines in Figure 1 , which are indicated by the reference numeral 7.
  • the senor 40 is mounted on a pan-tilt mount such that it can be pointed in different directions for a given vehicle facing.
  • the sensor 40 is coupled to the processor 5. Thus, in operation, data gathered using the sensor 40 is sent from the sensor 40 to the processor 5 by which it is processed, as described in more detail later below.
  • the navigation module 41 measures a plurality of internal parameters of the vehicle 2.
  • internal parameters is used herein to refer to parameters whose values are only determined by the vehicle 2 itself, i.e. parameters whose values are not determined by an external entity, for example a further vehicle.
  • the internal parameters measured by the navigation module 41 may include the vehicle's velocity, orientation, and position of vehicle 2.
  • the internal parameters may include parameters corresponding to an accelerometer, a gyroscope, slippage, tyre shape, camera calibration etc.
  • the navigation module 41 is coupled to the processor 5. Thus, in operation, data gathered using the navigation module 41 is sent from the navigation module 41 to the processor 5 by which it is processed, as described in more detail later below.
  • the transceiver 6 is adapted to receive data from entities that are remote from the vehicle 2, for example other vehicles (as described in more detail later below).
  • the transceiver 6 is coupled to the processor 5. Thus, in operation, information received by the transceiver 6 is sent from the transceiver 6 to the processor 5 by which it is processed, as described in more detail later below.
  • the transceiver 6 is adapted to transmit data to entities that are remote from the vehicle 2, for example other vehicles (as described in more detail later below).
  • the processor 5 is arranged to output information to the transceiver 6.
  • information received by the transceiver 6 from the processor 5 is sent from the transceiver 6 to entities external to the vehicle 2, as described in more detail later below.
  • Figure 2 is a schematic illustration (not to scale) of a scenario in which the vehicle 2 will be used to implement an embodiment of the decentralised optimisation algorithm, which is described in more detail later below with reference to Figure 3.
  • a team of three vehicles 2 is tasked to localise a number of static point targets (each indicated by an 'X' and the reference numeral 8 in Figure 2) ⁇
  • Each of the vehicles 2 in the team of vehicles is a vehicle as described above with reference to Figure 1.
  • the targets 8 are spread out relative to each other in an open environment.
  • the sensors 40 on each of the vehicles 2 is a bearings-only sensor which has a limited field of view.
  • a particular sensor 40 can track only one target 8 at a time
  • the goal of the team of vehicles is to gather information about the environment. Specifically, the team of vehicles is tasked to localise a set of point targets 8 within the environment.
  • the sensors 40 which the vehicles 2 use to localise the targets 8 have a limited field of view, but can rotate to face any direction.
  • each vehicle 2 must make two decisions:
  • each vehicle 2 must decide which direction to travel in.
  • this decision is made in a continuous space.
  • the decision parameter may be the steering angle for the vehicle 2 which can be any of a continuous range of values.
  • each vehicle 2 must decide at which target 8 it should point its sensor 40.
  • Each vehicle 2 has a state vector.
  • a state vector comprises, for example, a velocity vector, a position vector, etc. for a particular vehicle 2.
  • the vector is the state vector of the rth vehicle 2.
  • a team state vector x is a vector comprising the state vectors of each vehicle 2 in the team of vehicles.
  • Each vehicle 2 has a decision vector.
  • the vector M is the decision vector of the rth vehicle 2.
  • Each vehicle 2 has a discrete decision space (comprising e.g. the decision as to which target should the vehicle 2 point the sensor 40 at), and a continuous decision space (comprising e.g. the decision as to which direction to travel in).
  • Each vehicle 2 represents its discrete decision space with a vector of binary indicator variables of length equal to the number of the discrete decisions available for that vehicle 2.
  • an indicator variable is associated with each possible discrete decision available to the vehicle 2.
  • scalar ⁇ *f is the binary indication (0 or 1) of the /th discrete decision available to the /th vehicle 2.
  • the rest must be 0 (since the sensor 40 of the vehicle can only be pointed at one target 8 at a time).
  • the remainder of the individual decision vector w is comprised of the continuous decision variables: where c ⁇ is the value for of the /th continuous decision available to the rth vehicle 2.
  • a team decision vector U is a vector comprising the decision vectors of each vehicle 2 in the team of vehicles.
  • Each vehicle 2 can communicate (via its transceiver 6) its decision vector w ( to the other vehicles in the team.
  • the deterministic belief held by the /th vehicle 2 of the decision vector of the rth vehicle 2 is Uj .
  • the discrete and continuous components of Uj are ⁇ *j and Cj respectively.
  • Each vehicle 2 is able to evaluate the Jacobian of the team utility U (x, ) with respect to its decision variables.
  • these constraints can be either inter- or intra-vehicle.
  • the constraints may be any continuous and twice differentiable function of the state and joint decision vector.
  • constraints on the continuous decision variables are also modelled in this framework. ln this embodiment, the binary restrictions on the indicator variables a , are relaxed. This is so that these variables can take any value in the range [0, 1].
  • the problem of coordinating the vehicles 2 is advantageously transformed into a continuous optimisation problem.
  • the relaxation of the binary restrictions on the indicator variables advantageously transforms the Mixed Integer Programming problem (which has NP complexity) into a continuous optimisation problem. This advantageously tends to be easier to solve.
  • Figure 3 is a diagram showing certain steps of an embodiment of the decentralised optimisation algorithm.
  • each vehicle 2 in the team predicts future observations of its sensor 40 and its internal parameters using a mathematical model of the motion of the vehicle, and broadcasts this information to other vehicles 2 in the team.
  • each vehicle 2 has a model of its own motion and sensors. In this embodiment, this model is stored and updated by the processor 5 of the vehicle 2.
  • each vehicle 2 forward predicts the information gain given its decision vector u i .
  • this information update is computed without knowledge of the actual values of the sensor observations, but rather only from the sensor model and the trajectory of the vehicle 2 relative to a target 8. This is predicted from the vehicle motion model and the current estimate of the position of the target held by the vehicle 2.
  • the predicted observations made by a vehicle 2 are in the form of an information matrix update I .
  • This information matrix update / is broadcast to the other vehicles 2 in the team.
  • This broadcasting of / is represented schematically in Figure 3 by the dotted arrow pointing from the box of step s2.
  • this broadcasting of is performed because one vehicle 2 may not know the sensor model of another vehicle 2 in the team, and because the other vehicles 2 in the team need this information to compute the utilities of their actions (as described in more detail later below with reference to step s6 of Figure 3).
  • the prediction of observations made by a vehicle 2 occurs over a pre-set prediction horizon.
  • the pre-set prediction horizon is 15 seconds.
  • the prediction of observations made by a vehicle 2 (i.e. the information matrix update / ) is then used to compute an objective function, as described in more detail later below at step s6.
  • a process of decentralised data fusion is performed by each vehicle to estimate a location of a target 8.
  • a location of a target 8 is estimated by each vehicle 2 using an Information Filter (IF), as described in U A Decentralized Information- Theoretic Approach", J. Manyika and H. Durrant-Whyte, Data Fusion and Sensor Management: Chichester: Ellis Horwood Limited, 1994 (which is incorporated herein by reference).
  • IF Information Filter
  • the states being tracked by the IF are the 3D Cartesian coordinates of the targets 8 and the velocities in the x- and y-axes.
  • each vehicle 2 broadcasts its information matrix (hereinafter denoted by Y ) and its information vector (hereinafter denoted by y ).
  • the other vehicles 2 receive these broadcasts (via their transceivers 6).
  • the receiving of the information matrix Y and the information vector y from each other vehicle 2 in the team is represented schematically in Figure 3 by the dotted arrow pointing to the box of step s4.
  • These received estimates are fused into the receiving vehicle's own estimates of a location of a target 8 using covariance intersection.
  • the process of covariance intersection is as described in "A non- divergent estimation algorithm in the presence of unknown correlations", S. Julier and J. Uhlmann, vol. 4. American Control Conference, 1997, pp. 2369-2373 (which is incorporated herein by reference).
  • the process of covariance intersection utilises a vehicles' own estimate of the state of a target 8, as well as one or more estimates received from other vehicle(s).
  • a vehicle 2 transmits its own estimate of the state of a target 8 to other vehicles.
  • These transmitted estimates, and one or more estimates received from other vehicle(s) are represented schematically in Figure 3 by a dotted two-headed arrow pointing to and from the box of step s4.
  • the target estimates received from the other vehicles are delineated from the vehicles' own sensor data (represented schematically by the solid line pointing to step s4).
  • the vehicle's sensor data are processed into information matrix updates (I) and information vector updates (i), and are fused using Information Filter equations.
  • the vehicle's updated estimates of a location of a target 8 are used to determine an objective function (described in more detail later below with reference to step s6)
  • an objective function i.e. a function representative of the team objective
  • the Jacobian of the objective function i.e. the matrix of all first-order partial derivatives of the objective function
  • n is the number of discrete decisions available to a vehicle 2
  • m is the number of continuous decisions available to a vehicle 2
  • p is the number of vehicles 2 in the team
  • /* is a limit for the Wh continuous variable of the fth vehicle (i.e. a limit for
  • ⁇ , , A 2 and ⁇ are penalty multipliers, as described in more detail later below.
  • the first term on the right-hand side of the equation encapsulates the sum of the linear combinations of indicator variables
  • the second term on the right-hand side of the equation is a quadratic penalty term that enforces the linear constraints on the indicator variables.
  • the third term on the right-hand side of the equation a quartic penalty term that enforces the constraint that the indicator variables lie in the range [0, 1].
  • the fourth term on the right-hand side of the equation a further quadratic penalty term that enforces the limits on the continuous decision variables.
  • each vehicle 2 determines the Jacobian of the objective function (i.e. partial derivative) with respect to the discrete component of its decision vector.
  • an analytical expression for the above Jacobian of the objective function with respect to a generic continuous decision variable is given by:
  • each vehicle 2 determines the Jacobian of the objective function (i.e. partial derivative) with respect to the continuous component of its decision vector.
  • the penalty method is in contrast to the methodology proposed in "Asynchronous gradient-based algorithms for team decision making and controf, G. Mathews, H. Durrant-Whyte, and M. Prokopenko, IEEE Transactions of Robotics.
  • projection methods were used to observe constraints.
  • the penalty method requires penalty multipliers , , ⁇ 2 and A3 to be initialised to some positive value.
  • the objective function may then be optimised (as described later below with reference to step s8).
  • a heuristic approach is adopted for interpreting the relaxed indicator variables that artificially scales the noise of observations of a target when the indicator variable for a target is less than one.
  • the information matrix update is scaled by the corresponding indicator variable before being added to the inverse covariance matrix when computing the information gain, i.e.
  • the information matrix updates from predicted observations by other vehicles 2 in the team are obtained via communication between the vehicles 2.
  • the information matrix may be computed by:
  • H H T R- ] H
  • H is a sensor model
  • R represents the covariance of the sensor noise.
  • the scaling of the information matrix update(s) associated with each vehicle's own predicted observations is removed.
  • these scalings are only applied to the information matrix updates received via communication, i.e. the utility as computed by the /th vehicle 2 is:
  • each vehicle 2 in the team determines the Jacobian of the objective function with respect to its own decision vector.
  • each vehicle 2 in the team updates its own decision vector w,. using the Jacobian of the objective function for its own decision vector.
  • the optimisation of the objective function is performed in a distributed manner in accordance to the method proposed in "Asynchronous gradient-based algorithms for team decision making and controf.
  • each vehicle 2 broadcasts updates to its own decision vector to each other vehicle 2 in the team.
  • This information broadcast is represented schematically in Figure 3 by a dotted arrow pointing from the box of step s8.
  • the other vehicles 2 in the team receive these updates and use the information to estimate the coupling between the decisions of the vehicles 2.
  • This information receiving is represented schematically in Figure 3 by dotted arrow pointing to the box of step s8.
  • This coupling is used, along with communications delays, to compute bounds on the update step in the gradient descent optimisation to guarantee convergence.
  • the direction of steepest descent i.e. negative of the
  • Jacobian is used as the update direction.
  • step s10 after each update of its decision vector by a vehicle 2, a check is made to determine whether the optimisation process for determining a vehicle's decision vector has converged. If convergence has not been reached, steps s2 to s10 are iterated (as represented schematically in Figure 3 by a dashed arrow from the box of step s10 to the box of step s2).
  • step s12 if convergence is not reached after a fixed number of iterations, e.g. twenty iterations, the optimisation process is terminated and the process proceeds to step s12.
  • step s10 If convergence is reached at step s10, the process proceeds to step s12.
  • steps s12 once the optimisation has converged (or is terminated), the penalty multipliers , and A, are reduced and the optimisation is re-run with the initial estimate set at the output of the previous optimisation step.
  • This rerunning of the optimisation is represented schematically in Figure 3 by a solid line from the box of step s 2 to the box of step s6.
  • the penalty multipliers , and ⁇ are reduced on a fixed schedule during the decision time. This advantageously tends to provide that the same optimisation problem is being solved across the entire team of vehicles.
  • this reduction of the penalty multipliers according to a fixed schedule is advantageously facilitated by providing that clocks on each vehicle 2 in the team are synchronised. This synchronisation may be provided by, for example, mechanisms such as Network Time Protocol or Global Positioning System time.
  • the penalty multipliers are reduced geometrically with a common ratio.
  • the common ratio is preferably between 0.1 and 0.7, as suggested in "Numerical Optimization", J. Nocedal and S. J. Wright, ser. Springer Series in Operations Research. New York: Springer-Verlag, 1999.
  • the penalty multipliers , and /i are initialised to 1. These values may be advantageously selected depending on the application.
  • the penalty multipliers , ⁇ 2 and A are reduced on a schedule using clocks synchronised across the vehicles, i.e. in this embodiment, each vehicle comprises such a clock to facilitate this reduction of penalty multipliers (on a schedule).
  • a check is made to determine whether the penalty multipliers have been reduced to below a threshold. In this embodiment, this threshold is 0.05.
  • steps s2 to s14 are iterated (as represented schematically in Figure 3 by a dashed arrow from the box of step s14 to the box of step s2).
  • step s14 If the penalty multipliers have not been reduced to below the threshold at step s14, the process proceeds to step s16.
  • step s16 the discrete decision variables, and the continuous variables, in the decision vectors of the vehicles 2 in the team are resolved. In other words, values for the variables of the vehicle's decision vectors are determined.
  • the continuous decision variables are taken directly from the output of the optimisation.
  • the discrete decision represented by the indicator variable with the highest value is selected.
  • each vehicle 2 performs the action corresponding to the determined values of its decision vector.
  • each vehicle enacts the decision corresponding to the determined values of its decision vector.
  • Receding horizon control advantageously tends to account for impediments such as imperfect sensor and motion models, the limited prediction horizon, and the dynamic nature of the environment (i.e. the distributions over the positions of the targets are constantly changing).
  • decentralised optimisation process is provided.
  • An advantage provided by the above described decentralised optimisation process is that is tends to out-perform conventional approaches to decentralised decision making.
  • the above described decentralised optimisation process tends to out-perform process that implement an "Implicit Coordination" approach.
  • the vehicles share observations of the targets, but do not explicitly coordinate their future actions by communicating their decisions or the predicted observations that result from those decisions.
  • each vehicle makes a locally greedy decision, with the coordination arising due to the fact that the robots are making their decisions on the same belief of the state of the targets.
  • the above described decentralised optimisation process tends to out-perform process that implement a "Best Response" approach.
  • a best response algorithm has each vehicle optimise its decisions, both discrete and continuous, based on the previously communicated predicted observations of the other vehicles in the team. Once the vehicle has determined its "best response" to the predictions of the other vehicles, it broadcasts predicted observations that result from the best response decision. The cycle is then repeated. The iteration repeats for a fixed number of times.
  • the above described decentralised optimisation process tends to out-perform process that implement a "Mutually Exclusive Assignments" approach.
  • This approach decouples the discrete and continuous components of the decision and simplifies the assignment problem by assuming mutually exclusive vehicle-target assignments.
  • the continuous controls are optimised for each target, and the resulting utilities, ignoring observations from other vehicles, are computed.
  • the Kuhn-Munkres algorithm is then typically used to compute the optimal mutually-exclusive assignments.
  • a further advantage is that making decisions that involve both discrete and continuous parameters tends to be advantageously facilitated. Furthermore, those decisions are made in a decentralised manner. This tends to be in contrast to conventional techniques such as those that can only operate over continuous variables. Such techniques tend not to be able to use discrete components.
  • a further example of a conventional technique is a process in which the continuous component of the decision space is discretised. Decisions are then made in completely discrete space. However, this tends to result in a large decision vector. Also, discrete decision making algorithms tend to scale poorly with the dimension of the decision (for example, the max-sum algorithm). Furthermore, approximations due to the discretisation tend to have to be made. The above described decentralised optimisation process advantageously tends to avoid this disadvantages.
  • a further advantage is that the assigning of multiple vehicles to multiple targets in a manner that couples the vehicle motion control to the vehicle-to-target assignments tends to be facilitated. Moreover, non-mutually-exclusive vehicle- target assignments tend to be allowed.
  • a further advantage of the above described decentralised optimisation process is that team performance tends not to be impeded by the vehicles "racing" each other towards a goal. This tends to be in contrast to implicit coordination methods.
  • a further advantage is that the above described method tends perform well compared to other decentralised coordination techniques.
  • MILP Mixed Integer Linear Programming
  • Apparatus including the processors 5 of the vehicles 2, for implementing the above arrangement, and performing the method steps to be described later below, may be provided by configuring or adapting any suitable apparatus, for example one or more computers or other processing apparatus or processors, and/or providing additional modules.
  • the apparatus may comprise a computer, a network of computers, or one or more processors, for implementing instructions and using data, including instructions and data in the form of a computer program or plurality of computer programs stored in or on a machine readable storage medium such as computer memory, a computer disk, ROM, PROM etc. , or any combination of these or other storage media.
  • the vehicles are autonomous, unmanned land- based vehicles.
  • one or more of the vehicles may be a different appropriate type of vehicle.
  • one or more of the vehicles may be semi-autonomous or manned.
  • one or more of the vehicles may be an aircraft (e.g. a UAV), or a water-based vehicle (e.g. a ship).
  • one or more of the vehicles may be replaced with a different type of entity, for example, a stationary entity (e.g. a building) upon which a moveable sensor may be mounted.
  • each vehicle comprises a single sensor which detects a target.
  • a vehicle may comprise any number of sensors.
  • one or more of the vehicles may have different number of one or more different types of sensors to the other vehicles in the team.
  • the sensors are bearings-only cameras that each has a limited field of view.
  • one or more of the sensors may be a different type of sensor, for example an infrared camera.
  • a sensor is mounted on a pan-tilt mount such that it can be pointed in different directions for a given vehicle facing.
  • one or more of the sensors may be mounted to a vehicle in a different way, e.g. such that a sensor is not moveable with respect to the vehicle.
  • one or more of the sensors may have a field of view that is limited to a different degree.
  • a sensor may not have a limited field of view (i.e. it has a 360° field of view).
  • one or more sensors may be able to detect more than one target at a time.
  • each navigation module may measure any appropriate parameters of the vehicle to which it belongs.
  • the team of vehicles comprises three vehicles which are tasked to localise a plurality of targets.
  • the team may have a different number of vehicles which may be tasked to localise any number of targets.
  • prior knowledge about the location of the targets exists, and the team of vehicles is tasked to refine that prior knowledge.
  • a method of recursive estimation is not used; thus, prior knowledge about the target locations is not used.
  • any prior knowledge may have a known level of uncertainty. In other embodiments, this uncertainty is not known.
  • the team of vehicles may be appointed a different task. For example, in other embodiments the team of vehicles is tasked to evaluate and/or verify the prior knowledge.
  • the vehicles address the problems of determining which direction they should move in, and which direction their sensor should be pointed in.
  • a vehicle may be required to make different decisions instead of or in addition to those problems.
  • the discrete and continuous spaces for each vehicle are as described above. However, in other embodiments those spaces may be different to those described above so as to describe the decisions that are to be made by a vehicle.
  • the discrete decisions made by a vehicle are binary. However, in other embodiments one or more discrete decision variables may have a different number of possible discrete values. In such embodiments, a discrete decision variable that has a different number of possible discrete values may be formulated as a series of binary indicator variables.
  • each vehicle has a model of its own motion and sensors. This allows a prediction of the information gain given the vehicle's decision vector to be made. This can be computed without knowledge of the actual observations, but rather from only the sensor model and the trajectory of the vehicle relative to a target (which can be predicted using a motion model and the current estimate of the target).
  • the predicted observations made by a vehicle are in the form of an information matrix update / .
  • This information matrix update / is broadcast to the other vehicles in the team. This is done since one vehicle may not know the sensor model of another vehicle, and since other vehicles use this to compute the utilities of their actions.
  • information matrix updates / and/or decision vector are communicated in a different way.
  • the information matrix update / and decision vector are communicated on a tree network.
  • each sensor could communicate to its parent the summation of / matrices representing observations by itself and its children, the set of which is denoted by , and weighted by its associated indicator variable as per the following equation: ⁇ a k J I(x,c k ,j)
  • Updates to the decision vector could also be passed on this tree network. This means two messages are passed on each link of the tree network per iteration of the decentralised optimisation.

Landscapes

  • Engineering & Computer Science (AREA)
  • Aviation & Aerospace Engineering (AREA)
  • Radar, Positioning & Navigation (AREA)
  • Remote Sensing (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Automation & Control Theory (AREA)
  • Traffic Control Systems (AREA)

Abstract

L'invention concerne un système et un procédé pour faire fonctionner une pluralité de véhicules qui doivent accomplir une tâche commune. Le procédé selon l'invention comprend : pour chaque véhicule (2), définition d'un vecteur de décision en définissant au moins une variable continue, en définissant au moins une variable discrète et en formant en un vecteur les valeurs pour chacune des variables définies ; et mise à jour des valeurs des variables dans le vecteur de décision de chaque véhicule (2) en exécutant un processus d'optimisation décentralisé. Selon l'invention, chaque variable correspond à une décision à prendre par le véhicule correspondant (2). Le processus d'optimisation décentralisé selon l'invention comprend la détermination d'une fonction représentative de la tâche commune, la fonction étant de la ou des variable(s) continue(s) et de la ou des variable(s) discrète(s) et les variables discrètes étant forcées dans la fonction à être continues sur une plage de valeurs.
PCT/AU2011/001175 2010-09-13 2011-09-13 Commande décentralisée WO2012034169A1 (fr)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
AU2010904113A AU2010904113A0 (en) 2010-09-13 Decentralised Control
AU2010904113 2010-09-13
AU2011900384A AU2011900384A0 (en) 2011-02-07 Decentralised Control
AU2011900384 2011-02-07

Publications (1)

Publication Number Publication Date
WO2012034169A1 true WO2012034169A1 (fr) 2012-03-22

Family

ID=45830861

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/AU2011/001175 WO2012034169A1 (fr) 2010-09-13 2011-09-13 Commande décentralisée

Country Status (1)

Country Link
WO (1) WO2012034169A1 (fr)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2014027247A3 (fr) * 2012-08-17 2014-04-10 King Abdullah University Of Science And Technology Système et procédé pour surveiller le trafic tout en préservant la confidentialité personnelle
WO2019215003A1 (fr) * 2018-05-07 2019-11-14 Audi Ag Procédé d'apprentissage d'algorithmes de commande à auto-apprentissage pour dispositifs mobiles autonomes et dispositif mobile autonome

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6377878B1 (en) * 1999-06-24 2002-04-23 Sandia Corporation Convergent method of and apparatus for distributed control of robotic systems using fuzzy logic
US20060235584A1 (en) * 2005-04-14 2006-10-19 Honeywell International Inc. Decentralized maneuver control in heterogeneous autonomous vehicle networks
US20080141220A1 (en) * 2004-05-12 2008-06-12 Korea Institute Of Industrial Technology Robot Control Software Framework in Open Distributed Process Architecture

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6377878B1 (en) * 1999-06-24 2002-04-23 Sandia Corporation Convergent method of and apparatus for distributed control of robotic systems using fuzzy logic
US20080141220A1 (en) * 2004-05-12 2008-06-12 Korea Institute Of Industrial Technology Robot Control Software Framework in Open Distributed Process Architecture
US20060235584A1 (en) * 2005-04-14 2006-10-19 Honeywell International Inc. Decentralized maneuver control in heterogeneous autonomous vehicle networks

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2014027247A3 (fr) * 2012-08-17 2014-04-10 King Abdullah University Of Science And Technology Système et procédé pour surveiller le trafic tout en préservant la confidentialité personnelle
US20150221217A1 (en) * 2012-08-17 2015-08-06 King Abdullah University Of Science And Technology Methylation biomarkers for breast cancer
US11055988B2 (en) * 2012-08-17 2021-07-06 King Abdullah Univercity Of Science And Technology System and method for monitoring traffic while preserving personal privacy
WO2019215003A1 (fr) * 2018-05-07 2019-11-14 Audi Ag Procédé d'apprentissage d'algorithmes de commande à auto-apprentissage pour dispositifs mobiles autonomes et dispositif mobile autonome

Similar Documents

Publication Publication Date Title
Robinson et al. An efficient algorithm for optimal trajectory generation for heterogeneous multi-agent systems in non-convex environments
Allen et al. A real-time framework for kinodynamic planning with application to quadrotor obstacle avoidance
Rigatos Modelling and control for intelligent industrial systems
Pasqualetti et al. Cooperative patrolling via weighted tours: Performance analysis and distributed algorithms
US6993397B2 (en) System and method for implementing real-time applications based on stochastic compute time algorithms
Han et al. Formation tracking control for time-delayed multi-agent systems with second-order dynamics
Grocholsky et al. Scalable control of decentralised sensor platforms
Hu et al. Robust formation coordination of robot swarms with nonlinear dynamics and unknown disturbances: Design and experiments
Gan et al. Online decentralized information gathering with spatial–temporal constraints
Xu et al. Decentralized coordinated tracking with mixed discrete–continuous decisions
Neto et al. Multi-agent rapidly-exploring pseudo-random tree
Yao et al. Null-space-based modulated reference trajectory generator for multi-robots formation in obstacle environment
Rodrigues et al. Leader-following graph-based distributed formation control
Hung et al. Image-based multi-uav tracking system in a cluttered environment
Kulathunga et al. Trajectory tracking for quadrotors: An optimization‐based planning followed by controlling approach
Fakoorian et al. Rose: Robust state estimation via online covariance adaption
Wesselowski et al. A dual-mode model predictive controller for robot formations
CN111176324B (zh) 一种多无人机分布式协同编队规避动态障碍的方法
WO2012034169A1 (fr) Commande décentralisée
Hurtado et al. Decentralized control for a swarm of vehicles performing source localization
Zhou et al. Time varying control set design for uav collision avoidance using reachable tubes
Xu et al. Decentralised coordination of mobile robots for target tracking with learnt utility models
Gomes et al. Model predictive control for autonomous underwater vehicles
Liu et al. Towards collaborative mapping and exploration using multiple micro aerial robots
Jafari et al. A game theoretic based biologically-inspired distributed intelligent Flocking control for multi-UAV systems with network imperfections

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 11824348

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 11824348

Country of ref document: EP

Kind code of ref document: A1