CN114154383B

CN114154383B - Multi-robot source search scheme generation method and system based on cognitive search strategy

Info

Publication number: CN114154383B
Application number: CN202111457545.1A
Authority: CN
Inventors: 陈彬; 季雅泰; 吕欣; 赵勇; 刘忠; 王锐; 王昊冉; 何华; 肖军浩; 卢惠民
Original assignee: National University of Defense Technology
Current assignee: National University of Defense Technology
Priority date: 2021-12-01
Filing date: 2021-12-01
Publication date: 2024-07-12
Anticipated expiration: 2041-12-01
Also published as: CN114154383A

Abstract

The invention discloses a method and a system for generating a multi-robot source search scheme based on a cognitive search strategy, wherein the method comprises the steps of establishing a gas leakage model and a detection model of a sensor; modeling a leakage source search process based on the gas leakage model and the detection model of the sensor; the iterative update of source item parameter estimation in the leakage source searching process is represented by using a particle filtering method, and a source searching action scheme based on a cognitive searching strategy is established; aiming at an actual obstacle scene, improving the obstacle avoidance function of the source searching action scheme; and generating a multi-robot source searching action scheme based on the cognitive searching strategy aiming at the combination of the improved source searching action scheme and the multi-robot control technology. According to the invention, the gas leakage source search of multiple robots can be realized, the uncertainty existing in the source searching process is considered, the multiple robots are controlled to perform collaborative source searching in a scene with a large space scale by utilizing a multi-robot collaborative algorithm, and the source searching efficiency and the source searching accuracy can be effectively improved.

Description

Multi-robot source search scheme generation method and system based on cognitive search strategy

Technical Field

The invention relates to the field of robot control, in particular to a method and a system for generating a multi-robot source search scheme based on a cognitive search strategy, which can be applied to the requirements of various gas source search scenes including chemical industry park hazard chemical leakage accident source searching, residential building gas leakage source searching and the like.

Background

The occurrence of various dangerous gas leakage accidents can lead to serious casualties and economic property loss. In order to maintain personnel safety, avoid property loss, quickly searching for a leakage source location and acquiring source item parameters becomes an important task. Autonomous source searching by using a mobile device equipped with various sensors, such as a robot, is an effective method, and the robot moves step by step toward a leakage source according to information collected by the sensors until the leakage source is found. In this process, how to generate a specific action scheme to control the movement of the robot is an urgent target to be explored. For the problem of sensing and searching of gas leakage sources, there are mainly four methods that can support the generation of action schemes, namely gradient-based algorithms, bionics algorithms, probability and map-based algorithms, and information theory-based algorithms.

Purely gas concentration gradient-based algorithms are the most primitive implementations in this field. The method utilizes the concentration gradient to guide the robot to the position of the leakage source, but the uncertainty of gas distribution in the process of diffusion in the real environment is greatly influenced by factors such as turbulence, and the like, so that the method is difficult to realize in the real scene. The bionic algorithm is realized by researching an animal to find an odor source and performing path planning, and the method also needs stable concentration gradient and is mostly used for a source searching experiment of a small scene. The probability and map-based algorithm models the position of the odor source as a probability distribution, and through continuous observation, the probability distribution becomes a dirac function, so that the position of the source is determined. Such algorithms result in late and complex mathematical calculations, but are widely studied for their good results, such as searching for gas diffusion sources in a room using particle filtering methods.

In order to solve the source positioning problem in turbulent environment, the cognitive search strategy based on the principle of information theory is widely researched and applied. This strategy utilizes probability estimation to locate the scent source while making sequential decisions using information state-based reward functions in the event of uncertainty. In particular, the source search process can be described as a Partial Observational Markov Decision Process (POMDP), consisting of three elements: information status, action set, and rewards function. Vergassola et al propose the earliest cognitive search strategy Infotaxis algorithm that employs a grid-based approach to maintaining information state. Then, researchers replace the mesh-based method in Infotaxis algorithm with a sequential Monte Carlo framework based on particle filtering, so that the algorithm can estimate the source intensity and source location. Different cognitive search strategies may employ different reward functions. The original Infotaxis algorithm designed a bonus function that included two terms to achieve the trade-off between exploration (Exploration) and development (Exploitation). Another cognitive search algorithm named Entrotaxis was proposed by hakinsen et al, which devised a reward function based on the principle of maximum entropy sampling. This reward function measures the uncertainty of future expected probe events, rather than the uncertainty of source term information considered in the Infotaxis algorithm.

One feature of the gas leakage source search scenario is the large spatial scale. In this scenario, searching for diffuse sources using a single robot may take too much time, or due to being far from the source, efficient iterative calculations may not be performed, resulting in no source being found. By using a multi-robot system, where multiple robots complete tasks through coordinated actions, many complex tasks that are difficult to complete with a single robot can be efficiently completed. The problem of multi-robot cooperative control has attracted considerable attention. Soares et al propose a graph-based formation control algorithm that can organize robots into arbitrary, evolving shapes, enabling traceability of gas diffusion sources. The odor source positioning algorithm based on the particle swarm optimization provided by Jatmiko combines chemotaxis and wind direction methods in the improved particle swarm optimization algorithm, can position the source in the environment with obstacles, and solves the problem of dynamic convection diffusion. In addition Hajieghrary and Ani propose a "information rental" strategy for multi-robot collaboration to cope with large-scale robots. However, considering the uncertainty existing in the source searching process, how to control a plurality of robots to perform collaborative source searching in a scene with a large space scale so as to improve the source searching efficiency and the source searching accuracy rate has a wide research space.

Disclosure of Invention

The invention aims to solve the technical problems: aiming at the problems in the prior art, the invention provides a method and a system for generating a multi-robot source searching scheme based on a cognitive searching strategy, which can realize the gas leakage source searching of multiple robots, consider the uncertainty existing in the source searching process, utilize a multi-robot cooperative algorithm to control the multiple robots to perform cooperative source searching in a scene with large space scale, can effectively improve the source searching efficiency and the source searching accuracy, and are suitable for the requirements of searching various gas source searching scenes including the source searching of dangerous chemical leakage accidents in chemical industry parks, searching gas leakage sources in residential buildings and the like.

In order to solve the technical problems, the invention adopts the following technical scheme:

a multi-robot source search scheme generation method based on a cognitive search strategy comprises the following steps:

1) Establishing a gas leakage model and a detection model of a sensor;

2) Modeling a leakage source search process based on the gas leakage model and the detection model of the sensor;

3) The iterative update of source item parameter estimation in the leakage source searching process is represented by using a particle filtering method, and a source searching action scheme based on a cognitive searching strategy is established;

4) Aiming at an actual obstacle scene, improving the obstacle avoidance function of the source searching action scheme;

5) And generating a multi-robot source searching action scheme based on the cognitive searching strategy aiming at the combination of the improved source searching action scheme and the multi-robot control technology.

Optionally, the functional expression of the gas leakage model established in step 1) is:

In the above formula, V is the average wind speed, For the gradient in the y-axis direction, c (r|θ ₀) is the gas concentration at position r= { x, y }, θ ₀＝{r₀, Q } is the source term parameter at position r ₀＝{x₀,y₀ } of the leakage source r ₀, Q is the diffusion intensity, D is the gas effective diffusion coefficient, Δc (r|θ ₀) is the variation of concentration, τ is the gas molecular lifetime, and δ is the dirac function.

Optionally, the functional expression of the detection model of the sensor established in step 1) is:

In the above formula, P (d (R) |θ ₀) is the probability that the sensor contacts the gas molecule d times in the unit time at the position r= { x, y }, d is the number of times the sensor contacts the gas molecule in the unit time at the position r= { x, y }, R (r|θ ₀) is the average number of times the sensor contacts the gas molecule in the unit time, and there are:

R(r|θ₀)＝4πDac(r|θ₀)

In the above formula, D is the effective diffusion coefficient of the gas, a is the sensor radius of the sphere, c (r|θ ₀) is the gas concentration at the position r= { x, y }, and there are:

In the above formula, c (r|θ ₀) is the gas concentration at the position r= { x, y }, θ ₀＝{r₀, Q } is the source term parameter at the position r ₀＝{x₀,y₀ } of the leakage source r ₀, Q is the diffusion intensity, V is the average wind speed, and the function expression of the intermediate variable λ is:

Optionally, step 2) includes: representing the source item parameter to be estimated as theta ₀＝{r₀ and Q by using a probability density function, determining a k-th post probability density function P (theta _k|D_k) representing the information state obtained after any k-th step by all the information already collected in the previous k-th step, wherein theta _k is the source item parameter estimated in the k-th step, D _k＝{d₁(r₁),d₂(r₂),…,d_k(r_k) represents all the information collected in the k-th step at a position r _k＝{x_k,y_k, and D ₁(r₁)～d_k(r_k) is the sensor readings obtained in the 1-k-th steps respectively; and the initial posterior probability density function P (theta ₀) is preset by priori knowledge, and the posterior probability density function P (theta _k|D_k) of any kth step is updated according to the following formula by using a Bayesian formula:

In the above formula, P (theta _k|D_k-1) is the posterior probability density of the k-1 step, P (d _k(r_k)|θ_k) is the density weight, and P (d _k(r_k)|D_k-1) is the normalization factor; considering that the selectable action set U under the condition of four connections is U= { Σ, +, → }, and four elements ∈, +, →respectively represent the up-down, left-right and four directions of actions, calculating the information gain I (U) which can be obtained at each candidate position in the selectable action set U by adopting a reward function in each step by the source-seeking robot, and selecting the position U ^* with the maximum information gain I (U) as the next position according to the following formula, thereby obtaining the model function expression of the leakage source searching process as shown in the following formula:

In the above formula, U ^* is the selected next position, U is the candidate position corresponding to the element in the optional action set U, I (U) is the information gain that can be obtained at the candidate position, and arg max represents the position where the selected information gain I (U) is the largest.

Optionally, step 3) includes: sampling the post probability density function P (θ _k|D_k) of any kth step to produce N weighted random samplesWherein the method comprises the steps ofRepresenting an estimate of the source term parameter by the ith particle of step k,Is a particleCorresponding weight, N weightsSum to 1, using N weighted random samplesThe N particles as the particle filtering method approximately represent the post probability density function P (θ _k|D_k) of the kth step as:

In the above formula, delta is a dirac function; the updating of the post probability density function P (theta _k|D_k) in any kth step in the leakage source searching process is represented as the particle updating of the particle filtering method, and when the particle of the cash particle filtering method is updated, an effective sampling scale N _eff is calculated to define the degradation degree of particle filtering, and if the effective sampling scale N _eff is reduced to be smaller than a set threshold eta, a residual resampling method is adopted to sample the original particles together to obtain a new particle set.

Optionally, the method for sampling the new particle set from the original particle set by adopting residual resampling includes: calculating the cumulative probability of the kth step according to the following formula:

In the above-mentioned method, the step of, For the first j cumulative probabilities of the kth step,For the first j-1 cumulative probabilities of the kth step,Representing the original particle setThe j-th weight value of the group (A), N is the particle number of the particle group; generating N random number sets uniformly distributed in [0,1] intervalFor random number setsEach random number in (a)Finding the smallest value of j makes it possible to satisfyIf a satisfied j value is found, then the new particle set is caused to be an exampleAnd is also provided withThereby obtaining a resampled particle setThen after sampling for the resampled particle setA new particle set is sampled by using a Metropolis-Hastings algorithm, and a normal distribution probability function is used as a transition probability of sampling by using the Metropolis-Hastings algorithm.

Optionally, step 4) includes:

4.1 Initializing source item parameters and determining a parameter estimation range; initializing a particle filtering parameter and setting the value of step k to be 1;

4.2 Obtaining the sensor degree d _k(r_k at the kth step position r= { x, y });

4.3 Performing a kth step of updating the particle filter;

4.4 Calculating the information entropy of each candidate position in the passable action set Uj _k of the kth step, calculating the information entropy of each candidate position in the action set U _k of the kth step, and calculating the optimal passable action direction Uj ^* _k and the optimal action direction U ^* _k of the kth step according to the following formula;

in the above formula, I (Uj _k) is the information entropy of each candidate position in the optional action set Uj _k of the kth step, and I (u _k) is the information entropy of each candidate position in the optional action set Uj _k of the kth step;

4.5 Judging whether uj ^* _k and u ^* _k in the kth step are equal, if uj ^* _k and u ^* _k in the kth step are equal, selecting uj ^* _k as the next position and moving, and jumping to the step 4.7);

4.6 1 to recent memory steps m traversal satisfies &Wherein uj ^* _k-m is not the optimal passable action direction for k-m steps, and u ^* _k-m is the optimal action direction for k-m steps; if the number count is smaller than a preset threshold value n, selecting uj ^* _k to move until the next intersection turns to the u ^* _k direction; otherwise, selecting uj ^* _k as the next position and moving, and jumping to the step 4.7);

4.7 Judging whether a preset stopping condition is met, if so, ending the source searching, and jumping to the step 5); otherwise, jumping to the step 4.2) to continue iteration.

Optionally, in step 5), when the improved source-seeking action scheme is combined with the multi-robot control technology to generate the multi-robot source-seeking action scheme based on the cognitive search strategy, each robot i in the multi-robot will locate itself at each stepSensor metricTo other robots in the vicinity, each robot i receives the sensor degrees collected by the nearby robot, performs the same source-finding search individually and maintains the respective k-th post probability density function P _i(θ_k|D_k), and the k-th post probability density function P _i(θ_k|D_k) of each robot is the same size.

In addition, the invention also provides a multi-robot-source search scheme generation system based on the cognitive search strategy, which comprises a microprocessor and a memory which are connected with each other, wherein the microprocessor is programmed or configured to execute the steps of the multi-robot-source search scheme generation method based on the cognitive search strategy.

Furthermore, the present invention provides a computer-readable storage medium having stored therein a computer program programmed or configured to perform the cognitive search strategy-based multi-robot search scheme generation method.

Compared with the prior art, the invention has the following advantages: the method comprises the steps of establishing a gas leakage model and a detection model of a sensor; modeling a leakage source search process based on the gas leakage model and the detection model of the sensor; the iterative update of source item parameter estimation in the leakage source searching process is represented by using a particle filtering method, and a source searching action scheme based on a cognitive searching strategy is established; aiming at an actual obstacle scene, improving the obstacle avoidance function of the source searching action scheme; and generating a multi-robot source searching action scheme based on the cognitive searching strategy aiming at the combination of the improved source searching action scheme and the multi-robot control technology. According to the invention, the gas leakage source search of multiple robots can be realized, the uncertainty existing in the source searching process is considered, the multiple robots are controlled to perform collaborative source searching in a scene with a large space scale by utilizing a multi-robot collaborative algorithm, the source searching efficiency and the source searching accuracy can be effectively improved, and the method can be suitable for the requirements of searching various gas sources in the scene including the source searching of dangerous chemical leakage accidents in a chemical industry park, the searching of gas leakage sources in residential buildings and the like.

Drawings

FIG. 1 is a schematic diagram of a basic flow of a method according to an embodiment of the present invention.

FIG. 2 is a schematic diagram of a detection model of a sensor according to an embodiment of the present invention.

FIG. 3 is a schematic diagram of an obstacle avoidance scheme according to an embodiment of the present invention.

Fig. 4 is a schematic flow chart of step 4) in the embodiment of the invention.

FIG. 5 is a schematic diagram of a coordinated control action scheme of multiple robots in an ideal case according to an embodiment of the invention.

FIG. 6 is a schematic diagram of a coordinated control action scheme of multiple robots in case of blocked information communication according to an embodiment of the present invention.

Fig. 7 is a schematic flow chart of step 5) in the embodiment of the invention.

Detailed Description

A method for generating an action scenario for controlling a multi-robot to perform a gas leakage source search based on a cognitive search strategy will be described in detail below by taking a large obstacle scenario as an example.

As shown in fig. 1, the method for generating a multi-robot search scheme based on a cognitive search strategy according to the present embodiment includes:

1) Establishing a gas leakage model and a detection model of a sensor;

In the embodiment, step 1) a gas leakage model is built based on an atmospheric transmission diffusion model, and a detection model of a sensor is built to process observed data; step 2) modeling a leakage source searching process, converting the leakage source searching process into a part of a considerable Markov decision process, and determining information states, available action sets and rewarding functions; step 3) representing iterative update of source item estimation by using a particle filtering method, adding a resampling step to avoid particle degradation, sampling new particles by using a Metropolis-Hastings algorithm, and then redesigning a reward function according to the distribution and weight of the particles to control the action of the robot; and 4) taking a complex obstacle scene as a background, considering the influence of road network constraint from an algorithm level, and introducing an intermittent search strategy to improve a robot action scheme based on the cognitive search strategy. And 5) aiming at the characteristic of large space scale of the complex obstacle scene, taking uncertainty existing in the source searching process into consideration, and controlling a plurality of robots to perform collaborative source searching in the scene with large space scale by utilizing a multi-robot collaborative algorithm. Through the steps, the gas leakage source search of the multiple robots can be realized, the multiple robots are controlled to perform collaborative source searching in a scene with a large space scale by utilizing a multi-robot collaborative algorithm in consideration of uncertainty existing in the source searching process, the source searching efficiency and the source searching accuracy can be effectively improved, and the method can be suitable for the requirements of various gas source searching scenes including chemical industry park hazard chemical leakage accident source searching, residential building inner gas leakage source searching and the like.

In this embodiment, the functional expression of the gas leakage model established in step 1) is:

In the above formula, V is the average wind speed, For the gradient in the y-axis direction, c (r|θ ₀) is the gas concentration at position r= { x, y }, θ ₀＝{r₀, Q } is the source term parameter at position r ₀＝{x₀,y₀ } of the leakage source r ₀, Q is the diffusion intensity, D is the gas effective diffusion coefficient, Δc (r|θ ₀) is the variation of concentration, τ is the gas molecular lifetime, and δ is the dirac function. The gas leakage model is derived from a steady-state convection diffusion equation (Advection-diffusion Equation), V is the average wind speed, the average wind direction is in the same direction as the negative direction of the Y axis, a specific coordinate system is shown in fig. 2, and the abscissa and ordinate in the figure represent the scene size. In this embodiment, the functional expression of the detection model of the sensor established in step 1) is:

R(r|θ₀)＝4πDac(r|θ₀)， (3)

in the above formula, c (r|θ ₀) is the gas concentration at the position r= { x, y } and is also the three-dimensional analytical solution of formula (1); θ ₀＝{r₀, Q } is the source term parameter of the position r ₀＝{x₀,y₀ of the leakage source r ₀, Q is the diffusion intensity, V is the average wind speed, and the functional expression of the intermediate variable λ is:

In the actual source searching process, the accuracy problem of the existing gas sensor is considered, and the detected concentration can have a larger error with the actual concentration. Therefore, a detection model of the sensor needs to be established to process the observed data, the contact process of the sensor and the gas molecules can be compared with electromagnetic phenomenon, a Mo Luhuo Frutus formula is introduced, concentration data at any position is converted into average contact times of the sensor and the gas chemical warfare agent molecules in unit time, and finally the average contact times model of the sensor and the gas molecules in unit time in the formula (3) is obtained. Where the gas concentration is high, the sensor can contact more times of gas molecules per unit time.

At the same time, turbulence effects can disturb the concentration field, resulting in the sensor only obtaining sporadic, intermittent effective readings, and therefore having to take its effect into account. We introduced a poisson process in the detection model of the sensor to approximate the effect of modeling turbulence on gas diffusion. The poisson process is one of random processes and is defined by the occurrence time of events, and if one random process N (t) is one-dimensional poisson process with time-aligned, the poisson process meets the following two conditions, namely, the number of events occurring in two mutually exclusive intervals is a random variable independent of each other; secondly, the probability distribution of the number of events occurring within the interval [ t, t+τ ] is:

In the above equation, μ is a positive number, commonly referred to as the arrival rate.

In this embodiment, taking the average number of times the sensor contacts with the gas molecules in unit time as the arrival rate, i.e. μ=r (r|θ ₀), the probability that the sensor contacts with the d molecules in unit time at the r= { x, y } position is represented by formula (2), and fig. 2 shows the detection values of the sensor in a case scenario before and after considering the turbulence effect, respectively. Fig. 2 is a schematic diagram of a detection model of the sensor in the present embodiment, where (a) in fig. 2 is a detection value of the sensor before the turbulence effect is considered, and (b) in fig. 2 is a detection value of the sensor after the turbulence effect is considered.

In this embodiment, step 2) includes: representing the source item parameter to be estimated as theta ₀＝{r₀ and Q by using a probability density function, determining a k-th post probability density function P (theta _k|D_k) representing the information state obtained after any k-th step by all the information already collected in the previous k-th step, wherein theta _k is the source item parameter estimated in the k-th step, D _k＝{d₁(r₁),d₂(r₂),…,d_k(r_k) represents all the information collected in the k-th step at a position r _k＝{x_k,y_k, and D ₁(r₁)～d_k(r_k) is the sensor readings obtained in the 1-k-th steps respectively; and the initial posterior probability density function P (theta ₀) is preset by priori knowledge, and the posterior probability density function P (theta _k|D_k) of any kth step is updated according to the following formula by using a Bayesian formula:

In the above formula, U ^* is the selected next position, U is the candidate position corresponding to the element in the optional action set U, I (U) is the information gain that can be obtained at the candidate position, and arg max represents the position where the selected information gain I (U) is the largest. At the beginning of the source seeking, since source term information has not been collected, an initial probability density function P (θ ₀) of the source term parameters can be obtained from a priori knowledge. It is assumed here that the parameter estimation range can be determined using a priori knowledge and that the initial probability density function is represented with a uniform distribution. The cognitive search algorithm directs the sourcing robot to move with a fixed step size at each step. Only the four-way case is considered in this embodiment, the optional action set is u= { Σ, +.i., Σ, → }. The sourcing robot uses a reward function to calculate the information gain I (u) that can be obtained at each candidate location in the optional action set in each step, and selects the location u ^* with the largest information gain as the next location.

In this embodiment, step 3) includes: sampling the post probability density function P (θ _k|D_k) of any kth step to produce N weighted random samplesWherein the method comprises the steps ofRepresenting an estimate of the source term parameter by the ith particle of step k,Is a particleCorresponding weight, N weightsSum to 1, using N weighted random samplesThe N particles as the particle filtering method approximately represent the post probability density function P (θ _k|D_k) of the kth step as:

In this embodiment, obtaining a new particle set from the original particle set by adopting a residual resampling method includes: calculating the cumulative probability of the kth step according to the following formula:

Step 3) includes particle filtering and resampling.

When the particle filtering is carried out, the iterative updating of the source item estimation is represented by using a particle filtering method, the posterior probability density function P (theta _k|D_k) of the kth step is sampled, and N random samples with weights are generatedWherein the method comprises the steps ofRepresenting an estimate of the source term parameter by the ith particle of step k,The weight corresponding to the particles is thatThe approximation P (θ _k|D_k) with these N particles is as in formula (9). In the resampling step, the particle weight is updated as described above when the sensor kth step obtains the sensor reading d _k at position r _k. It can be seen that if the probability that an estimate of the source term parameter by a particle will yield a reading d _k is greater, the updated weight of the particle will be greater. After several iterations, some particles that may be close to the real source term parameters will be increasingly weighted, which leads to particle degradation problems. The effective sampling scale N _eff is generally used to define the degree of degradation of particle filtering:

In the above formula, N is the number of weighted random samples, The weight corresponding to the particle. When N _eff drops to a set threshold eta, a residual resampling method is adopted to solve the particle degradation problem. From existing particle setsSampling to obtain new particle setThe method mainly comprises the following steps: the cumulative probability C _k is calculated and,Generating N random numbers uniformly distributed in the interval of 0,1For each ofFinding the smallest j causesOrder of principleAccording to the method, the particles with larger weights are duplicated many times, and the particles with small weights and very small contribution to the calculation of the posterior probability function are removed. But the resampling step may result in a loss of particle diversity, all particles may become duplicates of some particles after several iterations. Furthermore, updating the particle filter cannot reduce or eliminate the error of the point estimation of the source term parameters by the initial particles. Therefore, there is a need to increase particle diversity to solve this problem, and after the resampling step, new particle sets are sampled from the original particle set using the Metropolis-Hastings algorithm. A normal distribution probability function is used as the transition probability.

Under the constraint of road network, the optional action set of the kth step of the source-seeking robot is as followsThe robot can only select from Uj _k as the next stepIs a position of (c). However, this may result in the source-seeking device always wandering in an area and not being able to move in the source direction. At this time, if n times of movement cannot be performed in the same direction due to the road network constraint in the previous m steps by combining the intermittent search principle, the long-distance turning movement is adopted at this time to leave the area and turn to the direction which cannot be moved due to the road network constraint at the nearest intersection, so as to achieve the purpose of moving in the source direction. The process is shown in fig. 3, where the orientation has been calibrated, the solid black line represents the robot trajectory, the rectangle blue represents the obstacle, and the red point represents the robot operating at the position sensor. The blue circle in the left figure represents the current position of the robot, and the dotted line with an arrow points to the moving direction of the robot without the constraint of the road network. It can be seen that the robot cannot move forward many times due to the obstruction of the obstacle, and can only choose to move left and right, so long-distance steering movement is used to bypass the obstacle, during which the sensor does not work. As shown in fig. 4, step 4) in this embodiment includes:

4.2 Obtaining the sensor degree d _k(r_k at the kth step position r= { x, y });

4.3 Performing a kth step of updating the particle filter;

As shown in fig. 5 and 6, for a multi-robot based collaborative search algorithm, it is assumed that each robot i in a team (consisting of N robots) individually performs the Entrotaxis-Turn source-finding algorithm described previously and maintains a respective source term estimate P _i(θ_k|D_k) that describes where the source may exist. When each robot carries out source searching, the robot will locate itselfObtained sensor readingsTo other robots in the vicinity, each robot can receive the information collected by the nearby robot and update its estimate of the source item. Each robot decides the position to be moved next by means of a respective probability estimation function P _i(θ_k|D_k). The probability estimate for the source position for each robot is the same without communication delay or loss of information, i.e. P₁(θ_k|D_k)＝P₂(θ_k|D_k)＝...＝P_N(θ_k|D_k)., which means that all robots share the same particle filter from an algorithm level to estimate the source position, while this one particle filter can estimate the position where all robots will move. Wherein FIG. 5 is an ideal case ,P₁(θ_k|D_k)＝P₂(θ_k|D_k)＝P₃(θ_k|D_k); and FIG. 6 is a non-ideal case when information communication is blocked ,P₁(θ_k|D_k)≠P₂(θ_k|D_k)＝P₃(θ_k|D_k).

In this embodiment, when the improved source-seeking action scheme in step 5) is combined with the multi-robot control technique to generate the multi-robot source-seeking action scheme based on the cognitive search strategy, each robot i in the multi-robot will have its own position in each stepSensor metricTo other robots in the vicinity, each robot i receives the sensor degrees collected by the nearby robot, performs the same source-finding search individually and maintains the respective k-th post probability density function P _i(θ_k|D_k), and the k-th post probability density function P _i(θ_k|D_k) of each robot is the same size.

As shown in fig. 7, step 5) in this embodiment includes: 5.1 Initializing source item parameters and determining a parameter estimation range; initializing particle filtering parameters; initializing a value for representing step k to be 1;5.2 Obtaining the sensor degree d _k(r_k at the kth step position r= { x, y }); 5.3 Performing a kth step of updating the particle filter; 5.4 Calculating the information entropy of each candidate position in the passable action set Uj _k of the kth step, calculating the information entropy of each candidate position in the action set U _k of the kth step, and calculating the optimal passable action direction Uj ^* _k and the optimal action direction U ^* _k of the kth step according to the following formula;

In the above formula, I (Uj _k) is the information entropy of each candidate position in the optional action set Uj _k of the kth step, and I (u _k) is the information entropy of each candidate position in the optional action set Uj _k of the kth step; 5.5 If uj ^* _k and u ^* _k in the kth step are equal, selecting uj ^* _k as the next position and moving, and jumping to step 5.7); 5.6 1 to recent memory steps m traversal satisfies &Wherein uj ^* _k-m is not the optimal passable action direction for k-m steps, and u ^* _k-m is the optimal action direction for k-m steps; if the number count is greater than a preset threshold value n, selecting a direction away from other robots to move to the direction of a next intersection steering u ^* _k; otherwise, selecting uj ^* _k as the next position and moving, and jumping to the step 5.7); 5.7 Judging whether a preset stopping condition is met, if so, ending the source searching, and jumping to the step 5); otherwise, jump to step 5.2). By increasing the number of robots in this embodiment, the time to find the home position can be reduced, and the variance of the source position estimate can be reduced, because the increased number of robots effectively improves the detection rate of the team. Therefore, a source-seeking robot team consisting of N robots can finish the source-seeking work with fewer particle filtering iteration times, thereby reducing the source-seeking time and improving the source-seeking success rate.

In summary, the method of the embodiment includes constructing a gas leakage model and a sensor detection model according to an atmospheric transport diffusion model; modeling a leakage source searching process; establishing a source searching action scheme based on a cognitive search strategy; aiming at an actual obstacle scene, improving an action scheme to avoid an obstacle; aiming at a large obstacle scene, combining the action scheme with a multi-robot control technology to generate a multi-robot source search action scheme based on a cognitive search strategy. The method has the advantages of wide application scene, high searching efficiency and success rate and strong robustness in complex environments. The key method for realizing the embodiment is a multi-robot source search action scheme generation method based on a cognitive search strategy. According to the method, an action scheme can be provided for the gas source searching problem in a large obstacle scene. The method establishes a gas leakage model and establishes a detection model according to the accuracy problem of the robot sensor. Modeling the whole source searching process based on the cognitive searching strategy, designing an obstacle avoidance algorithm, and providing a multi-robot control action scheme under the scene of the corresponding large-scale station. Compared with the existing autonomous source-seeking action scheme, the method is wider in adaptation scene, and the searching efficiency and success rate in a large obstacle scene are remarkably improved. In practical application, the method of the embodiment has stronger practicability.

In addition, the embodiment also provides a multi-robot-source search scheme generation system based on the cognitive search strategy, which comprises a microprocessor and a memory which are connected with each other, wherein the microprocessor is programmed or configured to execute the steps of the multi-robot-source search scheme generation method based on the cognitive search strategy.

In addition, the present embodiment also provides a computer-readable storage medium having stored therein a computer program programmed or configured to perform the aforementioned multi-robot-source search scheme generation method based on a cognitive search strategy.

It will be appreciated by those skilled in the art that embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-readable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein. The present application is directed to methods, apparatus (systems), and computer program products in accordance with embodiments of the present application that produce means for implementing the functions specified in the flowchart flow(s) and/or block diagram block or blocks, with reference to the instructions that execute in the flowchart and/or processor(s) of the computer program product. These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks. These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

The above description is only a preferred embodiment of the present invention, and the protection scope of the present invention is not limited to the above examples, and all technical solutions belonging to the concept of the present invention belong to the protection scope of the present invention. It should be noted that modifications and adaptations to the present invention may occur to one skilled in the art without departing from the principles of the present invention and are intended to be within the scope of the present invention.

Claims

1. A method for generating a multi-robot search scheme based on a cognitive search strategy is characterized by comprising the following steps:

1) Establishing a gas leakage model and a detection model of a sensor;

5) Combining the improved source searching action scheme with a multi-robot control technology to generate a multi-robot source searching action scheme based on a cognitive searching strategy;

the functional expression of the gas leakage model established in the step 1) is as follows:

In the above formula, V is the average wind speed, ∇ _y is the gradient in the y-axis direction, c (r|theta ₀) is the gas concentration at the position r= { x, y }, theta ₀={r₀, Q } is the source term parameter at the position r ₀={x₀,y₀ } of the leakage source r ₀, Q is the diffusion intensity, D is the effective diffusion coefficient of the gas, Is the variation of concentration, τ is the gas molecular lifetime, δ is the dirac function;

The functional expression of the detection model of the sensor established in the step 1) is as follows:

In the above formula, c (r|θ ₀) is the gas concentration at the position r= { x, y }, θ ₀={r₀, Q } is the source term parameter at the position r ₀={x₀,y₀ } of the leakage source r ₀, Q is the diffusion intensity, V is the average wind speed, and the function expression of the intermediate variable λ is:

；

Step 2) comprises: representing the source item parameter to be estimated as theta ₀={r₀ and Q by using a probability density function, determining a k-th post probability density function P (theta _k|D_k) representing the information state obtained after any k-th step by all the information already collected in the previous k-th step, wherein theta _k is the source item parameter estimated in the k-th step, D _k ={d₁(r₁), d₂(r₂),…, d_k(r_k) represents all the information collected in the k-th step at a position r _k={x_k,y_k, and D ₁(r₁)～d_k(r_k) is the sensor readings obtained in the 1-k-th steps respectively; and the initial posterior probability density function P (theta ₀) is preset by priori knowledge, and the posterior probability density function P (theta _k|D_k) of any kth step is updated according to the following formula by using a Bayesian formula:

In the above formula, P (theta _k|D_k-1) is the posterior probability density of the k-1 step, P (d _k(r_k)|θ_k) is the density weight, and P (d _k(r_k)|D_k-1) is the normalization factor; the optional action set U under four-connection condition is considered to be u= { Σ, +.i, +.fwdarw }, four elements ∈, +.i, +., →four directions of the action are respectively represented in the selectable action set U, calculating the information gain I (U) which can be obtained at each candidate position in the optional action set U by adopting a reward function in each step of the source searching robot, and selecting the position U with the maximum information gain I (U) as the next position according to the following formula, thereby obtaining a model function expression for the leakage source searching process, wherein the model function expression is shown in the following formula:

In the above formula, U is the selected next position, U is the candidate position corresponding to the element in the optional action set U, I (U) is the information gain that can be obtained at the candidate position, and arg max represents the position where the selected information gain I (U) is the largest.

2. The method for generating a multi-robot search scheme based on a cognitive search strategy of claim 1, wherein step 3) comprises: sampling the random sample { x _k ⁱ,w_k ⁱ}_i=1:N with weights of the arbitrary k-th step post probability density function P (θ _k|D_k), where x _k ⁱ represents the estimation of the source term parameter by the i-th particle of the k-th step, w _k ⁱ is the weight corresponding to the particle x _k ⁱ, the sum of the N weights w _k ⁱ is 1, and using the random sample { x _k ⁱ,w_k ⁱ}_i=1:N with weights of the N particles as the N particles of the particle filtering method approximately represents the k-th step post probability density function P (θ _k|D_k) as follows:

3. The method for generating a multi-robot search scheme based on a cognitive search strategy according to claim 2, wherein the method for sampling new particle sets from original particle sets by residual resampling comprises: calculating the cumulative probability of the kth step according to the following formula:

in the above formula, C _k ^j is the first j cumulative probabilities of the kth step, C _k ^i-1 is the first j-1 cumulative probabilities of the kth step, Representing the original particle setThe j-th weight value of the group (A), N is the particle number of the particle group; generating N random number sets { mu _k ⁱ}_i=1:N ] which are uniformly distributed in the [0,1] interval; for each random number μ _k ⁱ in the set of random numbers { μ _k ⁱ}_i=1:N, find the smallest j value so that C _k ^j >μ_k ⁱ is satisfied, if the satisfied j value is found, let the new instance θ _k ⁱ =in the set of particlesAnd w _k ⁱ =1/N, thus obtaining a resampled particle set) { (θ _k ⁱ, w_k ⁱ)}_i=1:N, then sampling a new particle set by using a Metropolis-hastins algorithm for the resampled particle set) { (θ _k ⁱ, w_k ⁱ)}_i=1:N), and taking a normal distribution probability function as a transition probability of sampling by using the Metropolis-hastins algorithm.

4. The method for generating a multi-robot search scheme based on a cognitive search strategy of claim 3, wherein step 4) comprises:

4.2 Obtaining the sensor degree d _k(r_k at the kth step position r= { x, y });

4.3 Performing a kth step of updating the particle filter;

，，

In the above-mentioned method, the step of, For the information entropy of each candidate location in the optional action set Uj _k of step k,Information entropy of each candidate position in the optional action set Uj _k in the kth step;

4.6 1 to recent memory steps m traversal satisfies Wherein uj ^* _k-m is not the optimal passable action direction for k-m steps, and u ^* _k-m is the optimal action direction for k-m steps; if the number count is smaller than a preset threshold value n, selecting uj ^* _k to move until the next intersection turns to the u ^* _k direction; otherwise, selecting uj ^* _k as the next position and moving, and jumping to the step 4.7);

5. The method for generating a multi-robot source search scheme based on a cognitive search strategy according to claim 4, wherein when the improved source searching action scheme is combined with the multi-robot control technology in step 5) to generate the multi-robot source search action scheme based on the cognitive search strategy, each robot i in the multi-robot transmits its own position r _k ⁱ and sensor degree d _k ⁱ to other robots nearby in each step, each robot i receives the sensor degree collected by the nearby robot, performs the same source searching individually and maintains the respective k-th post probability density function P _i(θ_k|D_k), and the k-th post probability density function P _i(θ_k|D_k) of each robot is the same.

6. A multi-robot-based search scheme generation system based on a cognitive search strategy, comprising a microprocessor and a memory, which are interconnected, characterized in that the microprocessor is programmed or configured to perform the steps of the multi-robot-based search scheme generation method according to any one of claims 1 to 5.

7. A computer-readable storage medium having stored therein a computer program programmed or configured to perform the cognitive search strategy-based multi-robotic search scheme generation method of any one of claims 1-5.