CN114154383A

CN114154383A - Multi-robot-source search scheme generation method and system based on cognitive search strategy

Info

Publication number: CN114154383A
Application number: CN202111457545.1A
Authority: CN
Inventors: 陈彬; 季雅泰; 吕欣; 赵勇; 刘忠; 王锐; 王昊冉; 何华; 肖军浩; 卢惠民
Original assignee: National University of Defense Technology
Current assignee: National University of Defense Technology
Priority date: 2021-12-01
Filing date: 2021-12-01
Publication date: 2022-03-08

Abstract

The invention discloses a multi-robot-source search scheme generation method and a multi-robot-source search scheme generation system based on a cognitive search strategy, wherein the method comprises the steps of establishing a gas leakage model and a detection model of a sensor; modeling a leakage source searching process based on a gas leakage model and a detection model of a sensor; representing iterative updating of source item parameter estimation in a leakage source searching process by using a particle filtering method, and establishing a source searching action scheme based on a cognitive search strategy; improving the obstacle avoidance function of the source searching action scheme aiming at the actual obstacle scene; aiming at the combination of the improved source searching action scheme and the multi-robot control technology, a multi-robot source searching action scheme based on a cognitive search strategy is generated. The invention can realize the gas leakage source search of multiple robots, takes the uncertainty in the source searching process into consideration, and controls multiple robots to cooperatively search the source in a scene with large spatial scale by utilizing the multi-robot cooperative algorithm, thereby effectively improving the source searching efficiency and the source searching accuracy.

Description

Multi-robot-source search scheme generation method and system based on cognitive search strategy

Technical Field

The invention relates to the field of robot control, in particular to a multi-robot-source search scheme generation method and system based on a cognitive search strategy, which are suitable for various gas source search scenes including the search of sources of hazardous chemical leakage accidents in chemical industrial parks, the search of gas leakage sources in residential buildings and the like.

Background

The occurrence of various dangerous gas leakage accidents can cause serious casualties and economic property loss. In order to maintain personnel safety and avoid property loss, the method for quickly searching the position of a leakage source and acquiring source item parameters becomes important work. Autonomous source search using a mobile device such as a robot equipped with various sensors is an effective method, and the robot moves to a leakage source step by step according to information acquired by the sensors until the leakage source is found. In the process, how to generate a specific action scheme to control the movement of the robot is an urgent target to be explored. For the problem of sensing and searching of gas leakage sources, there are mainly four methods that can support the generation of action schemes, namely gradient-based algorithms, bionics algorithms, probability and map-based algorithms, and information theory principle-based algorithms.

Pure gas concentration gradient-based algorithms are the most primitive implementation in this field. The method guides the robot to the position of the leakage source by using the concentration gradient, but because the distribution uncertainty of the gas in the diffusion process under the real environment is greatly influenced by factors such as turbulence and the like, the method is difficult to realize under the real scene. The bionics algorithm is realized by a method for researching animals to find odor sources and planning paths, and the method also needs stable concentration gradients and is mostly used for source searching experiments in small scenes. The probability and map-based algorithm models the position of the odor source into probability distribution, and through continuous observation, the probability distribution is changed into a Dirac function, so that the position of the source is determined. Such algorithms are relatively late and complex in mathematical calculations, but are widely studied for their effectiveness, such as searching for a source of gas diffusion indoors using particle filtering.

In order to solve the problem of source positioning in a turbulent environment, a cognitive search strategy based on the information theory principle is widely researched and applied. This strategy uses probabilistic estimation to locate scent sources while using information state based reward functions for sequential decisions in uncertain cases. In particular, the source search process may be described as a Partially Observable Markov Decision Process (POMDP) consisting of three elements: information state, action set, and reward function. Vergasola et al proposed the earliest cognitive search strategy, Infotaxis, that employs a grid-based approach to maintaining information state. Then, researchers have replaced the grid-based method in the Infotaxis algorithm with a sequential monte carlo framework based on particle filtering, so that the algorithm can estimate the source intensity and the source position. Different cognitive search strategies may employ different reward functions. The original Infotaxi algorithm designed a reward function containing two terms to achieve the trade-off between Exploration (Exploration) and development (Exploitation). Another cognitive search algorithm named Entrotaxi, proposed by Hakinson et al, designs a reward function based on the principle of maximum entropy sampling. This reward function measures the uncertainty of the future expected probe event rather than the uncertainty of the source information considered in the Infotaxis algorithm.

One feature of the gas leakage source search scenario is the large spatial scale. In this scenario, searching for diffuse sources using a single robot may take too much time, or may not be able to perform efficient iterative calculations due to being far from the source, resulting in no sources being found. By using a multi-robot system, multiple robots complete tasks through coordinated actions, many complex tasks that are difficult to complete by a single robot can be effectively completed. The problem of multi-robot cooperative control has attracted a wide range of attention. Soares et al propose a graph-based formation control algorithm that can organize robots into arbitrary, evolving shapes, and achieve tracing to gas diffusion sources. The odor source positioning algorithm based on the particle swarm optimization proposed by Jatmiko combines chemotaxis and wind tendency methods in the improved particle swarm optimization algorithm, can position a source in an environment with obstacles, and solves the problem of dynamic convective diffusion. In addition, Hajieghrary and Ani propose a multi-robot cooperation "information leasing" strategy to deal with large-scale robots. However, in consideration of uncertainty in the sourcing process, there is still a wide research space on how to control multiple robots to perform collaborative sourcing in a large-space-scale scene to improve the sourcing efficiency and the sourcing accuracy.

Disclosure of Invention

The technical problems to be solved by the invention are as follows: the invention can realize the gas leakage source search of multiple robots, and the multi-robot cooperative algorithm is utilized to control a plurality of robots to cooperatively search the source in a scene with large spatial scale by considering the uncertainty existing in the source searching process, so that the source searching efficiency and the source searching accuracy can be effectively improved, and the method and the system are suitable for searching scenes of various gas sources, including the source searching of dangerous chemical leakage accidents in chemical industrial parks, the gas leakage source searching in residential buildings and the like.

In order to solve the technical problems, the invention adopts the technical scheme that:

a multi-robot-source search scheme generation method based on a cognitive search strategy comprises the following steps:

1) establishing a gas leakage model and a detection model of a sensor;

2) modeling a leakage source searching process based on a gas leakage model and a detection model of a sensor;

3) representing iterative updating of source item parameter estimation in a leakage source searching process by using a particle filtering method, and establishing a source searching action scheme based on a cognitive search strategy;

4) improving the obstacle avoidance function of the source searching action scheme aiming at the actual obstacle scene;

5) aiming at the combination of the improved source searching action scheme and the multi-robot control technology, a multi-robot source searching action scheme based on a cognitive search strategy is generated.

Optionally, the functional expression of the gas leakage model established in step 1) is:

in the above formula, V is the average wind speed,

gradient in the y-axis direction, c (r | θ)₀) Is the concentration of gas at position r ═ { x, y }, θ₀＝{r₀Q is a leakage source r₀Position r of₀＝{x₀,y₀The source parameter of Q is the diffusion intensity, D is the effective diffusion coefficient of the gas, Δ c (r | θ }₀) For the variation of concentration, τ is the gas molecular lifetime and δ is the dirac function.

Optionally, the functional expression of the detection model of the sensor established in step 1) is:

in the above formula, P (d (r) | θ₀) Is the probability that the sensor will contact d gas molecules per unit time at position R ═ x, y ═ d is the number of times the sensor will contact gas molecules per unit time at position R ═ x, y ═ R (R | θ!), R₀) Is the average number of contacts of the sensor with gas molecules per unit time, and has:

R(r|θ₀)＝4πDac(r|θ₀)

in the above formula, D is the effective diffusion coefficient of the gas, a is the radius of the spherical sensor, and c (r | theta [ ])₀) Is the concentration of the gas at position r ═ { x, y }, and has:

in the above formula, c (r | θ)₀) Is the concentration of gas at position r ═ { x, y }, θ₀＝{r₀Q is a leakage source r₀Position r of₀＝{x₀,y₀The source parameter of Q is diffusion intensity, V is average wind speed, and the functional expression of the intermediate variable lambda is:

optionally, step 2) comprises: the source term parameter to be estimated is θ₀＝{r₀Q, the estimation of the source term parameters is represented by a probability density function, which will represent the post probability density function P (theta) at the kth step of the information state obtained after any kth step_k|D_k) Determined by all the information that has been collected in the first k steps, where θ_kSource term parameters estimated for the k-th step, D_k＝{d₁(r₁),d₂(r₂),…,d_k(r_k) Denotes the kth step at position r_k＝{x_k,y_kAll the information collected on, d₁(r₁)～d_k(r_k) Respectively reading the sensor obtained in the steps 1-k; and an initial post probability density function P (theta)₀) The posterior probability density function P (theta) of any k step is preset by prior knowledge_k|D_k) Updating by using a Bayesian formula according to the following formula:

in the above formula, P (θ)_k|D_k-1) Posterior probability Density of step k-1, P (d)_k(r_k)|θ_k) As density weight, P (d)_k(r_k)|D_k-1) Is a normalization factor; considering the four-connectivity condition, the selectable action set U is U { ×), ↓, → }, four elements { ({), ↓), (±), → } in the selectable action set U respectively represent four directions of the action, the sourcing robot calculates an information gain i (U) that can be obtained at each candidate position in the selectable action set U by using a reward function in each step, and selects a position U with the largest information gain i (U) according to the following formula^*As the next position, the model function expression of the leakage source search process is obtained as follows:

in the above formula, u^*For the next position to be selected, U is the candidate position corresponding to the element in the selectable action set U, i (U) is the information gain that can be obtained at the candidate position, and argmax represents the position at which the information gain i (U) is selected to be the maximum.

Optionally, step 3) comprises: for any k-th step, the post probability density function P (theta)_k|D_k) Sampling to generate N random samples with weights

Wherein

Represents the estimation of the source term parameters by the ith particle of the kth step,

are particles

Corresponding weight, N weights

With a total of 1, using N weighted random samples

The N particles as the particle filtering method approximately represent the post probability density function P (theta) of the k step_k|D_k) Comprises the following steps:

in the above formula, δ is a dirac function; thereby obtaining the post probability density function P (theta) of any k step in the leakage source searching process_k|D_k) Is expressed as a particle update of the particle filter method, and at the time of the particle update of the cash particle filter method, the effective sampling scale N is calculated_effTo define the degradation degree of particle filter if the effective sampling scale N_effAnd when the particle size is reduced to be smaller than a set threshold eta, a new particle set is obtained by sampling the original particle set in an integrated manner by adopting a residual error resampling method.

Optionally, the obtaining a new particle set from the original particle lumped sample by the method of residual resampling includes: calculating the cumulative probability of the k step according to the following formula:

in the above formula, the first and second carbon atoms are,

for the first j cumulative probabilities of step k,

the first j-1 cumulative probabilities for step k,

indicating the original set of particles

The jth weight in (1), N is the number of particles in the particle set; generating N at [0,1 ]]Set of random numbers evenly distributed within an interval

For random number sets

Each of which is random

Find the minimum j value so as to satisfy

Example of having new particles concentrated if a satisfied j value is found

And is

Thereby obtaining a resampled particle set

Then on the resampled particle set

And sampling a new particle set by using a Metropolis-Hastings algorithm, and taking a normal distribution probability function as the transition probability of sampling by using the Metropolis-Hastings algorithm.

Optionally, step 4) comprises:

4.1) initializing source item parameters and determining a parameter estimation range; initializing particle filter parameters and setting the value of step k to be 1;

4.2) obtaining the sensor degree d at the k-th step position r ═ { x, y }_k(r_k)；

4.3) executing the k step updating particle filtering;

4.4) calculating the passable action set Uj of the k step_kOf each candidate positionInformation entropy, calculating action set U of k step_kThe information entropy of each candidate position in the k step is calculated according to the following formula to obtain the optimal passable action direction uj^* _kAnd optimal direction of action u^* _k；

In the above formula, I (uj)_k) Is the optional action set Uj of the k step_kInformation entropy of each candidate position in, I (u)_k) Is the optional action set Uj of the k step_kThe information entropy of each candidate position;

4.5) judging uj of the k step^* _kAnd u^* _kIf both are equal, if uj of the k-th step^* _kAnd u^* _kIf both are equal, uj is selected^* _kAs next position and move, jump to step 4.7);

4.6) traversal from 1 to the recent memory step number m satisfies

&

The number of (1), where uj^* _k-mOptimal passable direction of motion, u, without k-m steps^* _k-mThe optimal action direction of the k-m steps; if the number count is in a preset threshold n, then uj is selected^* _kThe direction is moved to turn to the next intersection u^* _kDirection; otherwise, select uj^* _kAs next position and move, jump to step 4.7);

4.7) judging whether a preset stopping condition is met, if so, ending the source searching, and skipping to the step 5); otherwise, jumping to step 4.2) and continuing iteration.

Optionally, in step 5), when the multi-robot-source searching action scheme based on the cognitive search strategy is generated by combining the improved source searching action scheme with the multi-robot control technologyIn the multi-robot, each robot i can determine the position of the robot at each step

And the degree of the sensor

Transmitting to other robots nearby, receiving the sensor degrees collected by the robots nearby by each robot i, independently executing the same searching source and maintaining the post probability density function P of the k step_i(θ_k|D_k) And the post probability density function P of the k step of each robot_i(θ_k|D_k) Are the same size.

In addition, the invention also provides a multi-robot-source search scheme generation system based on the cognitive search strategy, which comprises a microprocessor and a memory which are connected with each other, wherein the microprocessor is programmed or configured to execute the steps of the multi-robot-source search scheme generation method based on the cognitive search strategy.

In addition, the invention also provides a computer readable storage medium, wherein a computer program programmed or configured to execute the cognitive search strategy-based multi-robot-source search scheme generation method is stored in the computer readable storage medium.

Compared with the prior art, the invention has the following advantages: the method comprises the steps of establishing a gas leakage model and a detection model of a sensor; modeling a leakage source searching process based on a gas leakage model and a detection model of a sensor; representing iterative updating of source item parameter estimation in a leakage source searching process by using a particle filtering method, and establishing a source searching action scheme based on a cognitive search strategy; improving the obstacle avoidance function of the source searching action scheme aiming at the actual obstacle scene; aiming at the combination of the improved source searching action scheme and the multi-robot control technology, a multi-robot source searching action scheme based on a cognitive search strategy is generated. The invention can realize the search of the gas leakage source of a plurality of robots, takes the uncertainty in the source searching process into consideration, utilizes a multi-robot cooperative algorithm to control a plurality of robots to cooperatively search the source in a scene with large spatial scale, can effectively improve the source searching efficiency and the source searching accuracy, and is suitable for the search scenes of various gas sources, including the source searching of the hazardous chemical leakage accidents in chemical industrial parks, the search of the gas leakage source in residential buildings and the like.

Drawings

FIG. 1 is a schematic diagram of a basic flow of a method according to an embodiment of the present invention.

Fig. 2 is a schematic diagram of a detection model of a sensor in an embodiment of the invention.

Fig. 3 is a schematic diagram of an obstacle avoidance action scheme according to an embodiment of the present invention.

FIG. 4 is a schematic flow chart of step 4) in the embodiment of the present invention.

FIG. 5 is a schematic diagram of an ideal multi-robot cooperative control action scenario in the embodiment of the present invention.

FIG. 6 is a schematic diagram illustrating a multi-robot cooperative control action scenario under the condition of blocked information communication according to an embodiment of the present invention.

FIG. 7 is a schematic flow chart of step 5) in the embodiment of the present invention.

Detailed Description

The following describes in detail a method for generating an action plan for controlling a plurality of robots to search for a gas leakage source based on a cognitive search strategy, taking a large obstacle scene as an example.

As shown in fig. 1, the method for generating a multi-robot-source search scheme based on a cognitive search policy in this embodiment includes:

1) establishing a gas leakage model and a detection model of a sensor;

In the embodiment, step 1) establishes a gas leakage model based on an atmospheric transmission diffusion model, and establishes a detection model of a sensor to process observed data; step 2) modeling a leakage source searching process, converting the leakage source searching process into a partial observable Markov decision process, and determining an information state, an available action set and a reward function; step 3) representing iterative updating of source item estimation by using a particle filtering method, adding a resampling step to avoid the problem of particle degradation, sampling new particles by using a Metropolis-Hastings algorithm, and then redesigning a reward function according to the distribution and weight of the particles to control the action of the robot; and 4) taking a complex obstacle scene as a background, considering the influence of the road network constraint from the algorithm level, and introducing an intermittent search strategy to improve the robot action scheme based on the cognitive search strategy. And 5) aiming at the characteristic of large space scale of a complex obstacle scene, considering uncertainty existing in the source searching process, and controlling a plurality of robots to perform collaborative source searching in the scene with large space scale by using a multi-robot collaborative algorithm. Through the steps, the gas leakage source search of multiple robots can be realized, uncertainty in the source searching process is considered, the multiple robots are controlled to search the source in a large-space-scale scene in a collaborative mode through the multi-robot collaborative algorithm, the source searching efficiency and the source searching accuracy can be effectively improved, and the method is suitable for the requirements of searching multiple gas source scenes including chemical industry park hazardous chemical leakage accident source searching, gas leakage source search in residential buildings and the like.

In this embodiment, the functional expression of the gas leakage model established in step 1) is:

in the above formula, V is the average wind speed,

gradient in the y-axis direction, c (r | θ)₀) Is the concentration of the gas at the position r ═ { x, y }，θ₀＝{r₀Q is a leakage source r₀Position r of₀＝{x₀,y₀The source parameter of Q is the diffusion intensity, D is the effective diffusion coefficient of the gas, Δ c (r | θ }₀) For the variation of concentration, τ is the gas molecular lifetime and δ is the dirac function. The gas leakage model is derived from a steady-state convection diffusion Equation (addition-diffusion Equation), wherein V is the average wind speed, the average wind direction is the same as the negative direction of the Y axis, a specific coordinate system is shown in figure 2, and the horizontal and vertical coordinates in the figure represent the size of a scene. In this embodiment, the functional expression of the detection model of the sensor established in step 1) is:

R(r|θ₀)＝4πDac(r|θ₀)， (3)

in the above formula, c (r | θ)₀) Is the gas concentration at position r ═ { x, y }, which is also a three-dimensional analytical solution of equation (1); theta₀＝{r₀Q is a leakage source r₀Position r of₀＝{x₀,y₀The source parameter of Q is diffusion intensity, V is average wind speed, and the functional expression of the intermediate variable lambda is:

in the actual source searching process, the accuracy problem of the existing gas sensor is considered, and the detected concentration and the actual concentration have large errors. Therefore, a detection model of the sensor needs to be established to process observed data, the contact process of the sensor and gas molecules can be compared with the electromagnetic phenomenon, a Schumohofsky formula is introduced, concentration data of any position are converted into the average contact times of the sensor and gas chemical warfare agent molecules in unit time, and finally an average contact time model of the sensor and the gas molecules in unit time in the formula (3) is obtained. Where the gas concentration is higher, the sensor can contact more gas molecules per unit time.

At the same time, turbulence effects disturb the concentration field, resulting in the sensor only obtaining sporadic, intermittent valid readings, the effect of which has to be taken into account. We introduce a poisson process in the detection model of the sensor to approximate the effect of turbulence effects on gas diffusion. The Poisson process is one of random processes and is defined by the occurrence time of events, if a random process N (t) is a one-dimensional Poisson process with homogeneous time, the Poisson process meets the following two conditions, namely, the number of events occurring in two mutually exclusive intervals is mutually independent random variables; second, the probability distribution of the number of events occurring within the interval [ t, t + τ ] is:

in the above equation, μ is a positive number and is generally referred to as an arrival rate.

In this embodiment, the average number of times the sensor contacts the gas molecule per unit time is used as the arrival rate, i.e., μ ═ R (R | θ)₀) Then, the probability that the sensor contacts d molecules per unit time at the r ═ x, y position is shown as equation (2), and fig. 2 shows the detection values of the sensor in one case scenario before and after considering the turbulence effect, respectively. Figure 2 is a schematic diagram of a detection model of the sensor in the embodiment,fig. 2 (a) shows the detection value of the sensor before the turbulent effect is considered, and fig. 2 (b) shows the detection value of the sensor after the turbulent effect is considered.

In this embodiment, step 2) includes: the source term parameter to be estimated is θ₀＝{r₀Q, the estimation of the source term parameters is represented by a probability density function, which will represent the post probability density function P (theta) at the kth step of the information state obtained after any kth step_k|D_k) Determined by all the information that has been collected in the first k steps, where θ_kSource term parameters estimated for the k-th step, D_k＝{d₁(r₁),d₂(r₂),…,d_k(r_k) Denotes the kth step at position r_k＝{x_k,y_kAll the information collected on, d₁(r₁)～d_k(r_k) Respectively reading the sensor obtained in the steps 1-k; and an initial post probability density function P (theta)₀) The posterior probability density function P (theta) of any k step is preset by prior knowledge_k|D_k) Updating by using a Bayesian formula according to the following formula:

in the above formula, u^*For the next position to be selected, U is the candidate position corresponding to the element in the selectable action set U, i (U) is the information gain that can be obtained at the candidate position, and argmax represents the position at which the information gain i (U) is selected to be the maximum. When the source searching starts, because the source item information is not collected, the initial probability density function P (theta) of the source item parameters can be obtained according to the prior knowledge₀). It is assumed here that the parameter estimation ranges can be determined using a priori knowledge and that the initial probability density function is represented by a uniform distribution. And the cognitive search algorithm guides the searching robot to move by adopting a fixed step length in each step. In this embodiment, only the case of four connections is considered, and the selectable action set is U { ↓, ←, → }. The source searching robot adopts a reward function to calculate the information gain I (u) which can be obtained from each candidate position in the selectable action set in each step, and selects the position u with the maximum information gain^*As the next position.

In this embodiment, step 3) includes: for any k-th step, the post probability density function P (theta)_k|D_k) Sampling to generate N random samples with weights

Wherein

are particles

Corresponding weight, N weights

With a total of 1, using N weighted random samples

In this embodiment, obtaining a new particle set from an original particle set by using a residual resampling method includes: calculating the cumulative probability of the k step according to the following formula:

in the above formula, the first and second carbon atoms are,

for the first j cumulative probabilities of step k,

the first j-1 cumulative probabilities for step k,

indicating the original set of particles

For random number sets

Each of which is random

Find the minimum j value so as to satisfy

Example of having new particles concentrated if a satisfied j value is found

And is

Thereby obtaining a resampled particle set

Then on the resampled particle set

Step 3) comprises particle filtering and resampling.

When particle filtering is carried out, iterative updating of source item estimation is expressed by using a particle filtering method, and the posterior probability density function P (theta) of the k step is subjected to_k|D_k) Sampling to generate N weighted random samples

Wherein

the weight value corresponding to the particle is

Approximating P (theta) with the N particles_k|D_k) As shown in formula (9). In the resampling step, when the kth step of the sensor is at the position r_kOn obtained sensor reading d_kThe particle weight is updated as per the above equation. It can be seen that if a particle estimates the source parameter, a reading d can be generated_kThe larger the probability of (c), the larger the updated weight of the particle. After several iterations, some particles that may be close to the true source parameter may get heavier and heavier, which leads to particle degradation problems. Typically using an effective sampling dimension N_effTo define the degradation level of the particle filter:

in the above formula, N is the number of weighted random samples,

the weight value corresponding to the particle. When N is present_effWhen the particle size is reduced to a set threshold eta, a residual error resampling method is adopted to solve the problem of particle degradation. From existing collections of particles

Intermediate sampling to obtain new particle set

The method mainly comprises the following steps: calculating cumulative probability C_k，

Generating N at [0,1 ]]Random numbers uniformly distributed in intervals

For each

Find the minimum j such that

Then order

According to the method, the particles with larger weight values are copied for a plurality of times, and the particles with small weight values and small contribution to the calculation of the posterior probability function are eliminated. The resampling step, however, results in a loss of particle diversity, and after several iterations, all particles may become duplicates of some few particles. Furthermore, updating the particle filter does not reduce or eliminate the error of the initial particle to point estimation of the source term parameters. Therefore, there is a need to increase the diversity of particles to solve this problem, and after the resampling step, a new set of particles is sampled from the original set of particles using the Metropolis-Hastings algorithm. And adopting a normal distribution probability function as the transition probability.

Under the constraint of a road network, the k-th step of the source searching robot is selected as the set

The robot can only follow Uj_kIs selected as the next step

The position of (a). However, this may result in the homing device wandering in one area all the time and not being able to move in the direction of origin. At this time, combining the intermittent search principle, if the movement in the same direction cannot be performed for n times due to the road network constraint in the previous m steps, the purpose of moving to the source direction is achieved by adopting long-distance steering movement to leave the area and steering the direction which cannot be performed due to the road network constraint at the nearest intersection. The process is shown in fig. 3, where the orientation is calibrated, the solid black line represents the robot trajectory, the blue rectangle represents the obstacle, and the red dot represents the robot working at the position sensor. The blue circle in the left picture represents the robot's placeAt the position, the dotted line with the arrow points to the moving direction of the robot without the road network constraint. It can be seen that due to the blocking of the obstacle, the robot can not move forward for many times, and only can move left and right, so that the robot adopts long-distance steering movement to bypass the obstacle, and the sensor does not work in the period. As shown in fig. 4, step 4) in this embodiment includes:

4.3) executing the k step updating particle filtering;

4.4) calculating the passable action set Uj of the k step_kCalculating the action set U of the k step according to the information entropy of each candidate position_kThe information entropy of each candidate position in the k step is calculated according to the following formula to obtain the optimal passable action direction uj^* _kAnd optimal direction of action u^* _k；

4.6) traversal from 1 to the recent memory step number m satisfies

&

As shown in FIGS. 5 and 6, for the multi-robot based collaborative search algorithm, it is assumed that each robot i in the team (composed of N robots) individually executes the aforementioned Entrotaxi-Turn sourcing algorithm and maintains the respective source item estimates P_i(θ_k|D_k) The estimate describes the location where the source may be present. When each robot carries out source searching, the robot can find the position of the robot

And the obtained sensor readings

Each robot may receive the information collected by the nearby robots and update its estimate of the source item. Each robot passing a respective probability estimation function P_i(θ_k|D_k) To decide where to move next. Without communication delay or loss of information, the probabilistic estimate for each robot about the source location is the same, i.e., P₁(θ_k|D_k)＝P₂(θ_k|D_k)＝...＝P_N(θ_k|D_k). This means that, in the algorithm level, all robots share the same particle filter to estimate the source position, and the particle filter can estimate the positions where all robots will move. Where FIG. 5 is an ideal case, P₁(θ_k|D_k)＝P₂(θ_k|D_k)＝P₃(θ_k|D_k) (ii) a FIG. 6 shows a situation where communication of information is blocked under non-ideal conditions, P₁(θ_k|D_k)≠P₂(θ_k|D_k)＝P₃(θ_k|D_k)。

In this embodiment, when the multi-robot source search action scheme based on the cognitive search policy is generated by combining the improved source search action scheme with the multi-robot control technology in step 5), each robot i in the multiple robots uses its own position at each step

And the degree of the sensor

As shown in fig. 7, step 5) in this embodiment includes: 5.1) initializing source item parameters and determining a parameter estimation range; initializing particle filter parameters; initializing a value for representing step k to 1; 5.2) obtaining the sensor degree d at the k-th step position r ═ { x, y }_k(r_k) (ii) a 5.3) executing the k step updating particle filtering; 5.4) calculating the passable action set Uj of the k step_kCalculating the action set U of the k step according to the information entropy of each candidate position_kThe information entropy of each candidate position in the k step, and the optimal passable action direction uj of the k step is calculated according to the following formula^* _kAnd optimal direction of action u^* _k；

In the above formula, the first and second carbon atoms are,I(uj_k) Is the optional action set Uj of the k step_kInformation entropy of each candidate position in, I (u)_k) Is the optional action set Uj of the k step_kThe information entropy of each candidate position; 5.5) if uj of step k^* _kAnd u^* _kIf both are equal, uj is selected^* _kAs next position and move, jump to step 5.7); 5.6) traversal from 1 to the recent memory step number m satisfies

&

The number of (1), where uj^* _k-mOptimal passable direction of motion, u, without k-m steps^* _k-mThe optimal action direction of the k-m steps; if the number count is in a preset threshold value n, selecting a direction away from other robots to move, and turning to the next intersection u^* _kDirection; otherwise, select uj^* _kAs next position and move, jump to step 5.7); 5.7) judging whether a preset stopping condition is met, if so, ending the source searching, and skipping to the step 5); otherwise, jump to step 5.2). In this embodiment, by increasing the number of robots, the time for finding the original position can be reduced, and the variance of the source position estimation can also be reduced, because the increased number of robots effectively improves the team detection rate. Therefore, a source searching robot team consisting of N robots can complete source searching work by using fewer particle filtering iteration times, so that the source searching time is shortened, and the source searching success rate is improved.

In summary, the method of the present embodiment includes constructing a gas leakage model and a sensor detection model according to an atmospheric transmission diffusion model; modeling a leakage source searching process; establishing a source searching action scheme based on a cognitive search strategy; improving the obstacle avoidance function of the action scheme aiming at the actual obstacle scene; aiming at a large-scale obstacle scene, an action scheme is combined with a multi-robot control technology to generate a multi-robot source search action scheme based on a cognitive search strategy. The method has the advantages of wide applicable scenes, high searching efficiency and success rate and strong robustness in a complex environment. The key method for realizing the embodiment is a multi-robot-source search action scheme generation method based on a cognitive search strategy. According to the method, an action scheme can be provided for the gas source search problem in a large obstacle scene. The method establishes a gas leakage model and establishes a detection model according to the accuracy problem of the robot sensor. The whole source searching process is modeled based on a cognitive searching strategy, an obstacle avoidance algorithm is designed, and meanwhile, a multi-robot control action scheme corresponding to a large-scale station under the scene is provided. Compared with the existing autonomous source searching action scheme, the method has the advantages that the adaptive scenes are wider, and the searching efficiency and the success rate in large obstacle scenes are obviously improved. In practical application, the method of the embodiment has stronger practicability.

In addition, the embodiment also provides a system for generating the multi-robot-source search scheme based on the cognitive search strategy, which comprises a microprocessor and a memory which are connected with each other, wherein the microprocessor is programmed or configured to execute the steps of the method for generating the multi-robot-source search scheme based on the cognitive search strategy.

In addition, the present embodiment also provides a computer-readable storage medium, in which a computer program is stored, which is programmed or configured to execute the aforementioned multi-robot-source search scheme generation method based on the cognitive search policy.

As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-readable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein. The present application is directed to methods, apparatus (systems), and computer program products according to embodiments of the application, wherein the instructions that execute via the flowcharts and/or processor of the computer program product create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks. These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks. These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

The above description is only a preferred embodiment of the present invention, and the protection scope of the present invention is not limited to the above embodiments, and all technical solutions belonging to the idea of the present invention belong to the protection scope of the present invention. It should be noted that modifications and embellishments within the scope of the invention may occur to those skilled in the art without departing from the principle of the invention, and are considered to be within the scope of the invention.

Claims

1. A multi-robot-source search scheme generation method based on a cognitive search strategy is characterized by comprising the following steps:

1) establishing a gas leakage model and a detection model of a sensor;

2. The method for generating the multi-robot-source search scheme based on the cognitive search strategy as claimed in claim 1, wherein the functional expression of the gas leakage model established in the step 1) is as follows:

in the above formula, V is the average wind speed,

3. The method for generating the multi-robot-source search scheme based on the cognitive search strategy as claimed in claim 2, wherein the functional expression of the detection model of the sensor established in the step 1) is as follows:

R(r|θ₀)＝4πDac(r|θ₀)

4. the multi-robot-source search scheme generation method based on the cognitive search strategy as claimed in claim 3, wherein the step 2) comprises: the source term parameter to be estimated is θ₀＝{r₀Q, the estimation of the source term parameters is represented by a probability density function, which will represent the post probability density function P (theta) at the kth step of the information state obtained after any kth step_k|D_k) Determined by all the information that has been collected in the first k steps, where θ_kSource term parameters estimated for the k-th step, D_k＝{d₁(r₁),d₂(r₂),…,d_k(r_k) Denotes the kth step at position r_k＝{x_k,y_kAll the information collected on, d₁(r₁)～d_k(r_k) Respectively reading the sensor obtained in the steps 1-k; and an initial post probability density function P (theta)₀) The posterior probability density function P (theta) of any k step is preset by prior knowledge_k|D_k) Updating by using a Bayesian formula according to the following formula:

5. The multi-robot-source search scheme generation method based on the cognitive search strategy as claimed in claim 4, wherein the step 3) comprises: for any k-th step, the post probability density function P (theta)_k|D_k) Sampling to generate N weighted random samples { x_k ⁱ,w_k ⁱ}_i＝1:NWherein x is_k ⁱRepresents the estimate of the source term parameter for the ith particle of the kth step, w_k ⁱIs a particle x_k ⁱCorresponding weight, N weights w_k ⁱWith a total of 1, using N weighted random samples { x }_k ⁱ,w_k ⁱ}_i＝1:NAs particle filtering methodsThe N particles approximate the post probability density function P (theta) of the k step_k|D_k) Comprises the following steps:

6. The method for generating the multi-robot-source search scheme based on the cognitive search strategy according to claim 5, wherein the obtaining of the new particle set from the original particle set by using the residual resampling method comprises: calculating the cumulative probability of the k step according to the following formula:

in the above formula, the first and second carbon atoms are,

is the first j cumulative probabilities, C, of step k_k ^i-1The first j-1 cumulative probabilities for step k,

indicating the original set of particles

The jth weight in (1), N is the number of particles in the particle set; generating N at [0,1 ]]Uniformly distributed within the intervalSet of machine numbers [ mu ]_k ⁱ}_i＝1:N(ii) a For a set of random numbers mu_k ⁱ}_i＝1:NEach random number mu in_k ⁱFind the minimum value of j such that

Example of having new particles concentrated if a satisfied j value is found

And w_k ⁱ1/N, resulting in a resampled set of particles) { (θ)_k ⁱ,w_k ⁱ)}_i＝1:NThen for the resampled set of particles) { (θ)_k ⁱ,w_k ⁱ)}_i＝1:NAnd sampling a new particle set by using a Metropolis-Hastings algorithm, and taking a normal distribution probability function as the transition probability of sampling by using the Metropolis-Hastings algorithm.

7. The multi-robot-source search scheme generation method based on the cognitive search strategy as claimed in claim 6, wherein the step 4) comprises:

4.3) executing the k step updating particle filtering;

4.6) traversal from 1 to the recent memory step number m satisfies

8. The method as claimed in claim 7, wherein the step 5) combines the improved sourcing behavior scheme with the multi-robot control technology to generate the multi-robot sourcing search behavior scheme based on the cognitive search strategy, and each robot i in the multi-robot combines its own position r at each step_k ⁱAnd the sensor degree d_k ⁱTransmitting to other robots nearby, receiving the sensor degrees collected by the robots nearby by each robot i, independently executing the same searching source and maintaining the post probability density function P of the k step_i(θ_k|D_k) And each ofThe k-th step of the robot is followed by the probability density function P_i(θ_k|D_k) Are the same size.

9. A multi-robot-source search scenario generation system based on cognitive search strategy, comprising a microprocessor and a memory connected to each other, wherein the microprocessor is programmed or configured to perform the steps of the multi-robot-source search scenario generation method based on cognitive search strategy according to any one of claims 1 to 8.

10. A computer-readable storage medium having stored thereon a computer program programmed or configured to perform the cognitive search policy-based multi-robot-source search scenario generation method of any of claims 1-8.