CN114043476A

CN114043476A - Swarm robot control method based on particle swarm algorithm under rejection environment

Info

Publication number: CN114043476A
Application number: CN202111301771.0A
Authority: CN
Inventors: 张军旗; 刘欢; 王成; 臧笛; 刘春梅; 康琦
Original assignee: Tongji University
Current assignee: Tongji University
Priority date: 2021-11-04
Filing date: 2021-11-04
Publication date: 2022-02-15
Anticipated expiration: 2041-11-04
Also published as: CN114043476B

Abstract

The invention relates to a swarm robot control method based on a particle swarm algorithm in a rejection environment, which comprises the following steps: step 1, establishing an attack and defense confrontation scene in a rejection environment, and initializing parameters of a particle swarm algorithm; step 2, the attacking robot detects surrounding environment information through a sensor, acquires situation information of friend and enemy robots, and calculates the position of enemy territory in real time by using an inertial navigation technology; step 3, constructing a fitness function by the attack robot; step 4, optimizing a fitness function by utilizing a particle swarm algorithm to obtain the optimal occupation of the attacking robot; step 5, moving and attacking the attacking robot; step 6, if any attacking robot enters enemy territory, the task is completed; otherwise, judging whether the maximum operation time is reached, if so, failing the task; otherwise, the iteration of the next time slice is carried out in the step 2. Compared with the prior art, the method avoids global positioning, does not need pre-training in control, and solves the problem of dimension disaster.

Description

Swarm robot control method based on particle swarm algorithm under rejection environment

Technical Field

The invention relates to the field of cooperative and game confrontation control among multi-agent clusters, in particular to a swarm robot control method based on a particle swarm algorithm in a rejection environment.

Background

With the rapid development and mature application of intelligent unmanned technology, cooperative combat among unmanned devices has become possible, and cooperative attack and defense confrontation among unmanned clusters gradually becomes an important mode of future war. As an application carrier of the multi-agent technology, the unmanned cluster judges the surrounding situation through sensing the environment, and takes the actions of fire gathering attack, wounded recoiling, obstacle avoidance, group collision avoidance, dispersion, concentration, cooperation, assistance and the like according to a certain attack and defense strategy to realize attack and defense confrontation.

The cooperative attack and defense confrontation of the group of robots can be described as an optimal decision problem under complex multi-constraint conditions, and the most classical is the problem of earth guard. In this problem, the confrontation environment is composed of two multiple intelligent groups of an intruder and a defender. Where the intruder is intended to try to get as close to and into a territory as possible and the defender is intended to intercept the intruder as far from the territory as possible. The advantages and disadvantages of the situation under the confrontation environment depend on the relationship among an invader, a defender and the territory, and because the state space dimension of the multi-agent attack-defense confrontation task is high, the strategy solving space is exponentially increased along with the increase of the scale of the entity object, the situation is complex and fast to change, attack-defense strategies are various, the solving difficulty is large, and an efficient decision algorithm is needed.

The most popular group confrontation method at present is a multi-agent deep reinforcement learning method. However, such algorithms require extensive pre-training, are limited by dimensional disaster problems, and rely on accurate global positioning and communication, failing to achieve effective collaboration and countermeasures in a denial environment.

Disclosure of Invention

The invention aims to overcome the defects in the prior art and provide a swarm robot control method based on a particle swarm algorithm in a rejection environment, and the control method makes up the bottleneck problem that global positioning and communication are limited by dimension disasters since most of multi-agent algorithms currently exist.

The purpose of the invention can be realized by the following technical scheme:

the invention provides a swarm robot control method based on a particle swarm algorithm in a rejection environment, which comprises the following steps:

step 1, establishing an attack and defense confrontation scene in a rejection environment, and initializing parameters of a particle swarm algorithm;

step 2, the attacking robot detects surrounding environment information through a sensor, acquires situation information of the friend robot and the enemy robot, and calculates the position of the enemy territory in real time by utilizing an inertial navigation technology

Step 3, the attacking robot constructs a fitness function containing enemy robot state information, friend robot state information and enemy territory information by using the detected surrounding environment information;

step 4, optimizing the fitness function by utilizing a particle swarm algorithm to obtain the optimal occupation of the attacking robot and guide the attacking robot to move and attack in the next time slice;

step 5, the attacking robot carries out moving and attacking operations;

step 6, if an attacking robot enters enemy territory, the task is completed; otherwise, judging whether the maximum running time is reached, if so, failing the task; otherwise, the iteration of the next time slice is carried out in the step 2.

Preferably, the attack and defense confrontation scene in the rejection environment is established in step 1, and parameters of the particle swarm algorithm are initialized, specifically: initializing the positions of N attacking robots, the positions of M defending robots and a GPS positioning coordinate U of enemy territory; initializing initial number of particles in particle swarm optimization

Acceleration factor c₁And c₂Inertial weight w and dimension of problem D;

all robots have the same attribute and have P-point blood volume, when the robots are attacked, the blood volume is reduced by P, and when the blood volume of the robots is less than or equal to 0, the robots are destroyed.

Preferably, the step 2 specifically comprises: each attacking robot constructs a coordinate system according to the position and the moving direction of the attacking robot; the method comprises the steps of detecting surrounding environment information through a sensor, obtaining coordinates of a friend robot and an enemy robot, and calculating the position of enemy territory in real time by utilizing an inertial navigation technology

Preferably, in step 3, the attack robot constructs a fitness function of the attack robot by using the detected ambient environment information, specifically: attacking robot A_iConstructing a fitness function F according to the enemy robot state information, the friend robot state information and the enemy territory information_iThe expression is as follows:

F_i＝f₁+f₂+f₃

wherein f is₁Is a confrontation fitness f constructed according to the state information of the enemy robot₂Is a cooperative fitness f constructed according to the state information of the friend robot₃Is the adaptability of enemy to soil.

Preferably, said fitness opposition function f₁The expression of (a) is:

where Ψ is the attacking robot A_iNeighborhood region R₁The indices of all enemy robots within,

is a sub-fitness function generated according to the state information of the kth enemy robot;

is the kth enemyCoordinates of the square robot; x ═ x¹，x²) Is an independent variable representing the coordinate of a certain position in the rectangular coordinate system of the attack robot; sigma₁And w₁Respectively defining the width and the amplitude of the Gaussian-like model; s (i, k) is the measurement of the ith attack robot A_iWhether the situation of (2) is better than that of the enemy robot B_kDetermines the attacking robot A_iIs to enemy robot B_kWhether to move or to back is expressed as:

the situation information of the robot is obtained by calculation according to the number and the total blood volume of friend robots, wherein '-1' and '1' respectively represent an attacking robot A_iIn unfavorable and favorable situations; n is a radical of_iAnd

respectively represent attack robots A_iAttack range R₀The number and total blood volume values of all friend robots within; m_kAnd

respectively enemy robot B_kThe number of all enemy robots and the total blood volume value within the attack range R0;

when S (i, k) — 1, f_BkIs a valley-shaped function and represents the attacking robot A_iRobot B away from enemy_kThe farther away, the attacking robot A_iThe higher the fitness;

when S (i, k) is 1, f_BkIs a peak-shaped function and represents the attacking robot A_iWith enemy robot B_kThe closer the approach, the higher the fitness of the attacking robot Ai.

Preferably, the attack robot A_iIs a cooperative fitness function f₂The expression of (a) is:

where Φ is the attacking robot A_iNeighborhood region R₁Indices of all friends in the content; f. of_AkIs a sub-fitness function generated according to the state information of the kth attack robot, wherein

Is the coordinates of the kth attacker; x ═ x¹，x²) Is an independent variable representing the coordinate of a certain position in the rectangular coordinate system of the attack robot; sigma₂And w₂Respectively defining the width and the amplitude of the Gaussian-like model;

if attacking robot A_iIn the time slices of tau, the attacking robot a is in a disadvantage facing the surrounding enemy robot all the time_iThe cooperative fitness function f needs to be released from the constraint of the fellow and quit the group₂Set to 0, attack robot A_iAct independently to search for better attack locations;

when A is_iFriend with it A_kWhen the distance between the two is less than a preset threshold value delta-10 | Ψ |, f_AkSet to 0 to avoid collision of two attacking robots.

Preferably, the attack robot A_iLand-taking fitness f of enemy₃The expression is as follows:

wherein U ═ U (U)¹，U²) Is the coordinate of the center position of the enemy territory, x ═ x¹，x²) Is an independent variable representing the coordinate of a certain position in the rectangular coordinate system of the attack robot; sigma₃And w₃The width and amplitude of the gaussian-like model are defined separately.

Preferably, the step 4 specifically includes: in each time slice, executing a particle swarm algorithm to optimize the fitness function to obtain an attacking robot A_iAt which pointOptimum position p in the former case_g(ii) a The search space of the particle swarm algorithm is an attacking robot A_iIs the coordinate of (A) as the center, R₁Is within a circular area of radius.

Preferably, the speed and location update expression of the fitness function is:

wherein, c₁，c₂Is a constant acceleration factor, w is an inertial weight,

represents the velocity of the ith particle in the D-dimension, D ∈ [1]，

Indicating the position of the ith particle in the d-dimension,

and

is a random number vector; d is the dimension of the environment, for a two-dimensional confrontation environment D-2 and a three-dimensional confrontation environment D-3.

Preferably, the step 4 specifically includes:

each attacking robot is oriented to the calculated optimal position p_gDirection movement if the robot is in the optimal position p_gIs less than the maximum distance of the attack robot moving in a time slice

The attacking robot moves to the optimal position p_g(ii) a Otherwise, go to the optimal position p_gDirection movement

A distance; in the moving process of the robot, if other robots exist in the warning area of the position of the next time slice, the moving direction of the robot is rotated anticlockwise by a degrees, and after the rotating times exceed the preset times, if the robot still cannot find a proper collision-free path, the robot stays at the current position until the next time slice;

if an enemy of the attacking robot enters the attacking range in the moving process, the nearest enemy is selected to attack; if the situation of the attacking robot is better than that of the enemy robot, the attacking robot can move towards the enemy robot and attack, namely the closer the attacking robot is to the enemy robot, the higher the fitness of the attacking robot is; conversely, if the attacking robot is inferior to the enemy robot in posture, the closer it is to the enemy robot, the lower its fitness;

the attacking robot and the friend of the attacking robot are in a cooperative relationship and form a group to attack the enemy robot group, and the closer the attacking robot is to the friend of the attacking robot, the higher the fitness of the attacking robot is.

Compared with the prior art, the invention has the following advantages:

1) the group robot senses the motion of the group robot from the surrounding environment by using the sensor carried by the group robot, and does not depend on a global navigation system; each robot constructs a respective coordinate system, and cooperation and confrontation are realized by acquiring relative coordinates of surrounding intelligent bodies without depending on a global positioning system;

2) the fitness function of the robot integrates information of friend, enemy and enemy territory, and the control of the robot in the cooperative and antagonistic environments is realized;

3) compared with a reinforcement learning method, the distributed control of the swarm robots based on the particle swarm algorithm has the advantages of no need of pre-training and strong expandability.

Drawings

FIG. 1 is a flow chart of a swarm robot control method based on a particle swarm algorithm in a rejection environment.

FIG. 2 is an exemplary diagram of a rectangular coordinate system of an attack robot;

FIG. 3 shows an attack robot A₁An example of a fitness model of (1);

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, not all, embodiments of the present invention. All other embodiments, which can be obtained by a person skilled in the art without any inventive step based on the embodiments of the present invention, shall fall within the scope of protection of the present invention.

The embodiment provides a swarm robot attacking method based on a particle swarm algorithm in a denial environment, and as shown in fig. 1, the method includes the following steps:

step 1, establishing an attack and defense confrontation scene in a refusal environment, setting the positions of N attack group robots, the positions of M defense group robots and the position coordinates of enemy territory, setting the attributes of all the robots to be the same, wherein the robots have 10-point blood volume, the blood volume is reduced by 1 when the robots are attacked, and the robots are killed when the blood volume is less than or equal to 0; initial number of particles for initializing particle swarm optimization

Acceleration factor c₁And c₂Inertial weight w and dimension D of the problem; all robots have the same attribute and have 10-point blood volume, when the robots are attacked, the blood volume is reduced by 1, and when the blood volume of the robots is less than or equal to 0, the robots are destroyed.

Step 2, the attacking robot detects the surrounding environment through a sensor, acquires situation information of the friend robot and the enemy robot, and calculates the position of the enemy territory in real time by utilizing the inertial navigation technology

as shown in FIG. 2, each attacking robot A_iConstructing a rectangular coordinate system by taking the position of the mobile terminal as the origin of coordinates and the moving direction as the positive direction of the horizontal axis; wherein Q_j、P_kAnd U is the coordinates of the jth defender, kth attacker and territory respectively, attacking robot A_iObtaining Q by sensing environment using its sensor_jAnd P_k；

The coordinate U of the enemy territory is known when each attacking robot is at the initial position, and the approximate coordinate of the enemy territory is calculated in real time by applying the inertial navigation technology in the moving process of the robots

For each attacking robot, the fitness function comprises three parts of information: state information of the enemy robot, state information of the friend robot and enemy territory information;

the attack robot A_iFitness function F_iThe calculation formula of (2) is as follows:

F_i＝f₁+f₂+f₃

1) Constructing an confrontation fitness function f of the attack robot₁

If the attacking robot is more situationally preferred than the enemy robot, it will move towards the enemy robot and initiate the attack. This situation is mapped into the fitness function model, i.e. the closer the attacking robot is to the enemy robot, the higher its fitness. Conversely, if the attacking robot is less dominant than the enemy robot, the closer it is to the enemy robot, the lower its fitness. The confrontation fitness function model is constructed by utilizing Gaussian-like distribution. Countervailing fitness function f of attack robot₁Comprises the following steps:

where Ψ is the attacking robot A_iNeighborhood region R₁Index f of all enemy robots therein_BkIs a sub-fitness function generated from the state information of the kth enemy robot.

Is the coordinates of the kth enemy robot; x ═ x₁，x₂) Is an independent variable representing the coordinate of a certain position in a rectangular coordinate system; sigma₁And w₁The width and amplitude of the gaussian-like model are defined separately.

S (i, k) is the measurement of the ith attack robot A_iWhether the situation of (2) is better than that of the enemy robot B_kThe index of (1). The situation of the robot is calculated according to the number of friends and the total blood volume, wherein "-1" and "1" represent A respectively_iIn unfavorable and favorable situations; n is a radical of_iAnd

is A_iAttack range R₀Attack robot (including attacker A)_i) Number of and total blood volume values; m_kAnd

are respectively B_kAttack range R₀Number of enemy robots within and total blood volume value. When S (i, k) — 1, f_BkIs a valley-shaped function. This means that A_iFrom B_kThe farther away, A_iThe higher the fitness. When S (i, k) is 1, f_BkIs a peak-shaped function. This means that A_iAnd B_kThe closer to A_iThe higher the fitness.S (i, k) determines A_iIs directed to B_kWhether to move or to back.

2) Construction of attack robot A_iIs a cooperative fitness function f₂

The attacking robot and its friend are in a cooperative relationship and form a group to attack the enemy robot population. This situation is mapped into the fitness function model, i.e. the closer the attacking robot is to its friend, the higher its fitness. Attacking robot A_iIs a cooperative fitness function f₂Comprises the following steps:

wherein Φ is A_iNeighborhood region R₁Index of all friends in the content. f. of_AkIs a sub-fitness function generated according to the state information of the kth attack robot, wherein

Is the coordinates of the kth attacker. Notably, according to f₂The attacking population may dynamically form subgroups. If A is_iWithin a period of tau time, a is always at a disadvantage facing surrounding enemy robots, then a_iIt is necessary to escape from the constraints of its partners and exit the population, at which point f will be₂Set to 0, Ai acts independently to search for a better attack position. When A is_iFriend with it A_kWhen the distance therebetween is less than the threshold value Δ ═ 10| Ψ |, f_AkSet to 0 to avoid collision of two attacking robots.

3) Constructing a fitness function f according to the position information of enemy territory₃

The attacking robot aims to enter enemy territory, and the closer the attacking robot is to the enemy territory, the higher the fitness is. Therefore, the fitness function f₃Comprises the following steps:

wherein U ═ U (U)¹，U²) Is the coordinate of the center position of the enemy territory.

FIG. 3 shows an attacking robot A₁An example of a fitness model of (1), where two diamonds represent A₁And A₂Three triangles are enemy robot B₁、B₂And B₃The star is the enemy territory T. Wherein A is₁And A₂Are in cooperative relationship with each other, A₁And B₁、B₂And B₃The relationship between is a resistance relationship. The fitness model is applied to a particle swarm algorithm to optimize and obtain a group robot attack strategy in a rejection environment so as to guide the movement and attack of an attacking robot.

in each time slice, a particle swarm optimization PSO optimization attack robot A is executed_iBest position p in its current situation_g. Since the attacker's distance of movement is limited within a time slice and the environment is dynamically changing, it only needs to be in its neighborhood R₁Finding the position with the best fitness.

Therefore, the search space of the particle swarm algorithm PSO is constrained to attack robot a_iIs the coordinate of (A) as the center, R₁Is within a circular area of radius.

The speed and location update formula for optimizing the fitness function is as follows:

wherein, c₁，c₂Is a constant acceleration factor, w is an inertial weight,

represents the velocity of the ith particle in the D-dimension, D ∈ [1]，

Indicating the position of the ith particle in the d-dimension,

and

is a random number vector; d is the dimension of the environment, for a two-dimensional confrontation environment D-2, a three-dimensional confrontation environment D-3;

step 5, moving and attacking the attacking robot;

each robot is directed to the calculated optimal position p_gDirection movement if the robot is in the optimal position p_gIs less than the maximum distance the robot moves within a time slice

The robot moves to the optimal position p_g(ii) a Otherwise to p_gDirection movement

Distance.

In the moving process, if the robot has enemy to enter the attack range in the moving process, the nearest enemy attack to the enemy is selected. During the movement of the robot, if other robots exist in the warning area of the position of the next time slice, the moving direction of the robot is rotated by 15 degrees anticlockwise, and after a series of rotations (23 times), if the robot still cannot find a proper collision-free path, the robot stays at the current position until the next time slice.

In order to verify the performance of the method in the group robot territory invasion problem more intuitively, the method carries out the antagonism experiment with the following three algorithms applied to the enemy robot group:

(1) based on a rule algorithm, the enemy robot always moves to the middle point position of the attack robot closest to the territory and the enemy robot to intercept the attack robot.

(2) DPSO attack task assignment algorithm, "Cooperative Multi-task assignment for multiple UAVs," Electronics Optics & controls, vol.24, No.1, pp.46-50, 2017.

(3) SDPSO attack task allocation algorithm, "UAV cooperative multiple-task assistance based on discrete particle timing algorithm," Computer Simulation, vol.35, No.2, pp.22-28, 2018.

The challenge results of the algorithm are shown in table 1.

TABLE 1

It is apparent that the overall odds of the attacking robot group using the PSO-AS method is 100% when the numbers of the attacking robot group and the enemy robot group are the same. When the number of attacking robot groups is only 75% of the enemy robot groups, the proposed method still has a success rate of winning of more than 50%.

While the invention has been described with reference to specific embodiments, the invention is not limited thereto, and various equivalent modifications and substitutions can be easily made by those skilled in the art within the technical scope of the invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims

1. A swarm robot control method based on a particle swarm algorithm under a rejection environment is characterized by comprising the following steps:

step 5, the attacking robot carries out moving and attacking operations;

2. The method for controlling group robots based on particle swarm optimization in the denial environment according to claim 1, wherein the attack and defense confrontation scene in the denial environment is established in step 1, and parameters of the particle swarm optimization are initialized, specifically: initializing the positions of N attacking robots, the positions of M defending robots and a GPS positioning coordinate U of enemy territory; initializing initial number of particles in particle swarm optimization

Acceleration factor c₁And c₂Inertial weight w and dimension of problem D;

3. The swarm robot control method based on the particle swarm algorithm in the denial environment according to claim 1, wherein the step 2 specifically comprises: each attacking robot constructs a coordinate system according to the position and the moving direction of the attacking robot; the method comprises the steps of detecting surrounding environment information through a sensor, obtaining coordinates of a friend robot and an enemy robot, and calculating the position of enemy territory in real time by utilizing an inertial navigation technology

4. The swarm robot control method based on the particle swarm algorithm in the denial environment according to claim 1, wherein the attacking robot in step 3 uses the detected ambient environment information to construct a fitness function of the attacking robot, specifically: attacking robot A_iConstructing a fitness function F according to the enemy robot state information, the friend robot state information and the enemy territory information_iThe expression is as follows:

F_i＝f₁+f₂+f₃

5. The swarm robot control method based on particle swarm optimization in the rejection environment of claim 4, wherein the countervailing fitness function f is₁The expression of (a) is:

is the coordinates of the kth enemy robot; x ═ x¹,x²) Is an independent variable representing the coordinate of a certain position in the rectangular coordinate system of the attack robot; sigma₁And w₁Respectively defining the width and the amplitude of the Gaussian-like model; s (i, k) is the measurement of the ith attack robot A_iWhether the situation of (2) is better than that of the enemy robot B_kDetermines the attacking robot A_iIs to enemy robot B_kWhether to move or to back is expressed as:

respectively enemy robot B_kAttack range R₀All enemy machines withinNumber of people and total blood volume values;

when S (i, k) is 1, f_BkIs a peak-shaped function and represents the attacking robot A_iWith enemy robot B_kThe closer the attacking robot A is_iThe higher the fitness.

6. The swarm robot control method based on particle swarm optimization in the denial environment of claim 4, wherein the attacking robot A_iIs a cooperative fitness function f₂The expression of (a) is:

Is the coordinates of the kth attacker; x ═ x¹,x²) Is an independent variable representing the coordinate of a certain position in the rectangular coordinate system of the attack robot; sigma₂And w₂Respectively defining the width and the amplitude of the Gaussian-like model;

7. The swarm robot control method based on particle swarm optimization in the denial environment as claimed in claim 4, wherein the attacking robot A_iLand-taking fitness f of enemy₃The expression is as follows:

wherein U ═ U (U)¹,U²) Is the coordinate of the center position of the enemy territory, x ═ x¹,x²) Is an independent variable representing the coordinate of a certain position in the rectangular coordinate system of the attack robot; sigma₃And w₃The width and amplitude of the gaussian-like model are defined separately.

8. The swarm robot control method based on the particle swarm algorithm in the denial environment according to claim 1, wherein the step 4 specifically comprises: in each time slice, executing a particle swarm algorithm to optimize the fitness function to obtain an attacking robot A_iBest position p in its current situation_g(ii) a The search space of the particle swarm algorithm is an attacking robot A_iIs the coordinate of (A) as the center, R₁Is within a circular area of radius.

9. The swarm robot control method based on the particle swarm optimization algorithm in the rejection environment according to claim 8, wherein the speed and location update expression of the fitness function is as follows:

wherein, c₁，c₂Is a constant acceleration factor, w is an inertial weight,

represents the velocity of the ith particle in the D-dimension, D ∈ [1]，

Indicating the position of the ith particle in the d-dimension,

and

10. The swarm robot control method based on the particle swarm algorithm in the denial environment according to claim 8, wherein the step 4 specifically comprises:

A distance; in the moving process of the robot, if other robots exist in the warning area of the position where the next time slice is located, the moving direction of the robot is rotated anticlockwise by a degrees, and after the rotating times exceed the preset times, if the robot still cannot find a proper collision-free path, the robot stays onCurrent position until the next time slice;