CN112306070A - Multi-AUV dynamic maneuver decision method based on interval information game - Google Patents

Multi-AUV dynamic maneuver decision method based on interval information game Download PDF

Info

Publication number
CN112306070A
CN112306070A CN202011150930.7A CN202011150930A CN112306070A CN 112306070 A CN112306070 A CN 112306070A CN 202011150930 A CN202011150930 A CN 202011150930A CN 112306070 A CN112306070 A CN 112306070A
Authority
CN
China
Prior art keywords
auv
interval
payment
enemy
game
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN202011150930.7A
Other languages
Chinese (zh)
Inventor
刘禄
张立川
白春梅
张硕
任染臻
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Northwestern Polytechnical University
Original Assignee
Northwestern Polytechnical University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Northwestern Polytechnical University filed Critical Northwestern Polytechnical University
Priority to CN202011150930.7A priority Critical patent/CN112306070A/en
Publication of CN112306070A publication Critical patent/CN112306070A/en
Withdrawn legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05DSYSTEMS FOR CONTROLLING OR REGULATING NON-ELECTRIC VARIABLES
    • G05D1/00Control of position, course, altitude or attitude of land, water, air or space vehicles, e.g. using automatic pilots
    • G05D1/04Control of altitude or depth
    • G05D1/06Rate of change of altitude or depth
    • G05D1/0692Rate of change of altitude or depth specially adapted for under-water vehicles

Landscapes

  • Engineering & Computer Science (AREA)
  • Aviation & Aerospace Engineering (AREA)
  • Radar, Positioning & Navigation (AREA)
  • Remote Sensing (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Automation & Control Theory (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention relates to a multi-AUV dynamic maneuver decision method based on an interval information game. Subsequently, a payment matrix is executed, which consists of interval information and payment interval levels combining the four parameter interval sets and relative entropy. Then, Nash equilibrium conditions meeting the interval game conditions are provided, and a Nash equilibrium maneuvering decision model under the dynamic marine environment is established. Meanwhile, an improved differential evolution algorithm is applied to solve the existing problems and find an optimal strategy. The invention solves the problem of the influence of weak connectivity, uncertainty and variability of the underwater environment on the modeling difficulty, and ensures that the established model is more persuasive and more reliable in the application of the actual water area.

Description

Multi-AUV dynamic maneuver decision method based on interval information game
Technical Field
The invention belongs to the field of multi-underwater robot cooperative confrontation, and particularly relates to a multi-AUV cooperative confrontation method based on an interval information game.
Background
With the development of science and technology, Autonomous Underwater Vehicles (AUV) have been widely used in the relevant fields of marine observation, marine rescue, mine area search, enemy reconnaissance and the like. The high efficiency and reliability of the multi-AUV system due to the space-time distribution and the redundant configuration of the multi-AUV system provide a new solution for complex ocean tasks. The multi-AUV game cooperation can be used for ocean research and military countermeasure, including underwater multi-target tracking, monitoring and detection, and can effectively enlarge the underwater battle radius and reduce underwater equipment and casualties.
The maneuver decision is the key of the multi-UUV cooperative countermeasure, and is the basic action of each countermeasure step. There is also much research on unilateral strategy optimization, but little research on bilateral game theory. Therefore, by introducing the cooperative game theory into the maneuvering decision of the unmanned aerial vehicle system cluster, a more scientific and more accurate real-time countermeasure strategy can be made.
Disclosure of Invention
Technical problem to be solved
The existing method ignores the complexity and uncertainty of the underwater environment, and cannot obtain accurate real-time underwater environment characteristics, so that the reliability of the established decision model is low, the adopted conventional decision algorithm is easy to fall into local optimization, and finally, an accurate and reliable decision scheme is difficult to obtain, and the method cannot be applied to a real sea area environment. The invention aims to provide a multi-AUV dynamic maneuvering decision algorithm for a section information game aiming at the defects of the existing method.
Technical scheme
A multi-AUV dynamic maneuver decision method based on an interval information game is characterized by comprising the following steps:
step 1: obtaining an advantage function of the multi-AUV system of the two countermeasures according to the situation advantage and the energy efficiency advantage:
the situation advantages comprise an angle advantage AagSpeed advantage AsAnd distance advantage Adis
Figure BDA0002741228470000021
Wherein | AA | is twoThe viewing angle of an AUV player, ATA is the target incident angle;
Figure BDA0002741228470000022
wherein v isn1i,vn2jIs the velocity vector of both parties in the game; n1 and n2 in the subscripts are both countermeasures, i and j are the ith and jth AUV corresponding to both countermeasures;
Figure BDA0002741228470000023
wherein DijIs the distance between different AUVs; r0=(Rmax+Rmin)/2;RmaxIs the maximum starting distance, RminIs the minimum starting distance;
the overall situation dominance function is: wA=k1Aag+k2As+k3AdisWherein k is1,k2,k3Is a weighting coefficient, k1+k2+k3=1;
The energy efficiency advantage function is written as:
Figure BDA0002741228470000024
wherein C isn1i,Cn2jIs to combat the energy efficiency of both AUV systems;
the overall merit function of the multi-AUV system of our party is:
Figure BDA0002741228470000025
wherein delta1,δ2Is a weighting coefficient and satisfies δ12=1;
Figure BDA0002741228470000026
Figure BDA0002741228470000027
Representing a merit function having upper and lower bounds;
obtaining the overall advantage function of the enemy in the game in the same way
Figure BDA0002741228470000028
Step 2: obtaining a payment matrix of the multi-AUV system according to the interval information and payment interval grades combining the four parameter interval sets and the relative entropy:
the payment function of the multi-AUV game under uncertain information obtained according to the advantage function in the step 1 is established as follows:
Figure BDA0002741228470000031
wherein xij,yjiIs a binary decision variable, x ij1 denotes the ith AUV of my party attacking the jth AUV of the enemy, and x ij0 means that the ith AUV of my party does not attack the jth AUV of the enemy; likewise, yjiWhether the jth AUV representing the enemy attacks the ith AUV of the enemy or not;
the payment matrix under uncertain information is therefore:
Figure BDA0002741228470000032
the method combining the four parameter interval sets and the relative entropy is improved to ensure that
Figure BDA0002741228470000033
Figure BDA0002741228470000034
WmnIs a function of the advantage of the normalization,
Figure BDA0002741228470000035
wherein x1,x2,…,xmIs the maneuver strategy of my AUV System, y1,y2,…,ynIs a maneuvering strategy of an enemy multi-AUV system,
Figure BDA0002741228470000036
represents payment when the m-th policy is used by my AUV system and the n-th policy is used by the enemy AUV system;
and step 3: solving the Nash equilibrium optimal solution, and finding the optimal strategy:
the confrontation track of the multiple AUVs is regarded as the combination of each action, k-level dynamic games are used in the confrontation process, and each level of games comprises 7 actions, namely, keeping the original flight, accelerating, decelerating, turning left, turning right, climbing and diving;
considering the practical subsea environment constraints, solving the nash equilibrium problem can be converted into an optimization problem with interval uncertainty parameters:
Figure BDA0002741228470000041
Figure BDA0002741228470000042
wherein xiIndicating the probability of my AUV adopting the ith policy,
Figure BDA0002741228470000043
is the threshold value of the benefit of the participant,
Figure BDA0002741228470000044
the payment of my AUV system using the ith strategy and the payment of the enemy AUV system using the jth strategy;
by the optimal parameter x in the optimal solutioniAnd the probability of adopting the optimal strategy at the moment is obtained, and the AUV executes the action of the optimal strategy pair.
In the step 3, an improved differential evolution algorithm is adopted to solve the Nash equilibrium optimal solution, and an optimal strategy is found; the improved differential evolution algorithm comprises the steps of mutation, intersection and selection, and in order to select the optimal fitness, a game algorithm is combined, and when a new mutation vector is generated, a scaling ratio F is determined according to the evolution time and the difference between the best individual and the worst individual:
Figure BDA0002741228470000045
wherein Δ G ═ G/G, G is the maximum number of iterations, and G is the current number of iterations; f. ofbestIs the best fitness, fworstIs the worst fitness, fiIs the current personal fitness; fmaxAnd FminAre the maximum and minimum values of F.
Advantageous effects
The multi-AUV dynamic maneuver decision method based on the interval information game has the following beneficial effects that:
(1) the influence of weak connectivity, uncertainty and variability of an underwater environment on the modeling difficulty is solved, the established model is more convincing, and the model is more reliable in application of an actual water area. The interval information can represent underwater environment characteristics including various uncertainties, and the established model is more persuasive.
(2) The problem that the decision algorithm falls into the local optimal situation is solved, the optimal solution is searched in the whole algorithm, and the obtained result is more accurate and credible. The method selects the improved differential evolution algorithm when solving the problem, and effectively avoids the condition that the algorithm is trapped in local optimization.
Drawings
FIG. 1: level k gaming for multiple AUV systems
FIG. 2: operating procedure of IDE Algorithm
FIG. 3: expected revenue
FIG. 4: multi-AUV collaborative dynamic maneuver decision: first stage
FIG. 5: multi-AUV collaborative dynamic maneuver decision: second stage
FIG. 6: multi-AUV collaborative dynamic maneuver decision: the third stage
FIG. 7: multi-AUV collaborative dynamic maneuver decision: fourth stage
FIG. 8: multi-AUV collaborative dynamic maneuver decision: the fifth stage
Detailed Description
The invention will now be further described with reference to the following examples and drawings:
the technical scheme of the scheme is as follows: firstly, on the basis of a multi-AUV maneuvering strategy, an advantage function consisting of situation advantages and energy efficiency advantages is provided. Subsequently, a payment matrix is executed, which consists of interval information and payment interval levels combining the four parameter interval sets and relative entropy. Then, Nash equilibrium conditions meeting the interval game conditions are provided, and a Nash equilibrium maneuvering decision model under the dynamic marine environment is established. Meanwhile, an improved differential evolution algorithm is applied to solve the existing problems and find an optimal strategy. Finally, the superiority of the proposed multi-AUV dynamic maneuver decision algorithm is verified by examples. The method comprises the following steps:
step 1: obtaining an advantage function of the multi-AUV system of the two countermeasures according to the situation advantage and the energy efficiency advantage:
in order to establish an interval information payment matrix, a plurality of AUV maneuvering attribute evaluation methods are provided according to the situation information of the two enemy parties, as shown in the k-level dynamic game in fig. 1. The confrontational trajectories of multiple AUVs are treated as a combination of each action. Multiple AUVs for the enemy and multiple AUV systems for my party are considered decision-making parties in the game.
The gaming model of the multiple AUV system based on uncertain information can be expressed as:
Figure BDA0002741228470000061
wherein N ═ { N ═ N1,N2The decision parties in the game are determined;
Figure BDA0002741228470000062
is the policy space of the decision maker,
Figure BDA0002741228470000063
meaning that we choose the strategy of the ith category,
Figure BDA0002741228470000064
indicating that the enemy selects the jth strategy in the kth stage;
Figure BDA0002741228470000065
is the revenue interval corresponding to each policy that the multiple AUV systems participating in the game may select. According to the game tree shown in fig. 1, the actions of the multi-AUV system in the phase k game can be represented by one information set, so that the manipulation policy is actually a set of action rules of the multi-AUV system in each information.
The main difference between the multi-AUV counter-action and the other autonomous robot counter-actions is the information transfer mode. Due to the influence of the marine environment, information in the multi-AUV countermeasure process is mainly received by underwater sound waves. The shallow sea acoustic channel is a channel with space-time frequency variations. It has strong multipath interference, high environmental noise, large transmission loss and serious Doppler shift effect. Therefore, there is a great uncertainty in the information provided during the multi-AUV countermeasure. It is difficult to accurately quantify the threat level of both parties in the decision making process. Therefore, in the present invention, each attribute is represented by section information set in the decision process. The merit function that can evaluate the payment of each AUV consists of two parts, a situation advantage and an energy efficiency advantage.
In order to attack an enemy multi-AUV system, it is necessary to occupy a favorable attack position and minimize the attack risk of our multi-AUV system.
(1) The situation advantages include an angular advantage AagSpeed advantage AsAnd distance advantage Adis
Figure BDA0002741228470000066
Where | AA | is the perspective of two players and ATA is the target angle of incidence;
Figure BDA0002741228470000067
wherein v isn1i,vn2jIs the velocity vector of both parties in the game; n1 and n2 in the subscripts are both antagonistic parties, and i and j are the ith and jth AUV corresponding to both antagonistic parties.
Figure BDA0002741228470000071
Wherein DijIs the distance between different AUVs; r0=(Rmax+Rmin)/2;RmaxIs the maximum starting distance, RminIs the minimum starting distance;
the overall situation dominance function is: wA=k1Aag+k2As+k3AdisWherein k is1,k2,k3Is a weighting coefficient, k1+k2+k3=1
(2) The energy efficiency merit function may be written as:
Figure BDA0002741228470000072
wherein C isn1i,Cn2jIs energy efficient against both AUV systems.
(3) The overall merit function of the multi-AUV system of our party is:
Figure BDA0002741228470000073
wherein delta1,δ2Is a weighting coefficient and satisfies δ12=1;
Figure BDA0002741228470000074
Figure BDA0002741228470000075
Representing a merit function with upper and lower bounds.
(4) By exchanging the situation information parameters of the two parties, the overall advantage function W E of the enemy in the game can be obtained2
Step 2: and obtaining a payment matrix of the multi-AUV system according to the interval information and the payment interval grade combining the four parameter interval sets and the relative entropy.
The revenue matrix of the multi-AUV system is executed. The payment matrix consists of interval information and payment interval levels combining four parameter interval sets and relative entropy. Payouts in a game refer to the ultimate profit or loss of the player in the strategic selection. In multi-AUV confrontation, the gain of our AUV must be a loss of the enemy AUV. Thus, the game of the present invention falls within the category of two-player zero-sum games.
Due to the various underwater interference factors, the multi-AUV system cannot accurately obtain various information in the actual submarine maneuver decision. After a reasonable analysis of the collision situation, each interference factor usually varies within a certain interval. Thus, a revenue matrix for each multi-AUV system is established based on the interval information.
The payment function of the multi-AUV game under uncertain information obtained according to the advantage function in the step 1 is established as follows:
Figure BDA0002741228470000081
wherein xij,yjiIs a binary decision variable, xij1 denotes our ith AUV attacking enemy jth AUV, and xij0 means that our ith AUV does not attack the jth AUV of the enemy; likewise, yjiWhether the jth AUV representing the enemy attacks our ith AUV.
For comparison of interval information sets, the sizes cannot be compared from a quantitative point of view like real numbers. Ranking methods based on degree of likelihood may fail, while ranking methods based on geometric distance may result in significant information loss. To avoid these drawbacks, a method is proposed that combines four sets of parameter intervals and relative entropy.
Payment interval
Figure BDA0002741228470000082
Is derived from the combined information of the two parties to the game. But the payment does not take into account the distribution of points in the interval. In practice, the set of internal payment intervals cannot simply be considered as a uniform distribution. It should change according to the change of underwater confrontation. For a policy xiWhen the adversary situation is favorable for the attacker, the benefit of the attacker inevitably tends to fR(ii) a And vice versa. If not, it will tend to fL. To make it sufficientAnd converting the payment interval into four set parameter intervals by using the advantage matrix information.
The payment matrix under uncertain information is therefore:
Figure BDA0002741228470000083
the method combining the four parameter interval sets and the relative entropy is improved to ensure that
Figure BDA0002741228470000084
Figure BDA0002741228470000085
WmnIs a function of the advantage of the normalization,
Figure BDA0002741228470000086
wherein x1,x2,…,xmIs the maneuver strategy of my AUV System, y1,y2,…,ynIs a maneuvering strategy of an enemy multi-AUV system,
Figure BDA0002741228470000087
indicating payment when the my AUV system uses the mth policy and the enemy AUV system employs the nth policy.
The basic idea of the ranking method is to use the information entropy to measure the difference between the AUV self income and the maximum income (minimum income) under different strategies, and select the strategy with the minimum difference between the AUV self income and the maximum income (or the strategy with the maximum difference between the AUV self income and the minimum income). In fact, the highest reward indicates that the AUV has completed the intended task and there is no casualty. The lowest profit represents the situation where the AUV has not completed the intended task and casualties are greatest.
And step 3: and applying the improved differential evolution algorithm to solve the Nash equilibrium optimal solution and finding the optimal strategy.
The confrontation track of the multi-AUV is regarded as the combination of each action, k-level dynamic games are used in the confrontation process, and each level of games basically comprises 7 actions, namely, keeping the original flight, accelerating, decelerating, turning left, turning right, climbing and diving.
Considering the practical subsea environment constraints, solving the nash equilibrium problem can be converted into an optimization problem with interval uncertainty parameters:
Figure BDA0002741228470000091
Figure BDA0002741228470000092
wherein xiIndicating the probability of my AUV adopting the ith policy,
Figure BDA0002741228470000093
is the threshold value of the benefit of the participant,
Figure BDA0002741228470000094
my AUV system uses the ith policy and the enemy AUV system uses the jth policy for payment.
A Differential Evolution (DE) is an intelligent optimization method based on population difference heuristics. The DE leverages the differences between population individuals to interfere with individual evolution and searches the entire optimization space using a greedy rule to find an optimal solution. It updates the population by variation, crossover and selection in the population and then finds the best solution. DE has easy operation, good robustness, optimization ability characteristics such as strong. However, when the DE algorithm is applied in the optimization process, a situation may occur in which convergence is slow and falls into local optimum, so that it is difficult to satisfy the requirement of real-time countermeasure.
Aiming at the problems of the DE algorithm, the invention provides an improved differential evolution algorithm (IDE). The algorithm flow is shown in fig. 2:
(1) fitness function
Generally, the best strategy for participant N1 is to maximize its revenue under constraints, while the other participant N2 is the opposite. Therefore, the fitness function here may be an optimization objective function expressed in an optimization formula.
(2) Variation of
The scaling factor F is used to scale each basis vector and generate a new variation vector. A larger F may search for a potentially best solution over a larger range. Conversely, a smaller F may increase convergence speed and improve accuracy. Meanwhile, when the fitness of each person is good, it is preferable that F is small in order to reduce the interference with better persons. Conversely, when the adaptability of each person is relatively poor, it is preferable to expand the search range of the solution, and thus a larger F can be applied. In conjunction with the game algorithm presented herein, the scaling F is determined from the evolution time and the difference between the best and worst individuals:
Figure BDA0002741228470000101
wherein Δ G ═ G/G, G is the maximum number of iterations, and G is the current number of iterations; f. ofbestIs the best fitness, fworstIs the worst fitness, fiIs the current personal fitness; fmaxAnd FminAre the maximum and minimum values of F. If the adaptability difference between the current individual and the optimal individual is large, it means that the individual is far from the spatially optimal individual. FiThe larger the value of (A), the larger the interference to the individual, which means that the search range of the algorithm is enlarged and the global search capability is enhanced. If the difference in fitness is small, FiSmaller values can be taken and the interference to the individual is also smaller, which means that the search is only performed in a smaller area near the individual to enhance the ability of the algorithm to develop. Furthermore, in the later stages of evolution, the value of Δ g is preferably relatively small, so that searches can be made in local areas near the current individual and the accuracy of the algorithm is ensured.
Using the DE current best strategy, the following variation vectors can be derived:
wi,g=vi,g+Fi(vbest,g-vi,g)+Fi(vr1,g-vr2,g)
wherein wi,gIs a variation vector; fiIs the scale factor of the current individual as determined by the last equation; v. ofi,gRepresents the current individual vector, and vbest,gRepresents the best individual of the population; r1 and r2 are two different integers and 0<r1,r2<NP, NP is the population.
(3) Crossing
The crossover rate CR determines the crossover probability of variant and original individuals on each dimensional vector. Individuals with greater compliance may have greater CR that accelerates changes in individual structure. Therefore, it is better to use smaller CR in the later stage of evolution to reduce the interference of the target individual to the experimental individual and ensure the convergence speed of the algorithm. The designed crossover rates are as follows:
Figure BDA0002741228470000111
wherein
Figure BDA0002741228470000112
Is the current average fitness; CRiIs the current crossover rate, CRmaxAnd CRminAre the maximum and minimum values of CR. When the target individual vi,gWhen the fitness of the target individual is less than the average fitness, the target individual is relatively superior. Should choose a smaller CRiAnd from the target vector vi,gMore test vector information is obtained. Otherwise, from the change vector wi,gObtaining a test vector ui,gThis improves the diversity of the population. Δ g may ensure that a larger CR is obtained early in evolutioniIncreasing population diversity and speeding convergence. In addition, in later stages of development, smaller CRiIs favorable for finding the optimal solution.
The interleaving operation can be expressed as:
Figure BDA0002741228470000113
wherein u isij,gIs a test vector uij,gThe jth component of (a); rnbr is a random integer less than integer D; rand [0,1 ]]Is a random number between 0 and 1.
(4) Selecting
The selection operation is to select a better fitness between the newly generated test vector and the original target vector to be a member of the next population generation. This is a "greedy" selection operation. The selection operation may be described as follows:
Figure BDA0002741228470000114
wherein v isi,g+1Is the next generation of individuals.
Obtaining an optimal solution by utilizing improved differential evolution, and obtaining an optimal parameter x in the optimal solutioniAnd the probability of adopting the optimal strategy at the moment is obtained, and the AUV executes the action of the optimal strategy pair.
Example (c): assume that "a" and "D" participate in a 2-to-2 AUV underwater confrontation. The initial positions of 'a1', 'a2' are (0m, 100m, 200m), (0m, -100m, 200m), 'D1', 'D2' are (800m, 100m, 200m), (800m, -100m, 200 m). The speeds, deflection angles and pitch angles of A1 and A2 are 23m/s, -60 degrees, 5 degrees and 23m/s, 60 degrees and-5 degrees respectively; the velocities, yaw angles and pitch angles of "D1" and "D2" were 25m/s, 120 °, 3 ° and 25m/s, respectively, and-120 ° and-3 °. Both have the same control capability, and the time interval of the opposite step is 5 s. It is clear that "D" has advantages from the outset. It should also be noted that the maximum maneuver step should be determined based on the effectiveness of the AUV used in the confrontation. For comparison of the challenge performance, "a" uses the collaborative dynamic maneuver decision algorithm proposed by the present invention, and "D" uses the max-min decision algorithm in the multiple AUV challenge process. The three-dimensional challenge process with 5 main stages is shown in fig. 4-7. "+" shows the initial position and "4" shows the current position. The confrontation is ended when the expected profit of one party reaches absolute advantage.
The calculation part in the invention is as follows:
k1=0.445,k2=0.222,k3=0.333,δ1=0.9,δ2=0.1,
Figure BDA0002741228470000121
ω1=ω2=ω3=ω4=0.25 G=300,NP=100。
as shown in fig. 3: there are 50 steps in the confrontation process, representing its expected revenue en route. From the last part, the expected revenue obtained indicates that nash equilibrium for the section information game is satisfied.
As shown in fig. 4: "A" dominates, where "A1" attempts to attack "D2" and "A2" steps toward "D1".
As shown in fig. 5: "A1" and "A2" attempt to attack "D1", while "D2" attempts to surpass "A2". Then, the situation changes, where "D" is dominant in phase 3. This can also be verified in fig. 3, where the expected revenue changes from positive to negative.
As shown in fig. 6: "D1" and "D2" will still try to attack "A2", but "A2" will continue to catch up with "D1", and "A1" will change back to the side "D2".
As shown in fig. 7: the situation changes again, with "a" dominating and the expected revenue changing from negative to positive. "a 2" continues to rotate and successfully drives "D1" away, then "a 1" and "a 2" try to attack "D2", but "D1" and "D2" escape in two different directions.
As shown in fig. 8: finally, "a 1" and "a 2" both occupy the dominant position, so "a" gains absolute advantage and ends the confrontation.

Claims (2)

1. A multi-AUV dynamic maneuver decision method based on an interval information game is characterized by comprising the following steps:
step 1: obtaining an advantage function of the multi-AUV system of the two countermeasures according to the situation advantage and the energy efficiency advantage:
the situation advantages include angleAdvantage AagSpeed advantage AsAnd distance advantage Adis
Figure FDA0002741228460000011
Wherein | AA | is the viewing angle of two AUV players, and ATA is the target incident angle;
Figure FDA0002741228460000012
wherein v isn1i,vn2jIs the velocity vector of both parties in the game; n1 and n2 in the subscripts are both countermeasures, i and j are the ith and jth AUV corresponding to both countermeasures;
Figure FDA0002741228460000013
wherein DijIs the distance between different AUVs; r0=(Rmax+Rmin)/2;RmaxIs the maximum starting distance, RminIs the minimum starting distance;
the overall situation dominance function is: wA=k1Aag+k2As+k3AdisWherein k is1,k2,k3Is a weighting coefficient, k1+k2+k3=1;
The energy efficiency advantage function is written as:
Figure FDA0002741228460000014
wherein C isn1i,Cn2jIs to combat the energy efficiency of both AUV systems;
the overall merit function of the multi-AUV system of our party is:
Figure FDA0002741228460000015
wherein delta1,δ2Is a weighting coefficient and satisfies δ12=1;
Figure FDA0002741228460000016
Figure FDA0002741228460000017
Representing a merit function having upper and lower bounds;
obtaining the overall advantage function of the enemy in the game in the same way
Figure FDA0002741228460000018
Step 2: obtaining a payment matrix of the multi-AUV system according to the interval information and payment interval grades combining the four parameter interval sets and the relative entropy:
the payment function of the multi-AUV game under uncertain information obtained according to the advantage function in the step 1 is established as follows:
Figure FDA0002741228460000021
wherein xij,yjiIs a binary decision variable, xij1 denotes the ith AUV of my party attacking the jth AUV of the enemy, and xij0 means that the ith AUV of my party does not attack the jth AUV of the enemy; likewise, yjiWhether the jth AUV representing the enemy attacks the ith AUV of the enemy or not;
the payment matrix under uncertain information is therefore:
Figure FDA0002741228460000022
the method combining the four parameter interval sets and the relative entropy is improved to ensure that
Figure FDA0002741228460000023
Figure FDA0002741228460000024
WmnIs a function of the advantage of the normalization,
Figure FDA0002741228460000025
wherein x1,x2,…,xmIs the maneuver strategy of my AUV System, y1,y2,…,ynIs a maneuvering strategy of an enemy multi-AUV system,
Figure FDA0002741228460000026
represents payment when the m-th policy is used by my AUV system and the n-th policy is used by the enemy AUV system;
and step 3: solving the Nash equilibrium optimal solution, and finding the optimal strategy:
the confrontation track of the multiple AUVs is regarded as the combination of each action, k-level dynamic games are used in the confrontation process, and each level of games comprises 7 actions, namely, keeping the original flight, accelerating, decelerating, turning left, turning right, climbing and diving;
considering the practical subsea environment constraints, solving the nash equilibrium problem can be converted into an optimization problem with interval uncertainty parameters:
Figure FDA0002741228460000031
Figure FDA0002741228460000032
wherein xiIndicating the probability of my AUV adopting the ith policy,
Figure FDA0002741228460000033
is the threshold value of the benefit of the participant,
Figure FDA0002741228460000034
the payment of my AUV system using the ith strategy and the payment of the enemy AUV system using the jth strategy;
by the optimal parameter x in the optimal solutioniThat is to say, the current miningAnd taking the probability of the optimal strategy, and executing the action used by the optimal strategy pair by the AUV.
2. The multi-AUV dynamic maneuver decision method based on the interval information game as claimed in claim 1, wherein in step 3, an improved differential evolution algorithm is adopted to solve a Nash equilibrium optimal solution, and an optimal strategy is found; the improved differential evolution algorithm comprises the steps of mutation, intersection and selection, and in order to select the optimal fitness, a game algorithm is combined, and when a new mutation vector is generated, a scaling ratio F is determined according to the evolution time and the difference between the best individual and the worst individual:
Figure FDA0002741228460000035
wherein Δ G ═ G/G, G is the maximum number of iterations, and G is the current number of iterations; f. ofbestIs the best fitness, fworstIs the worst fitness, fiIs the current personal fitness; fmaxAnd FminAre the maximum and minimum values of F.
CN202011150930.7A 2020-10-24 2020-10-24 Multi-AUV dynamic maneuver decision method based on interval information game Withdrawn CN112306070A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011150930.7A CN112306070A (en) 2020-10-24 2020-10-24 Multi-AUV dynamic maneuver decision method based on interval information game

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011150930.7A CN112306070A (en) 2020-10-24 2020-10-24 Multi-AUV dynamic maneuver decision method based on interval information game

Publications (1)

Publication Number Publication Date
CN112306070A true CN112306070A (en) 2021-02-02

Family

ID=74327579

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011150930.7A Withdrawn CN112306070A (en) 2020-10-24 2020-10-24 Multi-AUV dynamic maneuver decision method based on interval information game

Country Status (1)

Country Link
CN (1) CN112306070A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112651107A (en) * 2021-02-23 2021-04-13 西安工业大学 Game-resisting target damage strategy evaluation method
CN113033107A (en) * 2021-04-16 2021-06-25 西北工业大学 Multi-AUV cluster game countermeasure model construction method based on central intelligence set theory
CN114079882A (en) * 2021-11-15 2022-02-22 广东工业大学 Method and device for cooperative computing and path control of multiple unmanned aerial vehicles

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112651107A (en) * 2021-02-23 2021-04-13 西安工业大学 Game-resisting target damage strategy evaluation method
CN112651107B (en) * 2021-02-23 2023-06-20 西安工业大学 Method for evaluating damage strategy of countergame target
CN113033107A (en) * 2021-04-16 2021-06-25 西北工业大学 Multi-AUV cluster game countermeasure model construction method based on central intelligence set theory
CN114079882A (en) * 2021-11-15 2022-02-22 广东工业大学 Method and device for cooperative computing and path control of multiple unmanned aerial vehicles
CN114079882B (en) * 2021-11-15 2024-04-05 广东工业大学 Method and device for cooperative calculation and path control of multiple unmanned aerial vehicles

Similar Documents

Publication Publication Date Title
CN112306070A (en) Multi-AUV dynamic maneuver decision method based on interval information game
CN110928329B (en) Multi-aircraft track planning method based on deep Q learning algorithm
CN112305913A (en) Multi-UUV collaborative dynamic maneuver decision method based on intuitive fuzzy game
CN108318032A (en) A kind of unmanned aerial vehicle flight path Intelligent planning method considering Attack Defence
CN111240353A (en) Unmanned aerial vehicle collaborative air combat decision method based on genetic fuzzy tree
CN113221444B (en) Behavior simulation training method for air intelligent game
CN113052289B (en) Method for selecting cluster hitting position of unmanned ship based on game theory
CN110673488A (en) Double DQN unmanned aerial vehicle concealed access method based on priority random sampling strategy
CN114638339A (en) Intelligent agent task allocation method based on deep reinforcement learning
CN115525058B (en) Unmanned submarine vehicle cluster cooperative countermeasure method based on deep reinforcement learning
CN114139023B (en) Multi-target hierarchical grouping method for marine situation generation based on Louvain algorithm
CN116185059A (en) Unmanned aerial vehicle air combat autonomous evasion maneuver decision-making method based on deep reinforcement learning
CN116225049A (en) Multi-unmanned plane wolf-crowd collaborative combat attack and defense decision algorithm
Liu et al. Multi‐UUV Cooperative Dynamic Maneuver Decision‐Making Algorithm Using Intuitionistic Fuzzy Game Theory
CN110163519B (en) UUV red and blue threat assessment method for base attack and defense tasks
CN116680509A (en) Dynamic matching method for multi-spacecraft escape-tracking game task
CN111624996A (en) Multi-unmanned-boat incomplete information trapping method based on game theory
CN111773722B (en) Method for generating maneuver strategy set for avoiding fighter plane in simulation environment
CN116661496B (en) Multi-patrol-missile collaborative track planning method based on intelligent algorithm
CN117408376A (en) Soldier chess operator position prediction method and system based on battlefield division and attraction map
CN117270528A (en) Unmanned ship escape game control method and controller
CN116432030A (en) Air combat multi-intention strategy autonomous generation method based on deep reinforcement learning
CN116225065A (en) Unmanned plane collaborative pursuit method of multi-degree-of-freedom model for multi-agent reinforcement learning
CN113255234B (en) Method for carrying out online target distribution on missile groups
CN113095465B (en) Underwater unmanned cluster task allocation method for quantum salmon migration mechanism evolution game

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WW01 Invention patent application withdrawn after publication
WW01 Invention patent application withdrawn after publication

Application publication date: 20210202