CN112306070A - Multi-AUV dynamic maneuver decision method based on interval information game - Google Patents
Multi-AUV dynamic maneuver decision method based on interval information game Download PDFInfo
- Publication number
- CN112306070A CN112306070A CN202011150930.7A CN202011150930A CN112306070A CN 112306070 A CN112306070 A CN 112306070A CN 202011150930 A CN202011150930 A CN 202011150930A CN 112306070 A CN112306070 A CN 112306070A
- Authority
- CN
- China
- Prior art keywords
- auv
- interval
- payment
- enemy
- game
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Withdrawn
Links
- 238000000034 method Methods 0.000 title claims abstract description 31
- 239000011159 matrix material Substances 0.000 claims abstract description 13
- 230000008901 benefit Effects 0.000 claims description 44
- 239000013598 vector Substances 0.000 claims description 18
- 238000005457 optimization Methods 0.000 claims description 12
- 230000035772 mutation Effects 0.000 claims description 4
- 230000009194 climbing Effects 0.000 claims description 3
- 230000009189 diving Effects 0.000 claims description 3
- 238000010606 normalization Methods 0.000 claims description 3
- 239000004576 sand Substances 0.000 claims description 3
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 abstract description 2
- 238000012360 testing method Methods 0.000 description 4
- 230000002349 favourable effect Effects 0.000 description 3
- 238000011160 research Methods 0.000 description 3
- 230000003042 antagnostic effect Effects 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 238000009826 distribution Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000002146 bilateral effect Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000007613 environmental effect Effects 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 238000011017 operating method Methods 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
- 238000009827 uniform distribution Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05D—SYSTEMS FOR CONTROLLING OR REGULATING NON-ELECTRIC VARIABLES
- G05D1/00—Control of position, course, altitude or attitude of land, water, air or space vehicles, e.g. using automatic pilots
- G05D1/04—Control of altitude or depth
- G05D1/06—Rate of change of altitude or depth
- G05D1/0692—Rate of change of altitude or depth specially adapted for under-water vehicles
Landscapes
- Engineering & Computer Science (AREA)
- Aviation & Aerospace Engineering (AREA)
- Radar, Positioning & Navigation (AREA)
- Remote Sensing (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Automation & Control Theory (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention relates to a multi-AUV dynamic maneuver decision method based on an interval information game. Subsequently, a payment matrix is executed, which consists of interval information and payment interval levels combining the four parameter interval sets and relative entropy. Then, Nash equilibrium conditions meeting the interval game conditions are provided, and a Nash equilibrium maneuvering decision model under the dynamic marine environment is established. Meanwhile, an improved differential evolution algorithm is applied to solve the existing problems and find an optimal strategy. The invention solves the problem of the influence of weak connectivity, uncertainty and variability of the underwater environment on the modeling difficulty, and ensures that the established model is more persuasive and more reliable in the application of the actual water area.
Description
Technical Field
The invention belongs to the field of multi-underwater robot cooperative confrontation, and particularly relates to a multi-AUV cooperative confrontation method based on an interval information game.
Background
With the development of science and technology, Autonomous Underwater Vehicles (AUV) have been widely used in the relevant fields of marine observation, marine rescue, mine area search, enemy reconnaissance and the like. The high efficiency and reliability of the multi-AUV system due to the space-time distribution and the redundant configuration of the multi-AUV system provide a new solution for complex ocean tasks. The multi-AUV game cooperation can be used for ocean research and military countermeasure, including underwater multi-target tracking, monitoring and detection, and can effectively enlarge the underwater battle radius and reduce underwater equipment and casualties.
The maneuver decision is the key of the multi-UUV cooperative countermeasure, and is the basic action of each countermeasure step. There is also much research on unilateral strategy optimization, but little research on bilateral game theory. Therefore, by introducing the cooperative game theory into the maneuvering decision of the unmanned aerial vehicle system cluster, a more scientific and more accurate real-time countermeasure strategy can be made.
Disclosure of Invention
Technical problem to be solved
The existing method ignores the complexity and uncertainty of the underwater environment, and cannot obtain accurate real-time underwater environment characteristics, so that the reliability of the established decision model is low, the adopted conventional decision algorithm is easy to fall into local optimization, and finally, an accurate and reliable decision scheme is difficult to obtain, and the method cannot be applied to a real sea area environment. The invention aims to provide a multi-AUV dynamic maneuvering decision algorithm for a section information game aiming at the defects of the existing method.
Technical scheme
A multi-AUV dynamic maneuver decision method based on an interval information game is characterized by comprising the following steps:
step 1: obtaining an advantage function of the multi-AUV system of the two countermeasures according to the situation advantage and the energy efficiency advantage:
the situation advantages comprise an angle advantage AagSpeed advantage AsAnd distance advantage Adis;
wherein v isn1i,vn2jIs the velocity vector of both parties in the game; n1 and n2 in the subscripts are both countermeasures, i and j are the ith and jth AUV corresponding to both countermeasures;
wherein DijIs the distance between different AUVs; r0=(Rmax+Rmin)/2;RmaxIs the maximum starting distance, RminIs the minimum starting distance;
the overall situation dominance function is: wA=k1Aag+k2As+k3AdisWherein k is1,k2,k3Is a weighting coefficient, k1+k2+k3=1;
The energy efficiency advantage function is written as:
wherein delta1,δ2Is a weighting coefficient and satisfies δ1+δ2=1; Representing a merit function having upper and lower bounds;
Step 2: obtaining a payment matrix of the multi-AUV system according to the interval information and payment interval grades combining the four parameter interval sets and the relative entropy:
the payment function of the multi-AUV game under uncertain information obtained according to the advantage function in the step 1 is established as follows:
wherein xij,yjiIs a binary decision variable, x ij1 denotes the ith AUV of my party attacking the jth AUV of the enemy, and x ij0 means that the ith AUV of my party does not attack the jth AUV of the enemy; likewise, yjiWhether the jth AUV representing the enemy attacks the ith AUV of the enemy or not;
the payment matrix under uncertain information is therefore:
the method combining the four parameter interval sets and the relative entropy is improved to ensure that WmnIs a function of the advantage of the normalization,wherein x1,x2,…,xmIs the maneuver strategy of my AUV System, y1,y2,…,ynIs a maneuvering strategy of an enemy multi-AUV system,represents payment when the m-th policy is used by my AUV system and the n-th policy is used by the enemy AUV system;
and step 3: solving the Nash equilibrium optimal solution, and finding the optimal strategy:
the confrontation track of the multiple AUVs is regarded as the combination of each action, k-level dynamic games are used in the confrontation process, and each level of games comprises 7 actions, namely, keeping the original flight, accelerating, decelerating, turning left, turning right, climbing and diving;
considering the practical subsea environment constraints, solving the nash equilibrium problem can be converted into an optimization problem with interval uncertainty parameters:
wherein xiIndicating the probability of my AUV adopting the ith policy,is the threshold value of the benefit of the participant,the payment of my AUV system using the ith strategy and the payment of the enemy AUV system using the jth strategy;
by the optimal parameter x in the optimal solutioniAnd the probability of adopting the optimal strategy at the moment is obtained, and the AUV executes the action of the optimal strategy pair.
In the step 3, an improved differential evolution algorithm is adopted to solve the Nash equilibrium optimal solution, and an optimal strategy is found; the improved differential evolution algorithm comprises the steps of mutation, intersection and selection, and in order to select the optimal fitness, a game algorithm is combined, and when a new mutation vector is generated, a scaling ratio F is determined according to the evolution time and the difference between the best individual and the worst individual:
wherein Δ G ═ G/G, G is the maximum number of iterations, and G is the current number of iterations; f. ofbestIs the best fitness, fworstIs the worst fitness, fiIs the current personal fitness; fmaxAnd FminAre the maximum and minimum values of F.
Advantageous effects
The multi-AUV dynamic maneuver decision method based on the interval information game has the following beneficial effects that:
(1) the influence of weak connectivity, uncertainty and variability of an underwater environment on the modeling difficulty is solved, the established model is more convincing, and the model is more reliable in application of an actual water area. The interval information can represent underwater environment characteristics including various uncertainties, and the established model is more persuasive.
(2) The problem that the decision algorithm falls into the local optimal situation is solved, the optimal solution is searched in the whole algorithm, and the obtained result is more accurate and credible. The method selects the improved differential evolution algorithm when solving the problem, and effectively avoids the condition that the algorithm is trapped in local optimization.
Drawings
FIG. 1: level k gaming for multiple AUV systems
FIG. 2: operating procedure of IDE Algorithm
FIG. 3: expected revenue
FIG. 4: multi-AUV collaborative dynamic maneuver decision: first stage
FIG. 5: multi-AUV collaborative dynamic maneuver decision: second stage
FIG. 6: multi-AUV collaborative dynamic maneuver decision: the third stage
FIG. 7: multi-AUV collaborative dynamic maneuver decision: fourth stage
FIG. 8: multi-AUV collaborative dynamic maneuver decision: the fifth stage
Detailed Description
The invention will now be further described with reference to the following examples and drawings:
the technical scheme of the scheme is as follows: firstly, on the basis of a multi-AUV maneuvering strategy, an advantage function consisting of situation advantages and energy efficiency advantages is provided. Subsequently, a payment matrix is executed, which consists of interval information and payment interval levels combining the four parameter interval sets and relative entropy. Then, Nash equilibrium conditions meeting the interval game conditions are provided, and a Nash equilibrium maneuvering decision model under the dynamic marine environment is established. Meanwhile, an improved differential evolution algorithm is applied to solve the existing problems and find an optimal strategy. Finally, the superiority of the proposed multi-AUV dynamic maneuver decision algorithm is verified by examples. The method comprises the following steps:
step 1: obtaining an advantage function of the multi-AUV system of the two countermeasures according to the situation advantage and the energy efficiency advantage:
in order to establish an interval information payment matrix, a plurality of AUV maneuvering attribute evaluation methods are provided according to the situation information of the two enemy parties, as shown in the k-level dynamic game in fig. 1. The confrontational trajectories of multiple AUVs are treated as a combination of each action. Multiple AUVs for the enemy and multiple AUV systems for my party are considered decision-making parties in the game.
The gaming model of the multiple AUV system based on uncertain information can be expressed as:
wherein N ═ { N ═ N1,N2The decision parties in the game are determined;is the policy space of the decision maker,meaning that we choose the strategy of the ith category,indicating that the enemy selects the jth strategy in the kth stage;is the revenue interval corresponding to each policy that the multiple AUV systems participating in the game may select. According to the game tree shown in fig. 1, the actions of the multi-AUV system in the phase k game can be represented by one information set, so that the manipulation policy is actually a set of action rules of the multi-AUV system in each information.
The main difference between the multi-AUV counter-action and the other autonomous robot counter-actions is the information transfer mode. Due to the influence of the marine environment, information in the multi-AUV countermeasure process is mainly received by underwater sound waves. The shallow sea acoustic channel is a channel with space-time frequency variations. It has strong multipath interference, high environmental noise, large transmission loss and serious Doppler shift effect. Therefore, there is a great uncertainty in the information provided during the multi-AUV countermeasure. It is difficult to accurately quantify the threat level of both parties in the decision making process. Therefore, in the present invention, each attribute is represented by section information set in the decision process. The merit function that can evaluate the payment of each AUV consists of two parts, a situation advantage and an energy efficiency advantage.
In order to attack an enemy multi-AUV system, it is necessary to occupy a favorable attack position and minimize the attack risk of our multi-AUV system.
(1) The situation advantages include an angular advantage AagSpeed advantage AsAnd distance advantage Adis;
wherein v isn1i,vn2jIs the velocity vector of both parties in the game; n1 and n2 in the subscripts are both antagonistic parties, and i and j are the ith and jth AUV corresponding to both antagonistic parties.
Wherein DijIs the distance between different AUVs; r0=(Rmax+Rmin)/2;RmaxIs the maximum starting distance, RminIs the minimum starting distance;
the overall situation dominance function is: wA=k1Aag+k2As+k3AdisWherein k is1,k2,k3Is a weighting coefficient, k1+k2+k3=1
(2) The energy efficiency merit function may be written as:
wherein delta1,δ2Is a weighting coefficient and satisfies δ1+δ2=1; Representing a merit function with upper and lower bounds.
(4) By exchanging the situation information parameters of the two parties, the overall advantage function W E of the enemy in the game can be obtained2。
Step 2: and obtaining a payment matrix of the multi-AUV system according to the interval information and the payment interval grade combining the four parameter interval sets and the relative entropy.
The revenue matrix of the multi-AUV system is executed. The payment matrix consists of interval information and payment interval levels combining four parameter interval sets and relative entropy. Payouts in a game refer to the ultimate profit or loss of the player in the strategic selection. In multi-AUV confrontation, the gain of our AUV must be a loss of the enemy AUV. Thus, the game of the present invention falls within the category of two-player zero-sum games.
Due to the various underwater interference factors, the multi-AUV system cannot accurately obtain various information in the actual submarine maneuver decision. After a reasonable analysis of the collision situation, each interference factor usually varies within a certain interval. Thus, a revenue matrix for each multi-AUV system is established based on the interval information.
The payment function of the multi-AUV game under uncertain information obtained according to the advantage function in the step 1 is established as follows:
wherein xij,yjiIs a binary decision variable, xij1 denotes our ith AUV attacking enemy jth AUV, and xij0 means that our ith AUV does not attack the jth AUV of the enemy; likewise, yjiWhether the jth AUV representing the enemy attacks our ith AUV.
For comparison of interval information sets, the sizes cannot be compared from a quantitative point of view like real numbers. Ranking methods based on degree of likelihood may fail, while ranking methods based on geometric distance may result in significant information loss. To avoid these drawbacks, a method is proposed that combines four sets of parameter intervals and relative entropy.
Payment intervalIs derived from the combined information of the two parties to the game. But the payment does not take into account the distribution of points in the interval. In practice, the set of internal payment intervals cannot simply be considered as a uniform distribution. It should change according to the change of underwater confrontation. For a policy xiWhen the adversary situation is favorable for the attacker, the benefit of the attacker inevitably tends to fR(ii) a And vice versa. If not, it will tend to fL. To make it sufficientAnd converting the payment interval into four set parameter intervals by using the advantage matrix information.
The payment matrix under uncertain information is therefore:
the method combining the four parameter interval sets and the relative entropy is improved to ensure that WmnIs a function of the advantage of the normalization,wherein x1,x2,…,xmIs the maneuver strategy of my AUV System, y1,y2,…,ynIs a maneuvering strategy of an enemy multi-AUV system,indicating payment when the my AUV system uses the mth policy and the enemy AUV system employs the nth policy.
The basic idea of the ranking method is to use the information entropy to measure the difference between the AUV self income and the maximum income (minimum income) under different strategies, and select the strategy with the minimum difference between the AUV self income and the maximum income (or the strategy with the maximum difference between the AUV self income and the minimum income). In fact, the highest reward indicates that the AUV has completed the intended task and there is no casualty. The lowest profit represents the situation where the AUV has not completed the intended task and casualties are greatest.
And step 3: and applying the improved differential evolution algorithm to solve the Nash equilibrium optimal solution and finding the optimal strategy.
The confrontation track of the multi-AUV is regarded as the combination of each action, k-level dynamic games are used in the confrontation process, and each level of games basically comprises 7 actions, namely, keeping the original flight, accelerating, decelerating, turning left, turning right, climbing and diving.
Considering the practical subsea environment constraints, solving the nash equilibrium problem can be converted into an optimization problem with interval uncertainty parameters:
wherein xiIndicating the probability of my AUV adopting the ith policy,is the threshold value of the benefit of the participant,my AUV system uses the ith policy and the enemy AUV system uses the jth policy for payment.
A Differential Evolution (DE) is an intelligent optimization method based on population difference heuristics. The DE leverages the differences between population individuals to interfere with individual evolution and searches the entire optimization space using a greedy rule to find an optimal solution. It updates the population by variation, crossover and selection in the population and then finds the best solution. DE has easy operation, good robustness, optimization ability characteristics such as strong. However, when the DE algorithm is applied in the optimization process, a situation may occur in which convergence is slow and falls into local optimum, so that it is difficult to satisfy the requirement of real-time countermeasure.
Aiming at the problems of the DE algorithm, the invention provides an improved differential evolution algorithm (IDE). The algorithm flow is shown in fig. 2:
(1) fitness function
Generally, the best strategy for participant N1 is to maximize its revenue under constraints, while the other participant N2 is the opposite. Therefore, the fitness function here may be an optimization objective function expressed in an optimization formula.
(2) Variation of
The scaling factor F is used to scale each basis vector and generate a new variation vector. A larger F may search for a potentially best solution over a larger range. Conversely, a smaller F may increase convergence speed and improve accuracy. Meanwhile, when the fitness of each person is good, it is preferable that F is small in order to reduce the interference with better persons. Conversely, when the adaptability of each person is relatively poor, it is preferable to expand the search range of the solution, and thus a larger F can be applied. In conjunction with the game algorithm presented herein, the scaling F is determined from the evolution time and the difference between the best and worst individuals:
wherein Δ G ═ G/G, G is the maximum number of iterations, and G is the current number of iterations; f. ofbestIs the best fitness, fworstIs the worst fitness, fiIs the current personal fitness; fmaxAnd FminAre the maximum and minimum values of F. If the adaptability difference between the current individual and the optimal individual is large, it means that the individual is far from the spatially optimal individual. FiThe larger the value of (A), the larger the interference to the individual, which means that the search range of the algorithm is enlarged and the global search capability is enhanced. If the difference in fitness is small, FiSmaller values can be taken and the interference to the individual is also smaller, which means that the search is only performed in a smaller area near the individual to enhance the ability of the algorithm to develop. Furthermore, in the later stages of evolution, the value of Δ g is preferably relatively small, so that searches can be made in local areas near the current individual and the accuracy of the algorithm is ensured.
Using the DE current best strategy, the following variation vectors can be derived:
wi,g=vi,g+Fi(vbest,g-vi,g)+Fi(vr1,g-vr2,g)
wherein wi,gIs a variation vector; fiIs the scale factor of the current individual as determined by the last equation; v. ofi,gRepresents the current individual vector, and vbest,gRepresents the best individual of the population; r1 and r2 are two different integers and 0<r1,r2<NP, NP is the population.
(3) Crossing
The crossover rate CR determines the crossover probability of variant and original individuals on each dimensional vector. Individuals with greater compliance may have greater CR that accelerates changes in individual structure. Therefore, it is better to use smaller CR in the later stage of evolution to reduce the interference of the target individual to the experimental individual and ensure the convergence speed of the algorithm. The designed crossover rates are as follows:
whereinIs the current average fitness; CRiIs the current crossover rate, CRmaxAnd CRminAre the maximum and minimum values of CR. When the target individual vi,gWhen the fitness of the target individual is less than the average fitness, the target individual is relatively superior. Should choose a smaller CRiAnd from the target vector vi,gMore test vector information is obtained. Otherwise, from the change vector wi,gObtaining a test vector ui,gThis improves the diversity of the population. Δ g may ensure that a larger CR is obtained early in evolutioniIncreasing population diversity and speeding convergence. In addition, in later stages of development, smaller CRiIs favorable for finding the optimal solution.
The interleaving operation can be expressed as:
wherein u isij,gIs a test vector uij,gThe jth component of (a); rnbr is a random integer less than integer D; rand [0,1 ]]Is a random number between 0 and 1.
(4) Selecting
The selection operation is to select a better fitness between the newly generated test vector and the original target vector to be a member of the next population generation. This is a "greedy" selection operation. The selection operation may be described as follows:
wherein v isi,g+1Is the next generation of individuals.
Obtaining an optimal solution by utilizing improved differential evolution, and obtaining an optimal parameter x in the optimal solutioniAnd the probability of adopting the optimal strategy at the moment is obtained, and the AUV executes the action of the optimal strategy pair.
Example (c): assume that "a" and "D" participate in a 2-to-2 AUV underwater confrontation. The initial positions of 'a1', 'a2' are (0m, 100m, 200m), (0m, -100m, 200m), 'D1', 'D2' are (800m, 100m, 200m), (800m, -100m, 200 m). The speeds, deflection angles and pitch angles of A1 and A2 are 23m/s, -60 degrees, 5 degrees and 23m/s, 60 degrees and-5 degrees respectively; the velocities, yaw angles and pitch angles of "D1" and "D2" were 25m/s, 120 °, 3 ° and 25m/s, respectively, and-120 ° and-3 °. Both have the same control capability, and the time interval of the opposite step is 5 s. It is clear that "D" has advantages from the outset. It should also be noted that the maximum maneuver step should be determined based on the effectiveness of the AUV used in the confrontation. For comparison of the challenge performance, "a" uses the collaborative dynamic maneuver decision algorithm proposed by the present invention, and "D" uses the max-min decision algorithm in the multiple AUV challenge process. The three-dimensional challenge process with 5 main stages is shown in fig. 4-7. "+" shows the initial position and "4" shows the current position. The confrontation is ended when the expected profit of one party reaches absolute advantage.
The calculation part in the invention is as follows:
as shown in fig. 3: there are 50 steps in the confrontation process, representing its expected revenue en route. From the last part, the expected revenue obtained indicates that nash equilibrium for the section information game is satisfied.
As shown in fig. 4: "A" dominates, where "A1" attempts to attack "D2" and "A2" steps toward "D1".
As shown in fig. 5: "A1" and "A2" attempt to attack "D1", while "D2" attempts to surpass "A2". Then, the situation changes, where "D" is dominant in phase 3. This can also be verified in fig. 3, where the expected revenue changes from positive to negative.
As shown in fig. 6: "D1" and "D2" will still try to attack "A2", but "A2" will continue to catch up with "D1", and "A1" will change back to the side "D2".
As shown in fig. 7: the situation changes again, with "a" dominating and the expected revenue changing from negative to positive. "a 2" continues to rotate and successfully drives "D1" away, then "a 1" and "a 2" try to attack "D2", but "D1" and "D2" escape in two different directions.
As shown in fig. 8: finally, "a 1" and "a 2" both occupy the dominant position, so "a" gains absolute advantage and ends the confrontation.
Claims (2)
1. A multi-AUV dynamic maneuver decision method based on an interval information game is characterized by comprising the following steps:
step 1: obtaining an advantage function of the multi-AUV system of the two countermeasures according to the situation advantage and the energy efficiency advantage:
the situation advantages include angleAdvantage AagSpeed advantage AsAnd distance advantage Adis;
wherein v isn1i,vn2jIs the velocity vector of both parties in the game; n1 and n2 in the subscripts are both countermeasures, i and j are the ith and jth AUV corresponding to both countermeasures;
wherein DijIs the distance between different AUVs; r0=(Rmax+Rmin)/2;RmaxIs the maximum starting distance, RminIs the minimum starting distance;
the overall situation dominance function is: wA=k1Aag+k2As+k3AdisWherein k is1,k2,k3Is a weighting coefficient, k1+k2+k3=1;
The energy efficiency advantage function is written as:
wherein delta1,δ2Is a weighting coefficient and satisfies δ1+δ2=1; Representing a merit function having upper and lower bounds;
Step 2: obtaining a payment matrix of the multi-AUV system according to the interval information and payment interval grades combining the four parameter interval sets and the relative entropy:
the payment function of the multi-AUV game under uncertain information obtained according to the advantage function in the step 1 is established as follows:
wherein xij,yjiIs a binary decision variable, xij1 denotes the ith AUV of my party attacking the jth AUV of the enemy, and xij0 means that the ith AUV of my party does not attack the jth AUV of the enemy; likewise, yjiWhether the jth AUV representing the enemy attacks the ith AUV of the enemy or not;
the payment matrix under uncertain information is therefore:
the method combining the four parameter interval sets and the relative entropy is improved to ensure that WmnIs a function of the advantage of the normalization,wherein x1,x2,…,xmIs the maneuver strategy of my AUV System, y1,y2,…,ynIs a maneuvering strategy of an enemy multi-AUV system,represents payment when the m-th policy is used by my AUV system and the n-th policy is used by the enemy AUV system;
and step 3: solving the Nash equilibrium optimal solution, and finding the optimal strategy:
the confrontation track of the multiple AUVs is regarded as the combination of each action, k-level dynamic games are used in the confrontation process, and each level of games comprises 7 actions, namely, keeping the original flight, accelerating, decelerating, turning left, turning right, climbing and diving;
considering the practical subsea environment constraints, solving the nash equilibrium problem can be converted into an optimization problem with interval uncertainty parameters:
wherein xiIndicating the probability of my AUV adopting the ith policy,is the threshold value of the benefit of the participant,the payment of my AUV system using the ith strategy and the payment of the enemy AUV system using the jth strategy;
by the optimal parameter x in the optimal solutioniThat is to say, the current miningAnd taking the probability of the optimal strategy, and executing the action used by the optimal strategy pair by the AUV.
2. The multi-AUV dynamic maneuver decision method based on the interval information game as claimed in claim 1, wherein in step 3, an improved differential evolution algorithm is adopted to solve a Nash equilibrium optimal solution, and an optimal strategy is found; the improved differential evolution algorithm comprises the steps of mutation, intersection and selection, and in order to select the optimal fitness, a game algorithm is combined, and when a new mutation vector is generated, a scaling ratio F is determined according to the evolution time and the difference between the best individual and the worst individual:wherein Δ G ═ G/G, G is the maximum number of iterations, and G is the current number of iterations; f. ofbestIs the best fitness, fworstIs the worst fitness, fiIs the current personal fitness; fmaxAnd FminAre the maximum and minimum values of F.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011150930.7A CN112306070A (en) | 2020-10-24 | 2020-10-24 | Multi-AUV dynamic maneuver decision method based on interval information game |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011150930.7A CN112306070A (en) | 2020-10-24 | 2020-10-24 | Multi-AUV dynamic maneuver decision method based on interval information game |
Publications (1)
Publication Number | Publication Date |
---|---|
CN112306070A true CN112306070A (en) | 2021-02-02 |
Family
ID=74327579
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011150930.7A Withdrawn CN112306070A (en) | 2020-10-24 | 2020-10-24 | Multi-AUV dynamic maneuver decision method based on interval information game |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112306070A (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112651107A (en) * | 2021-02-23 | 2021-04-13 | 西安工业大学 | Game-resisting target damage strategy evaluation method |
CN113033107A (en) * | 2021-04-16 | 2021-06-25 | 西北工业大学 | Multi-AUV cluster game countermeasure model construction method based on central intelligence set theory |
CN114079882A (en) * | 2021-11-15 | 2022-02-22 | 广东工业大学 | Method and device for cooperative computing and path control of multiple unmanned aerial vehicles |
-
2020
- 2020-10-24 CN CN202011150930.7A patent/CN112306070A/en not_active Withdrawn
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112651107A (en) * | 2021-02-23 | 2021-04-13 | 西安工业大学 | Game-resisting target damage strategy evaluation method |
CN112651107B (en) * | 2021-02-23 | 2023-06-20 | 西安工业大学 | Method for evaluating damage strategy of countergame target |
CN113033107A (en) * | 2021-04-16 | 2021-06-25 | 西北工业大学 | Multi-AUV cluster game countermeasure model construction method based on central intelligence set theory |
CN114079882A (en) * | 2021-11-15 | 2022-02-22 | 广东工业大学 | Method and device for cooperative computing and path control of multiple unmanned aerial vehicles |
CN114079882B (en) * | 2021-11-15 | 2024-04-05 | 广东工业大学 | Method and device for cooperative calculation and path control of multiple unmanned aerial vehicles |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112306070A (en) | Multi-AUV dynamic maneuver decision method based on interval information game | |
CN110928329B (en) | Multi-aircraft track planning method based on deep Q learning algorithm | |
CN112305913A (en) | Multi-UUV collaborative dynamic maneuver decision method based on intuitive fuzzy game | |
CN108318032A (en) | A kind of unmanned aerial vehicle flight path Intelligent planning method considering Attack Defence | |
CN111240353A (en) | Unmanned aerial vehicle collaborative air combat decision method based on genetic fuzzy tree | |
CN113221444B (en) | Behavior simulation training method for air intelligent game | |
CN113052289B (en) | Method for selecting cluster hitting position of unmanned ship based on game theory | |
CN110673488A (en) | Double DQN unmanned aerial vehicle concealed access method based on priority random sampling strategy | |
CN114638339A (en) | Intelligent agent task allocation method based on deep reinforcement learning | |
CN115525058B (en) | Unmanned submarine vehicle cluster cooperative countermeasure method based on deep reinforcement learning | |
CN114139023B (en) | Multi-target hierarchical grouping method for marine situation generation based on Louvain algorithm | |
CN116185059A (en) | Unmanned aerial vehicle air combat autonomous evasion maneuver decision-making method based on deep reinforcement learning | |
CN116225049A (en) | Multi-unmanned plane wolf-crowd collaborative combat attack and defense decision algorithm | |
Liu et al. | Multi‐UUV Cooperative Dynamic Maneuver Decision‐Making Algorithm Using Intuitionistic Fuzzy Game Theory | |
CN110163519B (en) | UUV red and blue threat assessment method for base attack and defense tasks | |
CN116680509A (en) | Dynamic matching method for multi-spacecraft escape-tracking game task | |
CN111624996A (en) | Multi-unmanned-boat incomplete information trapping method based on game theory | |
CN111773722B (en) | Method for generating maneuver strategy set for avoiding fighter plane in simulation environment | |
CN116661496B (en) | Multi-patrol-missile collaborative track planning method based on intelligent algorithm | |
CN117408376A (en) | Soldier chess operator position prediction method and system based on battlefield division and attraction map | |
CN117270528A (en) | Unmanned ship escape game control method and controller | |
CN116432030A (en) | Air combat multi-intention strategy autonomous generation method based on deep reinforcement learning | |
CN116225065A (en) | Unmanned plane collaborative pursuit method of multi-degree-of-freedom model for multi-agent reinforcement learning | |
CN113255234B (en) | Method for carrying out online target distribution on missile groups | |
CN113095465B (en) | Underwater unmanned cluster task allocation method for quantum salmon migration mechanism evolution game |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WW01 | Invention patent application withdrawn after publication | ||
WW01 | Invention patent application withdrawn after publication |
Application publication date: 20210202 |