CN110196605A - A kind of more dynamic object methods of the unmanned aerial vehicle group of intensified learning collaboratively searching in unknown sea area - Google Patents
A kind of more dynamic object methods of the unmanned aerial vehicle group of intensified learning collaboratively searching in unknown sea area Download PDFInfo
- Publication number
- CN110196605A CN110196605A CN201910346512.6A CN201910346512A CN110196605A CN 110196605 A CN110196605 A CN 110196605A CN 201910346512 A CN201910346512 A CN 201910346512A CN 110196605 A CN110196605 A CN 110196605A
- Authority
- CN
- China
- Prior art keywords
- aerial vehicle
- grid
- unmanned aerial
- vehicle group
- unmanned plane
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05D—SYSTEMS FOR CONTROLLING OR REGULATING NON-ELECTRIC VARIABLES
- G05D1/00—Control of position, course or altitude of land, water, air, or space vehicles, e.g. automatic pilot
- G05D1/0088—Control of position, course or altitude of land, water, air, or space vehicles, e.g. automatic pilot characterized by the autonomous decision making process, e.g. artificial intelligence, predefined behaviours
-
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05D—SYSTEMS FOR CONTROLLING OR REGULATING NON-ELECTRIC VARIABLES
- G05D1/00—Control of position, course or altitude of land, water, air, or space vehicles, e.g. automatic pilot
- G05D1/12—Target-seeking control
Abstract
The more dynamic object methods of collaboratively searching that the invention discloses a kind of unmanned aerial vehicle groups of intensified learning in unknown sea area, comprising the following steps: S1: divide using Grid Method to searching area: manor consciousness information figure S2 is established according to the pheromone concentration of unmanned plane within a certain area: Q value table is designed according to drone status information and decision u (k);S3: flight path and execution according to the Q value of unmanned aerial vehicle group current state using Boltzmann distribution mechanism selection unmanned plane;S4: it is designed to the Reward-Penalty Functions of evaluation unmanned plane during flying state using search efficiency function, is updated according to Q value of the Reward-Penalty Functions to the new state that unmanned aerial vehicle group reaches;S5: unmanned aerial vehicle group arrival new state is updated to current state, persistently makes the study that flight path decision is finally completed entire Q value table, unmanned aerial vehicle group makes a policy according to trained Q table, completes search mission.
Description
Technical field
The present invention relates to unmanned aerial vehicle (UAV) control technical fields more particularly to a kind of unmanned aerial vehicle group of intensified learning in unknown sea area
The interior more dynamic object methods of collaboratively searching.
Background technique
With the fast development of the technologies such as sensor, wireless communication, intelligent control, the function of unmanned population system increasingly increases
By force, application field constantly expands.Unmanned population system is because of its scalability, strong collaborative and low-loss, Synergy
It is more and more paid close attention to application study by academia, industry and national defence, and more UAV collaborative searching systems can be effective
Search efficiency is improved, there is huge in particular for the search that there is dynamic object under the complicated sea situation such as uncertainty, strong jamming
Big advantage, therefore, more UAV collaborations sea area search are one of the important directions of unmanned population system research.
Traditional searching method is using cover type search, such as the search of Back Word type, traversal search etc., this way of search
Generally to maximize covering mission area to find target as much as possible, in recent years, combining target existing probability establishes search
Graph model is solved using distributed model predictive control, effectively reduces the solution scale of searching decisions problem, but only
It is limited to the search of static object.For dynamic object, average detected time and average detection probability are calculated using bayes method,
But it is only applicable to the search to marine single target, is not able to satisfy the demand of multiple target search.
Summary of the invention
According to problem of the existing technology, the invention discloses a kind of unmanned aerial vehicle groups of intensified learning in unknown sea area
The more dynamic object methods of collaboratively searching, this method consider environment, unmanned plane dynamic, target dynamic and sensor detection mould first
Type establishes more sea areas UAV search graph, then, is updated using the concept of manor consciousness information figure to search graph, expands original
Search graph.Intensified learning method is finally utilized, designs Reward-Penalty Functions in conjunction with search efficiency function, generates more UAV collaborations online
The path of search.
Specifically includes the following steps:
S1: searching area is divided using Grid Method: based on sea environment, unmanned plane dynamic, sea moving ship
Dynamic and sensor detection model information establish more sea areas UAV search graph;It is dense according to the pheromones of unmanned plane within a certain area
Degree establishes manor consciousness information figure, expands more sea areas UAV search graph using manor consciousness information figure;
S2: Q value table is designed according to drone status information and decision u (k);
S3: according to the Q value of unmanned aerial vehicle group current state using the flight path of Boltzmann distribution mechanism selection unmanned plane
And execute, when unmanned aerial vehicle group reaches new state according to target detection income Jp, environment search for income Jχ, Executing Cost C, collision
The weighted sum of cost I obtains search efficiency function;
S4: the Reward-Penalty Functions of evaluation unmanned plane during flying state are designed to using search efficiency function, according to Reward-Penalty Functions
The Q value for the new state that unmanned aerial vehicle group reaches is updated;
S5: by unmanned aerial vehicle group arrival new state be updated to current state, persistently make flight path decision be finally completed it is whole
The study of a Q value table, unmanned aerial vehicle group make a policy according to trained Q table, complete search mission.
In S1 specifically in the following way:
S11: manor consciousness information figure is established: as unmanned plane ViPheromones H is generated when searching for grid (m, n)i(mn)(k), should
Pheromones can be to diffusion at other grids, at grid (a, b), diffusive transport function in search graph are as follows:
Wherein ρ, β are constant;
Work as NvWhen frame unmanned plane executes search mission, then there is NvKind pheromones are constantly generated and are spread, and are with grid (c, d)
Example, current time pheromone concentration are last moment because the pheromone concentration that volatilization leaves is spread with current newly generated pheromones
To the summation of the grid concentration, renewal equation are as follows:
Wherein, τH∈ [0,1] is volatilization factor;
As unmanned plane ViWhen detecting that other information element concentration are high in grid (m, n), indicate other UAV in grid
Frequent activity at (m, n), unmanned plane ViOther information element concentration detected are as follows:
S12: establish destination probability figure: destination probability more new formula is,
P in formulamn(k) probability existing for for k moment target at (m, n), pDFor sensor detection probability;pFFor sensor
False-alarm probability;τ ∈ [0,1] is the destination probability multidate information factor.ΔPmn(k) be probability knots modification, i.e., when grid (m, n) not by
When UAV is accessed, since other grids are accessed, probability changes at caused grid (m, n),
In formula, D (k) is the set of k moment all accessed grids;NvFor unmanned plane quantity.
S13: establish degree of certainty figure: degree of certainty renewal equation is,
Wherein, τcFor the multidate information factor of degree of certainty;χ ∈ [0,1] is a constant.
S14: H is setmn(k) be total pheromone concentration at grid (m, n), wherein pheromone concentration be about grid positions and
The function of time, obtaining environment search graph is
In S2 specifically in the following way:
The size of Q value table is by the control instruction of drone status and input, and wherein location status shares Lx×LyKind, nobody
There is z kind in possibility course of the machine at each grid, and every frame UAV, which optionally controls input, l kind, then the line number of the Q table designed is
Lx×Ly× z, columns l.
In S3 specifically in the following way:
S31: collision cost I is defined as,
In formulaFor unmanned plane ViThe manor consciousness shown, that is, other information element concentration detected,
Calculation formula is as follows:
In above formula, Hmn(k) the pheromones total amount generated at grid (m, n) for all unmanned planes.
In S4 specifically in the following way:
S41: not considering no-fly zone, then Reward-Penalty Functions design is as follows,
A is constant, influences the generalization ability and a × J of learning processk(s (k), u (k)) ∈ (- R, R), maximum reward are R,
Maximum punishment is set as-R, actual range of the d between UAV, and J (s (k), u (k)) is search efficiency function, and D is minimum safe distance
From d >=D need to be met to guarantee each UAV safe flight.
S42: when there is no-fly zone, if B is unmanned plane apart from no-fly zone circle center distance, then B is greater than no-fly zone radius D*,
Reward-Penalty Functions are further improved as follows at this time,
That is UAV, which collides or fly into no-fly zone, will all give maximum punishment.
By adopting the above-described technical solution, a kind of unmanned aerial vehicle group of intensified learning provided by the invention is in unknown sea area
The more dynamic object methods of collaboratively searching, this method solve multiple no-manned planes to cooperate with this primary safety problem of collision prevention, and utilizes
Search efficiency function designs new Reward-Penalty Functions, can plan that online more UAV are searched according to the quality of efficiency using intensified learning method
Rope track, and search graph is updated with search result, greatly improve search efficiency.
Detailed description of the invention
In order to illustrate the technical solutions in the embodiments of the present application or in the prior art more clearly, to embodiment or will show below
There is attached drawing needed in technical description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this
The some embodiments recorded in application, for those of ordinary skill in the art, without creative efforts,
It is also possible to obtain other drawings based on these drawings.
Fig. 1 is airborne sensor detection model schematic diagram;
Fig. 2 is intensified learning search initial stage schematic diagram;
Fig. 3 is sea area information learning search process schematic diagram;
Fig. 4 is UAV search result schematic diagram;
Fig. 5 is random search track plot;
Fig. 6 is traversal search track plot;
Fig. 7 is the flow chart of this method.
Specific embodiment
To keep technical solution of the present invention and advantage clearer, with reference to the attached drawing in the embodiment of the present invention, to this
Technical solution in inventive embodiments carries out clear and complete description:
Such as a kind of unmanned aerial vehicle group of the intensified learning more dynamic object sides of collaboratively searching in unknown sea area shown in Fig. 7 Fig. 1
Method, specifically includes the following steps:
S1: searching area is divided using Grid Method, is divided into Lx×LyA grid.Based on sea environment, unmanned plane
Dynamically, sea moving ship dynamic and sensor detection model information establish more sea areas UAV search graph, establish search graphWherein (m, n) is grid coordinate, and k is the moment, and specific value calculating process is as follows:
S11: manor consciousness information figure is established: as unmanned plane ViPheromones H is generated when searching for grid (m, n)i(mn)(k), should
Pheromones can be to diffusion at other grids, at grid (a, b), diffusive transport function in search graph are as follows:
Wherein ρ, β are constant;
Work as NvWhen frame unmanned plane executes search mission, then there is NvKind pheromones are constantly generated and are spread, and are with grid (c, d)
Example, current time pheromone concentration are last moment because the pheromone concentration that volatilization leaves is spread with current newly generated pheromones
To the summation of the grid concentration, renewal equation are as follows:
Wherein, τH∈ [0,1] is volatilization factor;
As unmanned plane ViWhen detecting that other information element concentration are high in grid (m, n), indicate other UAV in grid
Frequent activity at (m, n), unmanned plane ViOther information element concentration detected are as follows:
S12: establish destination probability figure: destination probability more new formula is,
P in formulamn(k) probability existing for for k moment target at (m, n), pDFor sensor detection probability;pFFor sensor
False-alarm probability;τ ∈ [0,1] is the destination probability multidate information factor.ΔPmn(k) be probability knots modification, i.e., when grid (m, n) not by
When UAV is accessed, since other grids are accessed, probability changes at caused grid (m, n),
In formula, D (k) is the set of k moment all accessed grids;NvFor unmanned plane quantity.
S13: establish degree of certainty figure: degree of certainty renewal equation is,
Wherein, τcFor the multidate information factor of degree of certainty;χ ∈ [0,1] is a constant.
S14: under inertial reference coordinate, it is as follows to establish unmanned plane motion model:
Wherein, (xi,yi)∈R2For ViSearch plane location status,viRespectively ViYaw angle and speed, ui
For decision variable ui∈ [- 1,1], ηmaxiFor ViPeak turn rate, by UAV performance constraints, speed, which need to meet, vi
∈[vmin,vmax], and
S15:UAV with fixed angle installs visible light sensor and the case where with fixed height horizontal flight.Such as Fig. 1 institute
Show, under relative coordinate system, detection width can be described by following formula:
du=2hu·tanγu/sinαu
In formula, huFor the flying height of UAV, αuFor sensor established angle, γuFor the horizontal field of view angle of sensor.
In S2 specifically in the following way: there is z kind in possibility course of the unmanned plane at each grid, and every frame UAV is optional
Control input has l kind, so the line number of the Q table of design is Lx×Ly× z, columns l.Each value of Q value table initialization is 0.
In S3 specifically in the following way:
S31: according to the Q value of unmanned aerial vehicle group current state, using Boltzmann distribution mechanism as probability selection decision.In shape
Under state s (k), probability that set of strategies u (k) is selected are as follows:
In formula, u ∈ A indicates that strategy u is a certain executable strategy in decision set A;The size of T determines that study is explored not
Know the ability in space, T value is bigger, and the ability for exploring new decision space stronger (if T is infinitely great, has P (u (k))=1/m, as
Stochastic Decision-making).T is defined as,
T=T0n-1/λ
In formula, parameter lambda > 1, T0>0。
S32: unmanned aerial vehicle group reaches new state after executing the decision, generates search efficiency function, the Efficiency Function is by target
It was found that income Jp, environment search for income Jχ, Executing Cost C, collide cost I weighted sum obtain,
J (s (k), u (k))=w1Jp(k)+w2Jχ(k)-w3C(k)-w4I(k)
In formula: 0≤wi≤ 1 (i=1,2,3,4) is weight.Notice that above-mentioned every income and cost have different amounts
Guiding principle, it is therefore desirable to sum again after being normalized respectively.
S33: wherein target detection income JpIt is to be got by destination probability calculating, destination probability more new formula is,
P in formulaDFor sensor detection probability;pFFor sensor false-alarm probability;τ ∈ [0,1] be destination probability multidate information because
Son.ΔPmnIt (k) is probability knots modification, i.e., when grid (m, n) is not accessed by UAV, since other grids are accessed, caused grid
Probability changes at lattice (m, n),
In formula, D (k) is the set of k moment all accessed grids;If UAV platform access (m, n), pmn(k)
It updates and detects variable b with platform sensorkCorrelation, bk=1 expression airborne sensor detects target, bk=0 indicates sensor
Do not detect target.
So target detection income is,
S34: it calculates environment and searches for income Jχ.With the search of UAV and the observation of sensor, UAV to region of search gradually
Understand, the comentropy of corresponding search graph gradually decreases, therefore environment search income is defined as the reduction amount of comentropy:
Jχ(k)=H (k)-H (k+1)
In formulaFor the comentropy at k moment, it describes current environment
Degree of uncertainty.
S35: calculate Executing Cost C, Executing Cost be UAV fly to during target point time loss and fuel oil disappear
Consumption can be estimated using formula (14):
S36: collision cost I is defined as,
In formulaFor unmanned plane ViThe manor consciousness shown, that is, other information element concentration detected.Its
Calculation formula is as follows:
As unmanned plane ViWhen searching for grid (m, n), pheromones H is generatedi(mn)(k), which can be to it in search graph
It is spread at his grid, at grid (a, b), diffusive transport function is,
Wherein ρ, β are constant.Work as NvWhen frame UAV executes search mission, then there is NvKind pheromones are constantly generated and are spread, with
For grid (c, d), current time pheromone concentration is last moment because the pheromone concentration that volatilization leaves newly is generated with current
Pheromones be diffused into the summation of the grid concentration, renewal equation are as follows:
In formula, τH∈ [0,1] is volatilization factor.
In S4 specifically in the following way:
S41: expectation obtains higher overall efficiency J (s (k), u (k)), after unmanned plane executes hunting action every time, such as
Fruit obtains higher performance, then gives and reward immediately;If obtaining lower efficiency, gives and punish immediately.
Reward-Penalty Functions r (k) design is as follows,
Wherein, a is constant, influences the generalization ability and a × J of learning processk(s (k), u (k)) ∈ (- R, R), most Grand Prix
It encourages as R, maximum punishment is set as-R.J (s (k), u (k)) is determined by formula (16).Actual range of the d between UAV,
D is minimum safe distance, to guarantee each UAV safe flight, need to meet d >=D.
S42: considering no-fly zone, if B is unmanned plane apart from no-fly zone circle center distance, then B should be greater than no-fly zone radius D*, this
When Reward-Penalty Functions be further improved it is as follows,
That is UAV, which collides or fly into no-fly zone, will all give maximum punishment.Phytal zone and no-fly is searched as shown in Figure 2
Area will give and punish.
The update rule of S43:Q value function is,
In formula, siIt (k) is ViCurrent state;ui(k) it is the decision currently selected, that is, changes the yaw angle of UAV;r(k)
It is aircraft with state si(k) implementation strategy ui(k) state s is reachedi(k+1) reward value at once or penalty value obtained afterwards;Expression state si(k+1) the maximum Q value that tactful u is obtained is taken;α ∈ [0,1] is learning rate;γ is folding
Detain the factor.
In S5 specifically in the following way: the new state that unmanned aerial vehicle group reaches being updated to current state, is persistently made certainly
Plan is finally completed the study to entire Q value table, and after Q table is finally restrained, it is as shown in Figure 3 to have recorded sea area information.Unmanned aerial vehicle group root
It makes a policy according to trained Q table, completion search mission, the sea area information that Fig. 4 is grasped by UAV more after search mission,
Including whole phytal zones, no-fly zone, 9 naval vessels sea area covered along straight line cruise.
Under this experiment condition, the search effect based on Q-Learning algorithm is compared with random search and traversal search
Compared with using monte carlo method experiment 500 times, random collection and traversal search analogous diagram are as shown in Figure 5,6.It is strong in three kinds of methods
Chemistry practises search efficiency highest, average each time step about than random search more searches out a target, with the time into
Although row traversal search may also find that all targets, efficiency are extremely low.Emulation experiment shows the validity of the algorithm, and passes through
Multiple no-manned plane collaboration dynamic object search is realized in comparative analysis verifying, is more effectively realized than original searching method to dynamic mesh
Target search.
The foregoing is only a preferred embodiment of the present invention, but scope of protection of the present invention is not limited thereto,
Anyone skilled in the art in the technical scope disclosed by the present invention, according to the technique and scheme of the present invention and its
Inventive concept is subject to equivalent substitution or change, should be covered by the protection scope of the present invention.
Claims (5)
1. a kind of more dynamic object methods of unmanned aerial vehicle group of intensified learning collaboratively searching in unknown sea area, it is characterised in that including
Following steps:
S1: searching area is divided using Grid Method: based on sea environment, unmanned plane dynamic, sea moving ship dynamic
More sea areas UAV search graph is established with sensor detection model information;It is built according to the pheromone concentration of unmanned plane within a certain area
Stand-up collar ground consciousness information figure, utilizes manor consciousness information figure to expand more sea areas UAV search graph;
S2: Q value table is designed according to drone status information and decision u (k);
S3: it using the flight path of Boltzmann distribution mechanism selection unmanned plane and is held according to the Q value of unmanned aerial vehicle group current state
Row, when unmanned aerial vehicle group reaches new state according to target detection income Jp, environment search for income Jχ, Executing Cost C, collision cost I
Weighted sum obtain search efficiency function;
S4: the Reward-Penalty Functions of evaluation unmanned plane during flying state are designed to using search efficiency function, according to Reward-Penalty Functions to nothing
The Q value for the new state that man-machine group reaches is updated;
S5: unmanned aerial vehicle group arrival new state is updated to current state, flight path decision is persistently made and is finally completed entire Q value
The study of table, unmanned aerial vehicle group make a policy according to trained Q table, complete search mission.
2. a kind of unmanned aerial vehicle group of intensified learning according to claim 1 more dynamic objects of collaboratively searching in unknown sea area
Method, it is further characterized in that: in S1 specifically in the following way:
S11: manor consciousness information figure is established: as unmanned plane ViPheromones H is generated when searching for grid (m, n)i(mn)(k), the information
Element can be spread in search graph at other grids, then the pheromones diffusive transport function at grid (a, b) are as follows:
Wherein ρ, β are constant;
Work as NvWhen frame unmanned plane executes search mission, then there is NvKind of pheromones are generated and are spread, if at grid (c, d), current time
It is dense that pheromone concentration is that the pheromone concentration that leaves by volatilization last moment with currently newly generated pheromones is diffused into the grid
The summation of degree, renewal equation are as follows:
Wherein, τH∈ [0,1] is volatilization factor;
As unmanned plane ViWhen detecting that other information element concentration are high in grid (m, n), indicate other UAV at grid (m, n)
Frequent activity, unmanned plane ViOther information element concentration detected are as follows:
S12: establish destination probability figure: destination probability more new formula is,
P in formulamn(k) probability existing for for k moment target at (m, n), pDFor sensor detection probability;pFFor sensor false-alarm
Probability;τ ∈ [0,1] is the destination probability multidate information factor, Δ PmnIt (k) is probability knots modification, i.e., when grid (m, n) is not by UAV
When access, since other grids are accessed, probability changes at caused grid (m, n):
In formula, D (k) is the set of k moment all accessed grids;NvFor unmanned plane quantity;
S13: establish degree of certainty figure: degree of certainty renewal equation is,
Wherein, τcFor the multidate information factor of degree of certainty;χ ∈ [0,1] is a constant;
S14: H is setmnIt (k) is pheromone concentration total at grid (m, n), wherein pheromone concentration is about grid positions and time
Function, obtain environment search graph be
3. a kind of unmanned aerial vehicle group of intensified learning according to claim 1 more dynamic objects of collaboratively searching in unknown sea area
Method, it is further characterized in that: in S2 specifically in the following way:
Wherein the size of Q value table is by the control instruction of drone status and input, and wherein location status shares Lx×LyKind, nobody
There is z kind in possibility course of the machine at each grid, and every frame UAV, which optionally controls input, l kind, then the line number of the Q table designed is
Lx×Ly× z, columns l.
4. a kind of unmanned aerial vehicle group of intensified learning according to claim 1 more dynamic objects of collaboratively searching in unknown sea area
Method, it is further characterized in that: in S3 specifically in the following way:
S31: collision cost I is defined as,
In formulaFor unmanned plane ViThe manor consciousness shown, that is, other information element concentration detected calculate
Formula is as follows:
In above formula, Hmn(k) the pheromones total amount generated at grid (m, n) for all unmanned planes.
5. a kind of unmanned aerial vehicle group of intensified learning according to claim 1 more dynamic objects of collaboratively searching in unknown sea area
Method, it is further characterized in that: in S4 specifically in the following way:
S41: not considering no-fly zone, then Reward-Penalty Functions design is as follows,
A is constant, influences the generalization ability and a × J of learning processk(s (k), u (k)) ∈ (- R, R), maximum reward is R, maximum
Punishment is set as-R, actual range of the d between UAV, and J (s (k), u (k)) is search efficiency function, and D is minimum safe distance, is
Guarantee each UAV safe flight, d >=D need to be met;
S42: when there is no-fly zone, if B is unmanned plane apart from no-fly zone circle center distance, then B is greater than no-fly zone radius D*, at this time
Reward-Penalty Functions further improvement is as follows,
That is UAV, which collides or fly into no-fly zone, will all give maximum punishment.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910346512.6A CN110196605B (en) | 2019-04-26 | 2019-04-26 | Method for cooperatively searching multiple dynamic targets in unknown sea area by reinforcement learning unmanned aerial vehicle cluster |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910346512.6A CN110196605B (en) | 2019-04-26 | 2019-04-26 | Method for cooperatively searching multiple dynamic targets in unknown sea area by reinforcement learning unmanned aerial vehicle cluster |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110196605A true CN110196605A (en) | 2019-09-03 |
CN110196605B CN110196605B (en) | 2022-03-22 |
Family
ID=67752255
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910346512.6A Active CN110196605B (en) | 2019-04-26 | 2019-04-26 | Method for cooperatively searching multiple dynamic targets in unknown sea area by reinforcement learning unmanned aerial vehicle cluster |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110196605B (en) |
Cited By (18)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110806591A (en) * | 2019-10-11 | 2020-02-18 | 广东工业大学 | Unmanned aerial vehicle coverage search method and search device based on coherent theory |
CN111045445A (en) * | 2019-10-23 | 2020-04-21 | 浩亚信息科技有限公司 | Aircraft intelligent collision avoidance method, equipment and medium based on reinforcement learning |
CN111538059A (en) * | 2020-05-11 | 2020-08-14 | 东华大学 | Self-adaptive rapid dynamic positioning system and method based on improved Boltzmann machine |
CN111667513A (en) * | 2020-06-01 | 2020-09-15 | 西北工业大学 | Unmanned aerial vehicle maneuvering target tracking method based on DDPG transfer learning |
CN111680934A (en) * | 2020-06-30 | 2020-09-18 | 西安电子科技大学 | Unmanned aerial vehicle task allocation method based on group entropy and Q learning |
CN111708355A (en) * | 2020-06-19 | 2020-09-25 | 中国人民解放军国防科技大学 | Multi-unmanned aerial vehicle action decision method and device based on reinforcement learning |
CN111880567A (en) * | 2020-07-31 | 2020-11-03 | 中国人民解放军国防科技大学 | Fixed-wing unmanned aerial vehicle formation coordination control method and device based on deep reinforcement learning |
CN112327918A (en) * | 2020-11-12 | 2021-02-05 | 大连海事大学 | Multi-swarm sea area environment self-adaptive search algorithm based on elite learning |
WO2021082864A1 (en) * | 2019-10-30 | 2021-05-06 | 武汉理工大学 | Deep reinforcement learning-based intelligent collision-avoidance method for swarm of unmanned surface vehicles |
CN112947575A (en) * | 2021-03-17 | 2021-06-11 | 中国人民解放军国防科技大学 | Unmanned aerial vehicle cluster multi-target searching method and system based on deep reinforcement learning |
CN113342030A (en) * | 2021-04-27 | 2021-09-03 | 湖南科技大学 | Multi-unmanned aerial vehicle cooperative self-organizing control method and system based on reinforcement learning |
CN113382060A (en) * | 2021-06-07 | 2021-09-10 | 北京理工大学 | Unmanned aerial vehicle track optimization method and system in Internet of things data collection |
CN113505431A (en) * | 2021-06-07 | 2021-10-15 | 中国人民解放军国防科技大学 | ST-DQN-based target searching method, device, equipment and medium for marine unmanned aerial vehicle |
CN113671996A (en) * | 2021-10-22 | 2021-11-19 | 中国电子科技集团公司信息科学研究院 | Heterogeneous unmanned aerial vehicle reconnaissance method and system based on pheromone |
CN113985913A (en) * | 2021-09-24 | 2022-01-28 | 大连海事大学 | Collection-division type multi-unmanned-plane rescue system based on urban fire spread prediction |
CN114200964A (en) * | 2022-02-17 | 2022-03-18 | 南京信息工程大学 | Unmanned aerial vehicle cluster cooperative reconnaissance coverage distributed autonomous optimization method |
CN114446121A (en) * | 2022-02-24 | 2022-05-06 | 汕头市快畅机器人科技有限公司 | Control method of life search cluster education robot |
CN115328143A (en) * | 2022-08-26 | 2022-11-11 | 齐齐哈尔大学 | Master-slave water surface robot recovery guiding method based on environment driving |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107024220A (en) * | 2017-04-14 | 2017-08-08 | 淮安信息职业技术学院 | Robot path planning method based on intensified learning cockroach algorithm |
CN107729953A (en) * | 2017-09-18 | 2018-02-23 | 清华大学 | Robot plume method for tracing based on continuous state behavior domain intensified learning |
CN108319286A (en) * | 2018-03-12 | 2018-07-24 | 西北工业大学 | A kind of unmanned plane Air Combat Maneuvering Decision Method based on intensified learning |
CN108594834A (en) * | 2018-03-23 | 2018-09-28 | 哈尔滨工程大学 | One kind is towards more AUV adaptive targets search and barrier-avoiding method under circumstances not known |
CN108762281A (en) * | 2018-06-08 | 2018-11-06 | 哈尔滨工程大学 | It is a kind of that intelligent robot decision-making technique under the embedded Real-time Water of intensified learning is associated with based on memory |
CN109241552A (en) * | 2018-07-12 | 2019-01-18 | 哈尔滨工程大学 | A kind of underwater robot motion planning method based on multiple constraint target |
WO2019047646A1 (en) * | 2017-09-05 | 2019-03-14 | 百度在线网络技术(北京)有限公司 | Obstacle avoidance method and device for vehicle |
-
2019
- 2019-04-26 CN CN201910346512.6A patent/CN110196605B/en active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107024220A (en) * | 2017-04-14 | 2017-08-08 | 淮安信息职业技术学院 | Robot path planning method based on intensified learning cockroach algorithm |
WO2019047646A1 (en) * | 2017-09-05 | 2019-03-14 | 百度在线网络技术(北京)有限公司 | Obstacle avoidance method and device for vehicle |
CN107729953A (en) * | 2017-09-18 | 2018-02-23 | 清华大学 | Robot plume method for tracing based on continuous state behavior domain intensified learning |
CN108319286A (en) * | 2018-03-12 | 2018-07-24 | 西北工业大学 | A kind of unmanned plane Air Combat Maneuvering Decision Method based on intensified learning |
CN108594834A (en) * | 2018-03-23 | 2018-09-28 | 哈尔滨工程大学 | One kind is towards more AUV adaptive targets search and barrier-avoiding method under circumstances not known |
CN108762281A (en) * | 2018-06-08 | 2018-11-06 | 哈尔滨工程大学 | It is a kind of that intelligent robot decision-making technique under the embedded Real-time Water of intensified learning is associated with based on memory |
CN109241552A (en) * | 2018-07-12 | 2019-01-18 | 哈尔滨工程大学 | A kind of underwater robot motion planning method based on multiple constraint target |
Non-Patent Citations (2)
Title |
---|
张晶晶等: "一种基于强化学习的UAV目标搜索算法", 《计算机应用研究》 * |
郝钏钏等: "基于Q学习的无人机三维航迹规划算法", 《上海交通大学学报》 * |
Cited By (31)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110806591A (en) * | 2019-10-11 | 2020-02-18 | 广东工业大学 | Unmanned aerial vehicle coverage search method and search device based on coherent theory |
CN110806591B (en) * | 2019-10-11 | 2022-02-11 | 广东工业大学 | Unmanned aerial vehicle coverage search method and search device based on coherent theory |
CN111045445A (en) * | 2019-10-23 | 2020-04-21 | 浩亚信息科技有限公司 | Aircraft intelligent collision avoidance method, equipment and medium based on reinforcement learning |
CN111045445B (en) * | 2019-10-23 | 2023-11-28 | 浩亚信息科技有限公司 | Intelligent collision avoidance method, equipment and medium for aircraft based on reinforcement learning |
US20220189312A1 (en) * | 2019-10-30 | 2022-06-16 | Wuhan University Of Technology | Intelligent collision avoidance method for a swarm of unmanned surface vehicles based on deep reinforcement learning |
WO2021082864A1 (en) * | 2019-10-30 | 2021-05-06 | 武汉理工大学 | Deep reinforcement learning-based intelligent collision-avoidance method for swarm of unmanned surface vehicles |
CN111538059A (en) * | 2020-05-11 | 2020-08-14 | 东华大学 | Self-adaptive rapid dynamic positioning system and method based on improved Boltzmann machine |
CN111667513A (en) * | 2020-06-01 | 2020-09-15 | 西北工业大学 | Unmanned aerial vehicle maneuvering target tracking method based on DDPG transfer learning |
CN111667513B (en) * | 2020-06-01 | 2022-02-18 | 西北工业大学 | Unmanned aerial vehicle maneuvering target tracking method based on DDPG transfer learning |
CN111708355A (en) * | 2020-06-19 | 2020-09-25 | 中国人民解放军国防科技大学 | Multi-unmanned aerial vehicle action decision method and device based on reinforcement learning |
CN111708355B (en) * | 2020-06-19 | 2023-04-18 | 中国人民解放军国防科技大学 | Multi-unmanned aerial vehicle action decision method and device based on reinforcement learning |
CN111680934B (en) * | 2020-06-30 | 2023-04-07 | 西安电子科技大学 | Unmanned aerial vehicle task allocation method based on group entropy and Q learning |
CN111680934A (en) * | 2020-06-30 | 2020-09-18 | 西安电子科技大学 | Unmanned aerial vehicle task allocation method based on group entropy and Q learning |
CN111880567A (en) * | 2020-07-31 | 2020-11-03 | 中国人民解放军国防科技大学 | Fixed-wing unmanned aerial vehicle formation coordination control method and device based on deep reinforcement learning |
CN111880567B (en) * | 2020-07-31 | 2022-09-16 | 中国人民解放军国防科技大学 | Fixed-wing unmanned aerial vehicle formation coordination control method and device based on deep reinforcement learning |
CN112327918A (en) * | 2020-11-12 | 2021-02-05 | 大连海事大学 | Multi-swarm sea area environment self-adaptive search algorithm based on elite learning |
CN112327918B (en) * | 2020-11-12 | 2023-06-02 | 大连海事大学 | Multi-swarm sea area environment self-adaptive search algorithm based on elite learning |
CN112947575A (en) * | 2021-03-17 | 2021-06-11 | 中国人民解放军国防科技大学 | Unmanned aerial vehicle cluster multi-target searching method and system based on deep reinforcement learning |
CN113342030A (en) * | 2021-04-27 | 2021-09-03 | 湖南科技大学 | Multi-unmanned aerial vehicle cooperative self-organizing control method and system based on reinforcement learning |
CN113505431A (en) * | 2021-06-07 | 2021-10-15 | 中国人民解放军国防科技大学 | ST-DQN-based target searching method, device, equipment and medium for marine unmanned aerial vehicle |
CN113382060A (en) * | 2021-06-07 | 2021-09-10 | 北京理工大学 | Unmanned aerial vehicle track optimization method and system in Internet of things data collection |
CN113505431B (en) * | 2021-06-07 | 2022-05-06 | 中国人民解放军国防科技大学 | Method, device, equipment and medium for searching targets of maritime unmanned aerial vehicle based on ST-DQN |
CN113382060B (en) * | 2021-06-07 | 2022-03-22 | 北京理工大学 | Unmanned aerial vehicle track optimization method and system in Internet of things data collection |
CN113985913A (en) * | 2021-09-24 | 2022-01-28 | 大连海事大学 | Collection-division type multi-unmanned-plane rescue system based on urban fire spread prediction |
CN113985913B (en) * | 2021-09-24 | 2024-04-12 | 大连海事大学 | Integrated and separated type multi-unmanned aerial vehicle rescue system based on urban fire spread prediction |
CN113671996A (en) * | 2021-10-22 | 2021-11-19 | 中国电子科技集团公司信息科学研究院 | Heterogeneous unmanned aerial vehicle reconnaissance method and system based on pheromone |
CN114200964A (en) * | 2022-02-17 | 2022-03-18 | 南京信息工程大学 | Unmanned aerial vehicle cluster cooperative reconnaissance coverage distributed autonomous optimization method |
CN114446121A (en) * | 2022-02-24 | 2022-05-06 | 汕头市快畅机器人科技有限公司 | Control method of life search cluster education robot |
CN114446121B (en) * | 2022-02-24 | 2024-03-05 | 汕头市快畅机器人科技有限公司 | Control method of life search cluster education robot |
CN115328143A (en) * | 2022-08-26 | 2022-11-11 | 齐齐哈尔大学 | Master-slave water surface robot recovery guiding method based on environment driving |
CN115328143B (en) * | 2022-08-26 | 2023-04-18 | 齐齐哈尔大学 | Master-slave water surface robot recovery guiding method based on environment driving |
Also Published As
Publication number | Publication date |
---|---|
CN110196605B (en) | 2022-03-22 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110196605A (en) | A kind of more dynamic object methods of the unmanned aerial vehicle group of intensified learning collaboratively searching in unknown sea area | |
Jensen et al. | Algorithms at war: the promise, peril, and limits of artificial intelligence | |
CN108801266B (en) | Flight path planning method for searching uncertain environment by multiple unmanned aerial vehicles | |
Yu et al. | A knee-guided differential evolution algorithm for unmanned aerial vehicle path planning in disaster management | |
CN109254588A (en) | A kind of unmanned plane cluster coordinated investigation method based on cross and variation dove group's optimization | |
CN106705970A (en) | Multi-UAV(Unmanned Aerial Vehicle) cooperation path planning method based on ant colony algorithm | |
CN105892480A (en) | Self-organizing method for cooperative scouting and hitting task of heterogeneous multi-unmanned-aerial-vehicle system | |
US9030347B2 (en) | Preemptive signature control for vehicle survivability planning | |
US9240001B2 (en) | Systems and methods for vehicle survivability planning | |
CN108318032A (en) | A kind of unmanned aerial vehicle flight path Intelligent planning method considering Attack Defence | |
US6718261B2 (en) | Architecture for real-time maintenance of distributed mission plans | |
Johansson | Evaluating the performance of TEWA systems | |
US8831793B2 (en) | Evaluation tool for vehicle survivability planning | |
Leboucher et al. | A two-step optimisation method for dynamic weapon target assignment problem | |
CN114020031B (en) | Unmanned aerial vehicle cluster collaborative dynamic target searching method based on improved pigeon colony optimization | |
CN109885082B (en) | Unmanned aerial vehicle track planning method based on task driving | |
CN112486200B (en) | Multi-unmanned aerial vehicle cooperative confrontation online re-decision method | |
Chen et al. | Cooperative area reconnaissance for multi-UAV in dynamic environment | |
CN116679751A (en) | Multi-aircraft collaborative search method considering flight constraint | |
Su et al. | An improved adaptive differential evolution algorithm for single unmanned aerial vehicle multitasking | |
Zhang et al. | Design of the fruit fly optimization algorithm based path planner for UAV in 3D environments | |
He et al. | Learning-based airborne sensor task assignment in unknown dynamic environments | |
Zheng et al. | Coevolving and cooperating path planner for multiple unmanned air vehicles | |
Yue et al. | Reinforcement learning based approach for multi-UAV cooperative searching in unknown environments | |
Lv et al. | Maritime static target search based on particle swarm algorithm |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |