CN110196605A - A kind of more dynamic object methods of the unmanned aerial vehicle group of intensified learning collaboratively searching in unknown sea area - Google Patents

A kind of more dynamic object methods of the unmanned aerial vehicle group of intensified learning collaboratively searching in unknown sea area Download PDF

Info

Publication number
CN110196605A
CN110196605A CN201910346512.6A CN201910346512A CN110196605A CN 110196605 A CN110196605 A CN 110196605A CN 201910346512 A CN201910346512 A CN 201910346512A CN 110196605 A CN110196605 A CN 110196605A
Authority
CN
China
Prior art keywords
aerial vehicle
grid
unmanned aerial
vehicle group
unmanned plane
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910346512.6A
Other languages
Chinese (zh)
Other versions
CN110196605B (en
Inventor
岳伟
关显赫
刘中常
王丽媛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Dalian Maritime University
Original Assignee
Dalian Maritime University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Dalian Maritime University filed Critical Dalian Maritime University
Priority to CN201910346512.6A priority Critical patent/CN110196605B/en
Publication of CN110196605A publication Critical patent/CN110196605A/en
Application granted granted Critical
Publication of CN110196605B publication Critical patent/CN110196605B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05DSYSTEMS FOR CONTROLLING OR REGULATING NON-ELECTRIC VARIABLES
    • G05D1/00Control of position, course or altitude of land, water, air, or space vehicles, e.g. automatic pilot
    • G05D1/0088Control of position, course or altitude of land, water, air, or space vehicles, e.g. automatic pilot characterized by the autonomous decision making process, e.g. artificial intelligence, predefined behaviours
    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05DSYSTEMS FOR CONTROLLING OR REGULATING NON-ELECTRIC VARIABLES
    • G05D1/00Control of position, course or altitude of land, water, air, or space vehicles, e.g. automatic pilot
    • G05D1/12Target-seeking control

Abstract

The more dynamic object methods of collaboratively searching that the invention discloses a kind of unmanned aerial vehicle groups of intensified learning in unknown sea area, comprising the following steps: S1: divide using Grid Method to searching area: manor consciousness information figure S2 is established according to the pheromone concentration of unmanned plane within a certain area: Q value table is designed according to drone status information and decision u (k);S3: flight path and execution according to the Q value of unmanned aerial vehicle group current state using Boltzmann distribution mechanism selection unmanned plane;S4: it is designed to the Reward-Penalty Functions of evaluation unmanned plane during flying state using search efficiency function, is updated according to Q value of the Reward-Penalty Functions to the new state that unmanned aerial vehicle group reaches;S5: unmanned aerial vehicle group arrival new state is updated to current state, persistently makes the study that flight path decision is finally completed entire Q value table, unmanned aerial vehicle group makes a policy according to trained Q table, completes search mission.

Description

A kind of unmanned aerial vehicle group of intensified learning more dynamic objects of collaboratively searching in unknown sea area Method
Technical field
The present invention relates to unmanned aerial vehicle (UAV) control technical fields more particularly to a kind of unmanned aerial vehicle group of intensified learning in unknown sea area The interior more dynamic object methods of collaboratively searching.
Background technique
With the fast development of the technologies such as sensor, wireless communication, intelligent control, the function of unmanned population system increasingly increases By force, application field constantly expands.Unmanned population system is because of its scalability, strong collaborative and low-loss, Synergy It is more and more paid close attention to application study by academia, industry and national defence, and more UAV collaborative searching systems can be effective Search efficiency is improved, there is huge in particular for the search that there is dynamic object under the complicated sea situation such as uncertainty, strong jamming Big advantage, therefore, more UAV collaborations sea area search are one of the important directions of unmanned population system research.
Traditional searching method is using cover type search, such as the search of Back Word type, traversal search etc., this way of search Generally to maximize covering mission area to find target as much as possible, in recent years, combining target existing probability establishes search Graph model is solved using distributed model predictive control, effectively reduces the solution scale of searching decisions problem, but only It is limited to the search of static object.For dynamic object, average detected time and average detection probability are calculated using bayes method, But it is only applicable to the search to marine single target, is not able to satisfy the demand of multiple target search.
Summary of the invention
According to problem of the existing technology, the invention discloses a kind of unmanned aerial vehicle groups of intensified learning in unknown sea area The more dynamic object methods of collaboratively searching, this method consider environment, unmanned plane dynamic, target dynamic and sensor detection mould first Type establishes more sea areas UAV search graph, then, is updated using the concept of manor consciousness information figure to search graph, expands original Search graph.Intensified learning method is finally utilized, designs Reward-Penalty Functions in conjunction with search efficiency function, generates more UAV collaborations online The path of search.
Specifically includes the following steps:
S1: searching area is divided using Grid Method: based on sea environment, unmanned plane dynamic, sea moving ship Dynamic and sensor detection model information establish more sea areas UAV search graph;It is dense according to the pheromones of unmanned plane within a certain area Degree establishes manor consciousness information figure, expands more sea areas UAV search graph using manor consciousness information figure;
S2: Q value table is designed according to drone status information and decision u (k);
S3: according to the Q value of unmanned aerial vehicle group current state using the flight path of Boltzmann distribution mechanism selection unmanned plane And execute, when unmanned aerial vehicle group reaches new state according to target detection income Jp, environment search for income Jχ, Executing Cost C, collision The weighted sum of cost I obtains search efficiency function;
S4: the Reward-Penalty Functions of evaluation unmanned plane during flying state are designed to using search efficiency function, according to Reward-Penalty Functions The Q value for the new state that unmanned aerial vehicle group reaches is updated;
S5: by unmanned aerial vehicle group arrival new state be updated to current state, persistently make flight path decision be finally completed it is whole The study of a Q value table, unmanned aerial vehicle group make a policy according to trained Q table, complete search mission.
In S1 specifically in the following way:
S11: manor consciousness information figure is established: as unmanned plane ViPheromones H is generated when searching for grid (m, n)i(mn)(k), should Pheromones can be to diffusion at other grids, at grid (a, b), diffusive transport function in search graph are as follows:
Wherein ρ, β are constant;
Work as NvWhen frame unmanned plane executes search mission, then there is NvKind pheromones are constantly generated and are spread, and are with grid (c, d) Example, current time pheromone concentration are last moment because the pheromone concentration that volatilization leaves is spread with current newly generated pheromones To the summation of the grid concentration, renewal equation are as follows:
Wherein, τH∈ [0,1] is volatilization factor;
As unmanned plane ViWhen detecting that other information element concentration are high in grid (m, n), indicate other UAV in grid Frequent activity at (m, n), unmanned plane ViOther information element concentration detected are as follows:
S12: establish destination probability figure: destination probability more new formula is,
P in formulamn(k) probability existing for for k moment target at (m, n), pDFor sensor detection probability;pFFor sensor False-alarm probability;τ ∈ [0,1] is the destination probability multidate information factor.ΔPmn(k) be probability knots modification, i.e., when grid (m, n) not by When UAV is accessed, since other grids are accessed, probability changes at caused grid (m, n),
In formula, D (k) is the set of k moment all accessed grids;NvFor unmanned plane quantity.
S13: establish degree of certainty figure: degree of certainty renewal equation is,
Wherein, τcFor the multidate information factor of degree of certainty;χ ∈ [0,1] is a constant.
S14: H is setmn(k) be total pheromone concentration at grid (m, n), wherein pheromone concentration be about grid positions and The function of time, obtaining environment search graph is
In S2 specifically in the following way:
The size of Q value table is by the control instruction of drone status and input, and wherein location status shares Lx×LyKind, nobody There is z kind in possibility course of the machine at each grid, and every frame UAV, which optionally controls input, l kind, then the line number of the Q table designed is Lx×Ly× z, columns l.
In S3 specifically in the following way:
S31: collision cost I is defined as,
In formulaFor unmanned plane ViThe manor consciousness shown, that is, other information element concentration detected, Calculation formula is as follows:
In above formula, Hmn(k) the pheromones total amount generated at grid (m, n) for all unmanned planes.
In S4 specifically in the following way:
S41: not considering no-fly zone, then Reward-Penalty Functions design is as follows,
A is constant, influences the generalization ability and a × J of learning processk(s (k), u (k)) ∈ (- R, R), maximum reward are R, Maximum punishment is set as-R, actual range of the d between UAV, and J (s (k), u (k)) is search efficiency function, and D is minimum safe distance From d >=D need to be met to guarantee each UAV safe flight.
S42: when there is no-fly zone, if B is unmanned plane apart from no-fly zone circle center distance, then B is greater than no-fly zone radius D*, Reward-Penalty Functions are further improved as follows at this time,
That is UAV, which collides or fly into no-fly zone, will all give maximum punishment.
By adopting the above-described technical solution, a kind of unmanned aerial vehicle group of intensified learning provided by the invention is in unknown sea area The more dynamic object methods of collaboratively searching, this method solve multiple no-manned planes to cooperate with this primary safety problem of collision prevention, and utilizes Search efficiency function designs new Reward-Penalty Functions, can plan that online more UAV are searched according to the quality of efficiency using intensified learning method Rope track, and search graph is updated with search result, greatly improve search efficiency.
Detailed description of the invention
In order to illustrate the technical solutions in the embodiments of the present application or in the prior art more clearly, to embodiment or will show below There is attached drawing needed in technical description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this The some embodiments recorded in application, for those of ordinary skill in the art, without creative efforts, It is also possible to obtain other drawings based on these drawings.
Fig. 1 is airborne sensor detection model schematic diagram;
Fig. 2 is intensified learning search initial stage schematic diagram;
Fig. 3 is sea area information learning search process schematic diagram;
Fig. 4 is UAV search result schematic diagram;
Fig. 5 is random search track plot;
Fig. 6 is traversal search track plot;
Fig. 7 is the flow chart of this method.
Specific embodiment
To keep technical solution of the present invention and advantage clearer, with reference to the attached drawing in the embodiment of the present invention, to this Technical solution in inventive embodiments carries out clear and complete description:
Such as a kind of unmanned aerial vehicle group of the intensified learning more dynamic object sides of collaboratively searching in unknown sea area shown in Fig. 7 Fig. 1 Method, specifically includes the following steps:
S1: searching area is divided using Grid Method, is divided into Lx×LyA grid.Based on sea environment, unmanned plane Dynamically, sea moving ship dynamic and sensor detection model information establish more sea areas UAV search graph, establish search graphWherein (m, n) is grid coordinate, and k is the moment, and specific value calculating process is as follows:
S11: manor consciousness information figure is established: as unmanned plane ViPheromones H is generated when searching for grid (m, n)i(mn)(k), should Pheromones can be to diffusion at other grids, at grid (a, b), diffusive transport function in search graph are as follows:
Wherein ρ, β are constant;
Work as NvWhen frame unmanned plane executes search mission, then there is NvKind pheromones are constantly generated and are spread, and are with grid (c, d) Example, current time pheromone concentration are last moment because the pheromone concentration that volatilization leaves is spread with current newly generated pheromones To the summation of the grid concentration, renewal equation are as follows:
Wherein, τH∈ [0,1] is volatilization factor;
As unmanned plane ViWhen detecting that other information element concentration are high in grid (m, n), indicate other UAV in grid Frequent activity at (m, n), unmanned plane ViOther information element concentration detected are as follows:
S12: establish destination probability figure: destination probability more new formula is,
P in formulamn(k) probability existing for for k moment target at (m, n), pDFor sensor detection probability;pFFor sensor False-alarm probability;τ ∈ [0,1] is the destination probability multidate information factor.ΔPmn(k) be probability knots modification, i.e., when grid (m, n) not by When UAV is accessed, since other grids are accessed, probability changes at caused grid (m, n),
In formula, D (k) is the set of k moment all accessed grids;NvFor unmanned plane quantity.
S13: establish degree of certainty figure: degree of certainty renewal equation is,
Wherein, τcFor the multidate information factor of degree of certainty;χ ∈ [0,1] is a constant.
S14: under inertial reference coordinate, it is as follows to establish unmanned plane motion model:
Wherein, (xi,yi)∈R2For ViSearch plane location status,viRespectively ViYaw angle and speed, ui For decision variable ui∈ [- 1,1], ηmaxiFor ViPeak turn rate, by UAV performance constraints, speed, which need to meet, vi ∈[vmin,vmax], and
S15:UAV with fixed angle installs visible light sensor and the case where with fixed height horizontal flight.Such as Fig. 1 institute Show, under relative coordinate system, detection width can be described by following formula:
du=2hu·tanγu/sinαu
In formula, huFor the flying height of UAV, αuFor sensor established angle, γuFor the horizontal field of view angle of sensor.
In S2 specifically in the following way: there is z kind in possibility course of the unmanned plane at each grid, and every frame UAV is optional Control input has l kind, so the line number of the Q table of design is Lx×Ly× z, columns l.Each value of Q value table initialization is 0.
In S3 specifically in the following way:
S31: according to the Q value of unmanned aerial vehicle group current state, using Boltzmann distribution mechanism as probability selection decision.In shape Under state s (k), probability that set of strategies u (k) is selected are as follows:
In formula, u ∈ A indicates that strategy u is a certain executable strategy in decision set A;The size of T determines that study is explored not Know the ability in space, T value is bigger, and the ability for exploring new decision space stronger (if T is infinitely great, has P (u (k))=1/m, as Stochastic Decision-making).T is defined as,
T=T0n-1/λ
In formula, parameter lambda > 1, T0>0。
S32: unmanned aerial vehicle group reaches new state after executing the decision, generates search efficiency function, the Efficiency Function is by target It was found that income Jp, environment search for income Jχ, Executing Cost C, collide cost I weighted sum obtain,
J (s (k), u (k))=w1Jp(k)+w2Jχ(k)-w3C(k)-w4I(k)
In formula: 0≤wi≤ 1 (i=1,2,3,4) is weight.Notice that above-mentioned every income and cost have different amounts Guiding principle, it is therefore desirable to sum again after being normalized respectively.
S33: wherein target detection income JpIt is to be got by destination probability calculating, destination probability more new formula is,
P in formulaDFor sensor detection probability;pFFor sensor false-alarm probability;τ ∈ [0,1] be destination probability multidate information because Son.ΔPmnIt (k) is probability knots modification, i.e., when grid (m, n) is not accessed by UAV, since other grids are accessed, caused grid Probability changes at lattice (m, n),
In formula, D (k) is the set of k moment all accessed grids;If UAV platform access (m, n), pmn(k) It updates and detects variable b with platform sensorkCorrelation, bk=1 expression airborne sensor detects target, bk=0 indicates sensor Do not detect target.
So target detection income is,
S34: it calculates environment and searches for income Jχ.With the search of UAV and the observation of sensor, UAV to region of search gradually Understand, the comentropy of corresponding search graph gradually decreases, therefore environment search income is defined as the reduction amount of comentropy:
Jχ(k)=H (k)-H (k+1)
In formulaFor the comentropy at k moment, it describes current environment Degree of uncertainty.
S35: calculate Executing Cost C, Executing Cost be UAV fly to during target point time loss and fuel oil disappear Consumption can be estimated using formula (14):
S36: collision cost I is defined as,
In formulaFor unmanned plane ViThe manor consciousness shown, that is, other information element concentration detected.Its Calculation formula is as follows:
As unmanned plane ViWhen searching for grid (m, n), pheromones H is generatedi(mn)(k), which can be to it in search graph It is spread at his grid, at grid (a, b), diffusive transport function is,
Wherein ρ, β are constant.Work as NvWhen frame UAV executes search mission, then there is NvKind pheromones are constantly generated and are spread, with For grid (c, d), current time pheromone concentration is last moment because the pheromone concentration that volatilization leaves newly is generated with current Pheromones be diffused into the summation of the grid concentration, renewal equation are as follows:
In formula, τH∈ [0,1] is volatilization factor.
In S4 specifically in the following way:
S41: expectation obtains higher overall efficiency J (s (k), u (k)), after unmanned plane executes hunting action every time, such as Fruit obtains higher performance, then gives and reward immediately;If obtaining lower efficiency, gives and punish immediately.
Reward-Penalty Functions r (k) design is as follows,
Wherein, a is constant, influences the generalization ability and a × J of learning processk(s (k), u (k)) ∈ (- R, R), most Grand Prix It encourages as R, maximum punishment is set as-R.J (s (k), u (k)) is determined by formula (16).Actual range of the d between UAV,
D is minimum safe distance, to guarantee each UAV safe flight, need to meet d >=D.
S42: considering no-fly zone, if B is unmanned plane apart from no-fly zone circle center distance, then B should be greater than no-fly zone radius D*, this When Reward-Penalty Functions be further improved it is as follows,
That is UAV, which collides or fly into no-fly zone, will all give maximum punishment.Phytal zone and no-fly is searched as shown in Figure 2 Area will give and punish.
The update rule of S43:Q value function is,
In formula, siIt (k) is ViCurrent state;ui(k) it is the decision currently selected, that is, changes the yaw angle of UAV;r(k) It is aircraft with state si(k) implementation strategy ui(k) state s is reachedi(k+1) reward value at once or penalty value obtained afterwards;Expression state si(k+1) the maximum Q value that tactful u is obtained is taken;α ∈ [0,1] is learning rate;γ is folding Detain the factor.
In S5 specifically in the following way: the new state that unmanned aerial vehicle group reaches being updated to current state, is persistently made certainly Plan is finally completed the study to entire Q value table, and after Q table is finally restrained, it is as shown in Figure 3 to have recorded sea area information.Unmanned aerial vehicle group root It makes a policy according to trained Q table, completion search mission, the sea area information that Fig. 4 is grasped by UAV more after search mission, Including whole phytal zones, no-fly zone, 9 naval vessels sea area covered along straight line cruise.
Under this experiment condition, the search effect based on Q-Learning algorithm is compared with random search and traversal search Compared with using monte carlo method experiment 500 times, random collection and traversal search analogous diagram are as shown in Figure 5,6.It is strong in three kinds of methods Chemistry practises search efficiency highest, average each time step about than random search more searches out a target, with the time into Although row traversal search may also find that all targets, efficiency are extremely low.Emulation experiment shows the validity of the algorithm, and passes through Multiple no-manned plane collaboration dynamic object search is realized in comparative analysis verifying, is more effectively realized than original searching method to dynamic mesh Target search.
The foregoing is only a preferred embodiment of the present invention, but scope of protection of the present invention is not limited thereto, Anyone skilled in the art in the technical scope disclosed by the present invention, according to the technique and scheme of the present invention and its Inventive concept is subject to equivalent substitution or change, should be covered by the protection scope of the present invention.

Claims (5)

1. a kind of more dynamic object methods of unmanned aerial vehicle group of intensified learning collaboratively searching in unknown sea area, it is characterised in that including Following steps:
S1: searching area is divided using Grid Method: based on sea environment, unmanned plane dynamic, sea moving ship dynamic More sea areas UAV search graph is established with sensor detection model information;It is built according to the pheromone concentration of unmanned plane within a certain area Stand-up collar ground consciousness information figure, utilizes manor consciousness information figure to expand more sea areas UAV search graph;
S2: Q value table is designed according to drone status information and decision u (k);
S3: it using the flight path of Boltzmann distribution mechanism selection unmanned plane and is held according to the Q value of unmanned aerial vehicle group current state Row, when unmanned aerial vehicle group reaches new state according to target detection income Jp, environment search for income Jχ, Executing Cost C, collision cost I Weighted sum obtain search efficiency function;
S4: the Reward-Penalty Functions of evaluation unmanned plane during flying state are designed to using search efficiency function, according to Reward-Penalty Functions to nothing The Q value for the new state that man-machine group reaches is updated;
S5: unmanned aerial vehicle group arrival new state is updated to current state, flight path decision is persistently made and is finally completed entire Q value The study of table, unmanned aerial vehicle group make a policy according to trained Q table, complete search mission.
2. a kind of unmanned aerial vehicle group of intensified learning according to claim 1 more dynamic objects of collaboratively searching in unknown sea area Method, it is further characterized in that: in S1 specifically in the following way:
S11: manor consciousness information figure is established: as unmanned plane ViPheromones H is generated when searching for grid (m, n)i(mn)(k), the information Element can be spread in search graph at other grids, then the pheromones diffusive transport function at grid (a, b) are as follows:
Wherein ρ, β are constant;
Work as NvWhen frame unmanned plane executes search mission, then there is NvKind of pheromones are generated and are spread, if at grid (c, d), current time It is dense that pheromone concentration is that the pheromone concentration that leaves by volatilization last moment with currently newly generated pheromones is diffused into the grid The summation of degree, renewal equation are as follows:
Wherein, τH∈ [0,1] is volatilization factor;
As unmanned plane ViWhen detecting that other information element concentration are high in grid (m, n), indicate other UAV at grid (m, n) Frequent activity, unmanned plane ViOther information element concentration detected are as follows:
S12: establish destination probability figure: destination probability more new formula is,
P in formulamn(k) probability existing for for k moment target at (m, n), pDFor sensor detection probability;pFFor sensor false-alarm Probability;τ ∈ [0,1] is the destination probability multidate information factor, Δ PmnIt (k) is probability knots modification, i.e., when grid (m, n) is not by UAV When access, since other grids are accessed, probability changes at caused grid (m, n):
In formula, D (k) is the set of k moment all accessed grids;NvFor unmanned plane quantity;
S13: establish degree of certainty figure: degree of certainty renewal equation is,
Wherein, τcFor the multidate information factor of degree of certainty;χ ∈ [0,1] is a constant;
S14: H is setmnIt (k) is pheromone concentration total at grid (m, n), wherein pheromone concentration is about grid positions and time Function, obtain environment search graph be
3. a kind of unmanned aerial vehicle group of intensified learning according to claim 1 more dynamic objects of collaboratively searching in unknown sea area Method, it is further characterized in that: in S2 specifically in the following way:
Wherein the size of Q value table is by the control instruction of drone status and input, and wherein location status shares Lx×LyKind, nobody There is z kind in possibility course of the machine at each grid, and every frame UAV, which optionally controls input, l kind, then the line number of the Q table designed is Lx×Ly× z, columns l.
4. a kind of unmanned aerial vehicle group of intensified learning according to claim 1 more dynamic objects of collaboratively searching in unknown sea area Method, it is further characterized in that: in S3 specifically in the following way:
S31: collision cost I is defined as,
In formulaFor unmanned plane ViThe manor consciousness shown, that is, other information element concentration detected calculate Formula is as follows:
In above formula, Hmn(k) the pheromones total amount generated at grid (m, n) for all unmanned planes.
5. a kind of unmanned aerial vehicle group of intensified learning according to claim 1 more dynamic objects of collaboratively searching in unknown sea area Method, it is further characterized in that: in S4 specifically in the following way:
S41: not considering no-fly zone, then Reward-Penalty Functions design is as follows,
A is constant, influences the generalization ability and a × J of learning processk(s (k), u (k)) ∈ (- R, R), maximum reward is R, maximum Punishment is set as-R, actual range of the d between UAV, and J (s (k), u (k)) is search efficiency function, and D is minimum safe distance, is Guarantee each UAV safe flight, d >=D need to be met;
S42: when there is no-fly zone, if B is unmanned plane apart from no-fly zone circle center distance, then B is greater than no-fly zone radius D*, at this time Reward-Penalty Functions further improvement is as follows,
That is UAV, which collides or fly into no-fly zone, will all give maximum punishment.
CN201910346512.6A 2019-04-26 2019-04-26 Method for cooperatively searching multiple dynamic targets in unknown sea area by reinforcement learning unmanned aerial vehicle cluster Active CN110196605B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910346512.6A CN110196605B (en) 2019-04-26 2019-04-26 Method for cooperatively searching multiple dynamic targets in unknown sea area by reinforcement learning unmanned aerial vehicle cluster

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910346512.6A CN110196605B (en) 2019-04-26 2019-04-26 Method for cooperatively searching multiple dynamic targets in unknown sea area by reinforcement learning unmanned aerial vehicle cluster

Publications (2)

Publication Number Publication Date
CN110196605A true CN110196605A (en) 2019-09-03
CN110196605B CN110196605B (en) 2022-03-22

Family

ID=67752255

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910346512.6A Active CN110196605B (en) 2019-04-26 2019-04-26 Method for cooperatively searching multiple dynamic targets in unknown sea area by reinforcement learning unmanned aerial vehicle cluster

Country Status (1)

Country Link
CN (1) CN110196605B (en)

Cited By (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110806591A (en) * 2019-10-11 2020-02-18 广东工业大学 Unmanned aerial vehicle coverage search method and search device based on coherent theory
CN111045445A (en) * 2019-10-23 2020-04-21 浩亚信息科技有限公司 Aircraft intelligent collision avoidance method, equipment and medium based on reinforcement learning
CN111538059A (en) * 2020-05-11 2020-08-14 东华大学 Self-adaptive rapid dynamic positioning system and method based on improved Boltzmann machine
CN111667513A (en) * 2020-06-01 2020-09-15 西北工业大学 Unmanned aerial vehicle maneuvering target tracking method based on DDPG transfer learning
CN111680934A (en) * 2020-06-30 2020-09-18 西安电子科技大学 Unmanned aerial vehicle task allocation method based on group entropy and Q learning
CN111708355A (en) * 2020-06-19 2020-09-25 中国人民解放军国防科技大学 Multi-unmanned aerial vehicle action decision method and device based on reinforcement learning
CN111880567A (en) * 2020-07-31 2020-11-03 中国人民解放军国防科技大学 Fixed-wing unmanned aerial vehicle formation coordination control method and device based on deep reinforcement learning
CN112327918A (en) * 2020-11-12 2021-02-05 大连海事大学 Multi-swarm sea area environment self-adaptive search algorithm based on elite learning
WO2021082864A1 (en) * 2019-10-30 2021-05-06 武汉理工大学 Deep reinforcement learning-based intelligent collision-avoidance method for swarm of unmanned surface vehicles
CN112947575A (en) * 2021-03-17 2021-06-11 中国人民解放军国防科技大学 Unmanned aerial vehicle cluster multi-target searching method and system based on deep reinforcement learning
CN113342030A (en) * 2021-04-27 2021-09-03 湖南科技大学 Multi-unmanned aerial vehicle cooperative self-organizing control method and system based on reinforcement learning
CN113382060A (en) * 2021-06-07 2021-09-10 北京理工大学 Unmanned aerial vehicle track optimization method and system in Internet of things data collection
CN113505431A (en) * 2021-06-07 2021-10-15 中国人民解放军国防科技大学 ST-DQN-based target searching method, device, equipment and medium for marine unmanned aerial vehicle
CN113671996A (en) * 2021-10-22 2021-11-19 中国电子科技集团公司信息科学研究院 Heterogeneous unmanned aerial vehicle reconnaissance method and system based on pheromone
CN113985913A (en) * 2021-09-24 2022-01-28 大连海事大学 Collection-division type multi-unmanned-plane rescue system based on urban fire spread prediction
CN114200964A (en) * 2022-02-17 2022-03-18 南京信息工程大学 Unmanned aerial vehicle cluster cooperative reconnaissance coverage distributed autonomous optimization method
CN114446121A (en) * 2022-02-24 2022-05-06 汕头市快畅机器人科技有限公司 Control method of life search cluster education robot
CN115328143A (en) * 2022-08-26 2022-11-11 齐齐哈尔大学 Master-slave water surface robot recovery guiding method based on environment driving

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107024220A (en) * 2017-04-14 2017-08-08 淮安信息职业技术学院 Robot path planning method based on intensified learning cockroach algorithm
CN107729953A (en) * 2017-09-18 2018-02-23 清华大学 Robot plume method for tracing based on continuous state behavior domain intensified learning
CN108319286A (en) * 2018-03-12 2018-07-24 西北工业大学 A kind of unmanned plane Air Combat Maneuvering Decision Method based on intensified learning
CN108594834A (en) * 2018-03-23 2018-09-28 哈尔滨工程大学 One kind is towards more AUV adaptive targets search and barrier-avoiding method under circumstances not known
CN108762281A (en) * 2018-06-08 2018-11-06 哈尔滨工程大学 It is a kind of that intelligent robot decision-making technique under the embedded Real-time Water of intensified learning is associated with based on memory
CN109241552A (en) * 2018-07-12 2019-01-18 哈尔滨工程大学 A kind of underwater robot motion planning method based on multiple constraint target
WO2019047646A1 (en) * 2017-09-05 2019-03-14 百度在线网络技术(北京)有限公司 Obstacle avoidance method and device for vehicle

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107024220A (en) * 2017-04-14 2017-08-08 淮安信息职业技术学院 Robot path planning method based on intensified learning cockroach algorithm
WO2019047646A1 (en) * 2017-09-05 2019-03-14 百度在线网络技术(北京)有限公司 Obstacle avoidance method and device for vehicle
CN107729953A (en) * 2017-09-18 2018-02-23 清华大学 Robot plume method for tracing based on continuous state behavior domain intensified learning
CN108319286A (en) * 2018-03-12 2018-07-24 西北工业大学 A kind of unmanned plane Air Combat Maneuvering Decision Method based on intensified learning
CN108594834A (en) * 2018-03-23 2018-09-28 哈尔滨工程大学 One kind is towards more AUV adaptive targets search and barrier-avoiding method under circumstances not known
CN108762281A (en) * 2018-06-08 2018-11-06 哈尔滨工程大学 It is a kind of that intelligent robot decision-making technique under the embedded Real-time Water of intensified learning is associated with based on memory
CN109241552A (en) * 2018-07-12 2019-01-18 哈尔滨工程大学 A kind of underwater robot motion planning method based on multiple constraint target

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
张晶晶等: "一种基于强化学习的UAV目标搜索算法", 《计算机应用研究》 *
郝钏钏等: "基于Q学习的无人机三维航迹规划算法", 《上海交通大学学报》 *

Cited By (31)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110806591A (en) * 2019-10-11 2020-02-18 广东工业大学 Unmanned aerial vehicle coverage search method and search device based on coherent theory
CN110806591B (en) * 2019-10-11 2022-02-11 广东工业大学 Unmanned aerial vehicle coverage search method and search device based on coherent theory
CN111045445A (en) * 2019-10-23 2020-04-21 浩亚信息科技有限公司 Aircraft intelligent collision avoidance method, equipment and medium based on reinforcement learning
CN111045445B (en) * 2019-10-23 2023-11-28 浩亚信息科技有限公司 Intelligent collision avoidance method, equipment and medium for aircraft based on reinforcement learning
US20220189312A1 (en) * 2019-10-30 2022-06-16 Wuhan University Of Technology Intelligent collision avoidance method for a swarm of unmanned surface vehicles based on deep reinforcement learning
WO2021082864A1 (en) * 2019-10-30 2021-05-06 武汉理工大学 Deep reinforcement learning-based intelligent collision-avoidance method for swarm of unmanned surface vehicles
CN111538059A (en) * 2020-05-11 2020-08-14 东华大学 Self-adaptive rapid dynamic positioning system and method based on improved Boltzmann machine
CN111667513A (en) * 2020-06-01 2020-09-15 西北工业大学 Unmanned aerial vehicle maneuvering target tracking method based on DDPG transfer learning
CN111667513B (en) * 2020-06-01 2022-02-18 西北工业大学 Unmanned aerial vehicle maneuvering target tracking method based on DDPG transfer learning
CN111708355A (en) * 2020-06-19 2020-09-25 中国人民解放军国防科技大学 Multi-unmanned aerial vehicle action decision method and device based on reinforcement learning
CN111708355B (en) * 2020-06-19 2023-04-18 中国人民解放军国防科技大学 Multi-unmanned aerial vehicle action decision method and device based on reinforcement learning
CN111680934B (en) * 2020-06-30 2023-04-07 西安电子科技大学 Unmanned aerial vehicle task allocation method based on group entropy and Q learning
CN111680934A (en) * 2020-06-30 2020-09-18 西安电子科技大学 Unmanned aerial vehicle task allocation method based on group entropy and Q learning
CN111880567A (en) * 2020-07-31 2020-11-03 中国人民解放军国防科技大学 Fixed-wing unmanned aerial vehicle formation coordination control method and device based on deep reinforcement learning
CN111880567B (en) * 2020-07-31 2022-09-16 中国人民解放军国防科技大学 Fixed-wing unmanned aerial vehicle formation coordination control method and device based on deep reinforcement learning
CN112327918A (en) * 2020-11-12 2021-02-05 大连海事大学 Multi-swarm sea area environment self-adaptive search algorithm based on elite learning
CN112327918B (en) * 2020-11-12 2023-06-02 大连海事大学 Multi-swarm sea area environment self-adaptive search algorithm based on elite learning
CN112947575A (en) * 2021-03-17 2021-06-11 中国人民解放军国防科技大学 Unmanned aerial vehicle cluster multi-target searching method and system based on deep reinforcement learning
CN113342030A (en) * 2021-04-27 2021-09-03 湖南科技大学 Multi-unmanned aerial vehicle cooperative self-organizing control method and system based on reinforcement learning
CN113505431A (en) * 2021-06-07 2021-10-15 中国人民解放军国防科技大学 ST-DQN-based target searching method, device, equipment and medium for marine unmanned aerial vehicle
CN113382060A (en) * 2021-06-07 2021-09-10 北京理工大学 Unmanned aerial vehicle track optimization method and system in Internet of things data collection
CN113505431B (en) * 2021-06-07 2022-05-06 中国人民解放军国防科技大学 Method, device, equipment and medium for searching targets of maritime unmanned aerial vehicle based on ST-DQN
CN113382060B (en) * 2021-06-07 2022-03-22 北京理工大学 Unmanned aerial vehicle track optimization method and system in Internet of things data collection
CN113985913A (en) * 2021-09-24 2022-01-28 大连海事大学 Collection-division type multi-unmanned-plane rescue system based on urban fire spread prediction
CN113985913B (en) * 2021-09-24 2024-04-12 大连海事大学 Integrated and separated type multi-unmanned aerial vehicle rescue system based on urban fire spread prediction
CN113671996A (en) * 2021-10-22 2021-11-19 中国电子科技集团公司信息科学研究院 Heterogeneous unmanned aerial vehicle reconnaissance method and system based on pheromone
CN114200964A (en) * 2022-02-17 2022-03-18 南京信息工程大学 Unmanned aerial vehicle cluster cooperative reconnaissance coverage distributed autonomous optimization method
CN114446121A (en) * 2022-02-24 2022-05-06 汕头市快畅机器人科技有限公司 Control method of life search cluster education robot
CN114446121B (en) * 2022-02-24 2024-03-05 汕头市快畅机器人科技有限公司 Control method of life search cluster education robot
CN115328143A (en) * 2022-08-26 2022-11-11 齐齐哈尔大学 Master-slave water surface robot recovery guiding method based on environment driving
CN115328143B (en) * 2022-08-26 2023-04-18 齐齐哈尔大学 Master-slave water surface robot recovery guiding method based on environment driving

Also Published As

Publication number Publication date
CN110196605B (en) 2022-03-22

Similar Documents

Publication Publication Date Title
CN110196605A (en) A kind of more dynamic object methods of the unmanned aerial vehicle group of intensified learning collaboratively searching in unknown sea area
Jensen et al. Algorithms at war: the promise, peril, and limits of artificial intelligence
CN108801266B (en) Flight path planning method for searching uncertain environment by multiple unmanned aerial vehicles
Yu et al. A knee-guided differential evolution algorithm for unmanned aerial vehicle path planning in disaster management
CN109254588A (en) A kind of unmanned plane cluster coordinated investigation method based on cross and variation dove group's optimization
CN106705970A (en) Multi-UAV(Unmanned Aerial Vehicle) cooperation path planning method based on ant colony algorithm
CN105892480A (en) Self-organizing method for cooperative scouting and hitting task of heterogeneous multi-unmanned-aerial-vehicle system
US9030347B2 (en) Preemptive signature control for vehicle survivability planning
US9240001B2 (en) Systems and methods for vehicle survivability planning
CN108318032A (en) A kind of unmanned aerial vehicle flight path Intelligent planning method considering Attack Defence
US6718261B2 (en) Architecture for real-time maintenance of distributed mission plans
Johansson Evaluating the performance of TEWA systems
US8831793B2 (en) Evaluation tool for vehicle survivability planning
Leboucher et al. A two-step optimisation method for dynamic weapon target assignment problem
CN114020031B (en) Unmanned aerial vehicle cluster collaborative dynamic target searching method based on improved pigeon colony optimization
CN109885082B (en) Unmanned aerial vehicle track planning method based on task driving
CN112486200B (en) Multi-unmanned aerial vehicle cooperative confrontation online re-decision method
Chen et al. Cooperative area reconnaissance for multi-UAV in dynamic environment
CN116679751A (en) Multi-aircraft collaborative search method considering flight constraint
Su et al. An improved adaptive differential evolution algorithm for single unmanned aerial vehicle multitasking
Zhang et al. Design of the fruit fly optimization algorithm based path planner for UAV in 3D environments
He et al. Learning-based airborne sensor task assignment in unknown dynamic environments
Zheng et al. Coevolving and cooperating path planner for multiple unmanned air vehicles
Yue et al. Reinforcement learning based approach for multi-UAV cooperative searching in unknown environments
Lv et al. Maritime static target search based on particle swarm algorithm

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant