CN116149375B

CN116149375B - Unmanned aerial vehicle search planning method and device for online decision, electronic equipment and medium

Info

Publication number: CN116149375B
Application number: CN202310430208.6A
Authority: CN
Inventors: 肖开明; 段浩鹏; 刘丽华; 王懋; 杨晧宇; 李璇; 黄宏斌
Original assignee: National University of Defense Technology
Current assignee: National University of Defense Technology
Priority date: 2023-04-21
Filing date: 2023-04-21
Publication date: 2023-07-07
Anticipated expiration: 2043-04-21
Also published as: CN116149375A

Abstract

The embodiment of the disclosure provides an unmanned aerial vehicle search planning method, device, electronic equipment and medium for online decision, wherein the method comprises the following steps: according to a directed networkGraph acquisition unmanned aerial vehicle is att‑Vertex, residual energy and decision variables of the stay in the stage 1; determining as the vertex closest to the vertex and not traversedtA target vertex of the stage; judging whether the unmanned aerial vehicle meets two conditions of going to the target vertex and returning to the starting point from the target vertex according to the residual energy and the moving cost; when the conditions are met, controlling the unmanned aerial vehicle to go to the target vertex, and acquiring the search cost and the search benefit of the target vertex on line; based on the search cost, the search benefit pairtThe decision variables of the stage determine the search variables of the target vertexes and determine whether to search the target vertexes. The method and the device realize that the energy consumption of the unmanned aerial vehicle on the path is reduced in an optimal mode under the condition of limited total energy of the unmanned aerial vehicle, and the accumulated search benefit is as high as possible.

Description

Unmanned aerial vehicle search planning method and device for online decision, electronic equipment and medium

Technical Field

The invention relates to the field of unmanned aerial vehicles, in particular to an unmanned aerial vehicle search planning method and device for online decision, electronic equipment and medium.

Background

With the rapid development of unmanned technologies, unmanned aerial vehicles have been widely used in the fields of information searching, environmental monitoring, border reconnaissance, material transportation and the like due to the advantages of unmanned aerial vehicles in terms of mobility, mobility and flexibility. In general, unmanned operation includes both path planning and search decisions, which directly affect the efficiency of unmanned operation execution.

Unmanned aerial vehicle search planning is an important mode of information acquisition, and unmanned aerial vehicle online planning becomes a concern because unmanned aerial vehicle application has advantages over a traditional manned aircraft method. Recent research has been directed to solving the joint optimization challenges of both path planning and search processes in unmanned aerial vehicle search planning.

The conventional method follows an offline two-stage paradigm, taking into consideration path selection and search time allocation in turn, making all decisions before acting. In the first stage, the path planning problem is solved solely by adopting greedy search, ant Colony Optimization (ACO) and other methods, and the information such as the search income and the cost of each search point is not considered. The second stage is concerned with the allocation of time or cost, taking into account the difficulty and potential value of searching for a particular point. To solve this problem, a series of methods including a rapid scoring method based on the significance of information gain, a heuristic method based on a partially observable distributed markov process, and a time allocation method based on newton's method have been proposed.

The above methods all have the premise that prior parameters of search benefits and costs need to be known before making a decision. However, in the actual operation of unmanned aerial vehicle searching, the environment in which it is to search is often unknown, that is, the search yields and costs are generally known online, rather than prior to the action. Most unmanned plane planning methods are aimed at globally known environments or locally known environments, and search tracks are planned offline in advance, so that the search efficiency of the methods is reduced and dynamic strain capacity is lacked when the unmanned plane planning methods face unknown environments.

Disclosure of Invention

In view of the above, embodiments of the present disclosure provide an unmanned aerial vehicle search planning method, apparatus, electronic device and medium for online decision, which at least partially solve the problems existing in the prior art.

In a first aspect, an embodiment of the present disclosure provides an unmanned aerial vehicle search planning method for online decision, including:

acquiring the unmanned aerial vehicle in the directed network diagram according to the current turnt-The vertex, the residual energy and the decision variable which stay in the 1 stage, wherein the directed network diagram is generated according to an unknown environment to be searched and comprises a group of vertex sets and a group of edge sets connecting any two vertices, and the directed network diagram comprises a group of vertex sets, a group of edge sets and a group of edge sets, wherein the edge sets are used for connecting any two verticesThe vertex set includes a starting point, the length of each edge in the edge set being known;

determining the vertex closest to the currently stopped vertex and not traversed as the vertextA target vertex of the stage;

judging whether the unmanned aerial vehicle meets two conditions of going to the target vertex and returning to the starting point from the target vertex according to the residual energy and the moving cost of the unmanned aerial vehicle;

when the conditions are met, controlling the unmanned aerial vehicle to go to the target vertex, and acquiring the search cost and the search benefit of the target vertex on line;

based on the search cost, the search benefit pairtDetermining a search variable of the target vertex by a decision variable of the stage;

and determining whether to search the target vertex according to the search variable.

According to a specific implementation of an embodiment of the disclosure, the decision variable is initially 0.

According to a specific implementation of an embodiment of the disclosure, the method further includes:

and when the condition is not met, controlling the unmanned aerial vehicle to return to a starting point, and updating the directed network graph according to the vertexes experienced by the current round.

According to one specific implementation of an embodiment of the present disclosure,tsearch variable for a phase

The method is calculated by the following formula:

wherein, the liquid crystal display device comprises a liquid crystal display device,

search cost for the target vertex, +.>

For search benefits of the target vertex, +.>

Is the decision variable; when->

The unmanned plane performs a search for the target vertex when +.>

And when the unmanned aerial vehicle does not execute the search for the target vertex.

According to a specific implementation manner of the embodiment of the present disclosure, after determining whether to search the target vertex according to the search variable, the method further includes:

obtained according to the search variable, preset learning rate, search income, total energy of unmanned aerial vehicle and mobile cost updatet+1Decision variables for the stage.

According to a specific implementation of an embodiment of the present disclosure, updated information is providedt+Expression of decision variables for stage 1:

；

，bis the total energy of the unmanned plane +.>

Indicating the stop position->

To->

Is>

Is the learning rate.

According to a specific implementation of an embodiment of the disclosure, the learning rate

Wherein, the method comprises the steps of, wherein,nis the number of vertices in the vertex set.

In a second aspect, an embodiment of the present disclosure provides an unmanned aerial vehicle search planning apparatus for online decision making, including:

a state acquisition unit, configured to acquire, according to a directed network graph of a current round, that the unmanned aerial vehicle is int-The method comprises the steps of 1, stay vertexes, residual energy and decision variables, wherein the directed network graph is generated according to a search environment and comprises a group of vertex sets and a group of edge sets connecting any two vertexes, the vertex sets comprise starting points, and the length of each edge in the edge sets is known;

a target vertex decision unit for determining the vertex closest to the currently stopped vertex and not traversed as the vertextA target vertex of the stage;

a movement judging unit, configured to judge whether the unmanned aerial vehicle meets the requirement of going to the target vertex and returning to the starting point from the target vertex according to the remaining energy and the movement cost of the unmanned aerial vehicle;

the online acquisition unit is used for controlling the unmanned aerial vehicle to go to the target vertex if the target vertex is satisfied, and acquiring the search cost and the search benefit of the target vertex online;

a search variable calculation unit for determining a search variable of the target vertex according to the search cost, the search benefit pair and the decision variable;

and the searching judging unit is used for searching whether the target vertex is searched or not according to the searching variable.

In a third aspect, an embodiment of the present disclosure provides an electronic device, including:

at least one processor; the method comprises the steps of,

a memory communicatively coupled to the at least one processor; wherein, the liquid crystal display device comprises a liquid crystal display device,

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the unmanned aerial vehicle search planning method of online decision making described above.

In a fourth aspect, embodiments of the present disclosure provide a non-transitory computer-readable storage medium storing computer instructions for causing the computer to perform the unmanned aerial vehicle search planning method of online decision-making described above.

In summary, in this embodiment, in order to cope with the challenges of the unknown search environment, the path planning and the search decision are performed in a comprehensive online manner, the search cost and the search benefit of the vertex to which the unmanned aerial vehicle goes and the vertex to which the unmanned aerial vehicle goes are determined in an online manner, and then a decision is made on whether the vertex is searched or not by using the idea of linear planning, so that the energy consumption of the unmanned aerial vehicle on the path is reduced in an optimal manner under the condition of limited total energy of the unmanned aerial vehicle, and the accumulated search benefit is as high as possible.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present disclosure, the drawings that are needed in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present disclosure, and other drawings may be obtained according to these drawings without inventive effort to a person of ordinary skill in the art.

Fig. 1 is a schematic flow chart of an unmanned aerial vehicle search planning method for online decision-making according to a first embodiment of the present invention;

FIG. 2 is a schematic diagram of a directed network diagram according to an embodiment of the present invention;

FIG. 3 is a schematic diagram of a greedy path planning strategy according to an embodiment of the present invention;

FIG. 4 is a schematic diagram of a multi-round search plan provided by an embodiment of the present invention;

FIG. 5 (a) is a graph of the impact of energy budget on search revenue in an offline two-stage process;

FIG. 5 (b) is a graph of the impact of energy budget on search benefits according to an embodiment of the present invention;

FIG. 6 is a schematic diagram showing the performance of the offline two-stage method and the embodiment when planning a single-round search;

FIG. 7 is a schematic diagram showing the performance of the offline two-stage method and the embodiment when searching and planning for multiple times;

fig. 8 is a schematic structural diagram of an unmanned aerial vehicle search planning device for online decision according to a second embodiment of the present invention.

Detailed Description

Embodiments of the present disclosure are described in detail below with reference to the accompanying drawings.

Other advantages and effects of the present disclosure will become readily apparent to those skilled in the art from the following disclosure, which describes embodiments of the present disclosure by way of specific examples. It will be apparent that the described embodiments are merely some, but not all embodiments of the present disclosure. The disclosure may be embodied or practiced in other different specific embodiments, and details within the subject specification may be modified or changed from various points of view and applications without departing from the spirit of the disclosure. It should be noted that the following embodiments and features in the embodiments may be combined with each other without conflict. All other embodiments, which can be made by one of ordinary skill in the art without inventive effort, based on the embodiments in this disclosure are intended to be within the scope of this disclosure.

It is noted that various aspects of the embodiments are described below within the scope of the following claims.

It should be apparent that the aspects described herein may be embodied in a wide variety of forms and that any specific structure and/or function described herein is merely illustrative. Based on the present disclosure, one skilled in the art will appreciate that one aspect described herein may be implemented independently of any other aspect, and that two or more of these aspects may be combined in various ways. For example, an apparatus may be implemented and/or a method practiced using any number of the aspects set forth herein. In addition, such apparatus may be implemented and/or such methods practiced using other structure and/or functionality in addition to one or more of the aspects set forth herein.

It should also be noted that the illustrations provided in the following embodiments merely illustrate the basic concepts of the disclosure by way of illustration, and only the components related to the disclosure are shown in the drawings and are not drawn according to the number, shape and size of the components in actual implementation, and the form, number and proportion of the components in actual implementation may be arbitrarily changed, and the layout of the components may be more complicated.

In addition, in the following description, specific details are provided in order to provide a thorough understanding of the examples. However, it will be understood by those skilled in the art that the aspects may be practiced without these specific details.

Referring to fig. 1, a first embodiment of the present invention provides an unmanned aerial vehicle search planning method for online decision making, which may be executed by an electronic device, in particular, one or more processors in the electronic device, to implement the following steps:

s101, acquiring the position of the unmanned aerial vehicle according to a directed network diagram of the current turnt-Vertex of 1 stage stay, residual energy, and decision variables.

In this embodiment, the electronic device may be a device with an operational capability, such as a notebook computer, a desktop computer, a workstation, or a server, and may control operation of the unmanned aerial vehicle by communicating with the unmanned aerial vehicle. The electronic device may also be an unmanned aerial vehicle itself, which implements the steps of the present embodiment by executing a preset computer program.

In this embodiment, when an unknown environment needs to be searched by the unmanned aerial vehicle, a starting point and a point to be searched in the unknown environment need to be set first, and then a directed network map can be constructed based on the starting point and the point to be searched. The directed network graph is available

It is indicated that it has a set of vertex sets +.>

And a set of edge sets->

The vertex set->

Includes a starting point, the edge setEThe length of each edge is known so that the distance between two vertices can be determined from the edge length.

As shown in fig. 2, fig. 2 shows a schematic diagram of a directed network diagram. In FIG. 2, directed network graph G vertex set

Comprises->

- />

8 vertices in total, wherein->

Is the starting point of the unmanned aerial vehicle.

In this embodiment, the energy of the unmanned aerial vehicle may be understood as the electric quantity of the unmanned aerial vehicle. The unmanned aerial vehicle consumes corresponding energy in the flight and searching processes, the remaining energy of the unmanned aerial vehicle can be determined through the remaining electric quantity of the unmanned aerial vehicle, and the remaining energy of the unmanned aerial vehicle can also be determined through the total energy budget of the unmanned aerial vehicle and the consumed energy.

In this embodiment, the decision variable is used to decide whether to search for a vertex, and its initial value

=0。

S102, determining the vertex closest to the position of the vertex at which the current stop exists and not reaching the vertex as the vertextThe target vertex of the phase.

In this embodiment, the unmanned aerial vehicle greedily selects the nearest vertex which has not been traversed as the target vertex of the next search through the greedily path planning strategy at each stage, namely

Wherein, record unmanned aerial vehicle is attThe position where the-1 stage stays is

And satisfy->

， />

。

Representing traversing vertex sequence +.>

Is not limited by the energy consumption of (a).

As can be seen from the above formula,tthe vertices of the phases are selected such that

The shortest and non-traversed vertex.

And S103, judging whether the unmanned aerial vehicle meets the requirements of going to the target vertex and returning to the starting point from the target vertex according to the residual energy and the moving cost of the unmanned aerial vehicle.

In the present embodiment, use is made of

An online path planning function representing each phase is shown below:

wherein the method comprises the steps of

Representing traversing vertex sequence +.>

Is not limited by the energy consumption of (a). The final objective of online path planning is to reduce the energy consumption of the drone on the path in an optimal way under a limited drone energy budget, thus maximizing the cumulative search return. Here the number of the elements is the number,a _i is unmanned plane NoiMultiple verticesv _i Is (vector form is +.>

) Cost of movement->

(vector form is->

) Representing passing edge->

Mobile energy consumption, i.e. mobile cost. />

The represented drone has a limited energy budget and is satisfied with insufficient energy to support searching directed network graphs>

Is included in the table. The goal of the drone is to plan a travel route that covers all vertices and decide on the vertices to perform the search operation. Use->

(vector form is->

) Search variables representing search choices, if unmanned aerial vehicle chooses to consume search costsa _i Search for vertex +.>

Then/>

Otherwise, 0. The corresponding search benefits of searching at the vertices are denoted +.>

Use->

(vector form is->

) Path variables representing path selection, if decision maker chooses to pass through side +.>

Then->

Otherwise, 0.

The final objective of online path planning is to reduce the energy consumption of the drone on the path in an optimal way, with limited total energy of the drone, so as to maximize the cumulative search revenue. At the same time, since the drone needs to return to the base before its energy is exhausted, the following constraints should be considered in the route planning process:

as shown in fig. 3, the drone is now at the apex

. According to the above constraints, only when the drone is from +.>

To->

And->

To->

When the total energy consumption of (2) is smaller than the current remaining energy, vertex +.>

Is allowed.

And S104, if the target vertex is met, controlling the unmanned aerial vehicle to go to the target vertex, and acquiring the search cost and the search benefit of the target vertex on line.

In this embodiment, if the above condition is satisfied, the unmanned aerial vehicle goes to the target vertex, and at the target vertex, the search cost and the search benefit of the target vertex may be obtained online.

S105, determining a search variable of the target vertex according to the search cost, the search benefit pair and the decision variable;

s106, determining whether to search the target vertex according to the search variable.

As described above, the final objective of the present embodiment is to reduce the energy consumption of the drone on the path in an optimal way under a limited drone energy budget, thereby maximizing the cumulative search return. To this end, the electronic device should make an irrevocable online decision at each stage as to which vertex to visit and whether to search for that vertex. After the greedy path planning strategy is obtained, the greedy path planning strategy is combined with an online unmanned aerial vehicle search planning process.

Also, the goal is to have the selected vertices

Maximizing cumulative search yield, i.e. +.>

. When it is assumed that all online data is known in advance, the comprehensive offline drone search planning problem (InOffSP) can be expressed as follows:

；

wherein the method comprises the steps of

And->

Is determined by the greedy path planning described above,/->

To->

Is used for traversing the energy.

P-InOffSP is an integer linear program. Due to

，/>

And->

Is known in an on-line manner, so that the optimal solution of the problem InOffSP is not available +.>

And->

。

Near optimal online policy for OnUAVSP

The +.in P-InOffSP can be relaxed first>

And dual transpose it to yield the following formula:

；

by introducing double decision variables

RAnd->

.

By using

，/>

And->

Representing the optimal solution for linear relaxation of InOffSP and DLP-InOffSP. According to the complementary conditions, it is possible to obtain:

when (when)

Decision variable +.>

May be a non-integer.

In summary, based on the above analysis, the present embodiment can obtaintSearch variables for phase target vertices

Is represented by the expression:

according to the instituteThe search variable

I.e. it can be determined whether or not to search for the target vertex, in particular when +.>

When=1, search for the target vertex is performed, and +.>

When=0, the search for the target vertex is not performed.

In this embodiment, after obtaining the search variable of the target vertex, the decision variable of the next stage needs to be updated to perform the search planning decision on the vertex of the next stage. Wherein the updated expression of the decision variable at the t+1 stage:

；

here the number of the elements is the number,

，bfor the energy budget of the unmanned aerial vehicle, +.>

Indicating the stop position->

To->

Is>

Is learning rate and->

。

In addition, the embodiment can be very conveniently applied to multi-round scenes.

As described above, in step S104, when the condition is not satisfied, the unmanned aerial vehicle is controlled to return to the starting point, and the search plan of one round is regarded as being ended. As shown in fig. 4, at each round

The unmanned aerial vehicle will receive a search benefit

And a set of vertices traversed in an online manner +.>

. By being integrated from the whole vertex set->

Delete->

The directed network graph is updated by the vertex in the multi-round scene, and the searching planning of the multi-round scene can be realized.

To evaluate the performance of the planning method on MOUAVSP, the sum of the search benefits of each round (i.e

) Is an intuitive evaluation criterion. But such a summation criterion does not reflect the process characteristics of the information acquisition in a multi-round scenario. For example, the sum of the benefits of the two planning methods may be the same, but mission planners prefer a method of collecting more information early, especially for emergency situations such as search and rescue actions. Thus, this embodiment designs another criterion to evaluate the ability to collect pre-information. Use->

Representing a sequence of search results planned over multiple passes.

Definition 1 (preempt advantage) if the following conditions are met, plan

In a ratio->

As a pre-dominance plan->

，

Wherein the method comprises the steps of

Representing a first-come dominance pair plan->

Is of a size of (a) and (b).

In order to facilitate the understanding of the present invention, the feasibility of the embodiment will be verified by experimental results and discussion below. The performance of the single-round search plan and the multi-round search plan is evaluated by the method of the embodiment, wherein the performance of the single-round search plan and the multi-round search plan are distributed on a common data set named TSPLIB. The proposed programming and algorithm was tested using a Gurobi 9.0.1 solver on a Windows10 (64) computer equipped with an Intel Core-i7 CPU and 16.0GB RAM.

Planning for single round searches

In this experiment, this example compares the performance of different methods of solving the problem OnUAVSP. First, the common situation that the unmanned aerial vehicle has a limited energy budget and cannot traverse all vertices is analyzed. As shown in table 1, at the energy budget

The online federation employed in this example was tested in the same case as TSPLIBPlanning methods and off-line two-stage methods. In the off-line two-stage method, a route plan of all vertexes is firstly established, then the vertexes are selected by using the residual energy of the unmanned aerial vehicle, and searching is performed in a fully known mode. In this case, the offline two-stage approach is more likely to converge to a local optimum, while the information about the search costs is known, but energy is wasted on the path, whereas the online joint planning approach is affected by an unknown environment. However, as shown in table 1, in the case of "att 48" and "pr1002", the method adopted in this embodiment is still superior to the offline two-stage method, and a higher search benefit is obtained. Meanwhile, this example obtained a higher competition ratio in all cases (the results showed from 88.99% to 94.52%).

TABLE 1 comparison of the Performance of the two methods

Second, the time efficiency of both methods was tested. By using

And->

Representing the time consumption of the online and offline policies of the different methods. Note that the time consumption of the offline two-stage approach includes two parts, namely, offline route planning (the present embodiment solves the TSP problem by ACO algorithm) and offline search planning (solving linear integers). As shown in table 1, it is apparent that this example has a tremendous advantage in terms of time efficiency over the off-line two-stage approach, with 97.79 seconds and 63,702.50 CUP seconds (i.e., about 3 orders of magnitude difference) reported in case "pr 1002". These results indicate that the present embodiment is superior in time efficiency to the off-line two-stage approach when the energy budget of the drone is limited. Finally, this example analyzes the impact of energy budget on the performance of both methods. In the "ch130" case shown in FIG. 6, the cumulative search benefits under two methods, namely, energy budget (Energy budget) are recorded>

From 25 to 350. When the energy budget is relatively small (below 125), the Search benefit (Search Payoff) of this embodiment (Online job-planning Approach) is significantly greater than the Offline Two-stage Approach. To intuitively illustrate the above phenomenon in fig. 5 (a) and 5 (b), when the energy budgetbWhen only a partial vertex search task is supported and the drone needs to return to the base vertex for battery replacement/charging, two specific drone search planning schemes are randomly selected. Fig. 5 (a) and 5 (b) show the results of the "ch130" case, the symbol pentagram representing the departure base of the drone, the square representing the vertices determined to be searched. The straight line with the arrow represents the flight path of the drone under different strategies. At the same energy budget->

The solution of the offline two-phase approach first meets the energy consumption of traversing all vertices, then allocates the remaining energy budget to valuable vertices to complete the search task (see fig. 5 (a)), reporting a search yield of 94.45. Instead, the total energy budget is allocated to the path planning and search tasks in an integrated online manner, reporting the search benefits of 141.80 in this embodiment. Because of the advantages of comprehensive allocation, more energy budget is allocated to the completion of the search task, and less energy is wasted on useless routing tasks.

Planning for multiple rounds of searches

The experiment focuses on the performance of different methods in a multi-round online search scheme. At each round of energy budget

In the case of "ch130" of (2), the sum of the benefits of multiple rounds of searching is discussed +.>

And first order dominance criteria. Due to the energy limitation of each round of travel and the relatively large number of points to be searched, the round isThe number is set to +.>

. As shown in fig. 7, the Search benefit (Search Payoff) of the 10 rounds of trips in this embodiment (Online job-planning Approach) is summed up to 968.98, and the Search benefit in the Offline Two-stage Approach (Offline Two-stage Approach) is summed up to 866.16. Furthermore, they show different progress in multiple rounds of tasks. This embodiment tends to achieve higher information collection benefits in early rounds of the mission, while the benefits under the offline two-stage approach fluctuate randomly. From a quantitative point of view, the plan derived by the present embodiment as a plan for a precursor to govern an offline two-stage method at a competition Ratio (Progress Ratio) of 0.6 suggests that the present embodiment is highly advantageous in terms of precursor governance for collecting information.

Referring to fig. 8, a second embodiment of the present invention provides an unmanned aerial vehicle search planning apparatus for online decision, which includes:

a state obtaining unit 210, configured to obtain that the unmanned aerial vehicle is in the vehicle according to the directed network map of the current turnt-The method comprises the steps of 1, stay vertexes, residual energy and decision variables, wherein the directed network graph is generated according to a search environment and comprises a group of vertex sets and a group of edge sets connecting any two vertexes, the vertex sets comprise starting points, and the length of each edge in the edge sets is known;

a target vertex decision unit 220 for determining, as the vertex closest to the currently stopped vertex and not traversedtA target vertex of the stage;

a movement judging unit 230 for judging whether the unmanned aerial vehicle meets the requirement of going to the target vertex and returning to the starting point from the target vertex according to the residual energy and the movement cost of the unmanned aerial vehicle;

an online obtaining unit 240, configured to control the unmanned aerial vehicle to go to the target vertex if the target vertex is satisfied, and obtain the search cost and the search benefit of the target vertex online;

a search variable calculation unit 250 for determining a search variable of the target vertex according to the search cost, the search benefit pair, and the decision variable;

the search judging unit 260 determines whether to search the target vertex according to the search variable.

The third embodiment of the present invention also provides an electronic device, including:

at least one processor; the method comprises the steps of,

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the unmanned aerial vehicle search planning method of any of the previous embodiments.

The fourth embodiment of the present invention also provides a non-transitory computer readable storage medium storing computer instructions for causing a computer to execute the unmanned aerial vehicle search planning method for online decision making according to any one of the preceding embodiments.

A fifth embodiment of the present invention also provides a computer program product comprising a computer program stored on a non-transitory computer readable storage medium, the computer program comprising program instructions which, when executed by a computer, cause the computer to perform the unmanned aerial vehicle search planning method of any of the preceding embodiments.

The foregoing is merely specific embodiments of the disclosure, but the protection scope of the disclosure is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the disclosure are intended to be covered by the protection scope of the disclosure. Therefore, the protection scope of the present disclosure shall be subject to the protection scope of the claims.

Claims

1. An unmanned aerial vehicle search planning method for online decision making is characterized by comprising the following steps:

acquiring the unmanned aerial vehicle in the directed network diagram according to the current turnt-Vertex with 1 stage stayThe method comprises the steps of generating a directed network graph according to an unknown environment to be searched, wherein the directed network graph comprises a group of vertex sets and a group of edge sets connecting any two vertexes, the vertex sets comprise starting points, and the length of each edge in the edge sets is known;

determining whether to search the target vertex according to the search variable;

tsearch variable for a phase

The method is calculated by the following formula: />

The method comprises the steps of carrying out a first treatment on the surface of the Wherein (1)>

For the search benefit of the target vertex, +.>

Search cost for target vertex, +.>

Is the decision variable; when->

When in use, theThe drone performs a search for the target vertices, when +.>

When the unmanned aerial vehicle does not execute the search for the target vertex;

after determining whether to search the target vertex according to the search variable, the method further comprises:

obtained according to the search variable, preset learning rate, search income, total energy of unmanned aerial vehicle and mobile cost updatet+Decision variables of stage 1;

updated att+The expression of the decision variables of stage 1 is:

；

，bis the total energy of the unmanned plane +.>

Indicating the stop position->

To->

Is>

Is the learning rate.

2. The unmanned aerial vehicle search planning method of on-line decision according to claim 1, wherein the decision variable is initially 0.

3. The unmanned aerial vehicle search planning method of on-line decision making according to claim 1, further comprising:

4. An unmanned aerial vehicle search planning method for online decision making according to claim 3, wherein the learning rate

5. An unmanned aerial vehicle search planning device for online decision making, which is characterized by comprising:

further comprises:

tsearch variable for a phase

The method is calculated by the following formula:

；

for the search benefit of the target vertex, +.>

Search cost for target vertex, +.>

Is the decision variable; when->

The unmanned plane performs a search for the target vertex when +.>

updated att+The expression of the decision variables of stage 1 is:

；

，bis the total energy of the unmanned plane +.>

Indicating the stop position->

To->

Is>

Is the learning rate;

6. An electronic device, comprising:

at least one processor; the method comprises the steps of,

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the unmanned aerial vehicle search planning method of any of claims 1 to 4.

7. A non-transitory computer readable storage medium storing computer instructions for causing the computer to perform the unmanned aerial vehicle search planning method of on-line decision making of any of claims 1 to 4.