CN115562330A - Unmanned aerial vehicle control method for restraining wind disturbance of similar field - Google Patents

Unmanned aerial vehicle control method for restraining wind disturbance of similar field Download PDF

Info

Publication number
CN115562330A
CN115562330A CN202211381428.6A CN202211381428A CN115562330A CN 115562330 A CN115562330 A CN 115562330A CN 202211381428 A CN202211381428 A CN 202211381428A CN 115562330 A CN115562330 A CN 115562330A
Authority
CN
China
Prior art keywords
network
unmanned aerial
aerial vehicle
wind
disturbance
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202211381428.6A
Other languages
Chinese (zh)
Other versions
CN115562330B (en
Inventor
李湛
宋罘林
于兴虎
郑晓龙
高会军
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Harbin Institute of Technology
Original Assignee
Harbin Institute of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Harbin Institute of Technology filed Critical Harbin Institute of Technology
Priority to CN202211381428.6A priority Critical patent/CN115562330B/en
Publication of CN115562330A publication Critical patent/CN115562330A/en
Application granted granted Critical
Publication of CN115562330B publication Critical patent/CN115562330B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05DSYSTEMS FOR CONTROLLING OR REGULATING NON-ELECTRIC VARIABLES
    • G05D1/00Control of position, course, altitude or attitude of land, water, air or space vehicles, e.g. using automatic pilots
    • G05D1/08Control of attitude, i.e. control of roll, pitch, or yaw
    • G05D1/0808Control of attitude, i.e. control of roll, pitch, or yaw specially adapted for aircraft
    • G05D1/0816Control of attitude, i.e. control of roll, pitch, or yaw specially adapted for aircraft to ensure stability
    • G05D1/0825Control of attitude, i.e. control of roll, pitch, or yaw specially adapted for aircraft to ensure stability using mathematical models
    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05DSYSTEMS FOR CONTROLLING OR REGULATING NON-ELECTRIC VARIABLES
    • G05D1/00Control of position, course, altitude or attitude of land, water, air or space vehicles, e.g. using automatic pilots
    • G05D1/10Simultaneous control of position or course in three dimensions
    • G05D1/101Simultaneous control of position or course in three dimensions specially adapted for aircraft
    • G05D1/106Change initiated in response to external conditions, e.g. avoidance of elevated terrain or of no-fly zones
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02EREDUCTION OF GREENHOUSE GAS [GHG] EMISSIONS, RELATED TO ENERGY GENERATION, TRANSMISSION OR DISTRIBUTION
    • Y02E10/00Energy generation through renewable energy sources
    • Y02E10/70Wind energy
    • Y02E10/72Wind turbines with rotation axis in wind direction

Landscapes

  • Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Automation & Control Theory (AREA)
  • Remote Sensing (AREA)
  • Radar, Positioning & Navigation (AREA)
  • Aviation & Aerospace Engineering (AREA)
  • Mathematical Analysis (AREA)
  • Pure & Applied Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Mathematical Optimization (AREA)
  • Algebra (AREA)
  • Control Of Position, Course, Altitude, Or Attitude Of Moving Bodies (AREA)
  • Feedback Control In General (AREA)

Abstract

An unmanned aerial vehicle control method for restraining wind disturbance of a similar field belongs to the technical field of unmanned aerial vehicle disturbance rejection. The invention comprises the following steps: s1, acquiring a wind source image through a camera carried on an autonomous unmanned aerial vehicle, tracking a target interference source in the image by using a tracking network to acquire the position of the target interference source, and acquiring an action compensation amount according to the position of the target interference source by using a compensation network; the tracking network comprises a feature extractor and a convolutional layer; the wind source image tracking network obtains a disturbance source characteristic diagram; the compensation network is realized by adopting a deep reinforcement learning algorithm; s2, adding the control compensation quantity and the control quantity output by the unmanned aerial vehicle controller to serve as the input of the controlled autonomous unmanned aerial vehicle; s3, updating network parameters of the wind disturbance compensation network of the similar field; s4, repeating S1 to S3 until the unmanned aerial vehicle flies away from the wind field area, and solving the problem that the autonomous unmanned aerial vehicle is easy to crash when flying under the urban crowded environment and is disturbed by wind in an artificial field.

Description

Unmanned aerial vehicle control method for restraining wind disturbance of similar field
Technical Field
The invention relates to an unmanned aerial vehicle control method for inhibiting wind disturbance of a similar field, and belongs to the technical field of unmanned aerial vehicle disturbance rejection.
Background
In recent years, in view of the rapid development of urban logistics, unmanned planes capable of rapidly shuttling through buildings crowded in modern cities are important research and development targets for many units. In the face of narrow passable space between buildings, most researchers focus on obstacle avoidance and path planning of unmanned aerial vehicles. However, the crowded urban space and the large number of obstacles are not the only difficulties that urban autonomous drones need to face. In modern cities, artificial wind sources such as ventilation fans and air conditioner outdoor units are widely available. The wind disturbance generated by the artificial wind source is greatly different from natural wind disturbance, has obvious directivity and has strong correlation with the space position relative to the wind source, and the wind disturbance generated by the artificial wind source is similar to field wind disturbance. Will receive the wind field interference of impact nature when autonomic unmanned aerial vehicle passes through their air outlet, this will produce devastating strike to unmanned aerial vehicle between crowded building, has very big probability to lead to unmanned aerial vehicle to bump into upstairs or other obstacles and crash.
According to the existing research, most researchers studying the immunity algorithm of drones use feedback compensation methods that enable drones to suppress completely unknown disturbances. For example, ADRC and LADRC algorithms are widely used in drone anti-jamming control. In addition, the adaptive neural network backstepping controller and various instruction filters are also used for estimating unknown disturbance on line, and become a common method in the field of unmanned aerial vehicle disturbance rejection control. However, most of these methods are directed to completely unknown disturbances, so that only feedback compensation methods can be used.
Model-free reinforcement learning provides a new idea for approaching unknown models by means of interaction with the environment. Reinforcement learning has received increasing attention from researchers in recent years and has been applied to the study of some robots, including the four rotor control problem. The autonomous unmanned aerial vehicle can well inhibit the wind disturbance of the quasi-field by using the model-free reinforcement learning algorithm, but the method has the problem of low data utilization rate, and convergence becomes difficult as the abstraction level of the neural network is deepened.
Disclosure of Invention
Aiming at the problem of how to inhibit the influence of artificial wind interference on the unmanned aerial vehicle, the invention provides an unmanned aerial vehicle control method for inhibiting wind interference of a similar field.
The invention discloses an unmanned aerial vehicle control method for inhibiting wind disturbance of a similar field, which comprises the following steps:
s1, acquiring a wind source image through a camera carried on an autonomous unmanned aerial vehicle, tracking a target interference source in the image by using a tracking network to acquire the position of the target interference source, and acquiring an action compensation amount u' according to the position of the target interference source by using a compensation network;
the tracking network comprises a feature extractor and a convolutional layer;
the wind source images are sequentially input into the feature extractor and the convolution layer to obtain a disturbance source feature map;
the compensation network is realized by adopting a deep reinforcement learning network, the input of the compensation network is a disturbance source characteristic diagram, the output is an action a output by a behavior network, the action a represents the normalization result of the control compensation quantity of the unmanned aerial vehicle in the x, y and z directions, and the normalized action a is mapped into an action compensation quantity u';
s2, adding the control compensation amount u' and the control amount output by the unmanned aerial vehicle controller to obtain u which is used as the input of the controlled autonomous unmanned aerial vehicle;
s3, updating network parameters of the deep reinforcement learning network;
and S4, repeating the steps from S1 to S3 until the unmanned aerial vehicle flies away from the wind field area.
Preferably, in S1, the industrial fan is identified by a camera carried by the autonomous unmanned aerial vehicle, a group of image samples labeled to a boundary frame of the industrial fan constitutes a training set, a feature extractor and a model predictor are trained by the training set, after training is completed, the feature extractor is used as a feature extractor of the tracking network, and a convolutional layer is extracted from the trained model predictor and used as a convolutional layer of the tracking network.
Preferably, the feature extractor is a ResNet-50 backbone network.
Preferably, the compensation network comprises a behavior network, a target behavior network, an evaluation network and a target evaluation network, which are all fully connected layers;
action values output by a behavior network
Figure BDA0003926836510000021
Figure BDA0003926836510000022
Feature diagram of disturbance source at time t, W a Weights representing the behavioral network;
action values output by the target behavior network
Figure BDA0003926836510000023
Figure BDA0003926836510000024
A disturbance source feature map W representing the disturbance source at time t +1 after the action at time t is executed a ' represents the weight of the target behavior network;
evaluating an action State value output by a network
Figure BDA0003926836510000025
W c A weight representing an evaluation network;
action state value output by target evaluation network
Figure BDA0003926836510000026
W c ' denotes the weight of the target evaluation network.
Preferably, in S3, the gradient of the behavior network is:
Figure BDA0003926836510000027
m is the minimum batch of each data sample, and the weight of the behavior network is updated:
Figure BDA0003926836510000031
Figure BDA0003926836510000032
Figure BDA0003926836510000033
Figure BDA0003926836510000034
Figure BDA0003926836510000035
wherein
Figure BDA0003926836510000036
Is an intermediate variable, and the superscript represents time,
Figure BDA0003926836510000037
is a hyperparameter of a Is the learning rate, ξ is the infinitesimal quantity;
Figure BDA0003926836510000038
represents to W a The gradient is calculated, and the gradient is calculated,
Figure BDA0003926836510000039
the gradient of a, J (W) is determined a ) An objective function representing a behavioral network;
the weights of the target behavior network are updated as follows:
W a ′←τW a +(1-τ)W a
τ is the soft update rate.
Preferably, in S3, the gradient of the evaluation network is:
Figure BDA00039268365100000310
wherein ,
Figure BDA00039268365100000311
is shown to W c Finding the gradient, L (W) c ) An objective function representing the evaluation network;
Figure BDA00039268365100000312
r i represents the reward, γ is the decay coefficient, M is the minimum batch per data sample;
updating the weight of the evaluation network:
Figure BDA00039268365100000313
Figure BDA00039268365100000314
Figure BDA00039268365100000315
Figure BDA00039268365100000316
Figure BDA00039268365100000317
wherein
Figure BDA00039268365100000318
Is an intermediate variable, and the superscript represents time,
Figure BDA00039268365100000319
is a hyperparameter of c Is the learning rate, ξ is the infinitesimal quantity;
the weight of the target evaluation network is updated as follows:
W c ′←τW c +(1-τ)W c
τ is the soft update rate.
Preferably, the prize r i Comprises the following steps:
Figure BDA0003926836510000041
e 1 、e 3 、e 5 respectively representing the x, y and z axis errors of the target track point and the current position, and C is a constant.
Preferably, u = [ u ] x ,u y ,u z ] T
Figure BDA0003926836510000042
Figure BDA0003926836510000043
Figure BDA0003926836510000044
Where m is the mass of the drone, target trajectory r d =[x d ,y d ,z d ] T
Figure BDA0003926836510000045
Figure BDA0003926836510000046
k 1 ,k 2 ,k 3 ,k 4 ,k 5 ,k 6 Is a controller parameter;
error variable
Figure BDA0003926836510000047
Figure BDA0003926836510000048
x 1 ,x 3 ,x 5 Is the position component of the x, y and z axes of the unmanned aerial vehicle in the inertial coordinate system, x 2 ,x 4 ,x 6 Respectively representing the speed components of x, y and z axes under an inertial coordinate system of the unmanned aerial vehicle;
g is the acceleration of gravity, F is the resultant force of the four motors, d x ,d y ,d z Is the component of the disturbance on the unmanned plane in three directions; phi, theta, psi is the attitude euler angle of the drone.
The unmanned aerial vehicle tracking system has the beneficial effects that the problem that the unmanned aerial vehicle flying in the urban crowded environment is easy to have the danger of explosion when being disturbed by wind in an artificial field is solved, the position of a target interference source is obtained through a tracking network, action compensation is carried out according to a reinforcement learning compensation network, and the interference of artificial wind on the unmanned aerial vehicle is inhibited. By using the feedforward compensation method, the unmanned aerial vehicle can make a compensation action in advance before the interference comes, and the method has faster response and higher compensation precision compared with the general feedback compensation.
Drawings
FIG. 1 is a schematic diagram of the principles of the present invention;
FIG. 2 is a schematic diagram of an unmanned aerial vehicle suppressing interference of an industrial fan;
FIG. 3 is a schematic diagram of a neural network model;
FIG. 4 is a comparison of the track following effect using the feed forward compensation of the present invention and using PID.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It should be noted that the embodiments and features of the embodiments may be combined with each other without conflict.
The invention is further described with reference to the following drawings and specific examples, which are not intended to be limiting.
According to the unmanned aerial vehicle control method for restraining the wind disturbance of the similar field, the wind source is identified through the camera carried by the autonomous unmanned aerial vehicle, and the target disturbance source is tracked by adopting the tracking network. And inputting the output of the trained tracking network into a wind disturbance compensation network, and adding the compensation action generated by the compensation network and the output of the unmanned aerial vehicle feedback controller to be used as the input of the controlled autonomous unmanned aerial vehicle.
Then, updating network parameters of the deep reinforcement learning network;
and repeating the compensation process and the updating process until the unmanned aerial vehicle flies away from the wind field area. And the normal controller without feed forward compensation is recovered when the wind disturbance area is separated.
The structure of the neural network for realizing tracking in the present embodiment is shown in the left half of fig. 3. The neural network is characterized in that an industrial fan is identified through a camera carried on an autonomous unmanned aerial vehicle, a group of image samples marked on a boundary frame of the industrial fan form a training set, a feature extractor and a model predictor are trained through the training set, the feature extractor uses a ResNet-50 backbone network, then feature mapping is output to the model predictor, after training is completed, the feature extractor serves as a feature extractor of a tracking network, and a convolutional layer is extracted from the trained model predictor and serves as a convolutional layer of the tracking network. And applied to the features extracted from the test frames to compute a target confidence score. And the positioning of disturbance sources is realized by combining an architecture based on an overlap maximization strategy introduced in an ATOM algorithm. The effect of finally tracking the network is that an enclosure frame is generated all the time to mark the position of the industrial fan. And connecting the trained tracking network with a compensation network of the wind disturbance of the similar field. The final output of the target tracking problem is a bounding box, and the field-like wind disturbance compensation problem is control compensation generated according to the position of a tracked disturbance source. Although the outputs of the two tasks are different, the knowledge of processing the two-dimensional image signal is similar. This embodiment will be describedThe weights in the neural network for realizing tracking are divided into two parts, the former part is a convolution layer for extracting the characteristics of disturbance sources, and the latter part is a full connection layer for generating a surrounding frame. For both the tasks of target tracking and disturbance compensation, the characteristics of the target are unchanged, so the convolution layer can be frozen, the dimension of the output layer of the full connection layer is replaced to the dimension of the compensation action, and a new full connection layer is transferred and updated to generate the compensation action. As shown in the right half of fig. 3, W is a neural network weight transmitted from the neural network trained during tracking, and is divided into two parts, where Conv is a convolutional layer for extracting image features, and FC is a fully-connected layer for generating a compensation operation. The compensation network is realized by adopting a deep reinforcement learning algorithm, and in the embodiment, the reinforcement learning state space and the action space are designed to realize the compensation network. The state space is the feature map of the artificial wind source acquired by the convolutional layer. The action space is the control compensation quantity of the unmanned aerial vehicle in the x, y and z directions. W as shown on the right half of FIG. 3 a Is the weight of the behavioral network, W a ' weight of target behavior network, W c Is the weight of the evaluation network, W c ' is the weight of the target evaluation network, a is the action of the action network output, a ' is the action of the target action network output, and Q ' are the action state value function and the target action state value function, respectively. The reinforcement learning state s is represented by a characteristic diagram of convolutional layer output, the action a is a result of normalization of compensation amounts in three directions of x, y, and z, values in each direction are between-1 and 1, and action analysis maps the normalized action to an actual action compensation amount u'.
The specific embodiments of the present invention are described with reference to a practical example:
as shown in fig. 2, the autonomous drone flies under the wind disturbance of the industrial fan. In the figure C i Is a horizontal inertial coordinate system, x, y, z are C i Orthogonal coordinate basis satisfying right hand rule; c b Is a fixed coordinate system of the unmanned plane body, x b ,y b ,z b Is C b And orthogonal coordinate bases meeting the right-hand rule, wherein phi, theta and psi are attitude Euler angles of the unmanned aerial vehicle. The resolution of the onboard camera is 640 x 480. Unmanned aerial vehicle model under disturbanceComprises the following steps:
Figure BDA0003926836510000071
wherein m is the unmanned aerial vehicle mass, g is the gravitational acceleration, F is the resultant force of four motors, d x ,d y ,d z Are the components of the disturbance experienced by the drone in three directions.
The basic controller of the drone uses reverse-step control, and the control block is as shown in fig. 1, with u '= [ u' x ,u′ y ,u′ z ] T Is a compensation amount, u = [ u = x ,u y ,u z ] T Is the final control output, x = [ x ] 1 ,x 2 ,x 3 ,x 4 ,x 5 ,x 6 ] T Is the state of the drone, specifically the position and position derivative, r, of the drone under the inertial system d =[x d ,y d ,z d ] T Is the target trajectory. The following error variables are defined:
Figure BDA0003926836510000072
Figure BDA0003926836510000073
wherein
Figure BDA0003926836510000074
k 1 ,k 2 ,k 3 ,k 4 ,k 5 ,k 6 Is a controller parameter. Using a back-stepping controller, the formula is as follows:
Figure BDA0003926836510000075
Figure BDA0003926836510000076
Figure BDA0003926836510000077
acquiring a wind source image through a camera carried on the autonomous unmanned aerial vehicle, tracking a target interference source in the image by using a tracking network to obtain the position of the target interference source, and obtaining an action compensation amount u' according to the position of the target interference source by using a compensation network;
adding the control compensation amount u' and the control amount output by the unmanned aerial vehicle controller to obtain u which is used as the input of the controlled autonomous unmanned aerial vehicle;
the compensation network in fig. 3 includes a behavior network, a target behavior network, an evaluation network, and a target evaluation network, all of which are fully connected layers;
action values output by a behavior network
Figure BDA0003926836510000078
Figure BDA0003926836510000079
Feature diagram of disturbance source at time t, W a Weights representing the behavioral network;
action values output by the target behavior network
Figure BDA0003926836510000081
Figure BDA0003926836510000082
Represents a disturbance source feature map, W ', at time t +1 after the execution of the operation at time t' a Weights representing the target behavior network;
evaluating an action State value output by a network
Figure BDA0003926836510000083
W c A weight representing an evaluation network;
action state value output by target evaluation network
Figure BDA0003926836510000084
W′ c Representing the weight of the target evaluation network.
The behavioral network gradient is designed as follows:
Figure BDA0003926836510000085
in the formula ,
Figure BDA0003926836510000086
is shown to W a The gradient is calculated and the gradient is calculated,
Figure BDA0003926836510000087
the gradient of a, J (W) is determined a ) An objective function representing a behavioral network;
m is the minimum batch per data sample;
the weight update of the behavioral network is as follows:
Figure BDA0003926836510000088
Figure BDA0003926836510000089
Figure BDA00039268365100000810
Figure BDA00039268365100000811
Figure BDA00039268365100000812
wherein
Figure BDA00039268365100000813
Is an intermediate variable, and the superscript represents time,
Figure BDA00039268365100000814
is a hyperparameter of a Is the learning rate, ξ is the infinitesimal quantity;
evaluation of network gradients the design was as follows:
Figure BDA00039268365100000815
wherein ,
Figure BDA00039268365100000816
represents to W c Finding the gradient, L (W) c ) An objective function representing the evaluation network;
Figure BDA00039268365100000817
r i represents the reward, γ is the decay coefficient, M is the minimum batch per data sample;
Figure BDA00039268365100000818
is the state (new feature map) that results after the action is performed. The weight update of the evaluation network is the same as the behavior network, as follows:
Figure BDA0003926836510000091
wherein
Figure BDA0003926836510000092
Is an intermediate variable, and the superscript represents time,
Figure BDA0003926836510000093
is a hyperparameter of c Is the learning rate, ξ is the infinitesimal quantity;
the weights of the target behavior network and the weights of the target evaluation network are updated as follows:
Figure BDA0003926836510000094
where τ is the soft update rate, the greater τ the faster the update rate.
A reward function is designed in the implementation mode, the unmanned aerial vehicle is not influenced by an industrial fan in the implementation mode, and therefore the smaller the position following error is, the higher the score is. The reward function is designed as a negative correlation function of the trajectory following error:
Figure BDA0003926836510000095
e 1 、e 3 、e 5 respectively representing the x, y and z axis errors of the target locus point and the current position, and C is a constant. The purpose of this is to keep the value of the reward function within a relatively reasonable range.
A simulation scene is set up, the track following effect of feedforward compensation by using a compensation network and without using the compensation network in the scene is shown in figure 4, and it can be clearly seen that the tracking accuracy by using the feedforward compensation is far better by using a PID algorithm.
And (4) conclusion: the example shows that the method can effectively realize feedforward compensation of the similar-field wind disturbance generated by the visual artificial wind source, improve the track tracking precision and greatly increase the flight safety of the autonomous unmanned aerial vehicle among modern urban buildings.
Although the invention herein has been described with reference to particular embodiments, it is to be understood that these embodiments are merely illustrative of the principles and applications of the present invention. It is therefore to be understood that numerous modifications may be made to the illustrative embodiments and that other arrangements may be devised without departing from the spirit and scope of the present invention as defined by the appended claims. It should be understood that features described in different dependent claims and herein may be combined in ways different from those described in the original claims. It is also to be understood that features described in connection with individual embodiments may be used in other described embodiments.

Claims (10)

1. An unmanned aerial vehicle control method for suppressing wind disturbance of a similar field, the method comprising:
s1, acquiring a wind source image through a camera carried on an autonomous unmanned aerial vehicle, tracking a target interference source in the image by using a tracking network to acquire the position of the target interference source, and acquiring an action compensation amount u' according to the position of the target interference source by using a compensation network;
the tracking network comprises a feature extractor and a convolutional layer;
the wind source images are sequentially input into the feature extractor and the convolution layer to obtain a disturbance source feature map;
the compensation network is realized by adopting a deep reinforcement learning network, the input of the compensation network is a disturbance source characteristic diagram, the output of the compensation network is an action a output by a behavior network, the action a represents the normalization result of the control compensation quantity of the unmanned aerial vehicle in the x, y and z directions, and the normalized action a is mapped into an action compensation quantity u';
s2, adding the control compensation amount u' and the control amount output by the unmanned aerial vehicle controller to obtain u which is used as the input of the controlled autonomous unmanned aerial vehicle;
s3, updating network parameters of the deep reinforcement learning network;
and S4, repeating the steps from S1 to S3 until the unmanned aerial vehicle flies away from the wind field area.
2. The method according to claim 1, wherein in S1, the industrial wind turbine is identified by a camera carried by the autonomous unmanned aerial vehicle, a group of image samples labeled to the boundary frame of the industrial wind turbine constitutes a training set, a feature extractor and a model predictor are trained by the training set, after the training is completed, the feature extractor is used as a feature extractor of the tracking network, and a convolutional layer is extracted from the trained model predictor and used as a convolutional layer of the tracking network.
3. The drone controlling method for suppressing wind disturbance like farm according to claim 1, wherein the feature extractor is a ResNet-50 backbone network.
4. The unmanned aerial vehicle control method of suppressing field-like wind disturbances according to claim 1 or 2, wherein the compensation network comprises a behavior network, a target behavior network, an evaluation network and a target evaluation network, which are all fully connected layers;
action values output by a behavior network
Figure FDA0003926836500000011
Figure FDA0003926836500000012
Feature diagram of disturbance source at time t, W a Weights representing the behavioral network;
action values output by the target behavior network
Figure FDA0003926836500000013
Figure FDA0003926836500000014
A disturbance source feature map W representing the disturbance source at time t +1 after the action at time t is executed a ' weight representing target behavior network;
evaluating an action State value output by a network
Figure FDA0003926836500000015
W c A weight representing an evaluation network;
action state value output by target evaluation network
Figure FDA0003926836500000016
W c ' denotes the weight of the target evaluation network.
5. The unmanned aerial vehicle control method for suppressing wind disturbance like the farm according to claim 4, wherein in S3, the gradient of the behavior network is as follows:
Figure FDA0003926836500000021
m is the minimum batch of each data sample, and the weight of the behavior network is updated:
Figure FDA0003926836500000022
Figure FDA0003926836500000023
Figure FDA0003926836500000024
Figure FDA0003926836500000025
Figure FDA0003926836500000026
wherein m,v,
Figure FDA0003926836500000027
is an intermediate variable, and the superscript represents time,
Figure FDA0003926836500000028
is a hyperparameter, eta a Is the learning rate, ξ is the infinitesimal quantity;
Figure FDA0003926836500000029
is shown to W a The gradient is calculated, and the gradient is calculated,
Figure FDA00039268365000000210
the gradient of a, J (W) is determined a ) An objective function representing a behavioral network;
m is the minimum batch per data sample;
the weights of the target behavior network are updated as follows:
W′ a ←τW a +(1-τ)W′ a
τ is the soft update rate.
6. The unmanned aerial vehicle control method for suppressing wind disturbance like the farm according to claim 4, wherein in S3, the gradient of the evaluation network is:
Figure FDA00039268365000000211
wherein ,
Figure FDA00039268365000000212
represents to W c Finding the gradient, L (W) c ) An objective function representing the evaluation network;
Figure FDA00039268365000000213
r i represents the reward, γ is the decay coefficient, M is the minimum batch per data sample;
updating the weight of the evaluation network:
Figure FDA0003926836500000031
Figure FDA0003926836500000032
Figure FDA0003926836500000033
Figure FDA0003926836500000034
Figure FDA0003926836500000035
wherein m,v,
Figure FDA0003926836500000036
is an intermediate variable, and the superscript represents time,
Figure FDA0003926836500000037
is a hyperparameter of c Is the learning rate, ξ is the infinitesimal quantity;
the weight of the target evaluation network is updated as follows:
W′ c ←τW c +(1-τ)W′ c
τ is the soft update rate.
7. The method of controlling a drone for suppressing farm-like wind disturbances according to claim 6, wherein a reward r is i Comprises the following steps:
Figure FDA0003926836500000038
e 1 、e 3 、e 5 respectively representing the x, y and z axis errors of the target locus point and the current position, and C is a constant.
8. The method of claim 1, wherein u = [ u ] u unmanned aerial vehicle control method for suppressing wind disturbances in the quasi-field x ,u y ,u z ] T
Figure FDA0003926836500000039
Figure FDA00039268365000000310
Figure FDA00039268365000000311
Where m is the mass of the drone, target trajectory r d =[x d ,y d ,z d ] T
Figure FDA00039268365000000312
Figure FDA00039268365000000313
k 1 ,k 2 ,k 3 ,k 4 ,k 5 ,k 6 Is a controller parameter;
error variable
Figure FDA00039268365000000314
Figure FDA0003926836500000041
x 1 ,x 3 ,x 5 Is the position component of the x, y and z axes of the unmanned aerial vehicle in the inertial coordinate system, x 2 ,x 4 ,x 6 Respectively representing the speed components of x, y and z axes under an inertial coordinate system of the unmanned aerial vehicle;
g is the acceleration of gravity, F is the resultant force of the four motors, d x ,d y ,d z Components of disturbance of the unmanned aerial vehicle in three directions; phi, theta, psi is the attitude euler angle of the drone.
9. A computer-readable storage device storing a computer program, wherein the computer program when executed implements a drone controlling method for suppressing farm-like wind disturbances according to any one of claims 1 to 8.
10. An unmanned aerial vehicle control apparatus for suppressing farm-like wind disturbance, comprising a storage device, a processor, and a computer program stored in the storage device and operable on the processor, wherein the processor executes the computer program to implement the unmanned aerial vehicle control method for suppressing farm-like wind disturbance according to any one of claims 1 to 8.
CN202211381428.6A 2022-11-04 2022-11-04 Unmanned aerial vehicle control method for inhibiting wind disturbance of quasi-field Active CN115562330B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211381428.6A CN115562330B (en) 2022-11-04 2022-11-04 Unmanned aerial vehicle control method for inhibiting wind disturbance of quasi-field

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211381428.6A CN115562330B (en) 2022-11-04 2022-11-04 Unmanned aerial vehicle control method for inhibiting wind disturbance of quasi-field

Publications (2)

Publication Number Publication Date
CN115562330A true CN115562330A (en) 2023-01-03
CN115562330B CN115562330B (en) 2023-08-22

Family

ID=84768647

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211381428.6A Active CN115562330B (en) 2022-11-04 2022-11-04 Unmanned aerial vehicle control method for inhibiting wind disturbance of quasi-field

Country Status (1)

Country Link
CN (1) CN115562330B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109816695A (en) * 2019-01-31 2019-05-28 中国人民解放军国防科技大学 Target detection and tracking method for infrared small unmanned aerial vehicle under complex background
CN110398720A (en) * 2019-08-21 2019-11-01 深圳耐杰电子技术有限公司 A kind of anti-unmanned plane detection tracking interference system and photoelectric follow-up working method
US20200312163A1 (en) * 2019-03-26 2020-10-01 Sony Corporation Concept for designing and using an uav controller model for controlling an uav
KR20210088142A (en) * 2020-01-06 2021-07-14 세종대학교산학협력단 System for detecting and tracking target of unmanned aerial vehicle
CN114527776A (en) * 2022-01-07 2022-05-24 鹏城实验室 Unmanned aerial vehicle wind disturbance resisting control method and device, terminal and storage medium

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109816695A (en) * 2019-01-31 2019-05-28 中国人民解放军国防科技大学 Target detection and tracking method for infrared small unmanned aerial vehicle under complex background
US20200312163A1 (en) * 2019-03-26 2020-10-01 Sony Corporation Concept for designing and using an uav controller model for controlling an uav
CN110398720A (en) * 2019-08-21 2019-11-01 深圳耐杰电子技术有限公司 A kind of anti-unmanned plane detection tracking interference system and photoelectric follow-up working method
KR20210088142A (en) * 2020-01-06 2021-07-14 세종대학교산학협력단 System for detecting and tracking target of unmanned aerial vehicle
CN114527776A (en) * 2022-01-07 2022-05-24 鹏城实验室 Unmanned aerial vehicle wind disturbance resisting control method and device, terminal and storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
安航;鲜斌;: "无人直升机的姿态增强学习控制设计与验证", 控制理论与应用, vol. 36, no. 4, pages 516 - 524 *

Also Published As

Publication number Publication date
CN115562330B (en) 2023-08-22

Similar Documents

Publication Publication Date Title
Hong et al. Energy-efficient online path planning of multiple drones using reinforcement learning
CN110806756B (en) Unmanned aerial vehicle autonomous guidance control method based on DDPG
Ergezer et al. Path planning for UAVs for maximum information collection
Bouffard et al. Learning-based model predictive control on a quadrotor: Onboard implementation and experimental results
WO2019076044A1 (en) Mobile robot local motion planning method and apparatus and computer storage medium
He et al. Deep reinforcement learning based local planner for UAV obstacle avoidance using demonstration data
CN113268074B (en) Unmanned aerial vehicle flight path planning method based on joint optimization
CN113848984B (en) Unmanned aerial vehicle cluster control method and system
CN115033022A (en) DDPG unmanned aerial vehicle landing method based on expert experience and oriented to mobile platform
Magree et al. Monocular visual mapping for obstacle avoidance on UAVs
Song et al. Learning perception-aware agile flight in cluttered environments
CN109870906A (en) A kind of high-speed rotor aircraft paths planning method based on BBO optimization Artificial Potential Field
CN111624875A (en) Visual servo control method and device and unmanned equipment
CN109375642B (en) Energy-saving control method for unmanned aerial vehicle
Fu et al. Memory-enhanced deep reinforcement learning for UAV navigation in 3D environment
Silva et al. Landing area recognition by image applied to an autonomous control landing of VTOL aircraft
Sandström et al. Fighter pilot behavior cloning
CN117215197B (en) Four-rotor aircraft online track planning method, four-rotor aircraft online track planning system, electronic equipment and medium
Orsag et al. State estimation, robust control and obstacle avoidance for multicopter in cluttered environments: Euroc experience and results
CN115562330B (en) Unmanned aerial vehicle control method for inhibiting wind disturbance of quasi-field
CN112161626B (en) High-flyability route planning method based on route tracking mapping network
CN114609925B (en) Training method of underwater exploration strategy model and underwater exploration method of bionic machine fish
CN116009583A (en) Pure vision-based distributed unmanned aerial vehicle cooperative motion control method and device
CN117130383B (en) Unmanned aerial vehicle vision tracking method and system, unmanned aerial vehicle and readable storage medium
Yin et al. Online joint control approach to formation flying simulation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant