CN115562330A - Unmanned aerial vehicle control method for restraining wind disturbance of similar field - Google Patents
Unmanned aerial vehicle control method for restraining wind disturbance of similar field Download PDFInfo
- Publication number
- CN115562330A CN115562330A CN202211381428.6A CN202211381428A CN115562330A CN 115562330 A CN115562330 A CN 115562330A CN 202211381428 A CN202211381428 A CN 202211381428A CN 115562330 A CN115562330 A CN 115562330A
- Authority
- CN
- China
- Prior art keywords
- network
- unmanned aerial
- aerial vehicle
- wind
- disturbance
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 29
- 230000000452 restraining effect Effects 0.000 title abstract description 4
- 230000009471 action Effects 0.000 claims abstract description 44
- 230000002787 reinforcement Effects 0.000 claims abstract description 13
- 238000010586 diagram Methods 0.000 claims abstract description 10
- 238000011156 evaluation Methods 0.000 claims description 29
- 230000006399 behavior Effects 0.000 claims description 26
- 230000006870 function Effects 0.000 claims description 11
- 230000003542 behavioural effect Effects 0.000 claims description 9
- 238000012549 training Methods 0.000 claims description 9
- 230000001133 acceleration Effects 0.000 claims description 3
- 238000010606 normalization Methods 0.000 claims description 3
- 230000005484 gravity Effects 0.000 claims description 2
- 238000004590 computer program Methods 0.000 claims 4
- 238000013528 artificial neural network Methods 0.000 description 7
- 230000000694 effects Effects 0.000 description 3
- 230000002401 inhibitory effect Effects 0.000 description 3
- 230000008569 process Effects 0.000 description 2
- 101710163391 ADP-ribosyl cyclase/cyclic ADP-ribose hydrolase Proteins 0.000 description 1
- 230000003044 adaptive effect Effects 0.000 description 1
- 230000002567 autonomic effect Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000005314 correlation function Methods 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000004880 explosion Methods 0.000 description 1
- 230000036039 immunity Effects 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000003062 neural network model Methods 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000012827 research and development Methods 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 238000004088 simulation Methods 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 238000009423 ventilation Methods 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05D—SYSTEMS FOR CONTROLLING OR REGULATING NON-ELECTRIC VARIABLES
- G05D1/00—Control of position, course, altitude or attitude of land, water, air or space vehicles, e.g. using automatic pilots
- G05D1/08—Control of attitude, i.e. control of roll, pitch, or yaw
- G05D1/0808—Control of attitude, i.e. control of roll, pitch, or yaw specially adapted for aircraft
- G05D1/0816—Control of attitude, i.e. control of roll, pitch, or yaw specially adapted for aircraft to ensure stability
- G05D1/0825—Control of attitude, i.e. control of roll, pitch, or yaw specially adapted for aircraft to ensure stability using mathematical models
-
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05D—SYSTEMS FOR CONTROLLING OR REGULATING NON-ELECTRIC VARIABLES
- G05D1/00—Control of position, course, altitude or attitude of land, water, air or space vehicles, e.g. using automatic pilots
- G05D1/10—Simultaneous control of position or course in three dimensions
- G05D1/101—Simultaneous control of position or course in three dimensions specially adapted for aircraft
- G05D1/106—Change initiated in response to external conditions, e.g. avoidance of elevated terrain or of no-fly zones
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02E—REDUCTION OF GREENHOUSE GAS [GHG] EMISSIONS, RELATED TO ENERGY GENERATION, TRANSMISSION OR DISTRIBUTION
- Y02E10/00—Energy generation through renewable energy sources
- Y02E10/70—Wind energy
- Y02E10/72—Wind turbines with rotation axis in wind direction
Landscapes
- Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Physics & Mathematics (AREA)
- Automation & Control Theory (AREA)
- Remote Sensing (AREA)
- Radar, Positioning & Navigation (AREA)
- Aviation & Aerospace Engineering (AREA)
- Mathematical Analysis (AREA)
- Pure & Applied Mathematics (AREA)
- Mathematical Physics (AREA)
- Mathematical Optimization (AREA)
- Algebra (AREA)
- Control Of Position, Course, Altitude, Or Attitude Of Moving Bodies (AREA)
- Feedback Control In General (AREA)
Abstract
An unmanned aerial vehicle control method for restraining wind disturbance of a similar field belongs to the technical field of unmanned aerial vehicle disturbance rejection. The invention comprises the following steps: s1, acquiring a wind source image through a camera carried on an autonomous unmanned aerial vehicle, tracking a target interference source in the image by using a tracking network to acquire the position of the target interference source, and acquiring an action compensation amount according to the position of the target interference source by using a compensation network; the tracking network comprises a feature extractor and a convolutional layer; the wind source image tracking network obtains a disturbance source characteristic diagram; the compensation network is realized by adopting a deep reinforcement learning algorithm; s2, adding the control compensation quantity and the control quantity output by the unmanned aerial vehicle controller to serve as the input of the controlled autonomous unmanned aerial vehicle; s3, updating network parameters of the wind disturbance compensation network of the similar field; s4, repeating S1 to S3 until the unmanned aerial vehicle flies away from the wind field area, and solving the problem that the autonomous unmanned aerial vehicle is easy to crash when flying under the urban crowded environment and is disturbed by wind in an artificial field.
Description
Technical Field
The invention relates to an unmanned aerial vehicle control method for inhibiting wind disturbance of a similar field, and belongs to the technical field of unmanned aerial vehicle disturbance rejection.
Background
In recent years, in view of the rapid development of urban logistics, unmanned planes capable of rapidly shuttling through buildings crowded in modern cities are important research and development targets for many units. In the face of narrow passable space between buildings, most researchers focus on obstacle avoidance and path planning of unmanned aerial vehicles. However, the crowded urban space and the large number of obstacles are not the only difficulties that urban autonomous drones need to face. In modern cities, artificial wind sources such as ventilation fans and air conditioner outdoor units are widely available. The wind disturbance generated by the artificial wind source is greatly different from natural wind disturbance, has obvious directivity and has strong correlation with the space position relative to the wind source, and the wind disturbance generated by the artificial wind source is similar to field wind disturbance. Will receive the wind field interference of impact nature when autonomic unmanned aerial vehicle passes through their air outlet, this will produce devastating strike to unmanned aerial vehicle between crowded building, has very big probability to lead to unmanned aerial vehicle to bump into upstairs or other obstacles and crash.
According to the existing research, most researchers studying the immunity algorithm of drones use feedback compensation methods that enable drones to suppress completely unknown disturbances. For example, ADRC and LADRC algorithms are widely used in drone anti-jamming control. In addition, the adaptive neural network backstepping controller and various instruction filters are also used for estimating unknown disturbance on line, and become a common method in the field of unmanned aerial vehicle disturbance rejection control. However, most of these methods are directed to completely unknown disturbances, so that only feedback compensation methods can be used.
Model-free reinforcement learning provides a new idea for approaching unknown models by means of interaction with the environment. Reinforcement learning has received increasing attention from researchers in recent years and has been applied to the study of some robots, including the four rotor control problem. The autonomous unmanned aerial vehicle can well inhibit the wind disturbance of the quasi-field by using the model-free reinforcement learning algorithm, but the method has the problem of low data utilization rate, and convergence becomes difficult as the abstraction level of the neural network is deepened.
Disclosure of Invention
Aiming at the problem of how to inhibit the influence of artificial wind interference on the unmanned aerial vehicle, the invention provides an unmanned aerial vehicle control method for inhibiting wind interference of a similar field.
The invention discloses an unmanned aerial vehicle control method for inhibiting wind disturbance of a similar field, which comprises the following steps:
s1, acquiring a wind source image through a camera carried on an autonomous unmanned aerial vehicle, tracking a target interference source in the image by using a tracking network to acquire the position of the target interference source, and acquiring an action compensation amount u' according to the position of the target interference source by using a compensation network;
the tracking network comprises a feature extractor and a convolutional layer;
the wind source images are sequentially input into the feature extractor and the convolution layer to obtain a disturbance source feature map;
the compensation network is realized by adopting a deep reinforcement learning network, the input of the compensation network is a disturbance source characteristic diagram, the output is an action a output by a behavior network, the action a represents the normalization result of the control compensation quantity of the unmanned aerial vehicle in the x, y and z directions, and the normalized action a is mapped into an action compensation quantity u';
s2, adding the control compensation amount u' and the control amount output by the unmanned aerial vehicle controller to obtain u which is used as the input of the controlled autonomous unmanned aerial vehicle;
s3, updating network parameters of the deep reinforcement learning network;
and S4, repeating the steps from S1 to S3 until the unmanned aerial vehicle flies away from the wind field area.
Preferably, in S1, the industrial fan is identified by a camera carried by the autonomous unmanned aerial vehicle, a group of image samples labeled to a boundary frame of the industrial fan constitutes a training set, a feature extractor and a model predictor are trained by the training set, after training is completed, the feature extractor is used as a feature extractor of the tracking network, and a convolutional layer is extracted from the trained model predictor and used as a convolutional layer of the tracking network.
Preferably, the feature extractor is a ResNet-50 backbone network.
Preferably, the compensation network comprises a behavior network, a target behavior network, an evaluation network and a target evaluation network, which are all fully connected layers;
action values output by a behavior network Feature diagram of disturbance source at time t, W a Weights representing the behavioral network;
action values output by the target behavior network A disturbance source feature map W representing the disturbance source at time t +1 after the action at time t is executed a ' represents the weight of the target behavior network;
evaluating an action State value output by a networkW c A weight representing an evaluation network;
action state value output by target evaluation networkW c ' denotes the weight of the target evaluation network.
Preferably, in S3, the gradient of the behavior network is:
m is the minimum batch of each data sample, and the weight of the behavior network is updated:
wherein Is an intermediate variable, and the superscript represents time,is a hyperparameter of a Is the learning rate, ξ is the infinitesimal quantity;represents to W a The gradient is calculated, and the gradient is calculated,the gradient of a, J (W) is determined a ) An objective function representing a behavioral network;
the weights of the target behavior network are updated as follows:
W a ′←τW a +(1-τ)W a ′
τ is the soft update rate.
Preferably, in S3, the gradient of the evaluation network is:
wherein ,is shown to W c Finding the gradient, L (W) c ) An objective function representing the evaluation network;
updating the weight of the evaluation network:
wherein Is an intermediate variable, and the superscript represents time,is a hyperparameter of c Is the learning rate, ξ is the infinitesimal quantity;
the weight of the target evaluation network is updated as follows:
W c ′←τW c +(1-τ)W c ′
τ is the soft update rate.
Preferably, the prize r i Comprises the following steps:
e 1 、e 3 、e 5 respectively representing the x, y and z axis errors of the target track point and the current position, and C is a constant.
Preferably, u = [ u ] x ,u y ,u z ] T :
Where m is the mass of the drone, target trajectory r d =[x d ,y d ,z d ] T , k 1 ,k 2 ,k 3 ,k 4 ,k 5 ,k 6 Is a controller parameter;
x 1 ,x 3 ,x 5 Is the position component of the x, y and z axes of the unmanned aerial vehicle in the inertial coordinate system, x 2 ,x 4 ,x 6 Respectively representing the speed components of x, y and z axes under an inertial coordinate system of the unmanned aerial vehicle;
g is the acceleration of gravity, F is the resultant force of the four motors, d x ,d y ,d z Is the component of the disturbance on the unmanned plane in three directions; phi, theta, psi is the attitude euler angle of the drone.
The unmanned aerial vehicle tracking system has the beneficial effects that the problem that the unmanned aerial vehicle flying in the urban crowded environment is easy to have the danger of explosion when being disturbed by wind in an artificial field is solved, the position of a target interference source is obtained through a tracking network, action compensation is carried out according to a reinforcement learning compensation network, and the interference of artificial wind on the unmanned aerial vehicle is inhibited. By using the feedforward compensation method, the unmanned aerial vehicle can make a compensation action in advance before the interference comes, and the method has faster response and higher compensation precision compared with the general feedback compensation.
Drawings
FIG. 1 is a schematic diagram of the principles of the present invention;
FIG. 2 is a schematic diagram of an unmanned aerial vehicle suppressing interference of an industrial fan;
FIG. 3 is a schematic diagram of a neural network model;
FIG. 4 is a comparison of the track following effect using the feed forward compensation of the present invention and using PID.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It should be noted that the embodiments and features of the embodiments may be combined with each other without conflict.
The invention is further described with reference to the following drawings and specific examples, which are not intended to be limiting.
According to the unmanned aerial vehicle control method for restraining the wind disturbance of the similar field, the wind source is identified through the camera carried by the autonomous unmanned aerial vehicle, and the target disturbance source is tracked by adopting the tracking network. And inputting the output of the trained tracking network into a wind disturbance compensation network, and adding the compensation action generated by the compensation network and the output of the unmanned aerial vehicle feedback controller to be used as the input of the controlled autonomous unmanned aerial vehicle.
Then, updating network parameters of the deep reinforcement learning network;
and repeating the compensation process and the updating process until the unmanned aerial vehicle flies away from the wind field area. And the normal controller without feed forward compensation is recovered when the wind disturbance area is separated.
The structure of the neural network for realizing tracking in the present embodiment is shown in the left half of fig. 3. The neural network is characterized in that an industrial fan is identified through a camera carried on an autonomous unmanned aerial vehicle, a group of image samples marked on a boundary frame of the industrial fan form a training set, a feature extractor and a model predictor are trained through the training set, the feature extractor uses a ResNet-50 backbone network, then feature mapping is output to the model predictor, after training is completed, the feature extractor serves as a feature extractor of a tracking network, and a convolutional layer is extracted from the trained model predictor and serves as a convolutional layer of the tracking network. And applied to the features extracted from the test frames to compute a target confidence score. And the positioning of disturbance sources is realized by combining an architecture based on an overlap maximization strategy introduced in an ATOM algorithm. The effect of finally tracking the network is that an enclosure frame is generated all the time to mark the position of the industrial fan. And connecting the trained tracking network with a compensation network of the wind disturbance of the similar field. The final output of the target tracking problem is a bounding box, and the field-like wind disturbance compensation problem is control compensation generated according to the position of a tracked disturbance source. Although the outputs of the two tasks are different, the knowledge of processing the two-dimensional image signal is similar. This embodiment will be describedThe weights in the neural network for realizing tracking are divided into two parts, the former part is a convolution layer for extracting the characteristics of disturbance sources, and the latter part is a full connection layer for generating a surrounding frame. For both the tasks of target tracking and disturbance compensation, the characteristics of the target are unchanged, so the convolution layer can be frozen, the dimension of the output layer of the full connection layer is replaced to the dimension of the compensation action, and a new full connection layer is transferred and updated to generate the compensation action. As shown in the right half of fig. 3, W is a neural network weight transmitted from the neural network trained during tracking, and is divided into two parts, where Conv is a convolutional layer for extracting image features, and FC is a fully-connected layer for generating a compensation operation. The compensation network is realized by adopting a deep reinforcement learning algorithm, and in the embodiment, the reinforcement learning state space and the action space are designed to realize the compensation network. The state space is the feature map of the artificial wind source acquired by the convolutional layer. The action space is the control compensation quantity of the unmanned aerial vehicle in the x, y and z directions. W as shown on the right half of FIG. 3 a Is the weight of the behavioral network, W a ' weight of target behavior network, W c Is the weight of the evaluation network, W c ' is the weight of the target evaluation network, a is the action of the action network output, a ' is the action of the target action network output, and Q ' are the action state value function and the target action state value function, respectively. The reinforcement learning state s is represented by a characteristic diagram of convolutional layer output, the action a is a result of normalization of compensation amounts in three directions of x, y, and z, values in each direction are between-1 and 1, and action analysis maps the normalized action to an actual action compensation amount u'.
The specific embodiments of the present invention are described with reference to a practical example:
as shown in fig. 2, the autonomous drone flies under the wind disturbance of the industrial fan. In the figure C i Is a horizontal inertial coordinate system, x, y, z are C i Orthogonal coordinate basis satisfying right hand rule; c b Is a fixed coordinate system of the unmanned plane body, x b ,y b ,z b Is C b And orthogonal coordinate bases meeting the right-hand rule, wherein phi, theta and psi are attitude Euler angles of the unmanned aerial vehicle. The resolution of the onboard camera is 640 x 480. Unmanned aerial vehicle model under disturbanceComprises the following steps:
wherein m is the unmanned aerial vehicle mass, g is the gravitational acceleration, F is the resultant force of four motors, d x ,d y ,d z Are the components of the disturbance experienced by the drone in three directions.
The basic controller of the drone uses reverse-step control, and the control block is as shown in fig. 1, with u '= [ u' x ,u′ y ,u′ z ] T Is a compensation amount, u = [ u = x ,u y ,u z ] T Is the final control output, x = [ x ] 1 ,x 2 ,x 3 ,x 4 ,x 5 ,x 6 ] T Is the state of the drone, specifically the position and position derivative, r, of the drone under the inertial system d =[x d ,y d ,z d ] T Is the target trajectory. The following error variables are defined:
wherein k 1 ,k 2 ,k 3 ,k 4 ,k 5 ,k 6 Is a controller parameter. Using a back-stepping controller, the formula is as follows:
acquiring a wind source image through a camera carried on the autonomous unmanned aerial vehicle, tracking a target interference source in the image by using a tracking network to obtain the position of the target interference source, and obtaining an action compensation amount u' according to the position of the target interference source by using a compensation network;
adding the control compensation amount u' and the control amount output by the unmanned aerial vehicle controller to obtain u which is used as the input of the controlled autonomous unmanned aerial vehicle;
the compensation network in fig. 3 includes a behavior network, a target behavior network, an evaluation network, and a target evaluation network, all of which are fully connected layers;
action values output by a behavior network Feature diagram of disturbance source at time t, W a Weights representing the behavioral network;
action values output by the target behavior network Represents a disturbance source feature map, W ', at time t +1 after the execution of the operation at time t' a Weights representing the target behavior network;
evaluating an action State value output by a networkW c A weight representing an evaluation network;
action state value output by target evaluation networkW′ c Representing the weight of the target evaluation network.
The behavioral network gradient is designed as follows:
in the formula ,is shown to W a The gradient is calculated and the gradient is calculated,the gradient of a, J (W) is determined a ) An objective function representing a behavioral network;
m is the minimum batch per data sample;
the weight update of the behavioral network is as follows:
wherein Is an intermediate variable, and the superscript represents time,is a hyperparameter of a Is the learning rate, ξ is the infinitesimal quantity;
evaluation of network gradients the design was as follows:
wherein ,represents to W c Finding the gradient, L (W) c ) An objective function representing the evaluation network;r i represents the reward, γ is the decay coefficient, M is the minimum batch per data sample;is the state (new feature map) that results after the action is performed. The weight update of the evaluation network is the same as the behavior network, as follows:
wherein Is an intermediate variable, and the superscript represents time,is a hyperparameter of c Is the learning rate, ξ is the infinitesimal quantity;
the weights of the target behavior network and the weights of the target evaluation network are updated as follows:
where τ is the soft update rate, the greater τ the faster the update rate.
A reward function is designed in the implementation mode, the unmanned aerial vehicle is not influenced by an industrial fan in the implementation mode, and therefore the smaller the position following error is, the higher the score is. The reward function is designed as a negative correlation function of the trajectory following error:
e 1 、e 3 、e 5 respectively representing the x, y and z axis errors of the target locus point and the current position, and C is a constant. The purpose of this is to keep the value of the reward function within a relatively reasonable range.
A simulation scene is set up, the track following effect of feedforward compensation by using a compensation network and without using the compensation network in the scene is shown in figure 4, and it can be clearly seen that the tracking accuracy by using the feedforward compensation is far better by using a PID algorithm.
And (4) conclusion: the example shows that the method can effectively realize feedforward compensation of the similar-field wind disturbance generated by the visual artificial wind source, improve the track tracking precision and greatly increase the flight safety of the autonomous unmanned aerial vehicle among modern urban buildings.
Although the invention herein has been described with reference to particular embodiments, it is to be understood that these embodiments are merely illustrative of the principles and applications of the present invention. It is therefore to be understood that numerous modifications may be made to the illustrative embodiments and that other arrangements may be devised without departing from the spirit and scope of the present invention as defined by the appended claims. It should be understood that features described in different dependent claims and herein may be combined in ways different from those described in the original claims. It is also to be understood that features described in connection with individual embodiments may be used in other described embodiments.
Claims (10)
1. An unmanned aerial vehicle control method for suppressing wind disturbance of a similar field, the method comprising:
s1, acquiring a wind source image through a camera carried on an autonomous unmanned aerial vehicle, tracking a target interference source in the image by using a tracking network to acquire the position of the target interference source, and acquiring an action compensation amount u' according to the position of the target interference source by using a compensation network;
the tracking network comprises a feature extractor and a convolutional layer;
the wind source images are sequentially input into the feature extractor and the convolution layer to obtain a disturbance source feature map;
the compensation network is realized by adopting a deep reinforcement learning network, the input of the compensation network is a disturbance source characteristic diagram, the output of the compensation network is an action a output by a behavior network, the action a represents the normalization result of the control compensation quantity of the unmanned aerial vehicle in the x, y and z directions, and the normalized action a is mapped into an action compensation quantity u';
s2, adding the control compensation amount u' and the control amount output by the unmanned aerial vehicle controller to obtain u which is used as the input of the controlled autonomous unmanned aerial vehicle;
s3, updating network parameters of the deep reinforcement learning network;
and S4, repeating the steps from S1 to S3 until the unmanned aerial vehicle flies away from the wind field area.
2. The method according to claim 1, wherein in S1, the industrial wind turbine is identified by a camera carried by the autonomous unmanned aerial vehicle, a group of image samples labeled to the boundary frame of the industrial wind turbine constitutes a training set, a feature extractor and a model predictor are trained by the training set, after the training is completed, the feature extractor is used as a feature extractor of the tracking network, and a convolutional layer is extracted from the trained model predictor and used as a convolutional layer of the tracking network.
3. The drone controlling method for suppressing wind disturbance like farm according to claim 1, wherein the feature extractor is a ResNet-50 backbone network.
4. The unmanned aerial vehicle control method of suppressing field-like wind disturbances according to claim 1 or 2, wherein the compensation network comprises a behavior network, a target behavior network, an evaluation network and a target evaluation network, which are all fully connected layers;
action values output by a behavior network Feature diagram of disturbance source at time t, W a Weights representing the behavioral network;
action values output by the target behavior network A disturbance source feature map W representing the disturbance source at time t +1 after the action at time t is executed a ' weight representing target behavior network;
evaluating an action State value output by a networkW c A weight representing an evaluation network;
5. The unmanned aerial vehicle control method for suppressing wind disturbance like the farm according to claim 4, wherein in S3, the gradient of the behavior network is as follows:
m is the minimum batch of each data sample, and the weight of the behavior network is updated:
wherein m,v,is an intermediate variable, and the superscript represents time,is a hyperparameter, eta a Is the learning rate, ξ is the infinitesimal quantity;is shown to W a The gradient is calculated, and the gradient is calculated,the gradient of a, J (W) is determined a ) An objective function representing a behavioral network;
m is the minimum batch per data sample;
the weights of the target behavior network are updated as follows:
W′ a ←τW a +(1-τ)W′ a
τ is the soft update rate.
6. The unmanned aerial vehicle control method for suppressing wind disturbance like the farm according to claim 4, wherein in S3, the gradient of the evaluation network is:
wherein ,represents to W c Finding the gradient, L (W) c ) An objective function representing the evaluation network;
updating the weight of the evaluation network:
wherein m,v,is an intermediate variable, and the superscript represents time,is a hyperparameter of c Is the learning rate, ξ is the infinitesimal quantity;
the weight of the target evaluation network is updated as follows:
W′ c ←τW c +(1-τ)W′ c
τ is the soft update rate.
8. The method of claim 1, wherein u = [ u ] u unmanned aerial vehicle control method for suppressing wind disturbances in the quasi-field x ,u y ,u z ] T :
Where m is the mass of the drone, target trajectory r d =[x d ,y d ,z d ] T , k 1 ,k 2 ,k 3 ,k 4 ,k 5 ,k 6 Is a controller parameter;
x 1 ,x 3 ,x 5 Is the position component of the x, y and z axes of the unmanned aerial vehicle in the inertial coordinate system, x 2 ,x 4 ,x 6 Respectively representing the speed components of x, y and z axes under an inertial coordinate system of the unmanned aerial vehicle;
g is the acceleration of gravity, F is the resultant force of the four motors, d x ,d y ,d z Components of disturbance of the unmanned aerial vehicle in three directions; phi, theta, psi is the attitude euler angle of the drone.
9. A computer-readable storage device storing a computer program, wherein the computer program when executed implements a drone controlling method for suppressing farm-like wind disturbances according to any one of claims 1 to 8.
10. An unmanned aerial vehicle control apparatus for suppressing farm-like wind disturbance, comprising a storage device, a processor, and a computer program stored in the storage device and operable on the processor, wherein the processor executes the computer program to implement the unmanned aerial vehicle control method for suppressing farm-like wind disturbance according to any one of claims 1 to 8.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211381428.6A CN115562330B (en) | 2022-11-04 | 2022-11-04 | Unmanned aerial vehicle control method for inhibiting wind disturbance of quasi-field |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211381428.6A CN115562330B (en) | 2022-11-04 | 2022-11-04 | Unmanned aerial vehicle control method for inhibiting wind disturbance of quasi-field |
Publications (2)
Publication Number | Publication Date |
---|---|
CN115562330A true CN115562330A (en) | 2023-01-03 |
CN115562330B CN115562330B (en) | 2023-08-22 |
Family
ID=84768647
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202211381428.6A Active CN115562330B (en) | 2022-11-04 | 2022-11-04 | Unmanned aerial vehicle control method for inhibiting wind disturbance of quasi-field |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115562330B (en) |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109816695A (en) * | 2019-01-31 | 2019-05-28 | 中国人民解放军国防科技大学 | Target detection and tracking method for infrared small unmanned aerial vehicle under complex background |
CN110398720A (en) * | 2019-08-21 | 2019-11-01 | 深圳耐杰电子技术有限公司 | A kind of anti-unmanned plane detection tracking interference system and photoelectric follow-up working method |
US20200312163A1 (en) * | 2019-03-26 | 2020-10-01 | Sony Corporation | Concept for designing and using an uav controller model for controlling an uav |
KR20210088142A (en) * | 2020-01-06 | 2021-07-14 | 세종대학교산학협력단 | System for detecting and tracking target of unmanned aerial vehicle |
CN114527776A (en) * | 2022-01-07 | 2022-05-24 | 鹏城实验室 | Unmanned aerial vehicle wind disturbance resisting control method and device, terminal and storage medium |
-
2022
- 2022-11-04 CN CN202211381428.6A patent/CN115562330B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109816695A (en) * | 2019-01-31 | 2019-05-28 | 中国人民解放军国防科技大学 | Target detection and tracking method for infrared small unmanned aerial vehicle under complex background |
US20200312163A1 (en) * | 2019-03-26 | 2020-10-01 | Sony Corporation | Concept for designing and using an uav controller model for controlling an uav |
CN110398720A (en) * | 2019-08-21 | 2019-11-01 | 深圳耐杰电子技术有限公司 | A kind of anti-unmanned plane detection tracking interference system and photoelectric follow-up working method |
KR20210088142A (en) * | 2020-01-06 | 2021-07-14 | 세종대학교산학협력단 | System for detecting and tracking target of unmanned aerial vehicle |
CN114527776A (en) * | 2022-01-07 | 2022-05-24 | 鹏城实验室 | Unmanned aerial vehicle wind disturbance resisting control method and device, terminal and storage medium |
Non-Patent Citations (1)
Title |
---|
安航;鲜斌;: "无人直升机的姿态增强学习控制设计与验证", 控制理论与应用, vol. 36, no. 4, pages 516 - 524 * |
Also Published As
Publication number | Publication date |
---|---|
CN115562330B (en) | 2023-08-22 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Hong et al. | Energy-efficient online path planning of multiple drones using reinforcement learning | |
CN110806756B (en) | Unmanned aerial vehicle autonomous guidance control method based on DDPG | |
Ergezer et al. | Path planning for UAVs for maximum information collection | |
Bouffard et al. | Learning-based model predictive control on a quadrotor: Onboard implementation and experimental results | |
WO2019076044A1 (en) | Mobile robot local motion planning method and apparatus and computer storage medium | |
He et al. | Deep reinforcement learning based local planner for UAV obstacle avoidance using demonstration data | |
CN113268074B (en) | Unmanned aerial vehicle flight path planning method based on joint optimization | |
CN113848984B (en) | Unmanned aerial vehicle cluster control method and system | |
CN115033022A (en) | DDPG unmanned aerial vehicle landing method based on expert experience and oriented to mobile platform | |
Magree et al. | Monocular visual mapping for obstacle avoidance on UAVs | |
Song et al. | Learning perception-aware agile flight in cluttered environments | |
CN109870906A (en) | A kind of high-speed rotor aircraft paths planning method based on BBO optimization Artificial Potential Field | |
CN111624875A (en) | Visual servo control method and device and unmanned equipment | |
CN109375642B (en) | Energy-saving control method for unmanned aerial vehicle | |
Fu et al. | Memory-enhanced deep reinforcement learning for UAV navigation in 3D environment | |
Silva et al. | Landing area recognition by image applied to an autonomous control landing of VTOL aircraft | |
Sandström et al. | Fighter pilot behavior cloning | |
CN117215197B (en) | Four-rotor aircraft online track planning method, four-rotor aircraft online track planning system, electronic equipment and medium | |
Orsag et al. | State estimation, robust control and obstacle avoidance for multicopter in cluttered environments: Euroc experience and results | |
CN115562330B (en) | Unmanned aerial vehicle control method for inhibiting wind disturbance of quasi-field | |
CN112161626B (en) | High-flyability route planning method based on route tracking mapping network | |
CN114609925B (en) | Training method of underwater exploration strategy model and underwater exploration method of bionic machine fish | |
CN116009583A (en) | Pure vision-based distributed unmanned aerial vehicle cooperative motion control method and device | |
CN117130383B (en) | Unmanned aerial vehicle vision tracking method and system, unmanned aerial vehicle and readable storage medium | |
Yin et al. | Online joint control approach to formation flying simulation |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |