CN109143855B - Visual servo control method of unmanned gyroplane based on fuzzy SARSA learning - Google Patents
Visual servo control method of unmanned gyroplane based on fuzzy SARSA learning Download PDFInfo
- Publication number
- CN109143855B CN109143855B CN201810855339.8A CN201810855339A CN109143855B CN 109143855 B CN109143855 B CN 109143855B CN 201810855339 A CN201810855339 A CN 201810855339A CN 109143855 B CN109143855 B CN 109143855B
- Authority
- CN
- China
- Prior art keywords
- target
- learning
- contour
- aerial vehicle
- unmanned aerial
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05B—CONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
- G05B13/00—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion
- G05B13/02—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric
- G05B13/04—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric involving the use of models or simulators
- G05B13/042—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric involving the use of models or simulators in which a parameter or coefficient is automatically adjusted to optimise the performance
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/10—Terrestrial scenes
- G06V20/13—Satellite images
Landscapes
- Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Remote Sensing (AREA)
- Multimedia (AREA)
- Astronomy & Astrophysics (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Computation (AREA)
- Medical Informatics (AREA)
- Software Systems (AREA)
- Automation & Control Theory (AREA)
- Feedback Control In General (AREA)
Abstract
The invention discloses a vision servo control method of a rotor unmanned aerial vehicle based on fuzzy SARSA learning; the rotor unmanned aerial vehicle acquires image information through a camera, extracts the contour characteristics of a target based on a target contour extraction algorithm of a Firman chain code, and performs contour compensation on the edge information of the target in the image acquisition process; the parameter servo gain parameters of the visual servo are trained by using a reinforcement learning algorithm, so that the rotor unmanned aerial vehicle obtains the self-adaptive servo gain adjusting capability, and the learning rate is adjusted by combining a fuzzy control method. The rotor unmanned aerial vehicle obtains experience through training under the different scenes by using reinforcement learning, can change gain by oneself, and can obtain faster convergence rate for the learning rate memory adjustment of reinforcement learning through fuzzy control simultaneously. The target contour extraction algorithm based on the Ferman chain code is used, so that the error of the extraction algorithm for extracting the central feature point and the actual central feature point is effectively reduced, and the accuracy of feature extraction is improved.
Description
Technical Field
The invention relates to the field of machine learning and robot automatic control, in particular to a vision servo control method of a rotor unmanned aerial vehicle based on fuzzy SARSA learning.
Background
Today, the current artificial intelligence and machine learning technology are rapidly developed, and the machine learning technology is applied to the aspects of production and life of people. The control method of the rotor unmanned aerial vehicle always uses a classical automatic control method, such as a PID control method or a visual servo control method, but with the increasingly complex tasks born by the rotor unmanned aerial vehicle at present, the environment of the rotor unmanned aerial vehicle is unpredictable, and the classical control method cannot meet the control requirement of the rotor unmanned aerial vehicle; the method aims at the problems that the stability of a PID control method and a visual servo control method based on images and the like adopted by the traditional rotor unmanned aerial vehicle control is not high in a complex scene, the convergence rate is low, and the rotor unmanned aerial vehicle is difficult to efficiently realize a work task in a specific application scene. Therefore, there is a need for a method of intelligent control of a rotorcraft that improves visual servoing in conjunction with machine learning.
Disclosure of Invention
In order to avoid the defects in the prior art, the invention provides a vision servo control method of a rotor unmanned aerial vehicle based on fuzzy SARSA learning; the unmanned gyroplane acquires image information through a bottom camera, and then extracts the contour features of a target through a target contour extraction algorithm based on a Firman chain code, but usually edge information of the target is lost in the image acquisition process, so that contour compensation needs to be performed once; the parameter servo gain parameters of the visual servo are trained by using a reinforcement learning algorithm, so that the rotor unmanned aerial vehicle obtains the capability of adaptively adjusting the servo gain, and the learning rate is adjusted by combining a fuzzy control method. On the basis of the vision servo control method of the rotor unmanned aerial vehicle based on fuzzy SARSA learning, the rotor unmanned aerial vehicle can obtain experience through continuous training by using reinforcement learning under different scenes, so that the gain can be automatically changed, meanwhile, the learning rate memorability adjustment of the reinforcement learning can be realized through the fuzzy control, the operating efficiency of the classical reinforcement learning is accelerated, and the faster convergence speed can be obtained. The target contour extraction algorithm based on the Ferman chain code is used, and the contour is completed by using the contour compensation algorithm, so that the error of the central feature point extraction and the actual central feature point extraction caused by edge deletion of the classical image feature extraction algorithm is effectively reduced, and the feature extraction accuracy is improved.
The invention solves the technical problem by adopting the technical scheme that a vision servo control method of a rotor unmanned aerial vehicle based on fuzzy SARSA learning is characterized by comprising the following steps:
step 1, performing edge extraction on an image by using a Canny algorithm, obtaining a set of N contour coordinates through filtering and noise reduction operations, wherein the set of N contour coordinates is described by using a Firman chain code, and a contour pixel using the Firman chain code is marked as C ═ C { (C)i1., N }; carrying out rotation normalization on contour pixels of the target to obtain a Firman chain code and respectively calculating a Levenshtein distance with a graph in a standard contour library, wherein the Levenshtein distance is calculated in a mode that a shape A is converted into an operand required by a shape B, and the operations are insertion, deletion and modification; the method can be used for identifying the shape of the object under the condition that the image target loses a certain degree of edge;
and 2, after contour pixels of the image are acquired in the step 1, because the photographed image sometimes has a contour incomplete phenomenon, the system uses a contour compensation algorithm. For the l target, the Firman chain code is processed once to obtain NlA feature point of the outlineThe feature point set of (1) is:for set FlEach element in the N is subjected to rotation normalization to obtain NstandardA standard contour feature pointThe set of standard contour feature points is denoted asSet to OlFor the compensated feature point profile set, the compensation profile OlWith the standard profile DlThe conversion relationship between the two is as follows: dl·R+L=Ol. Wherein R and L are respectively a rotation matrix and a conversion matrix; compensating contour OlWherein the jth element is PjTotal of N in the setstandardElement, is marked as Ol={Pj|j=1,...,Nstandard}. Taking the central feature point of the target l as a feature point used for visual servo control, wherein the coordinate of the central feature point is obtained by calculating the average of the sum of the coordinates of the compensation contour feature points and is recorded as:
step 3, after the central characteristic point of the target is obtained in the step 2, establishing a bottom visual model of the rotor unmanned aerial vehicle, namely a conversion relation from a three-dimensional space to an image pixel plane;
step 4, constructing a decoupled visual servo control model of the rotor unmanned aerial vehicle through the obtained visual model of the rotor unmanned aerial vehicle, wherein the decoupled visual servo control model comprises a visual servo gain value;
step 5, establishing a single-step SARSA learning and adjusting servo gain model; using SARSA to learn and adjust the visual servo gain value of the rotor unmanned aerial vehicle in the step 4;
1) setting a state space; after the contour of the target is extracted through an image feature extraction algorithm, simplifying the target by adopting a central feature point, calculating the absolute value of the error between the current feature point and the target feature point, and summing the absolute values to obtain a certain range as a state;
2) setting an action space; selecting an initial value lambda by analyzing the difference of the selected servo gains as an action*As an initial value of the servo gain; let the size of the action set be 2 x na+1, the action set forms an arithmetic progression with a set tolerance of daIf the action set A is { a ═ a }i|i=1,2,3...,2naAdjusting servo gain on the linear speed and the angular speed respectively;
3) setting a reward function; the reward function is divided into three parts, namely, the reward function reaches an expected target, and the reward function tracks the loss of the target and other conditions; if the characteristic error of each dimensionδ is a threshold value, then the quad-rotor drone is considered to have reached the target position, the highest reward may be given; if the feature points are missing compared with the feature points of the target image after the real-time image shot by the quad-rotor unmanned aerial vehicle is subjected to feature extraction, the unmanned aerial vehicle is considered to lose the target, and the return value is a negative value; other situations give rewards depending on how close the quad-rotor drone is to the target;
4) setting a single-step SARSA learning iterative algorithm; setting an iterative formula of servo gain, and setting the iterative formula in two spaces of linear velocity gain and angular velocity gain respectively; an iterative process of a servo gain iterative algorithm is set according to Q learning, the iterative process uses the set servo gain iterative formula, and the iterative updating of the servo gain is completed through the iterative algorithm;
5) setting a learning rule; the maximum spent time unit in one learning round of single-step SARSA learning is set as 400 time slices, the placing positions of the four rotors in each round are within a feasible range, all targets can be seen randomly after the four rotors take off for 1.0m, and 5000 rounds are trained once. If the four rotors still do not reach the designated position from the initial position after 400 time slices, forcibly returning to the starting point again for the next round; if the characteristic points are lost due to the four-rotor motion, the current turn is ended, and the next turn is restarted; if the distance between the four rotors and the target position is kept within 5 pixels for a certain time in the movement process of the four rotors, the target point is reached, and the next turn is finished; fourthly, updating the servo gain after each round is finished;
step 6, fuzzy control rule, after establishing SARSA learning regulation visual servo gain model in step 5, using fuzzy control to carry out self-adaptive regulation of learning rate, wherein the basic rule of the self-adaptive regulation of learning rate is as follows, if the intelligent agent adopts the learned gain to increase the characteristic error, the learning rate is reduced, otherwise, the learning rate is increased; changing the learning rate of reinforcement learning by using fuzzy control, taking the change rate of characteristic errors as observed quantity, fuzzifying the observed quantity, setting a fuzzy control rule of 'maximum-minimum synthesis operation', inputting the observed quantity into the fuzzy control rule to obtain a controlled quantity learning rate, and finally obtaining the learning rate by defuzzification.
Advantageous effects
The invention provides a visual servo control method of a rotor unmanned aerial vehicle based on fuzzy SARSA learning. The rotor unmanned aerial vehicle is controlled by a visual servo control method based on images, and the visual servo based on the images can form closed-loop feedback adjustment based on errors so as to control reasonable movement of the rotor unmanned aerial vehicle. The self-adaptive adjustment of the servo gain is carried out by using a reinforcement learning method, the rotor unmanned aerial vehicle is trained under different scenes, and the rotor unmanned aerial vehicle learns the gain changing capability under different scenes after multiple times of training. Changing the learning rate of reinforcement learning by using fuzzy control, taking the change rate of characteristic errors as observed quantity, fuzzifying the observed quantity, setting a fuzzy control rule of 'maximum-minimum synthesis operation', inputting the observed quantity into the fuzzy control rule to obtain a controlled quantity learning rate, and finally obtaining the learning rate by defuzzification. The unmanned gyroplane learns the skill of self-adaptive gain adjustment after multiple training.
Drawings
The following describes in detail a method for controlling a visual servo of a rotary wing drone based on fuzzy SARSA learning according to the present invention with reference to the accompanying drawings and embodiments.
Fig. 1 is a flow chart of a vision servo control method of a rotor unmanned aerial vehicle based on fuzzy SARSA learning according to the present invention.
Detailed Description
The embodiment is a vision servo control method of a rotor unmanned aerial vehicle based on fuzzy SARSA learning.
Aiming at the problem of loss of the outline part in the traditional visual feature extraction algorithm, the embodiment provides an outline extraction algorithm based on a Kalman chain code, the outline compensation algorithm is used for compensating the outline, and then a weighted average method is used for calculating the central feature point of the target. To rotor unmanned aerial vehicle underactuation, nonlinear dynamics characteristic, the rotor unmanned aerial vehicle visual servo control model of a decoupling zero is proposed to this embodiment. The present embodiment provides a method for adjusting the visual servo gain by using the SARSA learning, aiming at the problem that the fixed visual servo gain value is inefficient and cannot adapt to the complex environment. Aiming at the problem that learning efficiency is not high due to the fact that the learning rate value of SARSA learning is fixed, the embodiment provides that the learning rate of SARSA learning is adjusted by using a fuzzy control method.
Referring to fig. 1, the method for controlling the vision servo of the unmanned rotorcraft based on the fuzzy SARSA learning in the embodiment includes two aspects of image feature extraction and intelligent control method of the unmanned rotorcraft, and includes the following steps:
the method comprises the following steps of performing edge extraction on an image by using a Canny algorithm, and then obtaining a set of N contour coordinates through filtering and noise reduction operations, wherein the set of N contour coordinates is described by using a Firman chain code, and contour pixels using the Firman chain code are marked as C ═ C { (C {)i1., N }; the visual feature extraction algorithm for the unmanned rotorcraft provided by the embodiment needs to establish a graph library, and assumes that the established standard profile library has M graphs corresponding to M actual objects, and the profiles of the M graphs are represented as D ═ DiI 1., N }, first of all, the method is to usePerforming rotation normalization on the Ferman chain code by using first-order difference according to a formula:
the method comprises the steps of obtaining a Lelman chain code after carrying out rotation normalization on contour pixels of a target and respectively calculating a Levenshtein distance with a graph in a standard contour library, wherein the Levenshtein distance is calculated in the mode that an operand is needed for converting a shape A into a shape B, the operation can be only insertion, deletion and modification, and the method can be used for identifying the shape of an object under the condition that the image target loses a certain degree of edge.
And secondly, because the shot picture sometimes has incomplete outline, the system uses an outline compensation algorithm. For the l target, the ferman chain code is obtained through one processing:the set of standard contour feature points obtained after the set is subjected to rotation normalization is
The contours of the identified objects are represented as X ═ X using the ferman chain codek1.,. q }, where q is the number of feature points, and x is the number of feature pointskIs the coordinates of the kth feature point. The contours in the standard contour library areWherein q is the number of the characteristic points,is the coordinates of the kth feature point. Establishing X and X*Relation of X*.R+L=X;
Let R be rotation matrix R ═ cos β, sin β, sin β, cos β ]; the formula is obtained through derivation:
wherein R ═ H*)+H,L=X-X*(H*) + H; thus, for a standard profile DlAnd compensation profile OlThe relationship between D andl·R+L=Ol. Note Ol={Pj|j=1,...,Nstan dard}; taking the central feature point of the target l as a feature point used for visual servo control, wherein the coordinate of the central feature point is obtained by calculating the average of the sum of the coordinates of the compensation contour feature points and is recorded as:
establishing a conversion relation from a three-dimensional space to an image pixel plane;
let P be (X)p,Yp,Zp)TIs a point in space, Pi=(x,y,z)TFor the projection of the point P on the image plane, a formula can be obtained according to the principle of pinhole imaging:
wherein, KsIs constant, f is focal length;
the image collected by the vision sensor is stored in a computer by using a binary function, and is marked as f (u, v), wherein the (u, v) is a coordinate on an image plane, and the f (u, v) is a pixel value at the point; the relationship of a point (u, v) in the image plane coordinate system to a point (x, y) in the coordinate system of the vision sensor plane is according to the formula:
recording origin O of coordinate plane of vision sensor1The coordinate in the image plane is (u)q,vq),dx,dyFor scaling from image plane to vision sensor planeAnd (4) proportion.
Establishing a rotor unmanned aerial vehicle control model based on visual servo; e represents a characteristic error vector, which can be expressed as e ═ fc-f*,fcRepresenting the current coordinates (u) of the feature points in the image planec,vc),f*As coordinates (u) of desired feature points*,v*) (ii) a Calculating the change rate of the characteristic error along with time and the angular speed according to the dynamic relation, wherein the functional relation between the linear speeds is as follows:
wherein the content of the first and second substances,is the image Jacobian matrix, v ═ vx,vy,vz)TIs the linear velocity vector of the unmanned aerial vehicle, fc=(uc,vc)TAs the current position coordinate vector of the feature point, ω ═ ω (ω ═ ω)x,ωy,ωz)TThe vector of the pitch angle speed, the roll angle speed and the yaw angle speed of the unmanned aerial vehicle. In order to ensure that the characteristic error is exponentially decoupled and reduced,substituting into equation (5) yields:
wherein the content of the first and second substances,is the pseudo-inverse of the matrix J, if J is a square matrix, the pseudo-inverse of the matrix is the inverse of the matrix and is denoted as J-1If the matrix is a matrix with different rows and columns, thenλv,ωFor servo gain, takeThe value range is (0, 1).
Considering the dynamics of a quad-rotor drone, the formula (X) is obtained from the kinematicsp,Yp,Zp) The relationship between the rate of change with time and the linear and angular velocities is according to the formula:
the formula can be derived in conjunction with formula (7):
if the number of the characteristic points is N, the coordinate set of the characteristic points is { fiI ═ 1,2.. N }, so the characteristic point error is
the decoupling is performed using servo gains independent of linear velocity and angular velocity,conversion to the formula:
wherein λ isvAnd λωThe servo gains for linear velocity and angular velocity respectively,is composed ofThe first three rows of the matrix of the sub-matrix,is composed ofThe fourth row of (a) constitutes a sub-matrix.
Establishing a single-step SARSA learning and adjusting servo gain model;
1. setting a state space:
after the contour of the target is extracted through an image feature extraction algorithm, the target is simplified by adopting the central feature point, and after the absolute value of the error between the current feature point and the target feature point is calculated, the absolute value is summed to obtain a certain range as a state.
2. Setting an action space:
selecting an initial value lambda by analyzing the difference of the selected servo gains as an action*As an initial value of the servo gain; let the size of the action set be 2 x na+1, the action set forms an arithmetic progression with a set tolerance of daIf the action set A is { a ═ a }i|i=1,2,3...,2naThe action set is unfolded to form { -nada,-(na-1)da,...,-da,0,da,2da,...,(na-1)da,nada}; the servo gain adjustment formula is:
the adjustment formula of the servo gain in the linear velocity and angular velocity directions according to formula (11) is:
wherein the content of the first and second substances,to adjust the previous linear velocity servo gainSelection actionsThen the obtained linear velocity servo gain is adjusted,in the selecting action, the servo gain of the yaw angle before the adjustmentThen the servo gain is adjusted to
3. Setting a reward function:
the reward function is divided into three parts, the expected target is reached, and the target loss and other conditions are tracked;
if the characteristic error of each dimensionδ is a threshold value, then it is assumed that the quad-rotor drone has reached the target location, and the highest reward may be given.
If the feature points are missing compared with the feature points of the target image after the real-time images shot by the quad-rotor unmanned aerial vehicle pass through feature extraction, the unmanned aerial vehicle is considered to have lost the target, and the return value is a negative value.
Other situations are awarded based on how close the quad-rotor drone is to the target.
Thus, using a function-analytic expression to describe the reward function follows the formula:
where row and col represent the length and width of the image plane, respectively.
4. Setting a single-step SARSA learning iterative algorithm:
the iterative formula for the servo gain is:
iterative process of algorithm using Q learning:
2) The current state isRandomly generating two random numbers respectively denoted as rand1And rand2If rand1If epsilon is less than epsilon, an action is randomly selectedOtherwise, selecting action according to formula (15); similar reason rand2< epsilon random selection actionOtherwise, selecting action according to formula (15);
5) And returning to the step 2, and repeating the steps for multiple times.
5. Setting learning rules:
the maximum spent time unit in one learning round of single-step SARSA learning is set as 400 time slices, the placing positions of the four rotors in each round are within a feasible range, all targets can be seen randomly after the four rotors take off for 1.0m, and 5000 rounds are trained once. If the four rotors still do not reach the designated position from the initial position after 400 time slices, forcibly returning to the starting point again for the next round; if the characteristic points are lost due to the four-rotor motion, the current turn is ended, and the next turn is restarted; if the distance between the four rotors and the target position is kept within 5 pixels for a certain time in the movement process of the four rotors, the target point is reached, and the next turn is finished; fourthly, updating the servo gain after each round is finished.
And a fuzzy control rule, namely using fuzzy control to perform adaptive adjustment on the learning rate, wherein the basic rule of the adaptive adjustment on the learning rate is that if the intelligent agent adopts the gain after learning to increase the characteristic error, the learning rate is reduced, and otherwise, the learning rate is increased. In the embodiment, the characteristic error change rate is used as the observed quantity, the controlled quantity is the learning rate, and the specific fuzzy control rule steps are as follows:
1. distance sum of characteristic point and its expected positionIs changed byAs observed quantity, andfuzzification is described as "fast decrease, slow decrease, substantially unchanged, slow increase, fast increase"; the learning rate output result is taken as a control amount of the fuzzy control, i.e., an output amount, and learning rate fuzzification is described as "large, medium, small".
2. Setting the membership function of the input quantity described by the { DR, DS, RU, IS, IR } asThe output quantity is sequentially the membership function described by L, LL, M, LS and SSetting the membership function of the input quantity described by the { DR, DS, RU, IS, IR } asThe output quantity is sequentially the membership function described by L, LL, M, LS and STaking the output DR ambiguity membership functions as an example, the expression of each membership function can be obtained as:
3. uniformly selecting n discrete points from input quantity value rangeEach point has 5 corresponding membership values relative to the input quantity fuzzy description, so that an input quantity membership discrete matrix can be constructedThe calculation formula is as follows:
uniformly selecting m discrete points from output theory domainThen construct an output quantity membership degree discrete matrix asThe calculation formula is as follows:
4. setting a fuzzy rule as if DR, L; if DR, LL; if RU, M; if IS, LS; if IR, S "; combining the maximum-minimum synthesis operation to deduce a fuzzy inference engine (R ═ R)ij) ∈ Rn × m ,i = 1,...,n,j = 1,...,m} :
Wherein, Λ represents the selected minimum value and v represents the selected maximum value.
5. For a particular observed quantityFirstly, 5 membership values of the fuzzy description relative to the input quantity are obtainedObtained by weightingIs (p) the fuzzified input quantity ρ ═j ) ∈ R1 × n The calculation formula is:
6. will be provided withThe fuzzified input quantity rho and the inference engine R obtain a corresponding output quantity fuzzy vector beta through a maximum-minimum synthesis operation (beta)j ) ∈ R1 × m The calculation formula is:
7. and calculating the final learning rate control result according to the fuzzy control quantity beta, and performing defuzzification operation by using a weighted average method, wherein the final output result, namely the learning rate is as follows:
thereby, for a determined observed quantityThe corresponding learning rate α can be obtained by fuzzy control0(ii) a Through the variable learning rate design based on the fuzzy control, the learning time can be reduced to a certain extent, and the algorithm operation efficiency is improvedAnd (4) rate.
Claims (1)
1. A vision servo control method of a rotor unmanned aerial vehicle based on fuzzy SARSA learning is characterized by comprising the following steps:
step 1, performing edge extraction on an image by using a Canny algorithm, obtaining a set of N contour coordinates through filtering and noise reduction operations, wherein the set of N contour coordinates is described by using a Firman chain code, and a contour pixel using the Firman chain code is marked as C ═ C { (C)i1., N }; carrying out rotation normalization on contour pixels of the target to obtain a Firman chain code and respectively calculating a Levenshtein distance with a graph in a standard contour library, wherein the Levenshtein distance is calculated in a mode that a shape A is converted into an operand required by a shape B, and the operations are insertion, deletion and modification; the method can be used for identifying the shape of the object under the condition that the image target loses a certain degree of edge;
step 2, after contour pixels of the image are obtained in the step 1, because the shot image sometimes has a contour incomplete phenomenon, a contour compensation algorithm is used; for the l target, the Firman chain code is processed once to obtain NlA feature point of the outlineThe feature point set of (1) is:for set FlEach element in the N is subjected to rotation normalization to obtain NstandardA standard contour feature pointThe set of standard contour feature points is denoted asSet to OlFor the compensated feature point profile set, the compensation profile OlWith the standard profile DlThe conversion relationship between the two is as follows: dl.R+L=Ol(ii) a Wherein R and L are respectively a rotation matrix and a conversion matrix; compensating contour OlWherein the jth element is PjTotal of N in the setstandardElement, is marked as Ol={Pj|j=1,...,Nstandard}; taking the central feature point of the target l as a feature point used for visual servo control, wherein the coordinate of the central feature point is obtained by calculating the average of the sum of the coordinates of the compensation contour feature points and is recorded as:
step 3, after the central characteristic point of the target is obtained in the step 2, establishing a bottom visual model of the rotor unmanned aerial vehicle, namely a conversion relation from a three-dimensional space to an image pixel plane;
step 4, constructing a decoupled visual servo control model of the rotor unmanned aerial vehicle through the obtained visual model of the rotor unmanned aerial vehicle, wherein the decoupled visual servo control model comprises a visual servo gain value;
step 5, establishing a single-step SARSA learning and adjusting servo gain model; using SARSA to learn and adjust the visual servo gain value of the rotor unmanned aerial vehicle in the step 4;
1) setting a state space; after the contour of the target is extracted through an image feature extraction algorithm, simplifying the target by adopting a central feature point, calculating the absolute value of the error between the current feature point and the target feature point, and summing the absolute values to obtain a certain range as a state;
2) setting an action space; selecting an initial value lambda by analyzing the difference of the selected servo gains as an action*As an initial value of the servo gain; let the size of the action set be 2 x na+1, the action set forms an arithmetic progression with a set tolerance of daIf the action set A is { a ═ a }i|i=1,2,3...,2naAdjusting the servo gain of the linear velocity and the angular velocity respectively;
3) setting a reward function; the reward function is divided into three parts, namely, the reward function reaches an expected target, and the reward function tracks the loss of the target and other conditions; if the characteristic error of each dimensionδ is a threshold value, then the quad-rotor drone is considered to have reached the target position, the highest reward may be given; if the feature points are missing compared with the feature points of the target image after the real-time image shot by the quad-rotor unmanned aerial vehicle is subjected to feature extraction, the unmanned aerial vehicle is considered to lose the target, and the return value is a negative value; other situations give rewards depending on how close the quad-rotor drone is to the target;
4) setting a single-step SARSA learning iterative algorithm; setting an iterative formula of servo gain, and setting the iterative formula in two spaces of linear velocity gain and angular velocity gain respectively; an iterative process of a servo gain iterative algorithm is set according to Q learning, the iterative process uses the iterative formula for setting the servo gain, and the iterative updating of the servo gain is completed through the iterative algorithm;
5) setting a learning rule; setting the maximum spent time unit in one learning round of single-step SARSA learning as 400 time slices, wherein the placing positions of the four rotors in each round are in a feasible range, all targets can be seen randomly after the four rotors take off for 1.0m, and 5000 rounds are trained once; if the four rotors still do not reach the designated position from the initial position after 400 time slices, forcibly returning to the starting point again for the next round; if the characteristic points are lost due to the four-rotor motion, the current turn is ended, and the next turn is restarted; if the distance between the four rotors and the target position is kept within 5 pixels for a certain time in the movement process of the four rotors, the target point is reached, and the next turn is finished; fourthly, updating the servo gain after each round is finished;
step 6, fuzzy control rule, after establishing SARSA learning regulation visual servo gain model in step 5, using fuzzy control to carry out self-adaptive regulation of learning rate, wherein the basic rule of the self-adaptive regulation of learning rate is as follows, if the intelligent agent adopts the learned gain to increase the characteristic error, the learning rate is reduced, otherwise, the learning rate is increased; changing the learning rate of reinforcement learning by using fuzzy control, taking the change rate of characteristic errors as observed quantity, fuzzifying the observed quantity, setting a fuzzy control rule of 'maximum-minimum synthesis operation', inputting the observed quantity into the fuzzy control rule to obtain a controlled quantity learning rate, and finally obtaining the learning rate by defuzzification.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810855339.8A CN109143855B (en) | 2018-07-31 | 2018-07-31 | Visual servo control method of unmanned gyroplane based on fuzzy SARSA learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810855339.8A CN109143855B (en) | 2018-07-31 | 2018-07-31 | Visual servo control method of unmanned gyroplane based on fuzzy SARSA learning |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109143855A CN109143855A (en) | 2019-01-04 |
CN109143855B true CN109143855B (en) | 2021-04-02 |
Family
ID=64798489
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810855339.8A Active CN109143855B (en) | 2018-07-31 | 2018-07-31 | Visual servo control method of unmanned gyroplane based on fuzzy SARSA learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109143855B (en) |
Families Citing this family (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109696830B (en) * | 2019-01-31 | 2021-12-03 | 天津大学 | Reinforced learning self-adaptive control method of small unmanned helicopter |
CN111026146B (en) * | 2019-12-24 | 2021-04-06 | 西北工业大学 | Attitude control method for composite wing vertical take-off and landing unmanned aerial vehicle |
CN112612212B (en) * | 2020-12-30 | 2021-11-23 | 上海大学 | Heterogeneous multi-unmanned system formation and cooperative target driving-away method |
CN114609976A (en) * | 2022-04-12 | 2022-06-10 | 天津航天机电设备研究所 | Non-calibration visual servo control method based on homography and Q learning |
CN114859971A (en) * | 2022-05-07 | 2022-08-05 | 北京卓翼智能科技有限公司 | Intelligent unmanned aerial vehicle for monitoring wind turbine |
CN116700348B (en) * | 2023-07-12 | 2024-03-19 | 湖南文理学院 | Visual servo control method and system for four-rotor aircraft with limited vision |
Family Cites Families (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP3494772B2 (en) * | 1995-09-08 | 2004-02-09 | カヤバ工業株式会社 | Fuzzy control device |
TWI334066B (en) * | 2007-03-05 | 2010-12-01 | Univ Nat Taiwan Science Tech | Method of fuzzy logic control with combined sliding mode concept for ideal dynamic responses |
CN101588480B (en) * | 2009-05-27 | 2010-09-08 | 北京航空航天大学 | Multi-agent visual servo-coordination control method |
CN102135761B (en) * | 2011-01-10 | 2013-08-21 | 穆科明 | Fuzzy self-adaptive control system for parameters of visual sensor |
CN105353772B (en) * | 2015-11-16 | 2018-11-09 | 中国航天时代电子公司 | A kind of Visual servoing control method in UAV Maneuver target locating |
CN107894709A (en) * | 2017-04-24 | 2018-04-10 | 长春工业大学 | Controlled based on Adaptive critic network redundancy Robot Visual Servoing |
-
2018
- 2018-07-31 CN CN201810855339.8A patent/CN109143855B/en active Active
Also Published As
Publication number | Publication date |
---|---|
CN109143855A (en) | 2019-01-04 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109143855B (en) | Visual servo control method of unmanned gyroplane based on fuzzy SARSA learning | |
US9019278B2 (en) | Systems and methods for animating non-humanoid characters with human motion data | |
Shi et al. | Decoupled visual servoing with fuzzy Q-learning | |
CN111240356B (en) | Unmanned aerial vehicle cluster convergence method based on deep reinforcement learning | |
JP2021518622A (en) | Self-location estimation, mapping, and network training | |
Zhao et al. | A brain-inspired decision making model based on top-down biasing of prefrontal cortex to basal ganglia and its application in autonomous UAV explorations | |
CN111260026B (en) | Navigation migration method based on meta reinforcement learning | |
Passalis et al. | Deep reinforcement learning for controlling frontal person close-up shooting | |
Hoang et al. | Vision-based target tracking and autonomous landing of a quadrotor on a ground vehicle | |
CN113281999A (en) | Unmanned aerial vehicle autonomous flight training method based on reinforcement learning and transfer learning | |
Srivastava et al. | Least square policy iteration for ibvs based dynamic target tracking | |
Kim et al. | Learning and generalization of dynamic movement primitives by hierarchical deep reinforcement learning from demonstration | |
CN116501168A (en) | Unmanned aerial vehicle gesture control method and control system based on chaotic sparrow search and fuzzy PID parameter optimization | |
Amirkhani et al. | Fuzzy cognitive map for visual servoing of flying robot | |
Jones et al. | Using neural networks to learn hand-eye co-ordination | |
Xing et al. | Contrastive learning for enhancing robust scene transfer in vision-based agile flight | |
Becerra | Fuzzy visual control for memory-based navigation using the trifocal tensor | |
CN109542094B (en) | Mobile robot vision stabilization control without desired images | |
Lopez-Franco et al. | Neural control for a differential drive wheeled mobile robot integrating stereo vision feedback | |
CN112051733B (en) | Rigid mechanical arm composite learning control method based on image classification | |
CN110703792A (en) | Underwater robot attitude control method based on reinforcement learning | |
Kumar et al. | Benchmarking Deep Reinforcement Learning Algorithms for Vision-based Robotics | |
Chen | Adaptive Shape-Servoing for Vision-based Robotic Manipulation with Model Estimation and Performance Regulation | |
Maeda et al. | View-based programming with reinforcement learning for robotic manipulation | |
Chen et al. | Vision-assisted Arm Motion Planning for Freeform 3D Printing |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
TR01 | Transfer of patent right |
Effective date of registration: 20210926 Address after: 710072 No. 4689, block B, Feixiang Zhongchuang space, West University of technology, 4 / F, innovation building, No. 127, Youyi West Road, Beilin District, Xi'an City, Shaanxi Province Patentee after: Xi'an Liuyi FeiMeng Information Technology Co.,Ltd. Address before: 710072 No. 127 Youyi West Road, Shaanxi, Xi'an Patentee before: Northwestern Polytechnical University |
|
TR01 | Transfer of patent right |