CN109227550A - A kind of Mechanical arm control method based on RBF neural - Google Patents

A kind of Mechanical arm control method based on RBF neural Download PDF

Info

Publication number
CN109227550A
CN109227550A CN201811338287.3A CN201811338287A CN109227550A CN 109227550 A CN109227550 A CN 109227550A CN 201811338287 A CN201811338287 A CN 201811338287A CN 109227550 A CN109227550 A CN 109227550A
Authority
CN
China
Prior art keywords
network
learning
mechanical arm
behavior
layer
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201811338287.3A
Other languages
Chinese (zh)
Inventor
曲兴田
田农
王鑫
杜雨欣
张昆
李金来
刘博文
王学旭
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jilin University
Original Assignee
Jilin University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jilin University filed Critical Jilin University
Priority to CN201811338287.3A priority Critical patent/CN109227550A/en
Publication of CN109227550A publication Critical patent/CN109227550A/en
Pending legal-status Critical Current

Links

Classifications

    • BPERFORMING OPERATIONS; TRANSPORTING
    • B25HAND TOOLS; PORTABLE POWER-DRIVEN TOOLS; MANIPULATORS
    • B25JMANIPULATORS; CHAMBERS PROVIDED WITH MANIPULATION DEVICES
    • B25J9/00Programme-controlled manipulators
    • B25J9/16Programme controls
    • B25J9/1602Programme controls characterised by the control system, structure, architecture
    • B25J9/1605Simulation of manipulator lay-out, design, modelling of manipulator
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B25HAND TOOLS; PORTABLE POWER-DRIVEN TOOLS; MANIPULATORS
    • B25JMANIPULATORS; CHAMBERS PROVIDED WITH MANIPULATION DEVICES
    • B25J9/00Programme-controlled manipulators
    • B25J9/16Programme controls
    • B25J9/1628Programme controls characterised by the control loop
    • B25J9/163Programme controls characterised by the control loop learning, adaptive, model based, rule based expert control

Landscapes

  • Engineering & Computer Science (AREA)
  • Robotics (AREA)
  • Mechanical Engineering (AREA)
  • Automation & Control Theory (AREA)
  • Feedback Control In General (AREA)

Abstract

The invention discloses a kind of Mechanical arm control method based on RBF neural, methods are as follows: Step 1: providing a kind of cognitive learning model mechanism of mechanical arm;Step 2: proposing a kind of based on cerebellum-basal ganglion behavior cognitive model and hybrid learning algorithm;Step 3: establishing the mathematical model that can make mechanical arm autonomous learning using artificial neural network and intensified learning method;Step 4: establishing mechanical arm emulation experiment model in Matlab;Step 5: Mechanical arm control method of the verifying based on RBF neural.The utility model has the advantages that being not only adapted to mechanical arm, it also can be applicable to other machinery fields.It can be in other control field applications.It is more suitable for applying, the workload of programmer can be greatly reduced.Mechanical arm with independent learning ability is following more competitive.

Description

A kind of Mechanical arm control method based on RBF neural
Technical field
The present invention relates to a kind of Mechanical arm control method, in particular to a kind of mechanical arm control based on RBF neural Method.
Background technique
Currently, robot relies, the basis of development is intelligence, in robot control system, it is crucial that study mechanism And ability.The study mechanism for simulating intelligent body learns robot as organism automatically by constantly training New knowledge and technical ability are obtained, self-perfection is realized, is the hot issue of robot control field.
In practical projects, the payload of mechanical arm can change, and all multi-parameters cannot achieve accurately during movement Precognition, and the self-adaptation control method of RBF network has the advantages that the priori knowledge for not needing unknown parameter, for example does not need to know Power on the quality of road load, the position of terminal of manipulator and terminalization object, therefore do not have to off-line training neural network. RBF network can also recognize the model error of robot, it is ensured that the stability of closed loop, it may have high performance tracking effect, Therefore RBF network has very high practical value to the control ability of complication system on the robotic arm.
Summary of the invention
The purpose of the invention is to provide a kind of cognitive learning models of mechanical arm, propose a kind of based on radial basis function The cerebellum of network-basal ganglion operating condition learning algorithm makes mechanical arm realize autonomous learning, so as to preferably control Mechanical arm.
Mechanical arm control method provided by the invention based on RBF neural, method are as described below:
Step 1: providing one kind according to the mechanism of the working principle of each module of human brain cognitive system and operant conditioning reflex The cognitive learning model mechanism of mechanical arm;
Step 2: proposing a kind of based on cerebellum-basal ganglion behavior cognitive model and hybrid learning algorithm;
Step 3: the cerebellum based on radial primary function network-basal ganglion operating condition learning algorithm design, using people Artificial neural networks and intensified learning method establish the mathematical model that can make mechanical arm autonomous learning;
Step 4: using the cerebellum based on radial primary function network-basal ganglion operating condition cognitive learning model, control Mechanical arm processed establishes mechanical arm emulation experiment model in Matlab;
Step 5: carrying out the test of feasibility by changing parameter and variable, verifying is based on RBF nerve in Matlab The Mechanical arm control method of network.
Beneficial effects of the present invention:
(1) present invention proposes a kind of cognitive science with cerebellum-basal ganglion operant conditioning reflex for main study mechanism Model is practised, mechanical arm is not only adapted to, also can be applicable to other machinery fields.
(2) it is derived and is optimized the present invention is based on cerebellum-basal ganglion behavior cognition mathematical model, it can be at other Control field application.
(3) it the present invention is based on the cerebellum of radial primary function network-basal ganglion operating condition learning algorithm design, uses The mathematical model for the mechanical arm autonomous learning that artificial neural network and intensified learning method are established is more intelligent, is more suitable for answering With the workload of programmer can be greatly reduced.
(4) present invention is compared with existing machinery arm control method with more perspective, the machinery with independent learning ability Arm is following more competitive.
Detailed description of the invention
Fig. 1 is the model structure schematic diagram with cerebellum-basal ganglion operant conditioning reflex for main study mechanism.
Fig. 2 is Radial Basis Function neural meta-model schematic diagram.
Fig. 3 is radial primary function network structural model schematic diagram.
Fig. 4 is the visual flow chart of K- means clustering algorithm.
Fig. 5 is cognitive learning algorithm flow chart.
Fig. 6 is the program implementing result schematic diagram that RBF network is fitted training sample point.
Fig. 7 is training time and parameter schematic diagram.
Fig. 8 is training error performance map.
Fig. 9 be spread be 0.5 when output image schematic diagram.
Figure 10 be spread be 0.5 when error performance figure.
Figure 11 be spread be 5 when output image schematic diagram.
Figure 12 be spread be 5 when error performance figure.
Specific embodiment
It please refers to shown in Fig. 1 to Figure 12:
Mechanical arm control method provided by the invention based on RBF neural, method are as described below:
Step 1: providing one kind according to the mechanism of the working principle of each module of human brain cognitive system and operant conditioning reflex The cognitive learning model mechanism of mechanical arm.
According to the working mechanism of human brain each section, propose that one kind with cerebellum-basal ganglion operant conditioning reflex is main The cognitive learning model of study mechanism makes multiagent system by behavior network, evaluates the effect of network and monitor, carries out not Disconnected study.
As shown in Figure 1, behavior network is realized jointly by cerebellum module and basal ganglion module, outwardly explore Behavior is realized by probabilistic type action selection.Cerebellum module is taken charge of to be learnt in supervised, and monitor is given signal, The complex act and external environment that the supervision behavior and probabilistic type behavior provided is acted on by the weighting of coordinating factor generate friendship Mutually.When obtaining positive learning effect, that is, provide prize signal;When the learning effect for obtaining negative sense, that is, provide punishment signal.Base After bottom neuromere module receives rewards and punishments signal, output result to behavior network carries out next round study.By successive ignition and Repetitive learning, behavior network are constantly adjusted online, and intelligence system can collect a large amount of behavior state and training number It is believed that breath, these, which explore information, can also become the learning database of monitor.By operating condition training, behavior network can Gradually find the behavior for being most suitable for itself.
Step 2: proposing a kind of based on cerebellum-basal ganglion behavior cognitive model and hybrid learning algorithm.
The core of the hybrid learning algorithm of model is: exploratory behaviour ae, supervise behavior as, the two is weighted summation and obtains Complex act af, it may be assumed that
af←ωae+(1-ω)as (1)
1), probabilistic type action selection usage behavior strategy πA(s), it is the mapping of state to behavior, is θ with a parameter RBF network approached, similar thermodynamic system, the randomness of multiagent system state transition shows certain statistics rule Rule enables its exploratory behaviour select to obey probability distribution, i.e. Blotzmann-Gibbs distribution:
Wherein, T is thermodynamic temperature, KBFor Boltzmann constant,For Boltzmann factor, Z is distribution Function;
Formula is deduced, exploratory behaviour aeAlternative state s, ε (s)=ε (ae)=(ae-aA)2, T expression behavior exploration degree, I.e. temperature is higher, and exploration degree is bigger, and for the T that each is determined, system has its corresponding equalization point;
2) it, with the positive negative effects of evaluation value function V (s) evaluation behavior, is approached with RBF network, function are as follows:
V (s)=E { rt+1+γV(st+1)} (3)
With rewards and punishments information rt+1Evaluation of estimate V (the s generated with next iterationt+1) estimation second evaluation signal δ:
δ=rt+1+γV(st+1)-V(st) (4)
Wherein, 0 < γ < 1 is the evaluation rewards and punishments factor;
3) one priori knowledge collection of monitor, is given in model, the expectation as behavior network maps, behavioral strategy πA(s) The update of middle parameter θ is realized jointly by cerebellum module and basal ganglion module, it may be assumed that
θ←θ+ωΔθBG+(1-ω)ΔθCB (5)
Error criterion for weighed value adjusting are as follows:
Using gradient descent method, the learning algorithm of network weight are as follows:
Wherein, η ∈ [0,1] is learning rate, and δ is second evaluation signal;
4), coordinating factor ω indicates the specific gravity that the supervised learning of cerebellum accounts in the cognitive process of behavior network, is learning The initial stage of control process, probability behavior error is larger, and the collected status information of behavior network is less and inaccurate, supervision The supervised learning of device occupies larger specific gravity, but increasing with the number of iterations, and rear stage cerebellum and basal ganglion are wherein Role is changed, and effect of the monitor of cerebellum module in learning process is constantly reduced, and strengthening mechanism has played master It leads, coordinating factor, which is increased form with index, to be indicated:
Step 3: the cerebellum based on radial primary function network-basal ganglion operating condition learning algorithm design, using people Artificial neural networks and intensified learning method establish the mathematical model that can make mechanical arm autonomous learning.
The mathematical model of autonomous learning is realized using RBF neural.RBF neural has three-decker: input Layer, hidden layer, output layer, the architecture of " feeling-association-reaction " having the same.Fig. 2 is Radial Basis Function neural member mould Type.Input layer corresponds to the node of sensory neuron, and hidden layer corresponds to the node of association's neuron, and output layer corresponds to reaction The node of neuron.Input layer only serve transmitting signal effect, after signal is passed to hidden layer by input layer, use RBF as " base " of hidden unit constitutes hidden layer and carries out processing conversion to it, their connection weights between two layers are 1.What hidden layer used It is nonlinear optimization strategy, and output layer is using linear optimization strategy.Fig. 3 is radial primary function network structural model.
RBF neural learning algorithm needs to solve 3 parameters: center, variance and the hidden layer of basic function to output The weight of layer;
1), the learning center t of radial basis functioni(i=1,2 ..., I) uses K- means clustering algorithm, it is assumed that cluster centre There are I (value of I is determined by priori knowledge), if ti(n) (i=1,2 ..., I), the center of basic function, K- when being nth iteration Specific step is as follows for means clustering algorithm:
Step 1: executing initialization to cluster centre, i.e., is rule of thumb concentrated from training sample and randomly select I difference Sample as initial center ti(0) iterative steps n=0 is arranged in (i=1,2 ..., I);
Step 2: stochastic inputs training sample Xk;
Step 3: searching training sample Xk is nearest from which center, that is, find i (Xk) make its satisfaction:
i(Xk)=argmin | | Xk-ti(0) | |, i=1,2 ..., I (10)
Step 4: updating adjustment cluster centre, XkAddition so that the cluster centre of the i-th class is changed, new is poly- Class center is equal to:
ti(n+1)=ti(n)+η[Xk(n)-ti(n)], i=i (Xk)
ti(n+1)=ti(n), other (11)
Step 5: whether judging algorithmic statement, it will usually set a threshold value to the variation of cluster centre value, calculate cluster The variation at center, if it is less than this value, stopping calculates down, if cluster centre still changes, algorithm is not restrained, and jumps It returns second step and continues iteration, final center takes ti(n);Fig. 4 is the visual flow chart of K- means clustering algorithm.
2), the variances sigma of radial basis functioni(i=1,2 ..., I)
After center is fixed, it is necessary to immediately determine that the variances sigma of basic function, basic function is Gaussian function:
Variance:dmaxIt is the maximum spacing between center, I is the number of hidden unit;
3), the study weight w of radial basis functionij(i=1,2 ..., I, j=1,2 ..., J)
The neuron of RBF network output layer is the output weighted sum to hidden layer neuron, and the reality of RBF network is defeated Out are as follows:
Y (n)=G (n) W (n) (13)
The corresponding input variable of each neuron of input layer, enabling its neuron number is n, and input vector is x=(x1, x2,...,xn)T, the corresponding Gaussian bases of each node of hidden layer, node in hidden layer j, hidden layer output h= [hj]T, hjFor the output of j-th of neuron of hidden layer, wherein c is the seat of j-th of neuron Gaussian bases central point of hidden layer Mark vector c=(c1,c2,...,cj)T, bjFor width (sound stage width vector) b=(b of j-th of neuron Gaussian bases of hidden layer1, b2,...,bj)T.In third layer, that is, output layer, neural network weight w=[w1,w2,...,wm]T.Network output be y (t)= wTH=w1h1+...+wmhm,Error for ideal first of output of output is el=yl d-ylEntire sample error index
For the behavior network and evaluation network mentioned in model before this, identical RBF network structure, input are all used It is original state s0, the weight of behavior network indicates that the weight for evaluating network is indicated with w with θ;Fig. 5 is cognitive learning algorithm stream Cheng Tu.
Step 4: using the cerebellum based on radial primary function network-basal ganglion operating condition cognitive learning model, control Mechanical arm processed.In Matlab, mechanical arm emulation experiment model is established.
Multi-joint mechanical arm is a kind of nonlinear system, the ideal model in two joints is reduced to herein, using calculating Torque Control method.
M (q) q "+C (q, q') q'+G (q)=τ+d (15)
Wherein,For joint displacements vector, M (q) is the positive definite inertial matrix of the 2*2 rank of mechanical arm, τ= (τ12)TFor the torque vector acted on joint, centrifugal force and coriolis force and frictional force item of the C (q, q') for 2*2 rank, G It (q) is 2*1 rank gravity item, d is unknown additional interference, and distracter is ignored.In Practical Project, mechanical arm inertial matrix, from Mental and physical efforts and coriolis force item and gravity item be usually it is unknown, M (q), C (q, q') and G are generally approached using three RBF networks (q)。
It is as follows that parameter is arranged in it: mechanical arm lengths: big brachium l1=small brachium l2=0.5m.System initial state q0=[0, 0]T, q'0=[0,0]T, the parameter of Gaussian function takes ci=[- 1, -0.5,0,0.5,1] and sound stage width b=10, the number of nodes of hidden layer 10 are selected as, the initial weight vector w of each node is set as 0, and the gain of adaptive law takes ΓM=100, ΓC=100, ΓG= 100。
So that mechanical arm is trained according to given sample point, after being fitted geometric locus, is moved according to track.
In Matlab software,.Radial primary function network can be created using newrb () function, method of calling is as follows:
Net=newrbe (P, T, spread)
Wherein, P is R × Q input vector, and T is S × Q desired output vector, i.e. target value, and R is input vector or matrix Dimension, Q are the number of training sample, and S is the dimension of output vector.Spread is the dispersion constant of radial base, and default value is 1.If to add node into the radial basis function network of building, multiple parameters can be added in function, hidden layer node is added to Until mean square error reaches requirement.The following are the syntax formats of function:
Net=newrb (P, T, spread, MN, DF)
Wherein, goal is specified mean square error, default value 0.MN refers to the maximum number of implicit node, and default value is Q, DF instruction show added neuron number every time.
According to the track that mechanical arm needs, 21 training samples are provided.Initial data defined below:
X=0:20;
Y=[1,3,4,6,9,14,21,29,38,48,58,66,73,79,85,89,93,95,97,99,1 00];It connects down To carry out the design of network, code are as follows:
Start test data, code below are as follows:
Step 5: carrying out the test of feasibility by changing parameter and variable, verifying is based on RBF nerve in Matlab The Mechanical arm control method of network.
1) result and preliminary analysis tested using initial value
The setting value of initializaing variable parameter is as follows: mean square error goal=0;The diffusion velocity of radial basis function (is spread normal Number) spread=1;
Initial data is 21 data points of the x from 0 to 20, and the distance between point and point are 1.Test data using x from 0 to 20, the data point that spacing is 0.5.If Fig. 6 is the program implementing result that RBF network is fitted training sample point.
Training time time_cost=1.7719s, Fig. 7 are training time and parameter list.Fig. 8 is training error performance map.
Order line outputs the process for adding implicit interstitial content and SSE decline.
NEWRB, neurons=0, MSE=1349.25
NEWRB, neurons=2, MSE=734.587
NEWRB, neurons=3, MSE=544.161
NEWRB, neurons=4, MSE=296.501
NEWRB, neurons=5, MSE=205.978
NEWRB, neurons=6, MSE=138.405
NEWRB, neurons=7, MSE=95.8257
NEWRB, neurons=8, MSE=86.2323
NEWRB, neurons=9, MSE=57.6582
NEWRB, neurons=10, MSE=29.0238
NEWRB, neurons=11, MSE=10.2131
NEWRB, neurons=12, MSE=9.33213
NEWRB, neurons=13, MSE=5.79217
NEWRB, neurons=14, MSE=3.89062
NEWRB, neurons=15, MSE=0.882868
NEWRB, neurons=16, MSE=0.757605
NEWRB, neurons=17, MSE=0.165323
NEWRB, neurons=18, MSE=0.0372311
NEWRB, neurons=19, MSE=0.0358684
NEWRB, neurons=20, MSE=4.21501e-029
NEWRB, neurons=21, MSE=1.83917e-027
As it can be seen that the shape for being fitted track that RBF network is relatively good.
2) change the result and preliminary analysis of different variable tests
By changing several training parameters of radial basis function, different simulation results can be also generated, the journey including fitting Degree, training error, and the hidden neuron interstitial content for meeting condition etc..
Value by changing dispersion constant observes network fitting.Its initial value spread=1, is changed to separately below 0.5 and 5, it then observes it and exports image.
As spread=0.5, image as shown in Figure 9 is exported
Training time is 1.6969s.From output image can be seen that dispersion constant be 0.5 when, the degree of track fitting is not As its value be 1 when, select too small, cause overfitting.Error performance image is as shown in Figure 10.
Order line outputs the process for adding implicit interstitial content and network mean square error MSE decline.
NEWRB, neurons=0, MSE=1349.25
NEWRB, neurons=2, MSE=1083.85
NEWRB, neurons=3, MSE=970.283
NEWRB, neurons=4, MSE=832.636
NEWRB, neurons=5, MSE=738.65
NEWRB, neurons=6, MSE=604.904
NEWRB, neurons=7, MSE=474.016
NEWRB, neurons=8, MSE=362.99
NEWRB, neurons=9, MSE=268.685
NEWRB, neurons=10, MSE=175.586
NEWRB, neurons=11, MSE=106.236
NEWRB, neurons=12, MSE=58.7686
NEWRB, neurons=13, MSE=29.4558
NEWRB, neurons=14, MSE=12.8321
NEWRB, neurons=15, MSE=4.65652
NEWRB, neurons=16, MSE=1.55368
NEWRB, neurons=17, MSE=0.546924
NEWRB, neurons=18, MSE=0.198805
NEWRB, neurons=19, MSE=0.0843713
NEWRB, neurons=20, MSE=9.93589e-029
As can be seen that mean square deviation significantly increases, and increases to the 17th to neuron number after dispersion constant spread reduces When, MSE is just less than 1.
As spread=5, image as shown in figure 11 is exported
Training time is 1.7474s.When stroll constant spread is 5, track middle section is fitted preferably, but Deviation at both ends is bigger.Error performance image is as shown in figure 12.
The conclusion obtained by error performance figure is seen similar with output image, and error declines comparatively fast, arrives in horizontal axis 4 19 point, deviation is smaller, but cusp occurs at both ends, and deviation is larger, while available by the mean square deviation table of following formula Identical conclusion.
NEWRB, neurons=0, MSE=1349.25
NEWRB, neurons=2, MSE=105.28
NEWRB, neurons=3, MSE=29.3692
NEWRB, neurons=4, MSE=0.452869
NEWRB, neurons=5, MSE=0.411198
NEWRB, neurons=6, MSE=0.263052
NEWRB, neurons=7, MSE=0.0828302
NEWRB, neurons=8, MSE=0.0645026
NEWRB, neurons=9, MSE=0.0550501
NEWRB, neurons=10, MSE=0.0354879
NEWRB, neurons=11, MSE=0.028415
NEWRB, neurons=12, MSE=0.0274097
NEWRB, neurons=13, MSE=0.0228389
NEWRB, neurons=14, MSE=0.0164181
NEWRB, neurons=15, MSE=0.011896
NEWRB, neurons=16, MSE=0.0115202
NEWRB, neurons=17, MSE=0.0114105
NEWRB, neurons=18, MSE=0.00630194
NEWRB, neurons=19, MSE=0.0062908
NEWRB, neurons=20, MSE=4.891
The work of this emulation experiment is to be completed on the platform of Matlab by calling RBF neural tool box function , by 21 groups of training data, RBF network can be trained well, can be very well when neuron number is at 15 or more Ground controls mean square error, and mechanical arm is made to realize autonomous learning.

Claims (3)

1. a kind of Mechanical arm control method based on RBF neural, it is characterised in that: its method is as described below:
Step 1: providing a kind of machinery according to the mechanism of the working principle of each module of human brain cognitive system and operant conditioning reflex The cognitive learning model mechanism of arm;
Step 2: proposing a kind of based on cerebellum-basal ganglion behavior cognitive model and hybrid learning algorithm;
Step 3: the cerebellum based on radial primary function network-basal ganglion operating condition learning algorithm design, using artificial mind The mathematical model that can make mechanical arm autonomous learning is established through network and intensified learning method;
Step 4: controlling machine using the cerebellum based on radial primary function network-basal ganglion operating condition cognitive learning model Tool arm establishes mechanical arm emulation experiment model in Matlab;
Step 5: carrying out the test of feasibility by changing parameter and variable, verifying is based on RBF neural in Matlab Mechanical arm control method.
2. a kind of Mechanical arm control method based on RBF neural according to claim 1, it is characterised in that: described The step of two in the core of hybrid learning algorithm be: exploratory behaviour ae, supervise behavior as, the two be weighted summation obtain it is compound Behavior af, it may be assumed that
af←ωae+(1-ω)as (1)
1), probabilistic type action selection usage behavior strategy πA(s), it is the mapping of state to behavior, the RBF for being θ with a parameter Network is approached, and similar thermodynamic system, the randomness of multiagent system state transition shows certain statistical law, is enabled Probability distribution is obeyed in its exploratory behaviour selection, i.e. Blotzmann-Gibbs distribution:
Wherein, T is thermodynamic temperature, KBFor Boltzmann constant,For Boltzmann factor, Z is partition function;
Formula is deduced, exploratory behaviour aeAlternative state s, ε (s)=ε (ae)=(ae-aA)2, T indicates that degree is explored in behavior, i.e., warm Degree is higher, and exploration degree is bigger, and for the T that each is determined, system has its corresponding equalization point;
2) it, with the positive negative effects of evaluation value function V (s) evaluation behavior, is approached with RBF network, function are as follows:
V (s)=E { rt+1+γV(st+1)} (3)
With rewards and punishments information rt+1Evaluation of estimate V (the s generated with next iterationt+1) estimation second evaluation signal δ:
δ=rt+1+γV(st+1)-V(st) (4)
Wherein, 0 < γ < 1 is the evaluation rewards and punishments factor;
3) one priori knowledge collection of monitor, is given in model, the expectation as behavior network maps, behavioral strategy πA(s) parameter in The update of θ is realized jointly by cerebellum module and basal ganglion module, it may be assumed that
θ←θ+ωΔθBG+(1-ω)ΔθCB (5)
Error criterion for weighed value adjusting are as follows:
Using gradient descent method, the learning algorithm of network weight are as follows:
Wherein, η ∈ [0,1] is learning rate, and δ is second evaluation signal;
4), coordinating factor ω indicates the specific gravity that the supervised learning of cerebellum accounts in the cognitive process of behavior network, controls in study The initial stage of process, probability behavior error is larger, and the collected status information of behavior network is less and inaccurate, monitor Supervised learning occupies larger specific gravity, but increasing with the number of iterations, and rear stage cerebellum and basal ganglion are in rising wherein Effect is changed, and effect of the monitor of cerebellum module in learning process is constantly reduced, and strengthening mechanism, which has risen, to be dominated, will Coordinating factor, which increases form with index, to be indicated:
3. a kind of Mechanical arm control method based on RBF neural according to claim 1, it is characterised in that: described The step of three in the mathematical model of autonomous learning realize that RBF neural has three-decker: defeated using RBF neural Enter layer, hidden layer, output layer, the architecture of " feeling-association-reaction " having the same, input layer corresponds to sensory nerve The node of member, hidden layer correspond to the node of association's neuron, and output layer corresponds to the node of reaction neuron, and input layer only rises To the effect of transmitting signal, after signal is passed to hidden layer by input layer, " base " for using RBF as hidden unit constitutes hidden layer pair It carries out processing conversion, their connection weights between two layers are 1, and hidden layer is exported using nonlinear optimization strategy Layer is using linear optimization strategy;
RBF neural learning algorithm needs to solve 3 parameters: center, variance and the hidden layer of basic function to output layer Weight;
1), the learning center t of radial basis functioni(i=1,2 ..., I) uses K- means clustering algorithm, it is assumed that cluster centre has I A, the value of I is determined by priori knowledge, if ti(n) (i=1,2 ..., I), the center of basic function when being nth iteration, K- mean value Specific step is as follows for clustering algorithm:
Step 1: executing initialization to cluster centre, i.e., is rule of thumb concentrated from training sample and randomly select I different samples This is as initial center ti(0) iterative steps n=0 is arranged in (i=1,2 ..., I);
Step 2: stochastic inputs training sample Xk;
Step 3: searching training sample Xk is nearest from which center, that is, find i (Xk) make its satisfaction:
i(Xk)=argmin | | Xk-ti(0) | |, i=1,2 ..., I (10)
Step 4: updating adjustment cluster centre, XkAddition so that the cluster centre of the i-th class is changed, new cluster centre It is equal to:
ti(n+1)=ti(n)+η[Xk(n)-ti(n)], i=i (Xk)
ti(n+1)=ti(n), other (11)
Step 5: whether judging algorithmic statement, it will usually set a threshold value to the variation of cluster centre value, calculate cluster centre Variation, if it is less than this value, stopping calculates down, if cluster centre still changes, algorithm is not restrained, and jumps back to the Two steps continue iteration, and final center takes ti(n);
2), the variances sigma of radial basis functioni(i=1,2 ..., I)
After center is fixed, it is necessary to immediately determine that the variances sigma of basic function, basic function is Gaussian function:
Variance:dmaxIt is the maximum spacing between center, I is the number of hidden unit;
3), the study weight w of radial basis functionij(i=1,2 ..., I, j=1,2 ..., J)
The neuron of RBF network output layer is the output weighted sum to hidden layer neuron, the reality output of RBF network Are as follows:
Y (n)=G (n) W (n) (13)
The corresponding input variable of each neuron of input layer, enabling its neuron number is n, and input vector is x=(x1, x2,...,xn)T, the corresponding Gaussian bases of each node of hidden layer, node in hidden layer j, hidden layer output h= [hj]T, hjFor the output of j-th of neuron of hidden layer, wherein c is the seat of j-th of neuron Gaussian bases central point of hidden layer Mark vector c=(c1,c2,...,cj)T, bjFor the width of j-th of neuron Gaussian bases of hidden layer, it may be assumed that sound stage width vector b= (b1,b2,...,bj)T, in third layer, that is, output layer, neural network weight w=[w1,w2,...,wm]T, network output is y (t) =wTH=w1h1+...+wmhm,Error for ideal first of output of output is el=yl d-ylEntire sample error index
For the behavior network and evaluation network mentioned in model before this, identical RBF network structure is all used, input is just Beginning state s0, the weight of behavior network indicates that the weight for evaluating network is indicated with w with θ.
CN201811338287.3A 2018-11-12 2018-11-12 A kind of Mechanical arm control method based on RBF neural Pending CN109227550A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811338287.3A CN109227550A (en) 2018-11-12 2018-11-12 A kind of Mechanical arm control method based on RBF neural

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811338287.3A CN109227550A (en) 2018-11-12 2018-11-12 A kind of Mechanical arm control method based on RBF neural

Publications (1)

Publication Number Publication Date
CN109227550A true CN109227550A (en) 2019-01-18

Family

ID=65078000

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811338287.3A Pending CN109227550A (en) 2018-11-12 2018-11-12 A kind of Mechanical arm control method based on RBF neural

Country Status (1)

Country Link
CN (1) CN109227550A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109514564A (en) * 2019-01-22 2019-03-26 江西理工大学 A kind of compound quadratic form multi-joint mechanical arm method for optimally controlling
CN109605377A (en) * 2019-01-21 2019-04-12 厦门大学 A kind of joint of robot motion control method and system based on intensified learning
CN110450155A (en) * 2019-07-30 2019-11-15 洛阳润信机械制造有限公司 A kind of optimum design method of the controller of multi-freedom Mechanism
CN112223276A (en) * 2020-09-01 2021-01-15 上海大学 Multi-joint robot control method based on adaptive neural network sliding mode control

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101804627A (en) * 2010-04-02 2010-08-18 中山大学 Redundant manipulator motion planning method
CN101846974A (en) * 2010-03-30 2010-09-29 江苏六维物流设备实业有限公司 Piler neural network control technique
CN102501251A (en) * 2011-11-08 2012-06-20 北京邮电大学 Mechanical shoulder joint position control method with dynamic friction compensation
CN106406085A (en) * 2016-03-15 2017-02-15 吉林大学 Space manipulator trajectory tracking control method based on cross-scale model
CN108288093A (en) * 2018-01-31 2018-07-17 湖北工业大学 BP neural network Weighting, system and prediction technique, system
CN108594657A (en) * 2018-04-11 2018-09-28 福建省德腾智能科技有限公司 A kind of mechanical arm self-adaptation control method based on neural network

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101846974A (en) * 2010-03-30 2010-09-29 江苏六维物流设备实业有限公司 Piler neural network control technique
CN101804627A (en) * 2010-04-02 2010-08-18 中山大学 Redundant manipulator motion planning method
CN102501251A (en) * 2011-11-08 2012-06-20 北京邮电大学 Mechanical shoulder joint position control method with dynamic friction compensation
CN106406085A (en) * 2016-03-15 2017-02-15 吉林大学 Space manipulator trajectory tracking control method based on cross-scale model
CN108288093A (en) * 2018-01-31 2018-07-17 湖北工业大学 BP neural network Weighting, system and prediction technique, system
CN108594657A (en) * 2018-04-11 2018-09-28 福建省德腾智能科技有限公司 A kind of mechanical arm self-adaptation control method based on neural network

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109605377A (en) * 2019-01-21 2019-04-12 厦门大学 A kind of joint of robot motion control method and system based on intensified learning
CN109514564A (en) * 2019-01-22 2019-03-26 江西理工大学 A kind of compound quadratic form multi-joint mechanical arm method for optimally controlling
CN109514564B (en) * 2019-01-22 2021-11-30 江西理工大学 Optimal control method for composite quadratic multi-joint mechanical arm
CN110450155A (en) * 2019-07-30 2019-11-15 洛阳润信机械制造有限公司 A kind of optimum design method of the controller of multi-freedom Mechanism
CN110450155B (en) * 2019-07-30 2021-01-22 洛阳润信机械制造有限公司 Optimal design method for controller of multi-degree-of-freedom mechanical arm system
CN112223276A (en) * 2020-09-01 2021-01-15 上海大学 Multi-joint robot control method based on adaptive neural network sliding mode control
CN112223276B (en) * 2020-09-01 2023-02-10 上海大学 Multi-joint robot control method based on adaptive neural network sliding mode control

Similar Documents

Publication Publication Date Title
CN109227550A (en) A kind of Mechanical arm control method based on RBF neural
CN108621159B (en) Robot dynamics modeling method based on deep learning
CN108284442B (en) Mechanical arm flexible joint control method based on fuzzy neural network
CN110806759B (en) Aircraft route tracking method based on deep reinforcement learning
Kiumarsi et al. Optimal control of nonlinear discrete time-varying systems using a new neural network approximation structure
CN103971160B (en) particle swarm optimization method based on complex network
CN105136469A (en) Unmanned vehicle speed control method based on PSO and RBF neutral network
CN111221346A (en) Method for optimizing PID (proportion integration differentiation) control four-rotor aircraft flight by crowd search algorithm
CN115952736A (en) Multi-agent target collaborative search method and system
CN111967087A (en) Neural network-based online vehicle decision control model establishing and evaluating method
Al-Dabooni et al. The Boundedness Conditions for Model-Free HDP ($\lambda $)
Tutuko et al. Route optimization of non-holonomic leader-follower control using dynamic particle swarm optimization
JP2009129366A (en) Sensibility estimation system of vehicle
Hager et al. Adaptive Neural network control of a helicopter system with optimal observer and actor-critic design
CN113200086A (en) Intelligent vehicle steering control system and control method thereof
Xu et al. Meta-learning via weighted gradient update
CN112525194A (en) Cognitive navigation method based on endogenous and exogenous information of hippocampus-striatum
Cheah et al. Region following formation control for multi-robot systems
CN113977580B (en) Mechanical arm imitation learning method based on dynamic motion primitive and self-adaptive control
CN106371321A (en) PID control method for fuzzy network optimization of coking-furnace hearth pressure system
CN115562258A (en) Robot social self-adaptive path planning method and system based on neural network
Lee et al. Combining GRN modeling and demonstration-based programming for robot control
CN104992059A (en) Intrinsic motivation based self-cognition system for motion balance robot and control method
Huang et al. Value system development for a robot
Xing et al. A brain-inspired approach for probabilistic estimation and efficient planning in precision physical interaction

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20190118