CN109227550A

CN109227550A - A kind of Mechanical arm control method based on RBF neural

Info

Publication number: CN109227550A
Application number: CN201811338287.3A
Authority: CN
Inventors: 曲兴田; 田农; 王鑫; 杜雨欣; 张昆; 李金来; 刘博文; 王学旭
Original assignee: Jilin University
Current assignee: Jilin University
Priority date: 2018-11-12
Filing date: 2018-11-12
Publication date: 2019-01-18

Abstract

The invention discloses a kind of Mechanical arm control method based on RBF neural, methods are as follows: Step 1: providing a kind of cognitive learning model mechanism of mechanical arm；Step 2: proposing a kind of based on cerebellum-basal ganglion behavior cognitive model and hybrid learning algorithm；Step 3: establishing the mathematical model that can make mechanical arm autonomous learning using artificial neural network and intensified learning method；Step 4: establishing mechanical arm emulation experiment model in Matlab；Step 5: Mechanical arm control method of the verifying based on RBF neural.The utility model has the advantages that being not only adapted to mechanical arm, it also can be applicable to other machinery fields.It can be in other control field applications.It is more suitable for applying, the workload of programmer can be greatly reduced.Mechanical arm with independent learning ability is following more competitive.

Description

A kind of Mechanical arm control method based on RBF neural

Technical field

The present invention relates to a kind of Mechanical arm control method, in particular to a kind of mechanical arm control based on RBF neural Method.

Background technique

Currently, robot relies, the basis of development is intelligence, in robot control system, it is crucial that study mechanism And ability.The study mechanism for simulating intelligent body learns robot as organism automatically by constantly training New knowledge and technical ability are obtained, self-perfection is realized, is the hot issue of robot control field.

In practical projects, the payload of mechanical arm can change, and all multi-parameters cannot achieve accurately during movement Precognition, and the self-adaptation control method of RBF network has the advantages that the priori knowledge for not needing unknown parameter, for example does not need to know Power on the quality of road load, the position of terminal of manipulator and terminalization object, therefore do not have to off-line training neural network. RBF network can also recognize the model error of robot, it is ensured that the stability of closed loop, it may have high performance tracking effect, Therefore RBF network has very high practical value to the control ability of complication system on the robotic arm.

Summary of the invention

The purpose of the invention is to provide a kind of cognitive learning models of mechanical arm, propose a kind of based on radial basis function The cerebellum of network-basal ganglion operating condition learning algorithm makes mechanical arm realize autonomous learning, so as to preferably control Mechanical arm.

Mechanical arm control method provided by the invention based on RBF neural, method are as described below:

Step 1: providing one kind according to the mechanism of the working principle of each module of human brain cognitive system and operant conditioning reflex The cognitive learning model mechanism of mechanical arm；

Step 2: proposing a kind of based on cerebellum-basal ganglion behavior cognitive model and hybrid learning algorithm；

Step 3: the cerebellum based on radial primary function network-basal ganglion operating condition learning algorithm design, using people Artificial neural networks and intensified learning method establish the mathematical model that can make mechanical arm autonomous learning；

Step 4: using the cerebellum based on radial primary function network-basal ganglion operating condition cognitive learning model, control Mechanical arm processed establishes mechanical arm emulation experiment model in Matlab；

Step 5: carrying out the test of feasibility by changing parameter and variable, verifying is based on RBF nerve in Matlab The Mechanical arm control method of network.

Beneficial effects of the present invention:

(1) present invention proposes a kind of cognitive science with cerebellum-basal ganglion operant conditioning reflex for main study mechanism Model is practised, mechanical arm is not only adapted to, also can be applicable to other machinery fields.

(2) it is derived and is optimized the present invention is based on cerebellum-basal ganglion behavior cognition mathematical model, it can be at other Control field application.

(3) it the present invention is based on the cerebellum of radial primary function network-basal ganglion operating condition learning algorithm design, uses The mathematical model for the mechanical arm autonomous learning that artificial neural network and intensified learning method are established is more intelligent, is more suitable for answering With the workload of programmer can be greatly reduced.

(4) present invention is compared with existing machinery arm control method with more perspective, the machinery with independent learning ability Arm is following more competitive.

Detailed description of the invention

Fig. 1 is the model structure schematic diagram with cerebellum-basal ganglion operant conditioning reflex for main study mechanism.

Fig. 2 is Radial Basis Function neural meta-model schematic diagram.

Fig. 3 is radial primary function network structural model schematic diagram.

Fig. 4 is the visual flow chart of K- means clustering algorithm.

Fig. 5 is cognitive learning algorithm flow chart.

Fig. 6 is the program implementing result schematic diagram that RBF network is fitted training sample point.

Fig. 7 is training time and parameter schematic diagram.

Fig. 8 is training error performance map.

Fig. 9 be spread be 0.5 when output image schematic diagram.

Figure 10 be spread be 0.5 when error performance figure.

Figure 11 be spread be 5 when output image schematic diagram.

Figure 12 be spread be 5 when error performance figure.

Specific embodiment

It please refers to shown in Fig. 1 to Figure 12:

Step 1: providing one kind according to the mechanism of the working principle of each module of human brain cognitive system and operant conditioning reflex The cognitive learning model mechanism of mechanical arm.

According to the working mechanism of human brain each section, propose that one kind with cerebellum-basal ganglion operant conditioning reflex is main The cognitive learning model of study mechanism makes multiagent system by behavior network, evaluates the effect of network and monitor, carries out not Disconnected study.

As shown in Figure 1, behavior network is realized jointly by cerebellum module and basal ganglion module, outwardly explore Behavior is realized by probabilistic type action selection.Cerebellum module is taken charge of to be learnt in supervised, and monitor is given signal, The complex act and external environment that the supervision behavior and probabilistic type behavior provided is acted on by the weighting of coordinating factor generate friendship Mutually.When obtaining positive learning effect, that is, provide prize signal；When the learning effect for obtaining negative sense, that is, provide punishment signal.Base After bottom neuromere module receives rewards and punishments signal, output result to behavior network carries out next round study.By successive ignition and Repetitive learning, behavior network are constantly adjusted online, and intelligence system can collect a large amount of behavior state and training number It is believed that breath, these, which explore information, can also become the learning database of monitor.By operating condition training, behavior network can Gradually find the behavior for being most suitable for itself.

Step 2: proposing a kind of based on cerebellum-basal ganglion behavior cognitive model and hybrid learning algorithm.

The core of the hybrid learning algorithm of model is: exploratory behaviour a_e, supervise behavior a_s, the two is weighted summation and obtains Complex act a_f, it may be assumed that

a_f←ωa_e+(1-ω)a_s (1)

1), probabilistic type action selection usage behavior strategy π_A(s), it is the mapping of state to behavior, is θ with a parameter RBF network approached, similar thermodynamic system, the randomness of multiagent system state transition shows certain statistics rule Rule enables its exploratory behaviour select to obey probability distribution, i.e. Blotzmann-Gibbs distribution:

Wherein, T is thermodynamic temperature, K_BFor Boltzmann constant,For Boltzmann factor, Z is distribution Function；

Formula is deduced, exploratory behaviour a_eAlternative state s, ε (s)=ε (a_e)=(a_e-a_A)², T expression behavior exploration degree, I.e. temperature is higher, and exploration degree is bigger, and for the T that each is determined, system has its corresponding equalization point；

2) it, with the positive negative effects of evaluation value function V (s) evaluation behavior, is approached with RBF network, function are as follows:

V (s)=E { r_t+1+γV(s_t+1)} (3)

With rewards and punishments information r_t+1Evaluation of estimate V (the s generated with next iteration_t+1) estimation second evaluation signal δ:

δ=r_t+1+γV(s_t+1)-V(s_t) (4)

Wherein, 0 < γ < 1 is the evaluation rewards and punishments factor；

3) one priori knowledge collection of monitor, is given in model, the expectation as behavior network maps, behavioral strategy π_A(s) The update of middle parameter θ is realized jointly by cerebellum module and basal ganglion module, it may be assumed that

θ←θ+ωΔθ_BG+(1-ω)Δθ_CB (5)

Error criterion for weighed value adjusting are as follows:

Using gradient descent method, the learning algorithm of network weight are as follows:

Wherein, η ∈ [0,1] is learning rate, and δ is second evaluation signal；

4), coordinating factor ω indicates the specific gravity that the supervised learning of cerebellum accounts in the cognitive process of behavior network, is learning The initial stage of control process, probability behavior error is larger, and the collected status information of behavior network is less and inaccurate, supervision The supervised learning of device occupies larger specific gravity, but increasing with the number of iterations, and rear stage cerebellum and basal ganglion are wherein Role is changed, and effect of the monitor of cerebellum module in learning process is constantly reduced, and strengthening mechanism has played master It leads, coordinating factor, which is increased form with index, to be indicated:

Step 3: the cerebellum based on radial primary function network-basal ganglion operating condition learning algorithm design, using people Artificial neural networks and intensified learning method establish the mathematical model that can make mechanical arm autonomous learning.

The mathematical model of autonomous learning is realized using RBF neural.RBF neural has three-decker: input Layer, hidden layer, output layer, the architecture of " feeling-association-reaction " having the same.Fig. 2 is Radial Basis Function neural member mould Type.Input layer corresponds to the node of sensory neuron, and hidden layer corresponds to the node of association's neuron, and output layer corresponds to reaction The node of neuron.Input layer only serve transmitting signal effect, after signal is passed to hidden layer by input layer, use RBF as " base " of hidden unit constitutes hidden layer and carries out processing conversion to it, their connection weights between two layers are 1.What hidden layer used It is nonlinear optimization strategy, and output layer is using linear optimization strategy.Fig. 3 is radial primary function network structural model.

RBF neural learning algorithm needs to solve 3 parameters: center, variance and the hidden layer of basic function to output The weight of layer；

1), the learning center t of radial basis function_i(i=1,2 ..., I) uses K- means clustering algorithm, it is assumed that cluster centre There are I (value of I is determined by priori knowledge), if t_i(n) (i=1,2 ..., I), the center of basic function, K- when being nth iteration Specific step is as follows for means clustering algorithm:

Step 1: executing initialization to cluster centre, i.e., is rule of thumb concentrated from training sample and randomly select I difference Sample as initial center t_i(0) iterative steps n=0 is arranged in (i=1,2 ..., I)；

Step 2: stochastic inputs training sample Xk；

Step 3: searching training sample Xk is nearest from which center, that is, find i (X_k) make its satisfaction:

i(X_k)=argmin | | X_k-t_i(0) | |, i=1,2 ..., I (10)

Step 4: updating adjustment cluster centre, X_kAddition so that the cluster centre of the i-th class is changed, new is poly- Class center is equal to:

t_i(n+1)=t_i(n)+η[X_k(n)-t_i(n)], i=i (X_k)

t_i(n+1)=t_i(n), other (11)

Step 5: whether judging algorithmic statement, it will usually set a threshold value to the variation of cluster centre value, calculate cluster The variation at center, if it is less than this value, stopping calculates down, if cluster centre still changes, algorithm is not restrained, and jumps It returns second step and continues iteration, final center takes t_i(n)；Fig. 4 is the visual flow chart of K- means clustering algorithm.

2), the variances sigma of radial basis function_i(i=1,2 ..., I)

After center is fixed, it is necessary to immediately determine that the variances sigma of basic function, basic function is Gaussian function:

Variance:d_maxIt is the maximum spacing between center, I is the number of hidden unit；

3), the study weight w of radial basis function_ij(i=1,2 ..., I, j=1,2 ..., J)

The neuron of RBF network output layer is the output weighted sum to hidden layer neuron, and the reality of RBF network is defeated Out are as follows:

Y (n)=G (n) W (n) (13)

The corresponding input variable of each neuron of input layer, enabling its neuron number is n, and input vector is x=(x₁, x₂,...,x_n)^T, the corresponding Gaussian bases of each node of hidden layer, node in hidden layer j, hidden layer output h= [h_j]^T, h_jFor the output of j-th of neuron of hidden layer, wherein c is the seat of j-th of neuron Gaussian bases central point of hidden layer Mark vector c=(c₁,c₂,...,c_j)^T, b_jFor width (sound stage width vector) b=(b of j-th of neuron Gaussian bases of hidden layer₁, b₂,...,b_j)^T.In third layer, that is, output layer, neural network weight w=[w₁,w₂,...,w_m]^T.Network output be y (t)= w^TH=w₁h₁+...+w_mh_m,Error for ideal first of output of output is e_l=y_l ^d-y_lEntire sample error index

For the behavior network and evaluation network mentioned in model before this, identical RBF network structure, input are all used It is original state s₀, the weight of behavior network indicates that the weight for evaluating network is indicated with w with θ；Fig. 5 is cognitive learning algorithm stream Cheng Tu.

Step 4: using the cerebellum based on radial primary function network-basal ganglion operating condition cognitive learning model, control Mechanical arm processed.In Matlab, mechanical arm emulation experiment model is established.

Multi-joint mechanical arm is a kind of nonlinear system, the ideal model in two joints is reduced to herein, using calculating Torque Control method.

M (q) q "+C (q, q') q'+G (q)=τ+d (15)

Wherein,For joint displacements vector, M (q) is the positive definite inertial matrix of the 2*2 rank of mechanical arm, τ= (τ₁,τ₂)^TFor the torque vector acted on joint, centrifugal force and coriolis force and frictional force item of the C (q, q') for 2*2 rank, G It (q) is 2*1 rank gravity item, d is unknown additional interference, and distracter is ignored.In Practical Project, mechanical arm inertial matrix, from Mental and physical efforts and coriolis force item and gravity item be usually it is unknown, M (q), C (q, q') and G are generally approached using three RBF networks (q)。

It is as follows that parameter is arranged in it: mechanical arm lengths: big brachium l₁=small brachium l₂=0.5m.System initial state q₀=[0, 0]^T, q'₀=[0,0]^T, the parameter of Gaussian function takes c_i=[- 1, -0.5,0,0.5,1] and sound stage width b=10, the number of nodes of hidden layer 10 are selected as, the initial weight vector w of each node is set as 0, and the gain of adaptive law takes Γ_M=100, Γ_C=100, Γ_G= 100。

So that mechanical arm is trained according to given sample point, after being fitted geometric locus, is moved according to track.

In Matlab software,.Radial primary function network can be created using newrb () function, method of calling is as follows:

Net=newrbe (P, T, spread)

Wherein, P is R × Q input vector, and T is S × Q desired output vector, i.e. target value, and R is input vector or matrix Dimension, Q are the number of training sample, and S is the dimension of output vector.Spread is the dispersion constant of radial base, and default value is 1.If to add node into the radial basis function network of building, multiple parameters can be added in function, hidden layer node is added to Until mean square error reaches requirement.The following are the syntax formats of function:

Net=newrb (P, T, spread, MN, DF)

Wherein, goal is specified mean square error, default value 0.MN refers to the maximum number of implicit node, and default value is Q, DF instruction show added neuron number every time.

According to the track that mechanical arm needs, 21 training samples are provided.Initial data defined below:

X=0:20；

Y=[1,3,4,6,9,14,21,29,38,48,58,66,73,79,85,89,93,95,97,99,1 00]；It connects down To carry out the design of network, code are as follows:

Start test data, code below are as follows:

1) result and preliminary analysis tested using initial value

The setting value of initializaing variable parameter is as follows: mean square error goal=0；The diffusion velocity of radial basis function (is spread normal Number) spread=1；

Initial data is 21 data points of the x from 0 to 20, and the distance between point and point are 1.Test data using x from 0 to 20, the data point that spacing is 0.5.If Fig. 6 is the program implementing result that RBF network is fitted training sample point.

Training time time_cost=1.7719s, Fig. 7 are training time and parameter list.Fig. 8 is training error performance map.

Order line outputs the process for adding implicit interstitial content and SSE decline.

NEWRB, neurons=0, MSE=1349.25

NEWRB, neurons=2, MSE=734.587

NEWRB, neurons=3, MSE=544.161

NEWRB, neurons=4, MSE=296.501

NEWRB, neurons=5, MSE=205.978

NEWRB, neurons=6, MSE=138.405

NEWRB, neurons=7, MSE=95.8257

NEWRB, neurons=8, MSE=86.2323

NEWRB, neurons=9, MSE=57.6582

NEWRB, neurons=10, MSE=29.0238

NEWRB, neurons=11, MSE=10.2131

NEWRB, neurons=12, MSE=9.33213

NEWRB, neurons=13, MSE=5.79217

NEWRB, neurons=14, MSE=3.89062

NEWRB, neurons=15, MSE=0.882868

NEWRB, neurons=16, MSE=0.757605

NEWRB, neurons=17, MSE=0.165323

NEWRB, neurons=18, MSE=0.0372311

NEWRB, neurons=19, MSE=0.0358684

NEWRB, neurons=20, MSE=4.21501e-029

NEWRB, neurons=21, MSE=1.83917e-027

As it can be seen that the shape for being fitted track that RBF network is relatively good.

2) change the result and preliminary analysis of different variable tests

By changing several training parameters of radial basis function, different simulation results can be also generated, the journey including fitting Degree, training error, and the hidden neuron interstitial content for meeting condition etc..

Value by changing dispersion constant observes network fitting.Its initial value spread=1, is changed to separately below 0.5 and 5, it then observes it and exports image.

As spread=0.5, image as shown in Figure 9 is exported

Training time is 1.6969s.From output image can be seen that dispersion constant be 0.5 when, the degree of track fitting is not As its value be 1 when, select too small, cause overfitting.Error performance image is as shown in Figure 10.

Order line outputs the process for adding implicit interstitial content and network mean square error MSE decline.

NEWRB, neurons=0, MSE=1349.25

NEWRB, neurons=2, MSE=1083.85

NEWRB, neurons=3, MSE=970.283

NEWRB, neurons=4, MSE=832.636

NEWRB, neurons=5, MSE=738.65

NEWRB, neurons=6, MSE=604.904

NEWRB, neurons=7, MSE=474.016

NEWRB, neurons=8, MSE=362.99

NEWRB, neurons=9, MSE=268.685

NEWRB, neurons=10, MSE=175.586

NEWRB, neurons=11, MSE=106.236

NEWRB, neurons=12, MSE=58.7686

NEWRB, neurons=13, MSE=29.4558

NEWRB, neurons=14, MSE=12.8321

NEWRB, neurons=15, MSE=4.65652

NEWRB, neurons=16, MSE=1.55368

NEWRB, neurons=17, MSE=0.546924

NEWRB, neurons=18, MSE=0.198805

NEWRB, neurons=19, MSE=0.0843713

NEWRB, neurons=20, MSE=9.93589e-029

As can be seen that mean square deviation significantly increases, and increases to the 17th to neuron number after dispersion constant spread reduces When, MSE is just less than 1.

As spread=5, image as shown in figure 11 is exported

Training time is 1.7474s.When stroll constant spread is 5, track middle section is fitted preferably, but Deviation at both ends is bigger.Error performance image is as shown in figure 12.

The conclusion obtained by error performance figure is seen similar with output image, and error declines comparatively fast, arrives in horizontal axis 4 19 point, deviation is smaller, but cusp occurs at both ends, and deviation is larger, while available by the mean square deviation table of following formula Identical conclusion.

NEWRB, neurons=0, MSE=1349.25

NEWRB, neurons=2, MSE=105.28

NEWRB, neurons=3, MSE=29.3692

NEWRB, neurons=4, MSE=0.452869

NEWRB, neurons=5, MSE=0.411198

NEWRB, neurons=6, MSE=0.263052

NEWRB, neurons=7, MSE=0.0828302

NEWRB, neurons=8, MSE=0.0645026

NEWRB, neurons=9, MSE=0.0550501

NEWRB, neurons=10, MSE=0.0354879

NEWRB, neurons=11, MSE=0.028415

NEWRB, neurons=12, MSE=0.0274097

NEWRB, neurons=13, MSE=0.0228389

NEWRB, neurons=14, MSE=0.0164181

NEWRB, neurons=15, MSE=0.011896

NEWRB, neurons=16, MSE=0.0115202

NEWRB, neurons=17, MSE=0.0114105

NEWRB, neurons=18, MSE=0.00630194

NEWRB, neurons=19, MSE=0.0062908

NEWRB, neurons=20, MSE=4.891

The work of this emulation experiment is to be completed on the platform of Matlab by calling RBF neural tool box function , by 21 groups of training data, RBF network can be trained well, can be very well when neuron number is at 15 or more Ground controls mean square error, and mechanical arm is made to realize autonomous learning.

Claims

1. a kind of Mechanical arm control method based on RBF neural, it is characterised in that: its method is as described below:

Step 1: providing a kind of machinery according to the mechanism of the working principle of each module of human brain cognitive system and operant conditioning reflex The cognitive learning model mechanism of arm；

Step 3: the cerebellum based on radial primary function network-basal ganglion operating condition learning algorithm design, using artificial mind The mathematical model that can make mechanical arm autonomous learning is established through network and intensified learning method；

Step 4: controlling machine using the cerebellum based on radial primary function network-basal ganglion operating condition cognitive learning model Tool arm establishes mechanical arm emulation experiment model in Matlab；

Step 5: carrying out the test of feasibility by changing parameter and variable, verifying is based on RBF neural in Matlab Mechanical arm control method.

2. a kind of Mechanical arm control method based on RBF neural according to claim 1, it is characterised in that: described The step of two in the core of hybrid learning algorithm be: exploratory behaviour a_e, supervise behavior a_s, the two be weighted summation obtain it is compound Behavior a_f, it may be assumed that

a_f←ωa_e+(1-ω)a_s (1)

1), probabilistic type action selection usage behavior strategy π_A(s), it is the mapping of state to behavior, the RBF for being θ with a parameter Network is approached, and similar thermodynamic system, the randomness of multiagent system state transition shows certain statistical law, is enabled Probability distribution is obeyed in its exploratory behaviour selection, i.e. Blotzmann-Gibbs distribution:

Wherein, T is thermodynamic temperature, K_BFor Boltzmann constant,For Boltzmann factor, Z is partition function；

Formula is deduced, exploratory behaviour a_eAlternative state s, ε (s)=ε (a_e)=(a_e-a_A)², T indicates that degree is explored in behavior, i.e., warm Degree is higher, and exploration degree is bigger, and for the T that each is determined, system has its corresponding equalization point；

V (s)=E { r_t+1+γV(s_t+1)} (3)

δ=r_t+1+γV(s_t+1)-V(s_t) (4)

Wherein, 0 < γ < 1 is the evaluation rewards and punishments factor；

3) one priori knowledge collection of monitor, is given in model, the expectation as behavior network maps, behavioral strategy π_A(s) parameter in The update of θ is realized jointly by cerebellum module and basal ganglion module, it may be assumed that

θ←θ+ωΔθ_BG+(1-ω)Δθ_CB (5)

Error criterion for weighed value adjusting are as follows:

Wherein, η ∈ [0,1] is learning rate, and δ is second evaluation signal；

4), coordinating factor ω indicates the specific gravity that the supervised learning of cerebellum accounts in the cognitive process of behavior network, controls in study The initial stage of process, probability behavior error is larger, and the collected status information of behavior network is less and inaccurate, monitor Supervised learning occupies larger specific gravity, but increasing with the number of iterations, and rear stage cerebellum and basal ganglion are in rising wherein Effect is changed, and effect of the monitor of cerebellum module in learning process is constantly reduced, and strengthening mechanism, which has risen, to be dominated, will Coordinating factor, which increases form with index, to be indicated:

3. a kind of Mechanical arm control method based on RBF neural according to claim 1, it is characterised in that: described The step of three in the mathematical model of autonomous learning realize that RBF neural has three-decker: defeated using RBF neural Enter layer, hidden layer, output layer, the architecture of " feeling-association-reaction " having the same, input layer corresponds to sensory nerve The node of member, hidden layer correspond to the node of association's neuron, and output layer corresponds to the node of reaction neuron, and input layer only rises To the effect of transmitting signal, after signal is passed to hidden layer by input layer, " base " for using RBF as hidden unit constitutes hidden layer pair It carries out processing conversion, their connection weights between two layers are 1, and hidden layer is exported using nonlinear optimization strategy Layer is using linear optimization strategy；

RBF neural learning algorithm needs to solve 3 parameters: center, variance and the hidden layer of basic function to output layer Weight；

1), the learning center t of radial basis function_i(i=1,2 ..., I) uses K- means clustering algorithm, it is assumed that cluster centre has I A, the value of I is determined by priori knowledge, if t_i(n) (i=1,2 ..., I), the center of basic function when being nth iteration, K- mean value Specific step is as follows for clustering algorithm:

Step 1: executing initialization to cluster centre, i.e., is rule of thumb concentrated from training sample and randomly select I different samples This is as initial center t_i(0) iterative steps n=0 is arranged in (i=1,2 ..., I)；

Step 2: stochastic inputs training sample Xk；

i(X_k)=argmin | | X_k-t_i(0) | |, i=1,2 ..., I (10)

Step 4: updating adjustment cluster centre, X_kAddition so that the cluster centre of the i-th class is changed, new cluster centre It is equal to:

t_i(n+1)=t_i(n)+η[X_k(n)-t_i(n)], i=i (X_k)

t_i(n+1)=t_i(n), other (11)

Step 5: whether judging algorithmic statement, it will usually set a threshold value to the variation of cluster centre value, calculate cluster centre Variation, if it is less than this value, stopping calculates down, if cluster centre still changes, algorithm is not restrained, and jumps back to the Two steps continue iteration, and final center takes t_i(n)；

2), the variances sigma of radial basis function_i(i=1,2 ..., I)

3), the study weight w of radial basis function_ij(i=1,2 ..., I, j=1,2 ..., J)

The neuron of RBF network output layer is the output weighted sum to hidden layer neuron, the reality output of RBF network Are as follows:

Y (n)=G (n) W (n) (13)

The corresponding input variable of each neuron of input layer, enabling its neuron number is n, and input vector is x=(x₁, x₂,...,x_n)^T, the corresponding Gaussian bases of each node of hidden layer, node in hidden layer j, hidden layer output h= [h_j]^T, h_jFor the output of j-th of neuron of hidden layer, wherein c is the seat of j-th of neuron Gaussian bases central point of hidden layer Mark vector c=(c₁,c₂,...,c_j)^T, b_jFor the width of j-th of neuron Gaussian bases of hidden layer, it may be assumed that sound stage width vector b= (b₁,b₂,...,b_j)^T, in third layer, that is, output layer, neural network weight w=[w₁,w₂,...,w_m]^T, network output is y (t) =w^TH=w₁h₁+...+w_mh_m,Error for ideal first of output of output is e_l=y_l ^d-y_lEntire sample error index

For the behavior network and evaluation network mentioned in model before this, identical RBF network structure is all used, input is just Beginning state s₀, the weight of behavior network indicates that the weight for evaluating network is indicated with w with θ.