CN109635932A

CN109635932A - A kind of Graphic Design and implementation method of part connection multilayer perceptron

Info

Publication number: CN109635932A
Application number: CN201811538198.3A
Authority: CN
Inventors: 李玉鑑; 沈浩; 张婷; 刘兆英; 李冬冬; 单传辉
Original assignee: Beijing University of Technology
Current assignee: Beijing University of Technology
Priority date: 2018-12-16
Filing date: 2018-12-16
Publication date: 2019-04-16

Abstract

The invention discloses the Graphic Design and implementation method of a kind of part connection multilayer perceptron, selection needs the data set of training, meets data set data prediction the needs of network training；Corresponding data structure is generated according to the network graphic that user draws, and checks whether the connection type of network is legal；It sorts to network structural topology, the forward calculation of network is executed according to linear order, calculates the output valve x on neuron；According to the retrospectively calculate for the linear order reverse execution network that topological sorting obtains, the error amount δ on neuron is calculated；Directed edge in traverses network calculates the weighted value w on gradient updating directed edge；The present invention can be used to the multilayer perceptron of training part connection, and generate network by graphic method, without generating network by programming.

Description

A kind of Graphic Design and implementation method of part connection multilayer perceptron

Technical field

The present invention relates to the Graphic Designs and implementation method of a kind of part connection multilayer perceptron, belong to pattern-recognition Field.The present invention can be used to train the multilayer perceptron of various connection ratios, and generate artificial mind by graphic method Through network rather than traditional programming is realized.

Background technique

As a kind of technology for realizing artificial intelligence, artificial neural network is in image recognition, speech recognition, natural language The multiple fields such as speech processing achieve breakthrough achievement, and receive very high concern.Multilayer perceptron was once suitable Popular machine learning algorithm, possesses and is widely applied scene.In recent years, due to the success of deep learning, multilayer perceptron is again heavy Newly paid close attention to.

Multilayer perceptron is a kind of feedforward neural network by there is the digraph of multilayer node to constitute, and wherein each is non-defeated Ingress is all the neuron with nonlinear activation function, and each layer is connect entirely with next layer.The one of multilayer perceptron A advantage is that it can be trained to.The process of training multilayer perceptron is exactly optimization mathematically in fact, and target is to find one The optimal weight of group reduces the error of predicted value and actual value to greatest extent.

A given training sample is indicated with (x, y), calculates the activation value on each neuron first, to the last one Layer.After the last layer calculates, the error of actual value and predicted value is calculated.Further according to obtained error retrospectively calculate The error delta value on each neuron on each layer, the value can be understood as the neuron for the value of overall error, The calculating process of error is promoted from the last layer.It is finally undated parameter, target is that error is made to become smaller.

Using element product (Hadamardproduct) is indicated, the matrix representation forms of multilayer perceptron algorithm are as follows It is shown:

1. executing forward calculation process, the second layer is calculated up to the activation value of the last layer, subscript n is indicated more here The n-th layer of layer perceptron, n_lIndicate the last layer, that is, output layer of multilayer perceptron.

2. first finding out the error amount of the last layer, y indicates that the label value of sample, z indicate a upper layer data as input warp The value after weighted sum is crossed, a indicates value of the z after activation primitive activation,Indicate that sample passes through multilayer perceptron Predicted value after calculating, f'(z) indicate f (z) derivative:

3. inverted order seeks the error amount from layer second from the bottom to the second layer, W indicates the ginseng between multilayer perceptron adjacent two layers Number, W and b it is practical be exactly parameter that whole network is trained:

δ^l=((W^l)^Tδ^l+1)·f'(z^l)

4. calculating gradient, Δ W and Δ b indicate the gradient of network training parameter, in back-propagation process each time all with This parameter updates network:

5. updating weight in multilayer perceptron, η is the learning rate in network, and numberical range general control is in 0 η≤1 <, master It is used to control the step-length that network parameter training updates:

W^l=W^l-ηΔW^l

b^l=b^l-ηΔb^l

Summary of the invention

Part connects the Graphic Design and implementation method of multilayer perceptron, is to construct one in the way of graphic hotsopt The feedforward neural network that a user needs, the requirement form of network is the graph structure of multilayer perceptron, and one kind is connected based on part The calculating mode identification method of multilayer perceptron, it is characterised in that following steps:

Step 1: selection training set selects a kind of data set of data set as neural network in board interface, optionally It is all data set disclosed in UCI that data set is most of, and the present invention has carried the data such as Iris, Sonar, Diabetes, Blood Collection uses for partially connecting multilayer perceptron model training；By taking Iris data set as an example, data set includes 150 data samples This, is divided into 3 classes, 50 data of every class, and each data include 4 attributes and 1 label value, can pass through calyx length, calyx Which kind of in three types width, petal length and 4, petal broadband attribute forecast iris flowers belong to；

Step 1.1: in order to accelerate trained convergence rate, each feature being normalized；It, can according to the sample of selection With the decline of selection batch gradient, stochastic gradient descent and the decline of mini-batch gradient, because not being optionally large-scale data, Therefore default uses stochastic gradient descent, only handles a sample every time；Training is had an impact in order to avoid sample order is fixed, Upset the sequence of sample in data set at random before training sample；

Step 1.2: the characteristic value in sample after pretreatment being put into array X, corresponding label value is successively stored another In a array Y；It is needed additional for classification task in order to which the error distance calculating allowed between label value and predicted value is more reasonable Label value is changed into one-hot coding form；If it is the task of recurrence, label value is not processed；

Step 2: tectonic network structure draws the part connection multilayer perceptron for meeting user demand in drawing board interface Neural network structure；

Step 2.1: in picture interface button regional choice " the multilayer perceptron number of plies " button, then input Multilayer Perception occur Device number of plies button inputs the number of plies；Then program according to the width at entire picture interface can evenly distribute every layer of spacing, and The initial layout distribution at picture interface is rationally provided according to the height at picture interface；Part connects each layer of use of multilayer perceptron One capsule indicates, calculates and saves some vector informations (such as output valve, error term etc.) in capsule structure, and and this The relevant link information of a capsule；

Step 2.2: neuron number in each layer capsule is set, and clicking capsule then will pop up the corresponding capsule of interface requirements input Internal neuron number；

Step 2.3: the activation primitive of each layer capsule is set, and capsule is defaulted without activation primitive；Capsule as input is not Activation processing is done, the activation primitives such as relu, sigmoid, tanh can be used in the capsule of intermediate hidden layers, and output layer capsule swashs Function living can generally be set as softmax function if it is classification task, not select activation primitive then if it is recurrence task；

Step 2.4: after neuron number determines in the number of plies of part connection multilayer perceptron and every layer of capsule, clicking " part connects " button, then form part and connect between adjacent two layers capsule；Connection multilayer perceptron in part is generated in interface Connection between each layer, can be with the sequence of display data stream calculating and the category by clicking the connection of directed edge figure setting section Property；The mode of part connection can be to be connected at random between two layers of neuron, also can choose the specific of some systems offers Rule connection；Random partial connection is corresponding to generate a mask matrix, the mask according to the connection Probability p set on directed edge Matrix is obeyed the Bernoulli that probability is p and is distributed, and only 0 and 1 two element in the matrix, 0 representative disconnects between two nodes Connection, 1 indicates to keep original connection；

Step 2.5: determining the computation sequence of capsule in the network formed, multilayer sense is mainly connected according to generating portion Know the dependence list generated in device graphic procedure, determines computation sequence using topological sorting；

Step 3: execution part connects the calculating of multilayer perceptron, the training and test of complete paired data collection；

Step 3.1: executing forward calculation；Unlike the multilayer perceptron calculation of full type of attachment in operation The mask matrix with weight same shape is used, the connection shape of the intermediate node of two-tier network is controlled by mask matrix State；W and b indicates the weight and biasing in part connection multilayer perceptron between adjacent two layers, both part connection multilayer Perceptron needs the parameter of training；Z indicates that one layer of output valve is weighted the knot of summation as the input data of current layer Fruit, a indicate that activation value of the z after activation primitive calculating, subscript indicate in part connection multilayer perceptron where element Layer, the forward calculation of the multilayer perceptron connected entirely is as follows:

z^l+1=W^la^l+b^l, a^l+1=f (z^l+1)

And weight matrix is first done corresponding element multiplication with corresponding mask matrix by the multilayer perceptron needs that part connects (Hadamard product)；Mask matrix used in the mask matrix and dropconnect method of part connection multilayer perceptron is not Same place is that the mask matrix generated in part connection multilayer perceptron is all identical in training and test, and Mask matrix data of every one batch of training in the training stage will generate a mask at random again in dropconnect Matrix, and do not use mask matrix in test phase, will also do scaling processing to the parameter in network accordingly, and part Connection multilayer perceptron does not need to do the scaling processing of parameter in test phase；Here mask matrix is indicated with M, Hadamard product With indicating, the forward calculation that part connects multilayer perceptron is as follows, and subscript indicates that these parameters connect multilayer in part Which layer of perceptron:

z^l+1=(M^l·W^l)a^l+b^l,a^l+1=f (z^l+1)

Step 3.2: executing retrospectively calculate；The output capsule in network is found, i.e. the last one capsule in calculating figure, utilized The predicted value of output and the true value of corresponding label first find out the error amount of the last layer, the last layer n_lIt indicates, y is indicated The true value of data sample,What is actually indicated is predicted value of the network to sample, and the process for calculating δ in fact exactly will be last Overall error is shared in network on each node, and the part using square error as loss function connects multilayer perceptron The retrospectively calculate formula of the last layer is as follows:

Then it is calculated always in reverse order from layer second from the bottom to the second layer, W and b indicate part connection perceptron adjacent two layers Between parameter, W and b it is practical be exactly required for whole network training parameter, calculation formula are as follows:

δ^l=((W^l)^Tδ^l+1)·f'(z^l)

Step 3.3: calculating gradient, Δ W and Δ b indicate the gradient of network training parameter, in back-propagation process each time In be all according to updating network with this parameter, the calculating of gradient is as multilayer perceptron:

Step 3.4: undated parameter value needs exist for again using the upper mask matrix generated before, and the calculating of update is public Formula is as follows, learning rate of the η for network, η≤1 learning rate general 0 <, for controlling the step-length of network training parameter update:

W^l=W^l-η(M^l·ΔW^l)

b^l=b^l-ηΔb^l

Step 3.5: after the training for completing part connection multilayer perceptron, the generalization ability of network is measured on test set, Calculating process on test set is consistent with the forward calculation in training process, finally exports trained network on test set Accuracy.

Compared with existing deep learning frame, the present invention is had the effect that

1) it builds neural network model no longer to need since original programming, but passes through shape patterned on interface The neural network that formula construction needs；

2) part that realizes between layers of multilayer perceptron connects, and can effectively reject full connection multilayer perceptron Some parameters of middle redundancy.

Detailed description of the invention

Fig. 1 is that the part that present invention capsule constructs connects multilayer perceptron in the effect picture at picture interface.

Fig. 2 is the flow chart that the present invention is implemented.

Specific embodiment

The present invention is explained and illustrated and the multilayer perceptron of more various connection ratios and full type of attachment below Classifying quality on different data sets.

The multilayer perceptron of graph-based part connection can be used in the present invention, forms an artificial neuron in a short time Network can be used for intuitively showing the classifying quality of neural network, specific embodiment party in the teaching and experiment of pattern-recognition Case are as follows:

Step 1: a training set of use is somebody's turn to do from the diabetes data set downloaded in UCI data set on internet A total of 768 data of data set, feature have pregnancy number, blood glucose, blood pressure, stratum corneum lipids, insulin, BMI body quality Index, diabetes genetic function, the age, as a result.Wherein the result is that the feature to be predicted, the sample comprising two categories, 0 represents Diabetes are not suffered from, 1 represents with diabetes.In 768 data points, 500 are marked as 0,268 and are marked as 1. When selecting the data set and testing, this graphical system will indicate that the 0 of label value, 1 changes into when selecting the data set automatically One-hot form.The ratio of 7:3 is used for the division of training set and test set.

Step 2: the graphic button provided using front-end interface is formed comprising an input layer, a hidden layer and one The artificial neural network structure of the multilayer perceptron of output layer.Firstly the need of the capsule figure for clicking different piece in front-end interface Shape, be arranged in pop-up the capsule expression this layer network in neuron number and this layer in each neuron adopted Activation primitive.

Step 2.1: in this implementation, since training dataset has eight training characteristics, therefore the mind of input capsule being set Should be consistent with feature quantity through first number, 8 are set as, activation primitive is not used to input capsule.

Step 2.2: the neuron number that the capsule of intermediate hidden layers is arranged is 100, and activation primitive is selected by combobox For relu function.

Step 2.3: since label is converted to the one-hot form of expression and is two classification problems, therefore last The number for exporting setting neuron in capsule is 2, and activation primitive is set as softmax function.

Step 2.4: clicking the ratio partially connected between directed edge figure setting adjacent two layers.

Step 3: in some hyper parameters of the sidebar setting network training of drawing board.It is 0.001 that learning rate, which is arranged, task setting For classification, training method is stochastic gradient descent, clicks the training that training button starts network.Show that training is every at the end of training The corresponding accuracy of one wheel and loss curve graph, print display for the accuracy of test set.

The present invention compares 6 kinds of random connection ratios in embodiment, and is same by the connection ratio setting between each layer Kind, be respectively 0.5,0.6,0.7,0.8,0.9 and for 1 full connection.In order to be comparable experimental result, to training set Identical cutting is used with the data of test set.It is as shown in table 1 using the test accuracy of same division to the data set:

1 part of table connects multilayer perceptron to the classification performance of UCI Diabetes

Connection ratio	0.5	0.6	0.7	0.8	0.9	1.0
							Test accuracy	74.63%	74.85%	74.98%	74.89%	75.06%	74.68%

And it also can exceed that in such a way that part connects on the small-sized multilayer perceptron for handling small data set and connect entirely The effect connect.Classified using the implementation steps of above embodiment to the Iris data set in UCI, since the difference of data set is right Network structure is adjusted, and the number of nodes of input layer is 4, and the number of nodes of intermediate hidden layers is 50, and the number of nodes of output layer is 3, Keeping in other parameters and embodiment, the classifying quality on test set is as shown in table 2:

2 part of table connects multilayer perceptron to the classification performance of UCIIris

Connection ratio	0.5	0.6	0.7	0.8	0.9	1.0
							Test accuracy	94.67%	95.78%	94.67%	94.44%	93.78%	92.89%

Claims

1. a kind of Graphic Design and implementation method of part connection multilayer perceptron, which is characterized in that this method includes as follows Step:

Step 1: selection training set selects a kind of data set of data set as neural network, optional data in board interface It is all data set disclosed in UCI that collection is most of, the present invention carried the data sets such as Iris, Sonar, Diabetes, Blood for Part connection multilayer perceptron model training uses；By taking Iris data set as an example, data set includes 150 data samples, is divided into 3 Class, 50 data of every class, each data include 4 attributes and 1 label value, can pass through calyx length, calyx width, petal Which kind of in three types length and 4, petal broadband attribute forecast iris flowers belong to；

Step 1.1: in order to accelerate trained convergence rate, each feature being normalized；According to the sample of selection, Ke Yixuan Batch gradient decline, stochastic gradient descent and the decline of mini-batch gradient are selected, because not being optionally large-scale data, thus it is silent Recognize using stochastic gradient descent, only handles a sample every time；Training is had an impact in order to avoid sample order is fixed, is being instructed Upset the sequence of sample in data set before practicing sample at random；

Step 1.2: the characteristic value in sample after pretreatment being put into array X, corresponding label value successively stores another number In group Y；For classification task, in order to which the error distance calculating allowed between label value and predicted value is more reasonable, needing additionally will mark Label value changes into one-hot coding form；If it is the task of recurrence, label value is not processed；

Step 2: tectonic network structure draws the nerve for meeting the part connection multilayer perceptron of user demand in drawing board interface Network structure；

Step 2.1: in picture interface button regional choice " the multilayer perceptron number of plies " button, then occurring inputting multilayer perceptron layer Number button, inputs the number of plies；Then program according to the width at entire picture interface can evenly distribute every layer of spacing, and according to The height at picture interface rationally provides the initial layout distribution at picture interface；Each layer of part connection multilayer perceptron is with one Capsule indicate, calculate and save some vector informations (such as output valve, error term etc.) in capsule structure, and with this glue The relevant link information of capsule；

Step 2.2: neuron number in each layer capsule is set, and clicking capsule then will pop up the corresponding capsule of interface requirements input Neuron number；

Step 2.3: the activation primitive of each layer capsule is set, and capsule is defaulted without activation primitive；Capsule as input does not do sharp The activation primitives such as relu, sigmoid, tanh, the activation letter of output layer capsule can be used in processing living, the capsule of intermediate hidden layers Number can generally be set as softmax function if it is classification task, not select activation primitive then if it is recurrence task；

Step 2.4: after neuron number determines in the number of plies of part connection multilayer perceptron and every layer of capsule, clicking " part Connection " button then forms part and connects between adjacent two layers capsule；In interface generate part connection each layer of multilayer perceptron it Between connection, can with display data stream calculate sequence and by click directed edge figure setting section connection attribute；Part The mode of connection can be to be connected at random between two layers of neuron, also be can choose the specific rule that some systems provide and is connected It connects；Random partial connection is corresponding to generate a mask matrix, mask matrix clothes according to the connection Probability p set on directed edge It is distributed from the Bernoulli that probability is p, only 0 and 1 two element in the matrix, 0 represents the company disconnected between two nodes It connects, 1 indicates to keep original connection；

Step 2.5: determining the computation sequence of capsule in the network formed, multilayer perceptron is mainly connected according to generating portion The dependence list generated in graphic procedure, determines computation sequence using topological sorting；

Step 3.1: executing forward calculation；Benefit is wanted in operation unlike the multilayer perceptron calculation of full type of attachment The mask matrix with weight same shape is used, the connection status of the intermediate node of two-tier network is controlled by mask matrix；W and B indicates the weight and biasing in part connection multilayer perceptron between adjacent two layers, both part connection multilayer perceptron Need trained parameter；Z indicate one layer of output valve as the input data of current layer be weighted summation as a result, a table Show that the activation value after z is calculated by activation primitive, subscript indicate the layer in part connection multilayer perceptron where element, entirely The forward calculation of the multilayer perceptron of connection is as follows:

z^l+1=W^la^l+b^l, a^l+1=f (z^l+1)

z^l+1=(M^l·W^l)a^l+b^l,a^l+1=f (z^l+1)

Step 3.2: executing retrospectively calculate；The output capsule in network is found, i.e. the last one capsule in calculating figure, utilizes output Predicted value and corresponding label true value, first find out the error amount of the last layer, the last layer n_lIt indicates, y indicates data The true value of sample,What is actually indicated is predicted value of the network to sample, and the process for calculating δ is exactly by last total mistake in fact Into network on each node, the part using square error as loss function connects the last of multilayer perceptron at difference booth One layer of retrospectively calculate formula is as follows:

Then it is calculated always in reverse order from layer second from the bottom to the second layer, W and b are indicated between part connection perceptron adjacent two layers Parameter, W and b it is practical be exactly required for whole network training parameter, calculation formula are as follows:

δ^l=((W^l)^Tδ^l+1)·f'(z^l)

Step 3.3: calculating gradient, Δ W and Δ b indicate the gradient of network training parameter, in back-propagation process each time all Network is updated so that this parameter is foundation, the calculating of gradient is as multilayer perceptron:

Step 3.4: undated parameter value needs exist for again using the upper mask matrix generated before, and the calculation formula of update is such as Shown in lower, learning rate of the η for network, η≤1 learning rate general 0 <, for controlling the step-length of network training parameter update:

W^l=W^l-η(M^l·ΔW^l)

b^l=b^l-ηΔb^l

Step 3.5: after the training for completing part connection multilayer perceptron, the generalization ability of network is measured on test set, is being surveyed Calculating process on examination collection is consistent with the forward calculation in training process, finally exports trained network on test set just True rate.