CN106779062A

CN106779062A - A kind of multi-layer perception (MLP) artificial neural network based on residual error network

Info

Publication number: CN106779062A
Application number: CN201611035693.3A
Authority: CN
Inventors: 胡伏原; 吕凡; 谭明奎
Original assignee: Suzhou University of Science and Technology
Current assignee: Suzhou University of Science and Technology
Priority date: 2016-11-23
Filing date: 2016-11-23
Publication date: 2017-05-31

Abstract

The invention discloses a kind of multi-layer perception (MLP) artificial neural network based on residual error network, multi-layer perception (MLP) artificial neural network based on irregular network includes some mixed-media network modules mixed-media structures, convolution in residual error neutral net is replaced by the way of connecting entirely, neuronal structure obtains the output of complete residual error module by the output of each hidden layer in mixed-media network modules mixed-media structure, wherein, each hidden layer is output as:s_i=ReLU [BN (net_i)]；Complete irregular module is output as:o_i=ReLU [BN (net_i+1)+net_i].The present invention be one kind based on residual error network, amount of calculation is smaller, more accurately multi-layer perception (MLP) artificial neural network, can preferably in the application of the more areas in addition to image.

Description

A kind of multi-layer perception (MLP) artificial neural network based on residual error network

Technical field

The present invention relates to biometric calculating field, and in particular to a kind of multi-layer perception (MLP) based on residual error network is manually refreshing Through network.

Background technology

Artificial neural network (Artificial Neural Network) is a kind of biometric, big by mimic biology The central nervous system of brain, sets up Mathematical Modeling or computation model with Function Estimation and analysis, is generally used in engineering The field such as habit and cognitive learning.Multi-layer perception (MLP) (Multilayer Perception), is also propagated forward network, for the first time It is to be proposed in its thesis for the doctorate by Paul J.Werbos in 1974, is a kind of structure of exemplary depth study, comprising input Layer, output layer and hidden layer.The number of plies and complexity of hidden layer determine the ability of network, and excessively complicated network holds The phenomenon of over-fitting is also easy to produce, the difficult point that hidden layer is deep learning how is correctly designed.

In recent years, the network structure of deep learning constantly adds " depth " --- and its hidden layer is more and more, is reducing mistake While rate, but also exposure is except another problem --- and degenerate problem, excessive hidden layer stacking easily causes error rate weight Newly uprise.On this basis, residual error network is a kind of effective neural network structure, by shortcut articulamentums and layer, is made Obtaining partial information can directly transmit, and improve the degree of accuracy of network.Most of all, the network structure causes deeper nerve Network becomes feasible, and in experimentation, network even can reach more than thousand layers.

The superior characteristic of residual error network so that it is more and more used in the middle of production, its good performance is in meter Calculation machine visual aspects achieve suitable success.But in use, residual error network often combines convolutional neural networks (Convolutional Neural Network) is used, and the chain type derivation in convolution operation and back-propagation process can cause instruction Substantial amounts of calculating is produced during white silk；Meanwhile, convolution in itself the characteristics of cause that it is more suitable for the operation to image, but for The fields such as other such as voice signals, natural language processing, effect will show slightly weak.

The content of the invention

It is an object of the invention to the problem above for overcoming prior art to exist, there is provided a kind of multilayer based on residual error network Perceptron artificial neural network, the present invention is a kind of based on residual error network, and amount of calculation is smaller, accurate Multilayer Perception Machine artificial neural network, can preferably in the application of the more areas in addition to image.

To realize above-mentioned technical purpose, above-mentioned technique effect is reached, the present invention is achieved through the following technical solutions：

A kind of multi-layer perception (MLP) artificial neural network based on residual error network, residual error neutral net builds an information transmission Express passway, training process keeps raw information, there is interior covariant skew in residual error neutral net, in residual error neutral net Middle introducing BN methods, the input for each neuron adds parameterWithThe input of each neuron is：

Wherein,It is to use the standardized linear pattern nondimensionalization function of standard deviation,It is expressed as

μ and σ represent the expected value and standard deviation of input distribution respectively；

Multi-layer perception (MLP) artificial neural network based on irregular network includes some mixed-media network modules mixed-media structures, using full connection The mode convolution that replaces in residual error neutral net, neuronal structure is by each hidden layer in the mixed-media network modules mixed-media structure Export to obtain the output of complete residual error module,

Wherein, each hidden layer is output as

s_i=ReLU [BN (net_i)] (3)

Complete irregular module is output as

o_i=ReLU [BN (net_i+1)+net_i] (4)

Preferably, when the dimension of input with the output of the residual error module is different, using the dimension of full connection adjustment input, So that residual error module is run.

Preferably, the rate of accuracy reached 98% of the residual error module data collection training.

The beneficial effects of the invention are as follows:

The present invention be one kind based on residual error network, amount of calculation is smaller, more accurately multi-layer perception (MLP) artificial neuron Network, can preferably use the application in the more areas in addition to image.

Artificial neural network of the invention overcomes conventional residual network to depend on calculation cost caused by convolutional neural networks Greatly, narrow application range, proposes the Remanent Model with multi-layer perception (MLP) artificial neural network as carrier, and the model is in deep learning With wider applicability, can apply and be not limited to other every field of image domains.Amount of calculation is reduced, depth is accelerated Degree learning model training process, there is more preferable advantage in application process；Have a wide range of application, can be widely applied to voice knowledge Not, the fields such as natural language processing, electrocardiogram monitoring.

Described above is only the general introduction of technical solution of the present invention, in order to better understand technological means of the invention, And can be practiced according to the content of specification, below with presently preferred embodiments of the present invention and coordinate accompanying drawing describe in detail as after. Specific embodiment of the invention is shown in detail by following examples and its accompanying drawing.

Brief description of the drawings

Technical scheme in technology in order to illustrate more clearly the embodiments of the present invention, in being described to embodiment technology below The required accompanying drawing for using is briefly described, it should be apparent that, drawings in the following description are only some realities of the invention Example is applied, for those of ordinary skill in the art, on the premise of not paying creative work, can also be according to these accompanying drawings Obtain other accompanying drawings.

Fig. 1 is residual error network structure of the present invention；

Fig. 2 is neuronal structure figure of the present invention.

Specific embodiment

Below in conjunction with the accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is carried out clear, complete Site preparation is described, it is clear that described embodiment is only a part of embodiment of the invention, rather than whole embodiments.It is based on Embodiment in the present invention, it is every other that those of ordinary skill in the art are obtained under the premise of creative work is not made Embodiment, belongs to the scope of protection of the invention.

Embodiment

A kind of multi-layer perception (MLP) artificial neural network based on residual error network is disclosed in the present embodiment, traditional god is solved Through the degenerate problem of network, traditional neutral net is typically the mode of direct transmission, including multi-layer perception (MLP), convolutional Neural net Network.

Traditional multi-layer perception (MLP) is made up of one or more hidden layers, and each hidden layer includes multiple neurons, it is assumed that I-th layer of input of neuron is net_{I, j}, it is output as s_{I, j}, then can obtain

s_{I, j}=f (net_{I, j})(6)

Wherein, n is the neuron number that preceding layer is directly connected to the neuron, constant b_{I, j}Represent the biasing of input.I Assign each layer line or nonlinear characteristic by activation primitive f.For output layer, the result that we predict is f (x⁽ⁱ⁾； W), target labels are y⁽ⁱ⁾, if its loss function being asked for using error of sum square, can obtain：

So only need to minimize the loss function.Back-propagation algorithm (Back Propagation) is taken to instruct Practice each layer of weight and biasing, so that loss function is minimum.Wherein, cause to train to overcome data volume excessive In slow phenomenon, using mini-batch gradient descent methods, the batch data that fixed size is extracted from initial data are carried out Training, so as to ensure that the high efficiency of training.

Back-propagation algorithm can be expressed as following processes

1. excitation is propagated：(1) the propagated forward stage is obtaining exciter response；(2) back-propagation phase is obtaining

Obtain response error.

2. weight updates：(1) input stimulus are multiplied with response error, so as to obtain the gradient of weight；(2)

This gradient is multiplied by a ratio and is added in weight after negating.

More complicated with the development of computing power, the neutral net of deeper starts to play great potential.However, working as Network structure is deeper and deeper simultaneously, along with degenerate problem --- the excessive network number of plies may result under training effect Drop.On this basis, residual error network builds identity mapping by shortcut, has built the fast of information transmission Fast passage so that training process can more keep the feature of raw information.

A kind of multi-layer perception (MLP) artificial neural network based on residual error network is disclosed shown in reference picture 1, in the present embodiment, Residual error neutral net builds an information transmission express passway, and training process keeps raw information, deposited in residual error neutral net In the skew of interior covariant, BN methods are introduced in residual error neutral net, the input for each neuron adds parameterWithThe input of each neuron is：

μ and σ represent the expected value and standard deviation of input distribution respectively.

In the present embodiment, the mixed-media network modules mixed-media structure of residual error network has been redesigned, has evaded convolution operation, while giving up Pondization operation.Pondization operation is usually to follow after the operation of volume machine obtains feature, and feature is carried out into aggregate statistics, calculates figure As upper some region of characteristic mean or maximum (depending on the circumstances).Pondization operation reduces convolution god to a certain extent Through the characteristic dimension of network, while effectively inhibiting over-fitting.But, pondization operation also brings along certain information and loses.

Substitute the convolution in original residual error network using the mode of full connection in the present embodiment, set forth herein neuron Structure is as shown in Figure 2：

Wherein, each hidden layer is output as

s_i=ReLU [BN (net_i)] (3)

Complete irregular module is output as

o_i=ReLU [BN (net_i+1)+net_i] (4)

When the dimension of input with the output of the residual error module is different, using the dimension for being fully connected adjustment input so that Residual error module is run.

Calculation cost is big caused by artificial neural network overcomes conventional residual network to depend on convolutional neural networks, is applicable model Enclose narrow, propose the Remanent Model with multi-layer perception (MLP) artificial neural network as carrier, the model has wider in deep learning General applicability, can apply and be not limited to other every field of image domains.Amount of calculation is reduced, deep learning mould is accelerated Type training process, there is more preferable advantage in application process；Have a wide range of application, can be widely applied to speech recognition, natural language The fields such as speech treatment, electrocardiogram monitoring.

The experiment based on MNIST data sets is devised in order to verify the validity of said structure, in the present embodiment, using heap Fold 5 layers of structure of Remanent Model of the invention, training result such as table 1

The training result of table 1

Unit	Epoch	Batchsize	Accuracy (%)
				10	2	128	95.7031
20	2	128	97.2656
				30	2	128	97.6562
40	2	128	98.8281
				50	2	128	98.0469

Experimental result surface, as NE number increases, the accuracy rate of system is also stepped up, and can be reached preferably 98%.

The present embodiment be one kind based on residual error network, amount of calculation is smaller, more accurately multi-layer perception (MLP) it is manually refreshing Through network, can preferably in the application of the more areas in addition to image.

The foregoing description of the disclosed embodiments, enables professional and technical personnel in the field to realize or uses the present invention. Various modifications to these embodiments will be apparent for those skilled in the art, as defined herein General Principle can be realized in other embodiments without departing from the spirit or scope of the present invention.Therefore, the present invention The embodiments shown herein is not intended to be limited to, and is to fit to and principles disclosed herein and features of novelty phase one The scope most wide for causing.

Claims

1. a kind of multi-layer perception (MLP) artificial neural network based on residual error network, builds an information and passes by residual error neutral net Express passway is passed, training process keeps raw information, there is interior covariant skew in residual error neutral net, in residual error nerve net BN methods are introduced in network, the input for each neuron adds parameterWithThe input of each neuron is：

y_{l}^{(k)} = γ_{l}^{(k)} {\hat{x}}_{l}^{(k)} + β_{l}^{(k)} - - - (1)

{\hat{x}}_{l}^{(k)} = \frac{x_{l}^{(k)} - μ}{σ} - - - (2)

Multi-layer perception (MLP) artificial neural network based on irregular network includes some mixed-media network modules mixed-media structures, using the side of full connection Formula replaces the convolution in residual error neutral net, the output that neuronal structure passes through each hidden layer in the mixed-media network modules mixed-media structure To obtain the output of complete residual error module,

Wherein, each hidden layer is output as

s_i=ReLU [BN (net_i)] (3)

Complete irregular module is output as

o_i=ReLU [BN (net_i+1)+net_i] (4)

2. the multi-layer perception (MLP) artificial neural network based on residual error network according to claim 1, it is characterised in that work as institute The dimension for stating input with the output of residual error module is different, using the dimension for being fully connected adjustment input so that residual error module is run.

3. the multi-layer perception (MLP) artificial neural network based on residual error network according to claim 1, it is characterised in that described The rate of accuracy reached 98% that residual error module is trained on MNIST data sets.