CN108898213A - A kind of adaptive activation primitive parameter adjusting method towards deep neural network - Google Patents

A kind of adaptive activation primitive parameter adjusting method towards deep neural network Download PDF

Info

Publication number
CN108898213A
CN108898213A CN201810631395.3A CN201810631395A CN108898213A CN 108898213 A CN108898213 A CN 108898213A CN 201810631395 A CN201810631395 A CN 201810631395A CN 108898213 A CN108898213 A CN 108898213A
Authority
CN
China
Prior art keywords
activation primitive
network
parameter
adaptive
neural network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201810631395.3A
Other languages
Chinese (zh)
Other versions
CN108898213B (en
Inventor
胡海根
周莉莉
罗诚
陈胜勇
管秋
周乾伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University of Technology ZJUT
Original Assignee
Zhejiang University of Technology ZJUT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University of Technology ZJUT filed Critical Zhejiang University of Technology ZJUT
Priority to CN201810631395.3A priority Critical patent/CN108898213B/en
Publication of CN108898213A publication Critical patent/CN108898213A/en
Application granted granted Critical
Publication of CN108898213B publication Critical patent/CN108898213B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Abstract

A kind of adaptive activation primitive parameter adjusting method towards deep neural network, the described method comprises the following steps:Step 1, mathematical definition is carried out to adaptive activation primitive parameter adjusting method first;Step 2, adaptive activation primitive is carried out based on MNIST data set and other classical activation primitives carry out experimental result comparison and analysis, the network used is that there are three hidden layers, each hidden layer has 50 neurons, use 100 periods of gradient descent algorithm iteration at any time, learning rate is set as 0.01, and minimum batch size is 100;Step 3, the detection after step 2 obtains optimal activation primitive version, applied to specific bladder cancer cell.The present invention is during network is constantly trained, find the optimal activation primitive for being suitble to the network by constantly adjusting own form, improve the performance of network, reduce adaptive activation primitive in network can learning parameter total number, accelerate e-learning rate, improves the extensive of network.

Description

A kind of adaptive activation primitive parameter adjusting method towards deep neural network
Technical field
The invention belongs to adaptive activation primitive fields, devise a kind of adaptive activation letter towards deep neural network Number parameter adjusting method.Specifically adaptive activation primitive by addition can learning parameter control itself shape, while these Learning parameter with network training can be updated, reduce adaptive activation primitive in net by back-propagation algorithm It is whole in network can learning parameter quantity.
Background technique
Nowadays machine learning is widely used in social life, and traditional machine learning mostly uses shallow structure, such as high This mixed model (GMM), conditional random fields (CRF), support vector machines (SVM) etc., table of these shallow structures to complicated function Show that ability is limited, to the extraction opposing primary of original input signal feature, for complicated classification problem its generalization ability by one Fixed restriction compares and is difficult to resolve certainly some more complicated natural sign processing problems, such as human speech and natural image identification Deng.Then deep learning is learnt by simulation brain to greatly facilitate the development of machine learning, and deep learning is maximum Feature is initial data to be transformed by some simple but nonlinear model higher level, to be more abstracted mark sheet It reaches, learns a kind of deep layer nonlinear network structure, realize approaching for complicated function, and from a few sample focusing study data set The ability of substantive characteristics.By it was verified that deep learning be good at discovery high dimensional data in labyrinth, be widely used in counting The research fields such as calculation machine vision, speech recognition, natural language processing.
As deep learning is in the application of various fields, more and more researchs concentrate on the innovation to deep learning algorithm and Optimization.At the beginning of optimization including classifier and loss function, the decline of the gradient based on backpropagation optimization, network weight parameter Optimization and the optimization of artificial neural network of beginningization etc., wherein being the innovation of deep learning algorithm to the optimization of artificial neural network Important component.Artificial neural network can possess different network structure and neuronal quantity, at this according to the difference of task People are usually using identical activation primitive, such as Sigmoid, Tanh, Relu in a little networks.It is proposed in recent years it is adaptive stress Function living, makes network neural member show different shapes, but with the expansion of network size and the increase of neuron, for adjusting Save these neuronal shapes can learning parameter show linear increase, the learning efficiency of network is significantly dragged down.Thus may be used See that artificial neural network basic structure is considered as being interconnected to form by some neurons, activation primitive then plays wherein Very important role.
The main function of activation primitive is to provide non-linear expression's ability of network in artificial neural network.If a mind It is only linear operation through neuron in network, then the network can only express simple Linear Mapping, even if increasing network Depth and width also still or Linear Mapping, it is difficult to effectively modeling actual environment in nonlinear Distribution data.It is added non- After linear activation primitive, deep neural network just has the Nonlinear Mapping learning ability of layering.Present invention is generally directed to Activation primitive improves, and optimizes the connection in network between neuron, further increases the performance of network whereby.
Summary of the invention
In order to reduce adaptive activation primitive in network can learning parameter total number, accelerate e-learning rate, change The generalization ability of kind network, the invention proposes a kind of adaptive activation primitive parameter regulation side towards deep neural network Method finds the optimal activation letter for being suitble to the network by constantly adjusting own form during network is constantly trained Number, improves the performance of network.
The technical solution adopted by the present invention to solve the technical problems is:
A kind of adaptive activation primitive parameter adjusting method towards deep neural network, the method includes following steps Suddenly:
Step 1, mathematical definition is carried out to adaptive activation primitive parameter adjusting method first, process is as follows:
If the adjustable parameter number of adaptive activation primitive is N, then adaptive activation primitive is defined as:
f(x)=f (a*x+c)
Wherein a and c be all for control activation primitive shape can learning parameter, so-called neural network regards many as The combination of single neuron, the output for defining neural network are collection weight, a deviation and can learn the compound of neuron parameter Function, function are as follows:
h(w,b,a,c)=h (f (a*x+c))
Wherein h represents the output of neural network, and w and b represent the weight and deviation of network;At the same time, the function also by Regard as in neural network all neurons using same group can learning parameter, more extensive defines is:Neural network In each neuron use different customized parameters, as follows:
Wherein fn represents one layer in network of each neuron, and each layer of neuron is determined using identical customized parameter Justice is as follows:
Relay algorithm using reversed and train the adaptive activation primitive in neural network, wherein can learning parameter along with Weight and bigoted as the progress of network training is optimized together, parameter { a1 ..., n, b1 ..., n } is according to chain type method of derivation It is then updated, is updated as follows:
Wherein ai ∈ { a1 ..., n, b1 ..., n }, L indicate cost function,This can pass through reversed from later layer Propagation obtains, and weighted term ∑ Xi can be used on all positions of characteristic pattern or neural net layer, for the change shared in one layer Amount, gradient ai can be acquired with following formula, and ∑ i is used to sum to the neuron in all channels or one layer, and formula is as follows:
Step 2, adaptive activation primitive is carried out based on MNIST data set and other classical activation primitives carries out experimental result Comparison and analysis, obtain final activation primitive version.Process is as follows:
The network used is there are three hidden layer, and each hidden layer has 50 neurons, uses gradient descent algorithm at any time 100 periods of iteration, learning rate are set as 0.01, and minimum batch size is 100.
Further, in the step 2, the comparison activation primitive used has traditional Si gmoid function, tradition ReLU activation letter The layering version of the unified version of several, adaptive activation primitive, the respective version of adaptive activation primitive and adaptive activation primitive This.
Step 3, after step 2 obtains optimal activation primitive version, applied to the detection of specific bladder cancer cell, process It is as follows:
3.1, the production of data set is carried out to bladder cancer;
3.2, hop algorithm and model is selected to carry out the initialization of parameter;
3.3, optimal activation primitive and conventional activation function are carried out to the comparison and analysis of experimental result.
Further, in the step 2, the comparison activation primitive used has traditional Si gmoid function, tradition ReLU activation letter The layering version of the unified version of several, adaptive activation primitive, the respective version of adaptive activation primitive and adaptive activation primitive This.
Further, in described 3.1, bladder cancer cell data set is made into pascal_voc2007 format, it is mainly sharp The label information of cell is saved with the xml document of generation.
In described 3.2, Faster R-CNN algorithm is selected, the initialization of network parameter is carried out using vgg16 model, is utilized Vgg16 pre-training model carries out network parameter initialization.
In described 3.3, using in the optimal activation primitive version replacement Faster R-CNN algorithm generated in step 3.2 Conventional activation function finally carries out the analysis and comparison of experimental result.
Beneficial effects of the present invention are mainly manifested in:From theoretical and experiments have shown that adaptive activation primitive parameter adjusting method Validity, the problems such as best activation primitive is provided, gradient disperse existing for conventional activation function is avoided for network, improve network Capability of fitting.
Detailed description of the invention
Fig. 1 is invention activation function AS convergence curve figure;
Fig. 2 be AS activation primitive of the present invention can learning parameter adjusting figure;
Fig. 3 is original sigmoid activation primitive of the invention and final AS activation primitive figure;
Fig. 4 is the final AS activation primitive of the present invention and Experimental comparison results' figure of other activation primitives.
Fig. 5 is Sigmoid activation primitive figure.
Fig. 6 is Tanh activation primitive figure.
Fig. 7 is ReLU activation primitive figure.
Specific embodiment
The invention will be further described below in conjunction with the accompanying drawings.
Referring to Fig.1~Fig. 7, a kind of adaptive activation primitive parameter adjusting method towards deep neural network, the side Method includes the following steps:
Step 1, mathematical definition is carried out to adaptive activation primitive parameter adjusting method first, process is as follows:
If the adjustable parameter number of adaptive activation primitive is N, then adaptive activation primitive is defined as:
f(x)=f (a*x+c)
Wherein a and c be all for control activation primitive shape can learning parameter, so-called neural network regards many as The combination of single neuron, the output for defining neural network are collection weight, a deviation and can learn the compound of neuron parameter Function, function are as follows:
h(w,b,a,c)=h (f (a*x+c))
Wherein h represents the output of neural network, and w and b represent the weight and deviation of network;At the same time, the function also by Regard as in neural network all neurons using same group can learning parameter, more extensive defines is:Neural network In each neuron use different customized parameters, as follows:
Wherein fn represents one layer in network of each neuron, and each layer of neuron is determined using identical customized parameter Justice is as follows:
Relay algorithm using reversed and train the adaptive activation primitive in neural network, wherein can learning parameter along with Weight and bigoted as the progress of network training is optimized together, parameter { a1 ..., n, b1 ..., n } is according to chain type method of derivation It is then updated, is updated as follows:
Wherein ai ∈ { a1 ..., n, b1 ..., n }, L indicate cost function,This can pass through reversed from later layer Propagation obtains, and weighted term ∑ Xi can be used on all positions of characteristic pattern or neural net layer, for the change shared in one layer Amount, gradient ai can be acquired with following formula, and ∑ i is used to sum to the neuron in all channels or one layer, and formula is as follows:
Step 2, adaptive activation primitive is carried out based on MNIST data set and other classical activation primitives carries out experimental result Comparison and analysis, process are as follows:
The network used is there are three hidden layer, and each hidden layer has 50 neurons, uses gradient descent algorithm at any time 100 periods of iteration, learning rate are set as 0.01, and minimum batch size is 100.
Further, in the step 2, the comparison activation primitive used has traditional Si gmoid function, tradition ReLU activation letter The layering version of the unified version of several, adaptive activation primitive, the respective version of adaptive activation primitive and adaptive activation primitive This.
The present invention is based on MNIST data set, in sigmoid classics activation primitive addition can learning parameter make from Adapt to activation primitive:Adaptive Sigmoid (AS), then by the adaptive activation primitive of each version and Sigmoid and ReLU two A classics activation primitive carries out the comparison of test result.
MNIST is a Handwritten Digit Recognition data set, referred to as the drosophila of deep learning experiment, it contains 60000 Picture is as training data, and 10000 pictures are as test set.In MNIST data set, each gray scale picture is all represented Number in 0~9.The size of picture is 28*28, and handwritten numeral appears in the middle of picture.Activate letter Number AS definition be:
F=b0*sigmoid(a0*x+a1)+b1
A0, a1, b0, b1 be four can learning parameter, they control the shape of function, can be with network weight and bigoted It is trained together.
The present invention mainly in sigmoid classics activation primitive addition can learning parameter make adaptive activation letter Number AS, the mathematical definition of function are as follows:
If the adjustable parameter number of adaptive activation primitive is N, it is assumed here that N=2.So adaptive activation primitive can be with It is defined as:
f(x)=f (a*x+c)
Wherein a and c be all can learning parameter for control activation primitive shape.So-called neural network can be regarded as The combination of many single neurons, the output for defining neural network are collection weight, a deviation and can learn neuron parameter Compound function, function are as follows:
h(w,b,a,c)=h (f (a*x+c))
Wherein h represents the output of neural network, and w and b represent the weight and deviation of network.At the same time, which may be used also It can learning parameter using same group to be seen as all neurons in neural network.One more it is extensive definition be:Nerve Each neuron uses different customized parameters in network, as follows:
Wherein fn represents one layer in network of each neuron.Each layer of neuron is determined using identical customized parameter Justice is as follows:
The present invention trains the adaptive activation primitive in neural network using reversed relay algorithm, wherein can learning parameter Along with weight and bigoted as the progress of network training is optimized together.Parameter a1 ..., n, b1 ..., n } it can be according to chain Formula Rule for derivation is updated, and is updated as follows:
Wherein ai ∈ { a1 ..., n, b1 ..., n }, L indicate cost function.This can pass through reversed from later layer Propagation obtains, and weighted term ∑ Xi can be used on all positions of characteristic pattern or neural net layer.For the change shared in one layer Amount, gradient ai can be acquired with following formula, and ∑ i is used to sum to the neuron in all channels or one layer, and formula is as follows:
In step 3, the method for adaptive activation primitive is applied to deep learning, the present invention is will be obtained in step 2 Optimal activation primitive is applied to the detection of bladder cancer cell.Process is as follows:
3.1, the production of data set is carried out.The present invention is that bladder cancer cell data set is made into pascal_voc2007 lattice Formula mainly saves the label information of cell using the xml document generated.
3.2, suitable algorithm and model is selected to carry out the initialization of parameter.The present invention selects Faster R-CNN algorithm, The initialization of network parameter is carried out using vgg16 model, mainly carries out network parameter initialization using vgg16 pre-training model The training time is reduced, while reducing the risk of poor fitting or over-fitting.
3.3, optimal activation primitive and conventional activation function are carried out to the comparison and analysis of experimental result.Mainly utilize step The conventional activation function in optimal activation primitive version replacement Faster R-CNN algorithm generated in rapid 2, is finally tested As a result analysis and comparison.
Finally, method proposed by the invention, i.e., use identical adjustable activation primitive in whole network, it is no matter neural How many a neurons are used in network, the number of parameters added in total is exactly the number of parameters that adaptive activation primitive can learn (shape that these parameters are used to control function).Whole network uses identical variant sigmoid function, more just as compound function Xiang Shici superposition, enhances the non-linear of network, improves the capability of fitting of network, and accelerate the pace of learning of network.
Hereinafter reference will be made to the drawings, and the present invention is described in detail.
As shown in Figure 1, the network used is there are three hidden layer, each hidden layer has 50 neurons, and use is terraced at any time Spend 100 periods of descent algorithm iteration.Learning rate is set as 0.01, and minimum batch size is 100.Comparison activation letter used in it Number has traditional Si gmoid function, tradition ReLU activation primitive, the unified version of adaptive activation primitive, adaptive activation primitive Respective version and adaptive activation primitive layering version.Wherein Fig. 1 is ReLU, Sigmoid and based on adaptive activation letter Three kinds of version convergence curves of number AS." relu_train " indicates to use Relu activation primitive classification error rate on training set. " relu_test " indicates to use Relu activation primitive classification error rate on test set." AUsigmoid " indicates adaptive activation The unified version (Unified Version, UV) of function AS, i.e., each neuron is all using identical activation primitive. " ALsigmoid " indicates the respective version (Individual Version, IV) of adaptive activation primitive, i.e., each neuron makes With the activation primitive respectively identified oneself." AIsigmoid " indicates layering version (Layer Version, LV), i.e. each layer of institute There is neuron using identical activation primitive, but the activation primitive between every layer is not necessarily identical.
The expression formula of traditional Si gmoid activation primitive is as follows:
Sigmoid activation primitive figure is referring to Fig. 5.
Sigmoid activation primitive is a kind of Activiation method often having, and is because of activation of the activation primitive for neuron Frequency has good explanation:From do not activate completely 0 to maximum boundary 1 fully saturated activation.But present Sigmoid function It seldom uses, a major reason is that Sigmoid function presence saturation makes gradient disappear.This is because Sigmoid is neural There are a bad characteristics for member, exactly when the activation value of neuron can be saturated when close at 0 or 1, at these regions, Functional gradient is almost 0, and such case causes when backpropagation, this (part) gradient will be with entire loss function Gradient about gate cell output is multiplied, and the result being multiplied also can be close to zero, this can effectively terminate gradient, cause almost There is no signal to pass to weight by neuron and arrive data again, leads to last gradient disperse problem.
Another classical activation primitive is:Tanh nonlinear function, expression formula are as follows:
Tanh activation primitive figure is referring to Fig. 6.
As seen from the figure, compared to Sigmoid function, although real number value is compressed between [- 1,1] by Tanh, and Sigmoid is the same, and there is also saturation problems.But with Sigmoid neuron unlike, its output is zero center.In reality In the operation of border, Tanh nonlinear function ratio Sigmoid nonlinear function is more favourable.It may be said that Tanh neuron is one simple The Sigmoid neuron of amplification.
Compared to the first two classics activation primitive, ReLU is the activation primitive being widely used now, its mathematical formulae It is as follows:
F (x)=max (0, x)
ReLU activation primitive figure is referring to Fig. 7.
Compared to Sigmoid and Tanh function, ReLU has huge acceleration for the convergence that gradient declines, this be by In the linear of it, unsaturated formula is generated.For ReLU activation primitive when input is positive number, there is no gradient saturations to ask Topic;When input is negative, ReLU is not activated completely, this indicates that ReLU will once being input to negative Paralysis.For example, may result in gradient updating to one kind when a very big gradient passes through the backpropagation of ReLU neuron Special state, neuron is likely to be activated again by other any operation nodes again in this state.If this It happens, then will all become 0 from this gradient Jing Guo this neuron backpropagation.That is, this ReLU unit By irreversible paralysis in training, which results in the diversified loss of data.
As shown in Figure 1, on MNIST training set, Relu activation primitive is used using the unified version ratio of activation primitive AS Realize lower classification error rate.Compared to original Sigmoid activation primitive is used, network has stronger capability of fitting.
It is illustrated in figure 2 the parameter regulation procedure chart that adaptive activation primitive unifies version, adaptive activation primitive is learned Parameter initialization is practised to be set as:A0=1.0, a1=0.0, b0=1.0, b1=0.0, after training iteration, final argument Become:A0=3.87, a1=0.07, b0=5.89, b1=-0.51, substantially just without too big variation.
As shown in figure 3, final adaptive activation primitive unifies version to be had more greatly compared to traditional Si gmoid activation primitive Codomain, largely solve the problems, such as the gradient disperse of traditional Si gmoid activation primitive, thus classification accuracy rate obtain Rise.
As shown in figure 4, the present invention using final adaptive activation primitive version (RAS) and other several activation primitives into The comparison of row experimental result, final adaptive activation primitive formula are:
F=5.89*sigmoid (3.87*x+0.07) -0.51
Can be seen that unified adaptive activation primitive from experimental result comparison diagram can reach best experiment effect, In the experiment of bladder cancer cell detection, using the testing result and speed of unified adaptive activation primitive all than traditional activation Function will be got well, and further demonstrating each network can train to obtain oneself most suitable activation primitive.

Claims (5)

1. a kind of adaptive activation primitive parameter adjusting method towards deep neural network, which is characterized in that the method packet Include following steps:
Step 1, mathematical definition is carried out to adaptive activation primitive parameter adjusting method first, process is as follows:
If the adjustable parameter number of adaptive activation primitive is N, then adaptive activation primitive is defined as:
f(x)=f (a*x+c)
Wherein a and c be all for control activation primitive shape can learning parameter, so-called neural network regards many single as The combination of neuron, define neural network output be collection weight, deviation and the compound function that neuron parameter can be learnt, Function is as follows:
h(w,b,a,c)=h (f (a*x+c))
Wherein h represents the output of neural network, and w and b represent the weight and deviation of network;At the same time, which is also seen as In neural network all neurons using same group can learning parameter, one more it is extensive define be:It is every in neural network A neuron all uses different customized parameters, as follows:
Wherein fn represents one layer in network of each neuron, each layer of neuron using identical customized parameter be defined as Under:
Relay algorithm using reversed and train the adaptive activation primitive in neural network, wherein can learning parameter along with weight With bigoted as the progress of network training is optimized together, parameter { a1 ..., n, b1 ..., n } is obtained according to chain type Rule for derivation To update, update as follows:
Wherein ai ∈ { a1 ..., n, b1 ..., n }, L indicate cost function,This can pass through backpropagation from later layer It obtains, weighted term ∑ Xi can be used on all positions of characteristic pattern or neural net layer, for the variable shared in one layer, ladder Degree ai can be used following formula to acquire, and ∑ i is used to sum to the neuron in all channels or one layer, and formula is as follows:
Step 2, adaptive activation primitive is carried out based on MNIST data set and other classical activation primitives carries out experimental result comparison With analysis, process is as follows:
The network used is there are three hidden layer, and each hidden layer has 50 neurons, uses gradient descent algorithm iteration at any time 100 periods, learning rate are set as 0.01, and minimum batch size is 100.
Step 3, after step 2 obtains optimal activation primitive version, applied to the detection of specific bladder cancer cell, process is such as Under:
3.1, the production of data set is carried out to bladder cancer;
3.2, selection algorithm and model carry out the initialization of parameter;
3.3, optimal activation primitive and conventional activation function are carried out to the comparison and analysis of experimental result.
2. a kind of method of the adaptive activation primitive parameter regulation of deep neural network as described in claim 1, feature exist In, in the step 2, the comparison activation primitive used have traditional Si gmoid function, tradition ReLU activation primitive, it is adaptive stress The layering version of the unified version of function living, the respective version of adaptive activation primitive and adaptive activation primitive.
3. a kind of method of the adaptive activation primitive parameter regulation of deep neural network as claimed in claim 1 or 2, feature It is, in described 3.1, bladder cancer cell data set is made into pascal_voc2007 format, mainly utilizes the xml text generated The label information of part preservation cell.
4. a kind of method of the adaptive activation primitive parameter regulation of deep neural network as claimed in claim 1 or 2, feature It is, in described 3.2, selects Faster R-CNN algorithm, the initialization of network parameter is carried out using vgg16 model, utilize Vgg16 pre-training model carries out network parameter initialization.
5. a kind of method of the adaptive activation primitive parameter regulation of deep neural network as claimed in claim 4, feature exist In, in described 3.3, utilize generated in step 3.2 optimal activation primitive version replacement Faster R-CNN algorithm in tradition Activation primitive finally carries out the analysis and comparison of experimental result.
CN201810631395.3A 2018-06-19 2018-06-19 Adaptive activation function parameter adjusting method for deep neural network Active CN108898213B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810631395.3A CN108898213B (en) 2018-06-19 2018-06-19 Adaptive activation function parameter adjusting method for deep neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810631395.3A CN108898213B (en) 2018-06-19 2018-06-19 Adaptive activation function parameter adjusting method for deep neural network

Publications (2)

Publication Number Publication Date
CN108898213A true CN108898213A (en) 2018-11-27
CN108898213B CN108898213B (en) 2021-12-17

Family

ID=64345490

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810631395.3A Active CN108898213B (en) 2018-06-19 2018-06-19 Adaptive activation function parameter adjusting method for deep neural network

Country Status (1)

Country Link
CN (1) CN108898213B (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109934222A (en) * 2019-03-01 2019-06-25 长沙理工大学 A kind of insulator chain self-destruction recognition methods based on transfer learning
CN110084380A (en) * 2019-05-10 2019-08-02 深圳市网心科技有限公司 A kind of repetitive exercise method, equipment, system and medium
CN110222173A (en) * 2019-05-16 2019-09-10 吉林大学 Short text sensibility classification method and device neural network based
CN110443296A (en) * 2019-07-30 2019-11-12 西北工业大学 Data adaptive activation primitive learning method towards classification hyperspectral imagery
CN110570048A (en) * 2019-09-19 2019-12-13 深圳市物语智联科技有限公司 user demand prediction method based on improved online deep learning
CN111860460A (en) * 2020-08-05 2020-10-30 江苏新安电器股份有限公司 Application method of improved LSTM model in human behavior recognition
CN114708460A (en) * 2022-04-12 2022-07-05 济南博观智能科技有限公司 Image classification method, system, electronic equipment and storage medium
CN115204352A (en) * 2021-04-12 2022-10-18 洼田望 Information processing apparatus, information processing method, and storage medium
WO2023092938A1 (en) * 2021-11-24 2023-06-01 苏州浪潮智能科技有限公司 Image recognition method and apparatus, and device and medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5113483A (en) * 1990-06-15 1992-05-12 Microelectronics And Computer Technology Corporation Neural network with semi-localized non-linear mapping of the input space
CN104951836A (en) * 2014-03-25 2015-09-30 上海市玻森数据科技有限公司 Posting predication system based on nerual network technique
CN105654136A (en) * 2015-12-31 2016-06-08 中国科学院电子学研究所 Deep learning based automatic target identification method for large-scale remote sensing images
CN105891215A (en) * 2016-03-31 2016-08-24 浙江工业大学 Welding visual detection method and device based on convolutional neural network
CN107122825A (en) * 2017-03-09 2017-09-01 华南理工大学 A kind of activation primitive generation method of neural network model

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5113483A (en) * 1990-06-15 1992-05-12 Microelectronics And Computer Technology Corporation Neural network with semi-localized non-linear mapping of the input space
CN104951836A (en) * 2014-03-25 2015-09-30 上海市玻森数据科技有限公司 Posting predication system based on nerual network technique
CN105654136A (en) * 2015-12-31 2016-06-08 中国科学院电子学研究所 Deep learning based automatic target identification method for large-scale remote sensing images
CN105891215A (en) * 2016-03-31 2016-08-24 浙江工业大学 Welding visual detection method and device based on convolutional neural network
CN107122825A (en) * 2017-03-09 2017-09-01 华南理工大学 A kind of activation primitive generation method of neural network model

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109934222A (en) * 2019-03-01 2019-06-25 长沙理工大学 A kind of insulator chain self-destruction recognition methods based on transfer learning
CN110084380A (en) * 2019-05-10 2019-08-02 深圳市网心科技有限公司 A kind of repetitive exercise method, equipment, system and medium
CN110222173A (en) * 2019-05-16 2019-09-10 吉林大学 Short text sensibility classification method and device neural network based
CN110222173B (en) * 2019-05-16 2022-11-04 吉林大学 Short text emotion classification method and device based on neural network
CN110443296A (en) * 2019-07-30 2019-11-12 西北工业大学 Data adaptive activation primitive learning method towards classification hyperspectral imagery
CN110443296B (en) * 2019-07-30 2022-05-06 西北工业大学 Hyperspectral image classification-oriented data adaptive activation function learning method
CN110570048A (en) * 2019-09-19 2019-12-13 深圳市物语智联科技有限公司 user demand prediction method based on improved online deep learning
CN111860460A (en) * 2020-08-05 2020-10-30 江苏新安电器股份有限公司 Application method of improved LSTM model in human behavior recognition
CN115204352A (en) * 2021-04-12 2022-10-18 洼田望 Information processing apparatus, information processing method, and storage medium
CN115204352B (en) * 2021-04-12 2024-03-12 洼田望 Information processing apparatus, information processing method, and storage medium
WO2023092938A1 (en) * 2021-11-24 2023-06-01 苏州浪潮智能科技有限公司 Image recognition method and apparatus, and device and medium
CN114708460A (en) * 2022-04-12 2022-07-05 济南博观智能科技有限公司 Image classification method, system, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN108898213B (en) 2021-12-17

Similar Documents

Publication Publication Date Title
CN108898213A (en) A kind of adaptive activation primitive parameter adjusting method towards deep neural network
Munakata Fundamentals of the new artificial intelligence: neural, evolutionary, fuzzy and more
CN109992779B (en) Emotion analysis method, device, equipment and storage medium based on CNN
CN103345656B (en) A kind of data identification method based on multitask deep neural network and device
CN109829541A (en) Deep neural network incremental training method and system based on learning automaton
CN106560848B (en) Novel neural network model for simulating biological bidirectional cognitive ability and training method
CN108304826A (en) Facial expression recognizing method based on convolutional neural networks
CN106503654A (en) A kind of face emotion identification method based on the sparse autoencoder network of depth
CN114049513A (en) Knowledge distillation method and system based on multi-student discussion
CN111858989A (en) Image classification method of pulse convolution neural network based on attention mechanism
CN109255340A (en) It is a kind of to merge a variety of face identification methods for improving VGG network
CN106709482A (en) Method for identifying genetic relationship of figures based on self-encoder
CN108121975A (en) A kind of face identification method combined initial data and generate data
CN107657313B (en) System and method for transfer learning of natural language processing task based on field adaptation
CN108256630A (en) A kind of over-fitting solution based on low dimensional manifold regularization neural network
CN106980830A (en) One kind is based on depth convolutional network from affiliation recognition methods and device
CN106980831A (en) Based on self-encoding encoder from affiliation recognition methods
Golovko et al. A new technique for restricted Boltzmann machine learning
CN108154156A (en) Image Ensemble classifier method and device based on neural topic model
Huang et al. Design and Application of Face Recognition Algorithm Based on Improved Backpropagation Neural Network.
Wu et al. Damage identification of low emissivity coating based on convolution neural network
Ding et al. College English online teaching model based on deep learning
CN110188621A (en) A kind of three-dimensional face expression recognition methods based on SSF-IL-CNN
Kozlova et al. The use of neural networks for planning the behavior of complex systems
CN106096543A (en) A kind of Handwritten Digit Recognition method based on modified extreme learning machine

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant