CN109063765B - Image classification method based on gated neural network information fusion - Google Patents
Image classification method based on gated neural network information fusion Download PDFInfo
- Publication number
- CN109063765B CN109063765B CN201810835370.5A CN201810835370A CN109063765B CN 109063765 B CN109063765 B CN 109063765B CN 201810835370 A CN201810835370 A CN 201810835370A CN 109063765 B CN109063765 B CN 109063765B
- Authority
- CN
- China
- Prior art keywords
- fusion
- input
- neural network
- evidence
- weight
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/25—Fusion techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/25—Fusion techniques
- G06F18/253—Fusion techniques of extracted features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/25—Fusion techniques
- G06F18/257—Belief theory, e.g. Dempster-Shafer
Landscapes
- Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Theoretical Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Image Analysis (AREA)
Abstract
The invention relates to a neural network information fusion method based on gating, which comprises the following steps: giving neural network characteristic tensors needing to be fused, and referring the tensors as fusion input; determining the dimensionality of the output tensor, and recording the output tensor as O; calculating a neural network including transformation and activation for each input, so that the input has the same dimensionality as the output tensor O; selecting a proper fusion evidence to input: the fusion evidence is a feature tensor for calculating fusion weight for controlling each fusion input, and the fusion evidence of the ith input is recorded as Ei(ii) a For the fusion of the ith path, pair EiPerforming neural network calculation; for calculated fused evidence EiActivation is performed to obtain fusion weight αiFusing the weight αiAnd input IiMultiplying; the paths are combined linearly or non-linearly into an output tensor O.
Description
Technical Field
The invention belongs to the field of machine learning and neural networks, and particularly relates to a gated information fusion method for a neural network.
Background
Neural Networks (NN) have achieved good results in many fields such as speech recognition, natural language recognition, image processing, and pattern recognition.
Multitasking, multi-branch neural networks are becoming more popular, and several popular neural network models, such as ResNet (He et, 2016), densnet (huang et, 2017), GRU (Cho et, 2014), etc., all introduce the operation of fusing two-branch information into one branch. On the comprehensive system of a robot, an unmanned aerial vehicle and an automatic driving system, a plurality of branches, a plurality of tasks and other complex neural network models are more and more common, and the information fusion of the neural network models is particularly important in the applications. Most of the existing information fusion methods use splicing (Concatenation) or weighted average as a fusion strategy: the use of splicing can cause the feature dimension to be greatly increased, and a large amount of computing resources are needed; weighted averaging, as a simple linear combination method, cannot fit a nonlinear fusion function.
HeK,ZhangX,Ren S,etal.DeepResidualLearningforImageRecognition[A].IEEEConferenceon ComputerVisionandPatternRecognition[C].LasVegas,NV,UnitedStates:IEEE,2016:770–778.
HuangG,LiuZ,van derMaatenL,etal.DenselyConnectedConvolutionalNetworks[A].IEEE ConferenceonComputerVisionandPatternRecognition[C].Honolulu,HI,USA:IEEE,2017:2261–2269.
ChoK,vanMerrienboerB,BahdanauD,etal.OntheProperties ofNeuralMachineTranslation:Encoder-DecoderApproaches[A].Workshop on Syntax,SemanticsandStructure in StatisticalTranslation[C].Doha,Qatar:2014.
Disclosure of Invention
The invention aims to provide a neural network information fusion method with small calculated amount and strong fitting capability. The technical scheme of the invention is as follows:
a gated neural network information fusion method comprises the following steps:
1) giving the neural network feature tensor I which needs to be fused1,I2,…,InN in total, and the tensors are called fusion input;
2) determining the dimensionality of the output tensor, and recording the output tensor as O;
3) calculating a neural network including transformation and activation for each input, so that the input has the same dimensionality as the output tensor O;
4) selecting a proper fusion evidence to input: the fusion evidence is a feature tensor for calculating fusion weight for controlling each fusion input, and the fusion evidence of the ith input is recorded as EiThen there is a pair input I1,I2,…,InHaving E of1,E2,…,En;
5) For the fusion of the ith path, pair EiPerforming neural network calculation;
6) for calculated fused evidence EiActivation is performed to obtain fusion weight αi;
7) Fuse weights αiAnd input IiMultiplying;
8) the paths are combined linearly or non-linearly into an output tensor O.
The invention has the substantial characteristics that: by introducing fusion evidence and nonlinear operation, the nonlinear intelligent fusion method capable of fusing any multi-path information and heterogeneous multi-source information is provided, fine-grained fusion of any characteristic tensor can be robustly carried out, and the method can be used for improving any neural network model and some non-neural network models. The beneficial effects are as follows:
1. it is applicable to all neural networks and some non-neural network methods.
2. Compared with the existing fusion method, the invention achieves better fusion performance and provides a fusion strategy of multipath, nonlinear and fine-grained control.
3. The method is simple to realize and has little influence on the existing structure.
Drawings
FIG. 1 Structure of the invention
FIG. 2 embodiment of the fusion construct
Detailed Description
The method of the invention can perform the feature tensor fusion element by element, the size and the quantity of the input and output tensors are flexible and variable, the fusion strategy is self-learning and has strong expression capability, and the method is not limited to a certain neural network and a certain network structure,
has stronger universality and practicability. In order to solve the above problems and achieve the above object, the technical solution of the present invention is as follows:
1) given any number of neural network feature tensors to be fused, these inputs I are recorded1,I2,…,InA total of n (yellow arrows in fig. 1), these inputs are referred to as fusion inputs;
2) determining the dimensionality of the output tensor, and recording the output tensor as O;
3) calculating (including transforming, activating and the like) each input by using a neural network method so that the input has the same dimension as the output tensor O;
4) selecting a proper fusion evidence to input: fused evidence refers to controlThe feature tensor for calculating the fusion weight for this fusion input (green arrow in FIG. 1) is recorded as the fusion evidence for the ith input as EiThen there is a pair input I1,I2,…,InHaving E of1,E2,…,En;
5) For the fusion of the ith way, optionally, for EiPerforming neural network calculations ("arbitrary functions" in fig. 1);
6) for calculated fused evidence EiActivating (e.g. Sigmoid, tanh, ReLU, etc.) (the "activation function" in FIG. 1) to obtain fusion weight αi;
7) Fuse weights αiAnd input IiMultiplication (symbol "x" in fig. 1);
8) the paths are combined linearly or non-linearly into an output tensor O (the "+" sign in figure 1).
9) In particular, when the data to be fused has only two paths, the fusion evidence of the two paths can be shared and α and 1- α can be used as the weight of the two data to be fused (as shown in fig. 1).
This section will be based on the inclusion-v 4 network architecture featuring multiple parallel branches as proposed by szegdy et al, 2016. it is clear that the invention is not limited to an infrastructure, which is only one example.
SZEGEDY C,IOFFE S,VANHOUCKE V,et al.Inception-v4,Inception-ResNet andthe Impact ofResidual Connections on Learning[C]//AAAI Conference onArtificial Intelligence.San Francisco,CA,USA:AAAI,2017.
(1) Suitable training data is prepared, the training data of the present example including training images and class labels.
(2) And establishing an inclusion-v 4 basic network.
(3) The data to be fused is determined (fig. 2), in this example a three-input fusion unit is used. Specifically, parallel branches of each unit in the inclusion-v 4 are used as fusion inputs.
(4) And selecting fusion evidence. In particular, in this example, the output of the previous unit in the inclusion-v 4 is taken as the fusion evidence of the unit, and a convolution and linear rectification unit (ReLU) is added in sequence as a fusion control branch.
(5) In order to obtain different receptive fields, the convolutions added by the convolution branches with large, medium and small scales in the inclusion-v 4 are respectively 3 × 3 convolution, 3 × 3 dilated convolution with a dilation rate of 2 and 3 × 3 dilated convolution with a dilation rate of 4.
(6) Evidence of fusion is activated. At the end of the fusion control branch of the previous step, tanh is added as an activation function.
(7) Each way fused input is multiplied by a corresponding fusion weight.
(8) And adding the obtained data of each path.
(9) Inputting the training data obtained in the step 1 into the obtained neural network, using an optimization method of mini-batch stochastic gradient descent (mini-batch SGD), selecting the sum of cross entropy loss and weight attenuation loss as a loss item, setting each batch of 32 training images and a weight attenuation coefficient of 0.01, and training until a loss function value is converged by descending in an exponential form of 0.95 power every 1 generation from 0.001.
(10) And (4) storing the neural network weights obtained by training in the step 9.
(11) Inputting the image to be detected into the neural network model obtained in the step 10, and obtaining a prediction result, namely a classification result of the image to be detected.
Claims (1)
1. A gated neural network information fusion-based image classification method comprises the following steps:
(1) preparing training data, wherein the training data comprises training images and class labels;
(2) building an inclusion-v 4 basic network;
(3) taking parallel branches of each unit in increment-v 4 as fusion input;
(4) selecting a fusion evidence; taking the output of the previous unit in the inclusion-v 4 as the fusion evidence of the unit, and sequentially adding a convolution and linear rectification unit (ReLU) as a fusion control branch;
(5) in order to obtain different receptive fields, the convolutions added for convolution branches with large, medium and small scales in inclusion-v 4 are respectively 3 × 3 convolution, 3 × 3 expansion convolution with an expansion rate of 2 and 3 × 3 expansion convolution with an expansion rate of 4;
(6) activation of fusion evidence: at the end of the fusion control branch in the step (4), adding tanh as an activation function;
(7) multiplying each way of fusion input by the corresponding fusion weight;
(8) adding the obtained data of each path;
(9) inputting the training data obtained in the step 1 into the obtained neural network, using an optimization method of mini-batch stochastic gradient descent (mini-batch SGD), selecting the sum of cross entropy loss and weight attenuation loss as a loss item, setting each batch of 32 training images and weight attenuation coefficient of 0.01, and training until a loss function value is converged, wherein the learning rate is decreased in an exponential form of 0.95 power every 1 generation from 0.001;
(10) saving the neural network weight obtained by training in the step (9);
(11) inputting the image to be detected into the neural network model obtained in the step (10), wherein the obtained prediction result is the classification result of the image to be detected;
the image classification method adopts a neural network information fusion method based on gating, and comprises the following steps:
1) giving the neural network feature tensor I which needs to be fused1,I2,…,InN in total, and the tensors are called fusion input;
2) determining the dimensionality of the output tensor, and recording the output tensor as O;
3) calculating a neural network including transformation and activation for each input, so that the input has the same dimensionality as the output tensor O;
4) selecting a proper fusion evidence to input: the fusion evidence is a feature tensor for calculating fusion weight for controlling each fusion input, and the fusion evidence of the ith input is recorded as EiThen there is a pair input I1,I2,…,InHaving E of1,E2,…,En;
5) For the fusion of the ith path, pair EiPerforming neural network computations;
6) For calculated fused evidence EiActivation is performed to obtain fusion weight αi;
7) Fuse weights αiAnd input IiMultiplying;
8) the paths are combined linearly or non-linearly into an output tensor O.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810835370.5A CN109063765B (en) | 2018-07-26 | 2018-07-26 | Image classification method based on gated neural network information fusion |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810835370.5A CN109063765B (en) | 2018-07-26 | 2018-07-26 | Image classification method based on gated neural network information fusion |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109063765A CN109063765A (en) | 2018-12-21 |
CN109063765B true CN109063765B (en) | 2020-03-27 |
Family
ID=64836583
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810835370.5A Expired - Fee Related CN109063765B (en) | 2018-07-26 | 2018-07-26 | Image classification method based on gated neural network information fusion |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109063765B (en) |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107622275A (en) * | 2017-08-21 | 2018-01-23 | 西安电子科技大学 | A kind of Data Fusion Target recognition methods based on combining evidences |
CN107729991A (en) * | 2017-10-19 | 2018-02-23 | 天津大学 | The neutral net neuron selectivity Activiation method that a kind of position can learn |
-
2018
- 2018-07-26 CN CN201810835370.5A patent/CN109063765B/en not_active Expired - Fee Related
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107622275A (en) * | 2017-08-21 | 2018-01-23 | 西安电子科技大学 | A kind of Data Fusion Target recognition methods based on combining evidences |
CN107729991A (en) * | 2017-10-19 | 2018-02-23 | 天津大学 | The neutral net neuron selectivity Activiation method that a kind of position can learn |
Non-Patent Citations (5)
Title |
---|
Deep Residual Learning for Image Recognition;Kaiming He等;《arXiv.org》;20151231;第1-12页 * |
Densely Connected Convolutional Networks;Gao Huang等;《arXiv.org》;20180128;第1-9页 * |
Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning;Christian Szegedy等;《Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence》;20171231;第4278-4284页 * |
基于D-S证据理论和神经网络的信息融合方法及应用;张佑春等;《Proceedings of the 27th Chinese Control Conference》;20080718;第623-626页 * |
基于二叉树型卷积神经网络信息融合的人脸验证;杨子文等;《计算机应用》;20171231;第37卷(第S2期);第155-159页 * |
Also Published As
Publication number | Publication date |
---|---|
CN109063765A (en) | 2018-12-21 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110223517B (en) | Short-term traffic flow prediction method based on space-time correlation | |
CN109816095B (en) | Network flow prediction method based on improved gated cyclic neural network | |
CN108205889B (en) | Method for predicting highway traffic flow based on convolutional neural network | |
Liang et al. | A fast and accurate online sequential learning algorithm for feedforward networks | |
CN110851782A (en) | Network flow prediction method based on lightweight spatiotemporal deep learning model | |
Shi et al. | AdaSGN: Adapting joint number and model size for efficient skeleton-based action recognition | |
Li et al. | DS-Net++: Dynamic weight slicing for efficient inference in CNNs and vision transformers | |
CN109543502A (en) | A kind of semantic segmentation method based on the multiple dimensioned neural network of depth | |
Xia et al. | Fully dynamic inference with deep neural networks | |
US11551076B2 (en) | Event-driven temporal convolution for asynchronous pulse-modulated sampled signals | |
CN110852295B (en) | Video behavior recognition method based on multitasking supervised learning | |
CN111445498A (en) | Target tracking method adopting Bi-L STM neural network | |
CN112766603B (en) | Traffic flow prediction method, system, computer equipment and storage medium | |
CN114362859B (en) | Adaptive channel modeling method and system for enhanced condition generation countermeasure network | |
CN109583659A (en) | User's operation behavior prediction method and system based on deep learning | |
Chang et al. | Differentiable architecture search with ensemble gumbel-softmax | |
Jing et al. | Task transfer by preference-based cost learning | |
CN109063765B (en) | Image classification method based on gated neural network information fusion | |
Li et al. | Ds-net++: Dynamic weight slicing for efficient inference in cnns and transformers | |
Zand et al. | Flow-based Spatio-Temporal Structured Prediction of Dynamics | |
CN117808054A (en) | Complex system identification and reconstruction method based on machine learning | |
CN117131979A (en) | Traffic flow speed prediction method and system based on directed hypergraph and attention mechanism | |
CN117392846A (en) | Traffic flow prediction method for space-time self-adaptive graph learning fusion dynamic graph convolution | |
CN115953839B (en) | Real-time 2D gesture estimation method based on loop architecture and key point regression | |
CN114338416A (en) | Space-time multi-index prediction method and device and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20200327 Termination date: 20210726 |
|
CF01 | Termination of patent right due to non-payment of annual fee |