CN112257796A - Image integration method of convolutional neural network based on selective characteristic connection - Google Patents
Image integration method of convolutional neural network based on selective characteristic connection Download PDFInfo
- Publication number
- CN112257796A CN112257796A CN202011174153.XA CN202011174153A CN112257796A CN 112257796 A CN112257796 A CN 112257796A CN 202011174153 A CN202011174153 A CN 202011174153A CN 112257796 A CN112257796 A CN 112257796A
- Authority
- CN
- China
- Prior art keywords
- characteristic
- features
- convolutional neural
- level
- neural network
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000013527 convolutional neural network Methods 0.000 title claims abstract description 39
- 238000000034 method Methods 0.000 title claims abstract description 29
- 230000010354 integration Effects 0.000 title claims abstract description 11
- 238000012545 processing Methods 0.000 claims abstract description 11
- 238000010606 normalization Methods 0.000 claims abstract description 10
- 230000004927 fusion Effects 0.000 abstract description 9
- 230000007246 mechanism Effects 0.000 description 9
- 239000000284 extract Substances 0.000 description 6
- 238000013135 deep learning Methods 0.000 description 4
- 238000010586 diagram Methods 0.000 description 4
- 238000011160 research Methods 0.000 description 4
- 238000011176 pooling Methods 0.000 description 3
- 208000037170 Delayed Emergence from Anesthesia Diseases 0.000 description 2
- 238000013528 artificial neural network Methods 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 238000002474 experimental method Methods 0.000 description 2
- 238000000605 extraction Methods 0.000 description 2
- 238000012549 training Methods 0.000 description 2
- 230000000007 visual effect Effects 0.000 description 2
- NAWXUBYGYWOOIX-SFHVURJKSA-N (2s)-2-[[4-[2-(2,4-diaminoquinazolin-6-yl)ethyl]benzoyl]amino]-4-methylidenepentanedioic acid Chemical compound C1=CC2=NC(N)=NC(N)=C2C=C1CCC1=CC=C(C(=O)N[C@@H](CC(=C)C(O)=O)C(O)=O)C=C1 NAWXUBYGYWOOIX-SFHVURJKSA-N 0.000 description 1
- 239000000654 additive Substances 0.000 description 1
- 230000000996 additive effect Effects 0.000 description 1
- 230000003190 augmentative effect Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000000295 complement effect Effects 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000008034 disappearance Effects 0.000 description 1
- 238000004880 explosion Methods 0.000 description 1
- 230000004807 localization Effects 0.000 description 1
- 238000012800 visualization Methods 0.000 description 1
- 238000007794 visualization technique Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/25—Fusion techniques
- G06F18/253—Fusion techniques of extracted features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T3/00—Geometric image transformation in the plane of the image
- G06T3/40—Scaling the whole image or part thereof
- G06T3/4084—Transform-based scaling, e.g. FFT domain scaling
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/11—Region-based segmentation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
Abstract
The invention discloses an image integration method of a convolutional neural network based on selective characteristic connection, which comprises the following steps: respectively solving the average characteristics of the low-layer characteristics and the high-layer characteristics; subtracting the average characteristic of the low-level characteristic from the average characteristic of the high-level characteristic to obtain the score of the key characteristic graph; scaling the average characteristic of the high-level characteristic; performing Softmax normalization processing to obtain a characteristic Z; and carrying out maximum value normalization processing on the characteristic Z to obtain the attention score. The image integration method of the convolutional neural network based on selective feature connection can better integrate feature map information based on a high-low layer feature fusion mode of selective feature connection, more effectively utilize learned features and does not increase the number of parameters. The structure of the convolutional neural network is optimized, the performance of the network is improved, the method is particularly significant to the shallow convolutional neural network, and the shallow convolutional neural network is applied to more fields.
Description
Technical Field
The invention belongs to the technical field of convolutional neural networks, and particularly relates to an image integration method of a convolutional neural network based on selective characteristic connection.
Background
In recent years, the research of network architecture has attracted much attention. Today, many excellent network architectures are proposed in succession. Google lenet constructs a 22-layer convolutional neural network, but it reduces the number of parameters from 6000 to 400 ten thousand by using the inclusion model. VGGNet demonstrates that increasing the depth of the network using a very small convolution filter can effectively boost the effectiveness of the model. However, increasing the depth of the network cannot simply stack the network layers one upon the other. Adding more layers in the appropriate depth model may result in higher training errors due to the problems of gradient disappearance and gradient explosion, which make deep networks difficult to train. High way Networks propose an efficient method of using bypass (bypass) and gate units (gating units) to train an end-to-end network with more than 100 layers. Bypass is considered a key factor in training these very deep networks. ResNet further attests to this view, it has added identity maps (identity maps) as a bypass in the network, and by using residual blocks (residual blocks), ResNet has made a breakthrough advance in many challenging tasks (image recognition, localization, and detection, etc.).
A novel visualization technique enables an in-depth understanding of the characteristics of the middle layer of the convolutional neural network and the operation of the classifier. In fact, the feature maps of different levels extract information of different levels of the input image. The lower layer features extract more detailed information, while the higher layer features extract more semantic information, the higher layer semantic information being closer to the last layer with class labels. In many computer vision tasks, combining high-level information and low-level information can effectively improve experimental performance.
At present, a Convolutional Neural Network (CNN) is used as an important branch of deep learning, a hardware basis required by the CNN as a main research direction is gradually matured, and as hardware technology is more and more perfect, deep learning algorithms are more and more diversified, bottom layer languages such as C language and C + + cannot meet a plurality of deep learning research requirements, and a plurality of more convenient and more flexible deep learning development frameworks such as tensflow, Caffe, thano, Keras, and torch are generated. The appearance of visualization technology can deeply analyze each layer of characteristics of the convolutional neural network, wherein the high-layer characteristics contain more semantic information, and the low-layer characteristics contain more detailed information, so that the integration of the high-layer information and the low-layer information to improve the experimental performance is an important research direction of the convolutional neural network in many computer vision tasks.
In a convolutional neural network, high-level and low-level feature fusion is an effective way for improving network performance, however, low-level features have the problems of background confusion and semantic ambiguity, and direct fusion of high-level and low-level features may cause confusion and semantic ambiguity of the fused features, resulting in poor network performance.
Disclosure of Invention
Based on the defects of the prior art, the technical problem to be solved by the invention is to provide an image integration method of a convolutional neural network based on selective feature connection, which is used for fusing a low-level feature with a high-level feature after the low-level feature is processed through a selective feature connection mechanism, so that the network performance is improved.
In order to solve the above technical problem, the present invention provides an image integration method based on a convolutional neural network with selective feature connection, which includes the following steps:
step 1: respectively solving the average characteristics of the low-layer characteristics and the high-layer characteristics;
step 2: subtracting the average characteristic of the low-level characteristic from the average characteristic of the high-level characteristic obtained in the step 1 to obtain a score of a key characteristic graph;
and step 3: scaling the average characteristic of the high-level characteristic;
and 4, step 4: respectively carrying out Softmax normalization processing on the scores of the key feature graphs obtained in the step 2 and the results of the scaling processing in the step 3 to obtain features Z;
and 5: and carrying out maximum value normalization processing on the characteristic Z to obtain the attention score.
Optionally, in step 1, the average characteristics of the low-level features are as follows:
the average features of the high-level features are as follows:
wherein Am ∈ RF×G×1,Bm∈RF×G×1The value of A0 at spatial location (i, j, c) corresponds to A0i,j,cAnd B corresponds to a value of B at spatial location (i, j, c)i,j,c,C1Number of features representing lower layers, C2Representing the number of high-level features.
Optionally, in step 2, the scores of the key feature maps are as follows:
P=Bm-Am。
further, in step 3, the scaling process is performed on the average feature of the high-level features as follows:
D=Bm*n
Further, in step 4, the scores of the key feature maps obtained in step 2 and the results of the scaling processing in step 3 are subjected to Softmax normalization processing, respectively, as follows:
the resulting characteristic Z is as follows:
Z=SP-SD。
therefore, the image integration method based on the convolutional neural network with the selective characteristic connection has the following beneficial effects:
the high-low layer feature fusion mode based on selective feature connection can better integrate feature map information, more effectively utilize learned features and can not increase the number of parameters. The structure of the convolutional neural network is optimized, the performance of the network is improved, the method is particularly significant to the shallow convolutional neural network, and the shallow convolutional neural network is applied to more fields.
The foregoing description is only an overview of the technical solutions of the present invention, and in order to make the technical means of the present invention more clearly understood, the present invention may be implemented in accordance with the content of the description, and in order to make the above and other objects, features, and advantages of the present invention more clearly understood, the following detailed description is given in conjunction with the preferred embodiments, together with the accompanying drawings.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings of the embodiments will be briefly described below.
Fig. 1 is a diagram of a CNN network architecture for selective feature connection;
FIG. 2 is a high-low level feature direct fusion map;
FIG. 3 is a diagram of high-level and low-level feature additive fusion;
FIG. 4 is a diagram of a process for selective feature computation.
Detailed Description
Other aspects, features and advantages of the present invention will become apparent from the following detailed description, taken in conjunction with the accompanying drawings, which form a part of this specification, and which illustrate, by way of example, the principles of the invention. In the referenced drawings, the same or similar components in different drawings are denoted by the same reference numerals.
The invention applies a general network architecture Selective Feature Connection Mechanism (SFCM) to connect convolutional neural network features of different layers. Different layer features contain different information, higher layer features always contain more semantic information, and lower layer features contain more detail information, however, the lower layer features are affected by the background, which causes background confusion and semantic ambiguity. Combining the high-level and low-level features directly, which can cause background clutter and semantic ambiguity, SFCM effectively overcomes this drawback. It uses human visual recognition mechanisms whereby low-level features are selectively connected to high-level features through feature selectors generated from high-level features, which can be employed in many network architectures.
The classical convolutional neural network consists of an input layer, a convolutional layer, a pooling layer, a full-link layer and an output layer, wherein the extracted convolutional features are from a low layer to a high layer, and finally, the final output result is obtained from the high-layer features. In order to improve the performance of the neural network, the invention uses the residual error structure of the ResNet model for reference, and obtains the final output result after fusing the high-level and low-level characteristics through a selective characteristic connection mechanism, wherein the network structure diagram is shown in figure 1:
the selective characteristic operation process will be described in detail below.
The existing method for fusing features of high and low layers generally combines feature maps of different layers directly, and the combined features are shown in formula (1):
O=[A,B] (1)
whereinLow-level features representing convolutional neural networks, C1Representing the number of low-level features, and G and F represent the width and height of the feature map. And C2Represents the number of high-level features,represents a high-level feature of a convolutional neural network,representing a combination of features. The whole process is shown in fig. 2.
However, the combined features obtained by directly combining the feature maps sharply increase the parameters of the full-link layer, so the method of fusing the high-level features and the low-level features of the present invention is to add the high-level features and the low-level features, as shown in formula (2):
O=A1+B (2)
whereinRepresenting the low-level features of a convolutional neural network,representing the transformed features of the lower level features,represents a high-level feature of a convolutional neural network,representing the combined features, the process is shown in fig. 3.
However, directly connecting the lower and upper layer features does not fully exploit the lower and upper layer information complementary properties. The high-level features contain more semantic information and the low-level features contain more detailed information. Combining the high-level and low-level features directly may cause background clutter and semantic ambiguity due to the introduction of too much detailed information. The present invention proposes a Selective Feature Connection Mechanism (SFCM) by referring to the human visual recognition mechanism. An attention score is assigned to each element on the low-level feature map that represents the importance of the element on the low-level feature map.
First, average features Am and Bm of a low-level feature and a high-level feature are obtained, respectively, and the average feature of the low-level feature is shown in formula (3):
the average feature of the high-level features is shown in equation (4):
wherein Am ∈ RF×G×1,Bm∈RF×G×1The value of A0 at spatial location (i, j, c) corresponds to A0i,j,cAnd B corresponds to a value of B at spatial location (i, j, c)i,j,c。
Because the superficial network extracts texture and detail features, the deep network extracts outline, shape and strongest features, and the superficial network comprises more features and also extracts key features, however, the deeper the layer number is, the more representative the extracted features are, and the more prominent the key features are. Thus, the average feature of the lower layer is subtracted from the average feature of the upper layer to obtain the score P of the key feature map, as shown in equation (5):
P=Bm-Am (5)
the average feature Bm of the high-level features is scaled to obtain D, as shown in equation (6):
D=Bm*n (6)
Performing Softmax normalization on P and D respectively, as shown in formula (7) and formula (8):
thus, the characteristic Z can be obtained as shown in equation (9). It represents the degree of importance of the corresponding position of each element of the low-level features.
Z=SP-SD (9)
The attention score M can be obtained by performing maximum normalization processing on the feature Z, that is, the feature selector is obtained, as shown in formula (10):
wherein M is the same as RF×G×1,Mi,jIs the final score at position (i, j). The learned attention score represents the importance of the corresponding position of each element of the low-level features. Thus, multiplying the low-level features by the attention score may screen out important features of the low-level features. Thus, the new low-layer feature As can be obtained from equation (11):
the new low-level features are augmented to a1 for fusing the high-level features. And (4) fusing the high-low layer features, and calculating a fusion coefficient L. If the average score of each of the feature maps a1 and B is E and F, it is determined from equation (12) and equation (13).
From this, a fusion coefficient L can be calculated as shown in equation (14)
The final combined features are then as shown in equation (15):
O=L*A1+B (15)
the whole selective characteristic operation process is shown in fig. 4.
As can be seen from the feature selector M, the feature selector can enhance the salient region of the low-level feature and suppress the background region of the low-level feature. With SFCM, most pixels in the low-level feature map are suppressed. Therefore, on the premise of not damaging the semantic expression capability of the high-level features, more detailed information of the salient regions of the low-level feature map is added, the expression capability of the features is further enhanced, and better performance is obtained.
The method builds two convolutional neural networks to carry out image classification experiments on data sets cifar10 and cifar100, and firstly builds a 9-layer convolutional neural network model which comprises an input layer, 3 convolutional layers, 3 pooling layers, 1 full-connection layer (a feature extraction layer) and an output layer (a Softmax layer). And an 11-layer convolutional neural network model is also built and comprises an input layer, 5 convolutional layers, 3 pooling layers, 1 full-connection layer (feature extraction layer) and an output layer.
The results of the experiment are shown in Table 1
Table 1: image recognition rates on datasets cifar10 and cifar100
As can be seen from table 1, the direct fusion of the high-level features and the low-level features may cause a decrease in the image recognition rate of the neural network, and the convolutional neural network based on the selective feature connection mechanism may ensure an increase in the image recognition rate, and compared with the conventional convolutional neural network, the convolutional neural network based on the selective feature connection mechanism may improve the accuracy by 0.9% in the cifar10 data set and 1.4% in the cifar100 data set, which proves the effectiveness and superiority of the selective feature connection mechanism.
While the foregoing is directed to the preferred embodiment of the present invention, other and further embodiments of the invention may be devised without departing from the basic scope thereof, and the scope thereof is determined by the claims that follow.
Claims (6)
1. An image integration method based on a convolutional neural network with selective feature connection is characterized by comprising the following steps:
step 1: respectively solving the average characteristics of the low-layer characteristics and the high-layer characteristics;
step 2: subtracting the average characteristic of the low-level characteristic from the average characteristic of the high-level characteristic obtained in the step 1 to obtain a score of a key characteristic graph;
and step 3: scaling the average characteristic of the high-level characteristic;
and 4, step 4: respectively carrying out Softmax normalization processing on the scores of the key feature graphs obtained in the step 2 and the results of the scaling processing in the step 3 to obtain features Z;
and 5: and carrying out maximum value normalization processing on the characteristic Z to obtain the attention score.
2. The method of claim 1, wherein in step 1, the average features of the low-level features are as follows:
the average features of the high-level features are as follows:
wherein Am ∈ RF×G×1,Bm∈RF×G×1The value of A0 at spatial location (i, j, c) corresponds to A0i,j,cAnd B corresponds to a value of B at spatial location (i, j, c)i,j,c,C1Number of features representing lower layers, C2Representing the number of high-level features.
3. The method of claim 2, wherein in step 2, the scores of the key feature maps are as follows:
P=Bm-Am。
5. The method for integrating images based on a convolutional neural network with selective feature connection as claimed in claim 1, wherein in step 4, the scores of the key feature maps obtained in step 2 and the results of the scaling process in step 3 are subjected to a Softmax normalization process respectively as follows:
the resulting characteristic Z is as follows:
Z=SP-SD。
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011174153.XA CN112257796A (en) | 2020-10-28 | 2020-10-28 | Image integration method of convolutional neural network based on selective characteristic connection |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011174153.XA CN112257796A (en) | 2020-10-28 | 2020-10-28 | Image integration method of convolutional neural network based on selective characteristic connection |
Publications (1)
Publication Number | Publication Date |
---|---|
CN112257796A true CN112257796A (en) | 2021-01-22 |
Family
ID=74261703
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011174153.XA Pending CN112257796A (en) | 2020-10-28 | 2020-10-28 | Image integration method of convolutional neural network based on selective characteristic connection |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112257796A (en) |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109902748A (en) * | 2019-03-04 | 2019-06-18 | 中国计量大学 | A kind of image, semantic dividing method based on the full convolutional neural networks of fusion of multi-layer information |
WO2019144575A1 (en) * | 2018-01-24 | 2019-08-01 | 中山大学 | Fast pedestrian detection method and device |
CN110084794A (en) * | 2019-04-22 | 2019-08-02 | 华南理工大学 | A kind of cutaneum carcinoma image identification method based on attention convolutional neural networks |
CN110097145A (en) * | 2019-06-20 | 2019-08-06 | 江苏德劭信息科技有限公司 | One kind being based on CNN and the pyramidal traffic contraband recognition methods of feature |
CN110728192A (en) * | 2019-09-16 | 2020-01-24 | 河海大学 | High-resolution remote sensing image classification method based on novel characteristic pyramid depth network |
CN111553289A (en) * | 2020-04-29 | 2020-08-18 | 中国科学院空天信息创新研究院 | Remote sensing image cloud detection method and system |
CN111753752A (en) * | 2020-06-28 | 2020-10-09 | 重庆邮电大学 | Robot closed loop detection method based on convolutional neural network multi-layer feature fusion |
-
2020
- 2020-10-28 CN CN202011174153.XA patent/CN112257796A/en active Pending
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2019144575A1 (en) * | 2018-01-24 | 2019-08-01 | 中山大学 | Fast pedestrian detection method and device |
CN109902748A (en) * | 2019-03-04 | 2019-06-18 | 中国计量大学 | A kind of image, semantic dividing method based on the full convolutional neural networks of fusion of multi-layer information |
CN110084794A (en) * | 2019-04-22 | 2019-08-02 | 华南理工大学 | A kind of cutaneum carcinoma image identification method based on attention convolutional neural networks |
CN110097145A (en) * | 2019-06-20 | 2019-08-06 | 江苏德劭信息科技有限公司 | One kind being based on CNN and the pyramidal traffic contraband recognition methods of feature |
CN110728192A (en) * | 2019-09-16 | 2020-01-24 | 河海大学 | High-resolution remote sensing image classification method based on novel characteristic pyramid depth network |
CN111553289A (en) * | 2020-04-29 | 2020-08-18 | 中国科学院空天信息创新研究院 | Remote sensing image cloud detection method and system |
CN111753752A (en) * | 2020-06-28 | 2020-10-09 | 重庆邮电大学 | Robot closed loop detection method based on convolutional neural network multi-layer feature fusion |
Non-Patent Citations (1)
Title |
---|
CHEN DU 等: "Selective Feature Connection Mechanism: Concatenating Multi-layer CNN Features with a Feature Selector", ARXIV, pages 1 - 8 * |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109191491B (en) | Target tracking method and system of full convolution twin network based on multi-layer feature fusion | |
CN111354017B (en) | Target tracking method based on twin neural network and parallel attention module | |
CN110570458B (en) | Target tracking method based on internal cutting and multi-layer characteristic information fusion | |
CN109299274B (en) | Natural scene text detection method based on full convolution neural network | |
CN110210539B (en) | RGB-T image saliency target detection method based on multi-level depth feature fusion | |
Cong et al. | Global-and-local collaborative learning for co-salient object detection | |
CN107844795B (en) | Convolutional neural networks feature extracting method based on principal component analysis | |
CN111368673B (en) | Method for quickly extracting human body key points based on neural network | |
CN111325165B (en) | Urban remote sensing image scene classification method considering spatial relationship information | |
CN111612008A (en) | Image segmentation method based on convolution network | |
Huang et al. | Hand gesture recognition with skin detection and deep learning method | |
CN112232134B (en) | Human body posture estimation method based on hourglass network and attention mechanism | |
CN112163498B (en) | Method for establishing pedestrian re-identification model with foreground guiding and texture focusing functions and application of method | |
CN112164077B (en) | Cell instance segmentation method based on bottom-up path enhancement | |
CN113744311A (en) | Twin neural network moving target tracking method based on full-connection attention module | |
CN114972312A (en) | Improved insulator defect detection method based on YOLOv4-Tiny | |
CN116129289A (en) | Attention edge interaction optical remote sensing image saliency target detection method | |
CN113011253A (en) | Face expression recognition method, device, equipment and storage medium based on ResNeXt network | |
Jian et al. | Dual-Branch-UNet: A Dual-Branch Convolutional Neural Network for Medical Image Segmentation. | |
Wei et al. | A survey of facial expression recognition based on deep learning | |
CN114511895B (en) | Natural scene emotion recognition method based on attention mechanism multi-scale network | |
CN112257796A (en) | Image integration method of convolutional neural network based on selective characteristic connection | |
CN115661858A (en) | 2D human body posture estimation method based on coupling of local features and global characterization | |
CN115578722A (en) | License plate detection method based on cooperative learning mechanism between license plates | |
Song et al. | Lightweight multi-level feature difference fusion network for RGB-DT salient object detection |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |