CN110490813B - Feature map enhancement method, device, equipment and medium for convolutional neural network - Google Patents

Feature map enhancement method, device, equipment and medium for convolutional neural network Download PDF

Info

Publication number
CN110490813B
CN110490813B CN201910605387.6A CN201910605387A CN110490813B CN 110490813 B CN110490813 B CN 110490813B CN 201910605387 A CN201910605387 A CN 201910605387A CN 110490813 B CN110490813 B CN 110490813B
Authority
CN
China
Prior art keywords
sub
feature map
feature
channel
enhancement
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910605387.6A
Other languages
Chinese (zh)
Other versions
CN110490813A (en
Inventor
贾琳
赵磊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Terminus Beijing Technology Co Ltd
Original Assignee
Terminus Beijing Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Terminus Beijing Technology Co Ltd filed Critical Terminus Beijing Technology Co Ltd
Priority to CN201910605387.6A priority Critical patent/CN110490813B/en
Publication of CN110490813A publication Critical patent/CN110490813A/en
Application granted granted Critical
Publication of CN110490813B publication Critical patent/CN110490813B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10004Still image; Photographic image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)

Abstract

The application discloses a method, a device, equipment and a medium for enhancing a feature map of a convolutional neural network, which are used for carrying out convolution operation on an input original image to obtain a corresponding multilayer feature map, grouping the feature map according to channel dimensions to obtain a plurality of sub-feature maps, carrying out global average pooling and global maximum pooling parallel processing on each sub-feature map by utilizing an embedded airspace grouping enhancement SGE module to obtain two corresponding channel dimension vectors, obtaining an attention enhancement factor of each channel in the corresponding sub-feature map according to the two corresponding channel dimension vectors, obtaining a corresponding enhancement feature map according to the attention enhancement factor and the corresponding sub-feature map, obtaining an enhancement feature map corresponding to the feature map of a certain layer according to all the enhancement feature maps, thereby better expressing semantic information of the importance degree between sub-feature map channels, the task performances of image classification, segmentation, detection and the like of the convolutional neural network are improved.

Description

Feature map enhancement method, device, equipment and medium for convolutional neural network
Technical Field
The present application relates to the field of computer vision technologies, and in particular, to a method, an apparatus, a device, and a medium for enhancing a feature map of a convolutional neural network.
Background
With the rise of deep learning, CNN (Convolutional Neural Network) is increasingly developed and applied in the field of computer vision as one of deep learning techniques, and researchers propose many convolution operations, such as transposed convolution, dilated convolution, grouped convolution, deep separation convolution, point-by-point convolution, deformable convolution, and the like. The grouping convolution has great advantages in reducing the amount of computation and parameter, preventing overfitting, and the like, and is consistent with the grouping ideas adopted by the artificial design features in the early computer vision field, such as HOG (Histogram of Oriented Gradient), SIFT (Scale-invariant feature transform), LBP (Local Binary Pattern), and the like, so that many classical networks, AlexNet, resext, MobileNet, ShuffleNet, capsulet, and the like all use the grouping ideas, and all group the feature maps along the channel dimension, and then perform convolution or regularization processing on each group of sub-feature maps, so that the semantic feature information of a specific region can be better expressed, and excellent performance improvement is achieved in the computer vision field.
At present, most CNN network structures improve the feature expression capability of a model by introducing an attention mechanism, and become very popular in the fields of computer vision and the like, and various neural network structures introduce a channel dimension or space dimension attention mechanism to enhance useful channel information and compress useless channel information, and can also fuse multi-scale features or global context information to further improve the enhancement capability of a specific region of a feature map, so that the neural network has a more interpretable mechanism. It can be seen that adding the grouped sub-feature maps into the attention mechanism can further enhance the ability to learn and express the semantic feature information of a specific region, and further compress noise and interference.
However, semantic feature information extracted by an SGE (Spatial Group-wise enhancement) module embedded in a conventional CNN network structure is not sufficiently expressed, which results in performance degradation of tasks such as image classification, segmentation, and detection of the CNN network.
Disclosure of Invention
The application aims to provide a feature map enhancement method, a feature map enhancement device, feature map enhancement equipment and feature map enhancement media for improving task performances of image classification, segmentation, detection and the like of a convolutional neural network.
In a first aspect, an embodiment of the present application provides a feature map enhancement method for a convolutional neural network, including:
performing convolution operation on an input original image to obtain a corresponding multilayer characteristic diagram;
grouping feature graphs of a certain layer according to channel dimensions to obtain a plurality of sub-feature graphs;
aiming at each sub-feature graph, performing global average pooling and global maximum pooling parallel processing by using an embedded airspace grouping enhancement SGE module to obtain two corresponding channel dimension vectors;
obtaining an attention enhancement factor of each channel in the corresponding sub-feature graph according to the corresponding two channel dimension vectors;
obtaining a corresponding enhancer characteristic map according to the attention enhancement factor and the corresponding sub characteristic map;
and obtaining an enhanced characteristic diagram corresponding to the characteristic diagram of the certain layer according to all the enhancer characteristic diagrams.
In a possible implementation manner, in the foregoing method provided in this embodiment of the present application, the obtaining, according to the two corresponding channel dimension vectors, an attention enhancement factor of each channel in a corresponding sub-feature map includes:
reducing the dimension of the corresponding two channel dimension vectors by using 1 x 1 convolution;
and activating and adding the two channel dimension vectors after dimension reduction by using a ReLU activation function to obtain the attention enhancement factor of each channel in the corresponding sub-feature map.
In one possible implementation manner, in the foregoing method provided in this embodiment of the present application, the obtaining a corresponding enhancer feature map according to the attention enhancement factor and the corresponding sub-feature map includes:
using 1 × 1 convolution to raise the attention enhancement factor to the number of channels corresponding to the sub-feature map;
normalizing the attention enhancement factor using a SoftMax function;
multiplying the normalized attention enhancement factor by the corresponding sub-feature map to obtain an enhanced first sub-feature map;
regularizing the first sub-feature graph to obtain a second sub-feature graph;
activating the second sub-feature graph by using a Sigmoid activation function to obtain an enhanced third sub-feature graph;
and taking the third sub-feature map as an enhancer feature map.
In a possible implementation manner, in the foregoing method provided in this embodiment of the present application, the activating the second sub-feature map by using a Sigmoid activation function to obtain an enhanced third sub-feature map includes:
obtaining an importance coefficient of the second sub-feature map channel by using a Sigmoid activation function;
and scaling the second sub-feature map by using the importance coefficient to recalibrate the importance of the spatial domain feature on the channel of the second sub-feature map to obtain an enhanced third sub-feature map.
In a second aspect, an embodiment of the present application provides a feature map enhancement apparatus for a convolutional neural network, including:
the convolution module is used for carrying out convolution operation on the input original image to obtain a corresponding multilayer characteristic diagram;
the grouping module is used for grouping the characteristic diagrams of a certain layer according to the channel dimension to obtain a plurality of sub-characteristic diagrams;
the enhancement module is used for carrying out global average pooling and global maximum pooling parallel processing by utilizing the embedded airspace grouping enhancement SGE module aiming at each sub-feature map to obtain two corresponding channel dimension vectors; obtaining an attention enhancement factor of each channel in the corresponding sub-feature graph according to the corresponding two channel dimension vectors; obtaining a corresponding enhancer characteristic map according to the attention enhancement factor and the corresponding sub characteristic map;
and the output module is used for obtaining an enhanced feature map corresponding to the feature map of the certain layer according to all the enhanced feature maps.
In a possible implementation manner, in the apparatus provided in this embodiment of the present application, the enhancing module is specifically configured to:
reducing the dimension of the corresponding two channel dimension vectors by using 1 x 1 convolution;
and activating and adding the two channel dimension vectors after dimension reduction by using a ReLU activation function to obtain the attention enhancement factor of each channel in the corresponding sub-feature map.
In a possible implementation manner, in the apparatus provided in this embodiment of the present application, the enhancing module is specifically configured to:
using 1 × 1 convolution to raise the attention enhancement factor to the number of channels corresponding to the sub-feature map;
normalizing the attention enhancement factor using a SoftMax function;
multiplying the normalized attention enhancement factor by the corresponding sub-feature map to obtain an enhanced first sub-feature map;
regularizing the first sub-feature graph to obtain a second sub-feature graph;
activating the second sub-feature graph by using a Sigmoid activation function to obtain an enhanced third sub-feature graph;
and taking the third sub-feature map as an enhancer feature map.
In a possible implementation manner, in the apparatus provided in this embodiment of the present application, the enhancing module is specifically configured to:
obtaining an importance coefficient of the second sub-feature map channel by using a Sigmoid activation function;
and scaling the second sub-feature map by using the importance coefficient to recalibrate the importance of the spatial domain feature on the channel of the second sub-feature map to obtain an enhanced third sub-feature map.
In a third aspect, an embodiment of the present application provides an electronic device, including: a memory and a processor;
the memory for storing a computer program;
wherein the processor executes the computer program in the memory to implement the method described in the first aspect and the various embodiments of the first aspect.
In a fourth aspect, the present application provides a computer-readable storage medium, in which a computer program is stored, and the computer program is used for implementing the method described in the first aspect and the implementation manners of the first aspect when executed by a processor.
Compared with the prior art, the feature map enhancement method, the device, the equipment and the medium of the convolutional neural network carry out convolution operation on an input original image to obtain a corresponding multilayer feature map, a certain layer of feature map is grouped according to channel dimensions to obtain a plurality of sub-feature maps, an embedded airspace grouping enhancement SGE module is utilized to carry out global average pooling and global maximum pooling parallel processing aiming at each sub-feature map to obtain two corresponding channel dimension vectors, an attention enhancement factor corresponding to each channel in the sub-feature maps is obtained according to the two corresponding channel dimension vectors, a corresponding enhancement feature map is obtained according to the attention enhancement factor and the corresponding sub-feature map, an enhancement feature map corresponding to the certain layer of feature map is obtained according to all enhancement feature maps, and therefore the attention enhancement factor of the channel dimensions is extracted through the global average pooling and the global maximum pooling operation, semantic information of importance degree among sub-feature graph channels can be better expressed, and meanwhile, a space domain grouping enhancement module structure is redesigned, so that the calculation of attention enhancement factors of the sub-feature graph channel dimensions is more effective, and the task performances of image classification, segmentation, detection and the like of the convolutional neural network are further improved.
Drawings
Fig. 1 is a schematic flowchart of a feature map enhancement method for a convolutional neural network according to an embodiment of the present disclosure;
FIG. 2 is a schematic diagram illustrating a flow of an algorithm for enhancing a sub-feature map according to an embodiment of the present application;
fig. 3 is a schematic structural diagram of a feature map enhancing apparatus of a convolutional neural network according to a second embodiment of the present application;
fig. 4 is a schematic structural diagram of an electronic device according to a third embodiment of the present application.
Detailed Description
The following detailed description of embodiments of the present application is provided in conjunction with the accompanying drawings, but it should be understood that the scope of the present application is not limited to the specific embodiments.
Throughout the specification and claims, unless explicitly stated otherwise, the word "comprise", or variations such as "comprises" or "comprising", will be understood to imply the inclusion of a stated element or component but not the exclusion of any other element or component.
Fig. 1 is a schematic flowchart of a feature map enhancement method for a convolutional neural network according to an embodiment of the present disclosure. In practical applications, the execution subject of this embodiment may be a feature map enhancing apparatus of a convolutional neural network, where the feature map enhancing apparatus of the convolutional neural network may be implemented by a virtual apparatus, such as a software code, or by an entity apparatus written with a relevant execution code, such as a usb disk, or by an entity apparatus integrated with a relevant execution code, such as a chip, a computer, and the like.
As shown in fig. 1, the method includes the following steps S101 to S106:
s101, performing convolution operation on the input original image to obtain a corresponding multilayer characteristic diagram.
In this embodiment, a pre-constructed convolutional neural network performs a multilayer convolution operation on an input original image, so as to obtain a corresponding multilayer feature map. It is understood that one of the layers of the feature map includes a certain number of channels.
And S102, grouping the feature graphs of a certain layer according to channel dimensions to obtain a plurality of sub-feature graphs.
In this embodiment, in the convolutional neural network learning process, the packet convolution may gradually capture a special semantic response, so that the response of a position of interest is larger, and other positions are not activated or have no response, and meanwhile, the amount of computation and parameter may also be reduced by using the packet convolution, so that the feature maps may be first grouped to better enhance the semantic feature information learning of a specific region, specifically, a certain number of channels in the feature maps are grouped to obtain a plurality of sub-feature maps with the same number as the grouped channels.
The output characteristic diagram after multilayer convolution is assumed to be
Figure BDA0002120634150000051
Where C is the number of channels in the feature map, H and W respectively represent the length and width of the feature map, and the feature map is first divided into G groups along the channel dimension, and then the vector of each spatial domain position of the obtained sub-feature map is denoted as X ═ X1,...,H×WWherein each element is
Figure BDA0002120634150000061
S103, aiming at each sub-feature graph, performing global average pooling and global maximum pooling parallel processing by using an embedded airspace grouping enhancement SGE module to obtain two corresponding channel dimension vectors.
And S104, obtaining the attention enhancement factor of each channel in the corresponding sub-feature graph according to the corresponding two channel dimension vectors.
In this embodiment, step S104 may be implemented as: reducing the dimension of the corresponding two channel dimension vectors by using 1 x 1 convolution; and activating and adding the two channel dimension vectors after dimension reduction by using a ReLU activation function to obtain the attention enhancement factor of each channel in the corresponding sub-feature map.
In practical application, some CNN network structures improve the feature expression capability of the model by introducing an attention mechanism, which not only tells the network model which important features to pay attention to, but also can enhance the expression capability of a specific region. But the way of cascading the attention enhancement modules in channel dimension and spatial dimension also increases the amount of computation and parameters of the network model.
Fig. 2 is a schematic flowchart of an algorithm for enhancing a sub-feature map according to an embodiment of the present disclosure. As shown in fig. 2, the left X column of the diagram represents that the feature maps are grouped into 3 sub-feature maps, and each sub-feature map is enhanced through the algorithm flow below the diagram to obtain the corresponding enhanced feature map in the right V column of the diagram.
The above algorithm flow is described in detail below. Considering the problems of calculation amount and model size, the method only uses a channel dimension attention mechanism, utilizes global average pooling to generate response to each spatial domain position of the sub-feature map, and combines global maximum pooling to only have gradient feedback to the position with maximum feature response during reverse propagation. And simultaneously redesigning a space domain grouping enhancement module structure, performing parallel processing on the sub-feature graphs by using global statistical feature information, and extracting attention enhancement factors of channel dimensions by using global average pooling and global maximum pooling respectively for expressing semantic information of the importance degree between the sub-feature graph channels.
The extraction process of the global statistical characteristic information is as follows:
Figure BDA0002120634150000062
Figure BDA0002120634150000063
wherein, a and b represent the channel dimension vector after the parallel processing of the global average pooling and the global maximum pooling, respectively, and max (·) represents the maximum operation of taking response to all positions of the channel dimension vector, so as to obtain the maximum activation information of each channel. Doing so means compressing the channel dimension of the grouped sub-feature maps from C H W to C H W
Figure BDA0002120634150000071
Meanwhile, each value in the channel dimension vector represents the importance of each grouping sub-feature graph among channels by using the global statistical feature information.
It can be understood that in this embodiment, the global maximum pooling information is effectively incorporated into the attention enhancement factor calculation, and the semantic feature information expression capability of the spatial domain grouping enhancement module is enhanced.
In order to model the interdependency among the sub-feature diagram channels, the dimension of a channel dimension vector is reduced by using 1 × 1 convolution with a ReLU activation function, the nonlinear interaction capability of information among the channels is increased, and the calculation amount is reduced at the same time, wherein the expression is as follows:
e=ReLU(W1a) (3)
f=ReLU(W2b) (4)
wherein, W1And W2Weight matrices, denoted 1 × 1 convolution dimensionality reduction operation, respectively
Figure BDA0002120634150000072
And
Figure BDA0002120634150000073
and the channel dimensions satisfy the relationship:
Figure BDA0002120634150000074
and adding the two channel dimension vectors to obtain the attention enhancement factor corresponding to each channel of the sub-feature map, wherein the expression is as follows:
Figure BDA0002120634150000075
and S105, obtaining a corresponding enhancer characteristic map according to the attention enhancement factor and the corresponding sub characteristic map.
In this embodiment, the step S105 may be implemented as: using 1 × 1 convolution to raise the attention enhancement factor to the number of channels corresponding to the sub-feature map; normalizing the attention enhancement factor using a SoftMax function; multiplying the normalized attention enhancement factor by the corresponding sub-feature map to obtain an enhanced first sub-feature map; regularizing the first sub-feature graph to obtain a second sub-feature graph; activating the second sub-feature graph by using a Sigmoid activation function to obtain an enhanced third sub-feature graph; and taking the third sub-feature map as an enhancer feature map.
To add more non-linearity to better fit complex relationships between channels, a 1 × 1 convolution upscaling operation is first used to bring the dimension of the attention-enhancing factor to the number of channels of the sub-feature map so that the number of channels of the sub-feature map matches the number of channels of the sub-feature map during subsequent weighted averaging processing. The expression of the attention enhancement factor after dimensionality raising and normalization is as follows:
Figure BDA0002120634150000081
wherein, W3Weight matrix representing 1 × 1 convolution upscaled operation, denoted
Figure BDA0002120634150000082
For each spatial position in the sub-feature map, x is paired with the normalized attention-enhancing factor u as described aboveiAnd weighted averaging to obtain a sub-feature map enhanced by the attention mechanism channel, wherein the expression is as follows:
yi=u·xi (7)
further, the second sub-feature map is activated by using a Sigmoid activation function to obtain an enhanced third sub-feature map, which can be specifically realized as follows: obtaining an importance coefficient of the second sub-feature map channel by using a Sigmoid activation function; and scaling the second sub-feature map by using the importance coefficient to recalibrate the importance of the spatial domain feature on the channel of the second sub-feature map to obtain an enhanced third sub-feature map.
In this embodiment, in order to eliminate the interference of the amplitude difference between the samples to the result, the sub-feature map enhanced by the attention mechanism channel is processed by using regularization. For each spatial position of the sub-feature map, the expression of the regularization operation is as follows:
Figure BDA0002120634150000083
Figure BDA0002120634150000084
Figure BDA0002120634150000085
wherein z isiRepresenting a sub-feature graph, μ, after regularization of the channel dimensioncA mean value representing the sub-feature map,
Figure BDA0002120634150000086
the variance of the sub-feature map is shown.
And then, obtaining an importance coefficient of the regularized sub-feature graph channel dimension by using a Sigmoid activation function, scaling the regularized sub-feature graph by using the importance coefficient, and re-calibrating the spatial domain feature importance of the sub-feature graph channel dimension. The expression describing this process is as follows:
vi=xi·Sigmoid(αzi+β) (11)
where α and β represent parameters of the scale operation and the panning operation, respectively, on the regularized sub-feature map, both of which have the same fixed value for each packet.
It can be understood that, in this embodiment, the spatial domain grouping enhancement module structure is redesigned, and the 1 × 1 convolution fusion pre-dimensionality reduction fusion post-dimensionality enhancement operation is adopted, so that not only is the amount of computation reduced, but also the information interaction capability of the sub-feature graph channel dimension is enhanced, and the probability of the channel dimension spatial domain feature importance degree obtained by using the SoftMax activation function is used to represent the attention enhancement factor.
And S106, obtaining an enhanced characteristic diagram corresponding to the characteristic layer of the certain layer according to all the enhanced characteristic diagrams.
According to the feature map enhancement method of the convolutional neural network, attention enhancement factors of channel dimensions are extracted through global average pooling and global maximum pooling, semantic information of importance degrees among sub-feature map channels can be better expressed, and meanwhile, a space domain grouping enhancement module structure is redesigned, so that the attention enhancement factors of the sub-feature map channel dimensions are calculated more effectively, and task performances of image classification, segmentation, detection and the like of the convolutional neural network are further improved.
The following are embodiments of the apparatus of the present application that may be used to perform embodiments of the method of the present application. For details which are not disclosed in the embodiments of the apparatus of the present application, reference is made to the embodiments of the method of the present application.
Fig. 3 is a schematic structural diagram of a feature map enhancing apparatus of a convolutional neural network according to a second embodiment of the present application, and as shown in fig. 3, the apparatus may include:
a convolution module 310, configured to perform convolution operation on an input original image to obtain a corresponding multi-layer feature map;
the grouping module 320 is configured to group feature maps of a certain layer according to channel dimensions to obtain a plurality of sub-feature maps;
the enhancement module 330 is configured to perform global average pooling and global maximum pooling parallel processing by using the embedded airspace grouping enhancement SGE module for each sub-feature map to obtain two corresponding channel dimension vectors; obtaining an attention enhancement factor of each channel in the corresponding sub-feature graph according to the corresponding two channel dimension vectors; obtaining a corresponding enhancer characteristic map according to the attention enhancement factor and the corresponding sub characteristic map;
and the output module 340 is configured to obtain an enhanced feature map corresponding to the certain layer of feature map according to all the enhanced feature maps.
According to the feature map enhancement device of the convolutional neural network, attention enhancement factors of channel dimensions are extracted through global average pooling and global maximum pooling, semantic information of importance degrees among sub-feature map channels can be better expressed, and meanwhile, a space domain grouping enhancement module structure is redesigned, so that the attention enhancement factors of the sub-feature map channel dimensions are calculated more effectively, and task performances of image classification, segmentation, detection and the like of the convolutional neural network are further improved.
In some embodiments, the enhancement module 330 is specifically configured to:
reducing the dimension of the corresponding two channel dimension vectors by using 1 x 1 convolution;
and activating and adding the two channel dimension vectors after dimension reduction by using a ReLU activation function to obtain the attention enhancement factor of each channel in the corresponding sub-feature map.
In some embodiments, the enhancement module 330 is specifically configured to:
using 1 × 1 convolution to raise the attention enhancement factor to the number of channels corresponding to the sub-feature map;
normalizing the attention enhancement factor using a SoftMax function;
multiplying the normalized attention enhancement factor by the corresponding sub-feature map to obtain an enhanced first sub-feature map;
regularizing the first sub-feature graph to obtain a second sub-feature graph;
activating the second sub-feature graph by using a Sigmoid activation function to obtain an enhanced third sub-feature graph;
and taking the third sub-feature map as an enhancer feature map.
In some embodiments, the enhancement module 330 is specifically configured to:
obtaining an importance coefficient of the second sub-feature map channel by using a Sigmoid activation function;
and scaling the second sub-feature map by using the importance coefficient to recalibrate the importance of the spatial domain feature on the channel of the second sub-feature map to obtain an enhanced third sub-feature map.
Fig. 4 is a schematic structural diagram of an electronic device according to a third embodiment of the present application, and as shown in fig. 4, the electronic device includes: a memory 401 and a processor 402;
a memory 401 for storing a computer program;
wherein the processor 402 executes the computer program in the memory 401 to implement the methods provided by the method embodiments as described above.
In an embodiment, the feature map enhancement apparatus of a convolutional neural network provided in the present application is exemplified by an electronic device. The processor may be a Central Processing Unit (CPU) or other form of processing unit having data processing capabilities and/or instruction execution capabilities, and may control other components in the electronic device to perform desired functions.
The memory may include one or more computer program products that may include various forms of computer-readable storage media, such as volatile memory and/or non-volatile memory. Volatile memory can include, for example, Random Access Memory (RAM), cache memory (or the like). The non-volatile memory may include, for example, Read Only Memory (ROM), a hard disk, flash memory, and the like. One or more computer program instructions may be stored on a computer-readable storage medium and executed by a processor to implement the methods of the various embodiments of the present application above and/or other desired functions. Various contents such as an input signal, a signal component, a noise component, etc. may also be stored in the computer-readable storage medium.
An embodiment of the present application provides a computer-readable storage medium, in which a computer program is stored, and the computer program is used for implementing the methods provided by the method embodiments described above when being executed by a processor.
In practice, the computer program in this embodiment may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, C + +, etc., and conventional procedural programming languages, such as the "C" programming language or similar programming languages, for performing the operations of the embodiments of the present application. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device and partly on a remote computing device, or entirely on the remote computing device or server.
In practice, the computer-readable storage medium may take any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. A readable storage medium may include, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium include: an electrical connection having one or more wires, a portable disk, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
The foregoing descriptions of specific exemplary embodiments of the present application have been presented for purposes of illustration and description. It is not intended to limit the application to the precise form disclosed, and obviously many modifications and variations are possible in light of the above teaching. The exemplary embodiments were chosen and described in order to explain certain principles of the present application and its practical application to enable one skilled in the art to make and use various exemplary embodiments of the present application and various alternatives and modifications thereof. It is intended that the scope of the application be defined by the claims and their equivalents.

Claims (6)

1. A feature map enhancement method for a convolutional neural network, comprising:
performing convolution operation on an input original image to obtain a corresponding multilayer characteristic diagram;
grouping feature graphs of a certain layer according to channel dimensions to obtain a plurality of sub-feature graphs;
aiming at each sub-feature graph, performing global average pooling and global maximum pooling parallel processing by using an embedded airspace grouping enhancement SGE module to obtain two corresponding channel dimension vectors;
obtaining an attention enhancement factor of each channel in the corresponding sub-feature graph according to the corresponding two channel dimension vectors;
obtaining a corresponding enhancer characteristic map according to the attention enhancement factor and the corresponding sub characteristic map;
obtaining an enhanced feature map corresponding to the feature map of the certain layer according to all the enhancer feature maps;
wherein, obtaining the attention enhancement factor of each channel in the corresponding sub-feature map according to the corresponding two channel dimension vectors comprises:
reducing the dimension of the corresponding two channel dimension vectors by using 1 x 1 convolution;
activating and adding the two channel dimension vectors after dimension reduction by using a ReLU activation function to obtain an attention enhancement factor corresponding to each channel in the sub-feature map;
wherein, obtaining a corresponding enhancer feature map according to the attention enhancing factor and the corresponding sub-feature map comprises:
using 1 × 1 convolution to raise the attention enhancement factor to the number of channels corresponding to the sub-feature map;
normalizing the attention enhancement factor using a SoftMax function;
multiplying the normalized attention enhancement factor by the corresponding sub-feature map to obtain an enhanced first sub-feature map;
regularizing the first sub-feature graph to obtain a second sub-feature graph;
activating the second sub-feature graph by using a Sigmoid activation function to obtain an enhanced third sub-feature graph;
and taking the third sub-feature map as an enhancer feature map.
2. The method according to claim 1, wherein said activating the second sub-feature map using a Sigmoid activation function, resulting in an enhanced third sub-feature map, comprises:
obtaining an importance coefficient of the second sub-feature map channel by using a Sigmoid activation function;
and scaling the second sub-feature map by using the importance coefficient to recalibrate the importance of the spatial domain feature on the channel of the second sub-feature map to obtain an enhanced third sub-feature map.
3. A feature map enhancement apparatus for a convolutional neural network, comprising:
the convolution module is used for carrying out convolution operation on the input original image to obtain a corresponding multilayer characteristic diagram;
the grouping module is used for grouping the characteristic diagrams of a certain layer according to the channel dimension to obtain a plurality of sub-characteristic diagrams;
the enhancement module is used for carrying out global average pooling and global maximum pooling parallel processing by utilizing the embedded airspace grouping enhancement SGE module aiming at each sub-feature map to obtain two corresponding channel dimension vectors; obtaining an attention enhancement factor of each channel in the corresponding sub-feature graph according to the corresponding two channel dimension vectors; obtaining a corresponding enhancer characteristic map according to the attention enhancement factor and the corresponding sub characteristic map;
the output module is used for obtaining an enhanced feature map corresponding to the feature map of the certain layer according to all the enhanced feature maps;
wherein, the enhancement module is specifically configured to:
reducing the dimension of the corresponding two channel dimension vectors by using 1 x 1 convolution;
activating and adding the two channel dimension vectors after dimension reduction by using a ReLU activation function to obtain an attention enhancement factor corresponding to each channel in the sub-feature map;
wherein, the enhancement module is specifically configured to:
using 1 × 1 convolution to raise the attention enhancement factor to the number of channels corresponding to the sub-feature map;
normalizing the attention enhancement factor using a SoftMax function;
multiplying the normalized attention enhancement factor by the corresponding sub-feature map to obtain an enhanced first sub-feature map;
regularizing the first sub-feature graph to obtain a second sub-feature graph;
activating the second sub-feature graph by using a Sigmoid activation function to obtain an enhanced third sub-feature graph;
and taking the third sub-feature map as an enhancer feature map.
4. The apparatus according to claim 3, wherein the enhancement module is specifically configured to:
obtaining an importance coefficient of the second sub-feature map channel by using a Sigmoid activation function;
and scaling the second sub-feature map by using the importance coefficient to recalibrate the importance of the spatial domain feature on the channel of the second sub-feature map to obtain an enhanced third sub-feature map.
5. An electronic device, comprising: a memory and a processor;
the memory for storing a computer program;
wherein the processor executes the computer program in the memory to implement the method of any one of claims 1-2.
6. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, is adapted to carry out the method of any one of claims 1-2.
CN201910605387.6A 2019-07-05 2019-07-05 Feature map enhancement method, device, equipment and medium for convolutional neural network Active CN110490813B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910605387.6A CN110490813B (en) 2019-07-05 2019-07-05 Feature map enhancement method, device, equipment and medium for convolutional neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910605387.6A CN110490813B (en) 2019-07-05 2019-07-05 Feature map enhancement method, device, equipment and medium for convolutional neural network

Publications (2)

Publication Number Publication Date
CN110490813A CN110490813A (en) 2019-11-22
CN110490813B true CN110490813B (en) 2021-12-17

Family

ID=68546677

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910605387.6A Active CN110490813B (en) 2019-07-05 2019-07-05 Feature map enhancement method, device, equipment and medium for convolutional neural network

Country Status (1)

Country Link
CN (1) CN110490813B (en)

Families Citing this family (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111027670B (en) * 2019-11-04 2022-07-22 重庆特斯联智慧科技股份有限公司 Feature map processing method and device, electronic equipment and storage medium
CN111161195B (en) * 2020-01-02 2023-10-13 重庆特斯联智慧科技股份有限公司 Feature map processing method and device, storage medium and terminal
CN113222827A (en) * 2020-01-21 2021-08-06 北京三星通信技术研究有限公司 Image processing method, image processing device, electronic equipment and computer readable storage medium
CN111274999B (en) * 2020-02-17 2024-04-19 北京迈格威科技有限公司 Data processing method, image processing device and electronic equipment
CN113361529B (en) * 2020-03-03 2024-05-10 北京四维图新科技股份有限公司 Image semantic segmentation method and device, electronic equipment and storage medium
CN111325751B (en) * 2020-03-18 2022-05-27 重庆理工大学 CT image segmentation system based on attention convolution neural network
CN111444957B (en) * 2020-03-25 2023-11-07 腾讯科技(深圳)有限公司 Image data processing method, device, computer equipment and storage medium
CN111539325A (en) * 2020-04-23 2020-08-14 四川旅游学院 Forest fire detection method based on deep learning
CN111967478B (en) * 2020-07-08 2023-09-05 特斯联科技集团有限公司 Feature map reconstruction method, system, storage medium and terminal based on weight overturn
CN112001248B (en) * 2020-07-20 2024-03-01 北京百度网讯科技有限公司 Active interaction method, device, electronic equipment and readable storage medium
CN112149694B (en) * 2020-08-28 2024-04-05 特斯联科技集团有限公司 Image processing method, system, storage medium and terminal based on convolutional neural network pooling module
CN112183645B (en) * 2020-09-30 2022-09-09 深圳龙岗智能视听研究院 Image aesthetic quality evaluation method based on context-aware attention mechanism
CN112465828B (en) * 2020-12-15 2024-05-31 益升益恒(北京)医学技术股份公司 Image semantic segmentation method and device, electronic equipment and storage medium
CN112668656B (en) * 2020-12-30 2023-10-13 深圳市优必选科技股份有限公司 Image classification method, device, computer equipment and storage medium
CN112862667A (en) * 2021-01-29 2021-05-28 成都商汤科技有限公司 Pooling method, chip, equipment and storage medium
CN112767406B (en) * 2021-02-02 2023-12-12 苏州大学 Deep convolution neural network training method for corneal ulcer segmentation and segmentation method
CN113011465B (en) * 2021-02-25 2021-09-03 浙江净禾智慧科技有限公司 Household garbage throwing intelligent supervision method based on grouping multi-stage fusion
CN113052173B (en) * 2021-03-25 2024-07-19 岳阳市金霖昇行科技有限公司 Sample data characteristic enhancement method and device
CN113486898B (en) * 2021-07-08 2024-05-31 西安电子科技大学 Radar signal RD image interference identification method and system based on improvement ShuffleNet
CN113920099B (en) * 2021-10-15 2022-08-30 深圳大学 Polyp segmentation method based on non-local information extraction and related components

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9953437B1 (en) * 2017-10-18 2018-04-24 StradVision, Inc. Method and device for constructing a table including information on a pooling type and testing method and testing device using the same
CN109299268A (en) * 2018-10-24 2019-02-01 河南理工大学 A kind of text emotion analysis method based on dual channel model
CN109376804B (en) * 2018-12-19 2020-10-30 中国地质大学(武汉) Hyperspectral remote sensing image classification method based on attention mechanism and convolutional neural network

Also Published As

Publication number Publication date
CN110490813A (en) 2019-11-22

Similar Documents

Publication Publication Date Title
CN110490813B (en) Feature map enhancement method, device, equipment and medium for convolutional neural network
Han et al. Dynamic neural networks: A survey
KR102434726B1 (en) Treatment method and device
Ball et al. Comprehensive survey of deep learning in remote sensing: theories, tools, and challenges for the community
CN108351984B (en) Hardware-efficient deep convolutional neural network
LeCun et al. Deep learning tutorial
CN108681746B (en) Image identification method and device, electronic equipment and computer readable medium
CN114556370A (en) Method and system for training convolutional neural networks using built-in attention
CN109472209B (en) Image recognition method, device and storage medium
Masoumi et al. Spectral shape classification: A deep learning approach
Wang et al. TRC‐YOLO: A real‐time detection method for lightweight targets based on mobile devices
Pawar et al. Assessment of autoencoder architectures for data representation
US20240135174A1 (en) Data processing method, and neural network model training method and apparatus
CN118643874A (en) Method and device for training neural network
Hamouda et al. Smart feature extraction and classification of hyperspectral images based on convolutional neural networks
US20230401838A1 (en) Image processing method and related apparatus
Lu et al. Generalized haar filter-based object detection for car sharing services
Li et al. Robust and structural sparsity auto-encoder with L21-norm minimization
CN115878330A (en) Thread operation control method and system
Chen et al. HRCP: High-ratio channel pruning for real-time object detection on resource-limited platform
Ma et al. Acceleration of multi‐task cascaded convolutional networks
US20240265503A1 (en) Neural processing unit and artificial neural network system for image fusion
Yadav et al. Design of CNN architecture for Hindi Characters
WO2023185209A1 (en) Model pruning
Shah et al. Convolutional neural network-based image segmentation techniques

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant