CN110781923B - Feature extraction method and device - Google Patents

Feature extraction method and device Download PDF

Info

Publication number
CN110781923B
CN110781923B CN201910927813.8A CN201910927813A CN110781923B CN 110781923 B CN110781923 B CN 110781923B CN 201910927813 A CN201910927813 A CN 201910927813A CN 110781923 B CN110781923 B CN 110781923B
Authority
CN
China
Prior art keywords
feature
subsets
processing
groups
network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910927813.8A
Other languages
Chinese (zh)
Other versions
CN110781923A (en
Inventor
贾琳
赵磊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chongqing Terminus Technology Co Ltd
Original Assignee
Chongqing Terminus Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chongqing Terminus Technology Co Ltd filed Critical Chongqing Terminus Technology Co Ltd
Priority to CN201910927813.8A priority Critical patent/CN110781923B/en
Publication of CN110781923A publication Critical patent/CN110781923A/en
Application granted granted Critical
Publication of CN110781923B publication Critical patent/CN110781923B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Abstract

The invention discloses a feature extraction method, which comprises the following steps: inputting an original characteristic diagram into a trained characteristic extraction model, grouping the original characteristic diagram according to channels by the characteristic extraction model through a packet network to obtain a G group characteristic set, outputting the G group characteristic set to a multi-scale enhancement network in the model, respectively carrying out multi-scale enhancement processing on the G group characteristic set by the multi-scale enhancement network to obtain a G group processed characteristic set, outputting the G group processed characteristic set to a post-processing network in the model, splicing the G group processed characteristic set according to the channels by the post-processing network, and adding the spliced characteristic diagram and the original characteristic diagram; the multi-scale enhancement processing comprises pooling processing, convolution processing, up-sampling processing and accumulation processing. The resolution of the features can be reduced through pooling processing, so that the amount of computation and the number of parameters are reduced, upsampling is carried out after convolution to restore the resolution, and then accumulation is carried out with the features before pooling to restore feature details, so that the amount of computation and the number of parameters are reduced while feature effectiveness is ensured.

Description

Feature extraction method and device
Technical Field
The invention relates to the technical field of computer vision, in particular to a feature extraction method and device.
Background
In the field of computer vision, feature information extraction is a necessary step for realizing various types of network models.
In the prior art, when feature information is extracted, a deep residual error network Res2Net network is usually used for extracting the feature information so as to enhance the extraction capability of multi-scale features and avoid the influence of gradient disappearance on a convolutional neural network. However, in Res2Net, after the input images subjected to the convolution processing are grouped, each group of features also needs to be subjected to the convolution processing using a convolution group, and thus the amount of calculation and the amount of parameters are large.
Disclosure of Invention
The present invention provides a feature extraction method and apparatus for overcoming the above-mentioned deficiencies in the prior art, and the object is achieved by the following technical solutions.
A first aspect of the present invention provides a feature extraction method, including:
inputting an original feature map into a trained feature extraction model, grouping the original feature map according to channels by the feature extraction model through a packet network to obtain G groups of feature subsets, outputting the G groups of feature subsets to a multi-scale enhancement network in the feature extraction model, respectively performing multi-scale enhancement processing on the G groups of feature subsets by the multi-scale enhancement network to obtain G groups of processed feature subsets, outputting the G groups of processed feature subsets to a post-processing network in the feature extraction model, splicing the G groups of processed feature subsets according to the channels by the post-processing network, and adding the spliced feature map and the original feature map to obtain an output feature map;
acquiring an output characteristic diagram output by the characteristic extraction model;
wherein the multi-scale enhancement processing comprises pooling processing, convolution processing, upsampling processing and accumulation processing.
A second aspect of the present invention provides a feature extraction apparatus, comprising:
the characteristic extraction module is used for inputting the original characteristic diagram into the trained characteristic extraction model, grouping the original characteristic diagram according to channels by the characteristic extraction model through a packet network to obtain G groups of characteristic subsets, outputting the G groups of characteristic subsets to a multi-scale enhancement network in the characteristic extraction model, respectively carrying out multi-scale enhancement processing on the G groups of characteristic subsets by the multi-scale enhancement network to obtain G groups of processed characteristic subsets, outputting the G groups of processed characteristic subsets to a post-processing network in the characteristic extraction model, splicing the G groups of processed characteristic subsets according to the channels by the post-processing network, and adding the spliced characteristic diagram and the original characteristic diagram to obtain an output characteristic diagram;
the acquisition module is used for acquiring an output characteristic diagram output by the characteristic extraction model;
wherein the multi-scale enhancement processing comprises pooling processing, convolution processing, upsampling processing and accumulation processing.
In the embodiment of the application, after the original feature map is input into the feature extraction model, the original feature map is divided into G groups of feature subsets through a packet network, a multi-scale enhancement network respectively performs multi-scale enhancement processing on each group of feature subsets, then the post-processing network splices the G groups of processed feature subsets, and the spliced feature map and the original feature map are added to obtain an output feature map. Wherein the multi-scale enhancement processing for each set of feature subsets comprises pooling processing, convolution processing, upsampling processing, and accumulation processing.
Based on the above description, it can be seen that, by using the multi-scale enhanced network to replace the 3 × 3 convolution group used in the existing Res2Net network, since the multi-scale enhanced network performs the pooling process to reduce the resolution of the feature subsets before performing the convolution process on each group of feature subsets, and further reduces the amount of computation and parameters, performs the upsampling process after performing the convolution process to restore the feature subsets to the resolution before performing the pooling process, and then performs the accumulation process with the feature subsets before performing the pooling process to restore the feature details lost by the pooling operation, the amount of computation and parameters are reduced while ensuring the validity of the feature extraction information.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the invention and not to limit the invention. In the drawings:
fig. 1 is a schematic diagram illustrating a structure of a Res2Net network according to an exemplary embodiment of the present invention;
FIG. 2 is a schematic diagram of a feature extraction model according to an exemplary embodiment of the present invention;
FIG. 3A is a flowchart illustrating an embodiment of a method of feature extraction according to an exemplary embodiment of the present invention;
FIG. 3B is a schematic diagram of a packet network structure according to the embodiment shown in FIG. 3A;
FIG. 3C is a schematic diagram of a post-processing network according to the embodiment of FIG. 3A;
FIG. 4 is a hardware block diagram of an electronic device shown in accordance with an exemplary embodiment of the present application;
fig. 5 is a flowchart illustrating an embodiment of a feature extraction apparatus according to an exemplary embodiment of the present invention.
Detailed Description
Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with the present invention. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the invention, as detailed in the appended claims.
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used in this disclosure and the appended claims, the singular forms "a", "an", and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used herein refers to and encompasses any and all possible combinations of one or more of the associated listed items.
It is to be understood that although the terms first, second, third, etc. may be used herein to describe various information, these information should not be limited to these terms. These terms are only used to distinguish one type of information from another. For example, first information may also be referred to as second information, and similarly, second information may also be referred to as first information, without departing from the scope of the present invention. The word "if" as used herein may be interpreted as "at" \8230; "or" when 8230; \8230; "or" in response to a determination ", depending on the context.
With the development of deep learning, a Convolutional Neural Network (CNN) is more and more widely applied in the field of computer vision, and particularly, the deep residual error network ResNet is provided, so that the CNN network design can not be influenced by gradient disappearance, a deep CNN network can be trained, and effective convolutional characteristic information can be extracted to the maximum extent. Therefore, in the field of computer vision, many backbone networks use the ResNet network to extract image features for subsequent classification, detection, segmentation and other tasks.
In order to further improve the effectiveness of feature extraction information, a Res2Net is proposed on the basis of a Res Net network, as shown in fig. 1, in an exemplary Res2Net network structure, an input feature map is subjected to 1 × 1 convolution kernel and is grouped to obtain four groups of feature subsets of X1, X2, X3 and X4, for a first group of feature subsets X1, a feature subset Y1 is obtained by performing convolution processing on a 3 × 3 convolution group, from a second group of feature subsets, for each group of feature subsets, the group of feature subsets needs to be spliced with a previous group of feature subsets and then input into a 3 × 3 convolution group for convolution processing, while Res2Net improves the effectiveness of convolution feature extraction information by using multi-scale information, each group of feature subsets needs to perform convolution processing by using a 3 × 3 convolution group, and the calculation burden and parameter amount are large.
In order to solve the above technical problems, the present invention provides a feature extraction model, as shown in fig. 2, the feature extraction model includes a packet network, a multi-scale enhancement network, and a post-processing network, after an original feature map is divided into G groups of feature subsets by the packet network, the multi-scale enhancement network performs multi-scale enhancement processing on each group of feature subsets, the post-processing network splices the G groups of processed feature subsets, and adds the spliced feature map and the original feature map to obtain an output feature map.
Wherein the multi-scale enhancement processing for each set of feature subsets comprises pooling processing, convolution processing, upsampling processing, and accumulation processing.
Based on the above description, it can be seen that, by using the multi-scale enhancement network to replace the 3 × 3 convolution group used in the existing Res2Net network, since the multi-scale enhancement network performs pooling processing to reduce the resolution of the feature subsets before performing convolution processing on each group of feature subsets, thereby reducing the amount of computation and parameters, performs upsampling processing after performing convolution processing to restore the feature subsets to the resolution before performing pooling processing, and then performs accumulation processing with the feature subsets before performing pooling processing to restore the feature details lost by the pooling operation, thereby reducing the amount of computation and parameters while ensuring the validity of the feature extraction information.
The feature extraction method implemented by the feature extraction model described above is explained in detail below with specific embodiments.
Fig. 3A is a flowchart illustrating an embodiment of a feature extraction method according to an exemplary embodiment of the present invention, the feature extraction method uses the feature extraction model shown in fig. 2 to perform feature extraction, and can be applied to electronic devices (such as a PC, a mobile phone terminal, and the like) as shown in fig. 3A, and the feature extraction method includes the following steps:
step 301: inputting the original characteristic diagram into a trained characteristic extraction model, grouping the original characteristic diagram according to channels by the characteristic extraction model through a packet network to obtain G groups of characteristic subsets, outputting the G groups of characteristic subsets to a multi-scale enhancement network in the characteristic extraction model, respectively carrying out multi-scale enhancement processing on the G groups of characteristic subsets by the multi-scale enhancement network to obtain G groups of processed characteristic subsets, outputting the G groups of processed characteristic subsets to a post-processing network in the characteristic extraction model, splicing the G groups of processed characteristic subsets according to the channels by the post-processing network, and adding the spliced characteristic diagram and the original characteristic diagram to obtain an output characteristic diagram.
In an embodiment, for a processing procedure of a packet network, as shown in fig. 3B of a packet network structure, a feature map after dimension reduction may be obtained by performing dimension reduction processing on an original feature map through a first convolution layer in the packet network, and output to a grouping layer in the packet network, where the grouping layer groups the feature map after dimension reduction according to a channel to obtain a G-group feature subset.
The original feature map is composed of a plurality of channel feature maps, so that after grouping, the size of each group of feature subsets is the same, but the number of channels of each group of feature subsets is 1/G of the number of channels of the feature map after dimension reduction.
Illustratively, the first convolution layer implementing the dimensionality reduction process may be implemented using a 1 × 1 convolution kernel to reduce the number of channels of the input feature map.
In the present invention, the grouping policy of the grouping layer may be set according to practical experience, for example, the feature map of each channel in the feature map after the dimension reduction processing may be used as a set of feature subsets.
In an embodiment, for the processing procedure of the multi-scale enhanced network, a first set of processed feature subsets may be obtained by performing multi-scale enhancement processing on the first set of feature subsets, and then, starting from a second set of feature subsets, for each set of feature subsets, the set of processed feature subsets is obtained by splicing the last set of processed feature subsets with the set of feature subsets according to a channel, and performing multi-scale enhancement processing on the spliced set of feature subsets.
Wherein the multi-scale enhancement processing for each set of feature subsets includes pooling, convolution, upsampling, and accumulating.
Illustratively, the pooling process may be implemented using a maximal pooling layer, the convolution process may be implemented using 3 × 3 convolution groups, and the upsampling process may be implemented using an upsampling layer. The accumulation processing may be performed by adding pixels along the channel dimension, that is, for each channel, adding the channel feature subset obtained by splicing and the corresponding pixels in the channel feature subset subjected to the upsampling processing to implement information aggregation, thereby further enhancing the effectiveness of the feature extraction information.
Assuming that the multi-scale enhancement processing is K (), the first set of processed feature subsets Y1= K (X1), the ith set of processed feature subsets Yi = K (Xi + Y (i-1)), 1-straw i ≦ G, and "+" in the formula represents concatenation.
Based on the above description, in addition to one multi-scale enhancement processing module corresponding to each group of feature subsets, a splicing layer also corresponds to each group of feature subsets starting from the second group of feature subsets in the multi-scale enhancement network.
In an embodiment, for a processing procedure of a post-processing network, as shown in fig. 3C, for a structure of the post-processing network, a splicing feature map is obtained by splicing, by a splicing layer in the post-processing network, the G groups of processed feature subsets according to channels, and is output to a second convolution layer in the post-processing network, the second convolution layer performs dimension-increasing processing on the splicing feature map to obtain a dimension-increased splicing feature map and outputs the dimension-increased splicing feature map to an SE (squeze-and-Excitation compression-activation) layer in the post-processing network, the SE layer performs enhancement processing on the dimension-increased splicing feature map to obtain an enhancement feature map and outputs the enhancement feature map to an accumulation layer of the post-processing network, and the accumulation layer adds the original feature map and the enhancement feature map to obtain an output feature map.
And the number and the size of channels of the spliced feature map after the dimension is increased and the feature map after the dimension is reduced are the same.
For example, the second convolution layer may also be implemented by using a convolution kernel of 1 × 1, so as to restore the number of channels of the stitched feature map to the number of channels of the input feature map, so that the number of channels of the stitched feature map after the dimensionality is the same as that of the original feature map. The accumulation layer also adds in pixels along the channel dimension, i.e. for each channel, the channel feature map is added to the corresponding pixel in the channel enhancement feature map.
Step 302: and acquiring an output characteristic diagram output by the characteristic extraction model.
For example, the obtained output feature map can be applied to classification, detection, segmentation and other tasks.
In the embodiment of the application, after the original feature map is input into the feature extraction model, the original feature map is divided into G groups of feature subsets through a packet network, a multi-scale enhancement network is used for respectively carrying out multi-scale enhancement processing on each group of feature subsets, then the post-processing network is used for splicing the G groups of processed feature subsets, and the spliced feature map and the original feature map are added to obtain the output feature map. Wherein the multi-scale enhancement processing for each set of feature subsets includes pooling, convolution, upsampling, and accumulating.
Based on the above description, it can be seen that, by using the multi-scale enhancement network to replace the 3 × 3 convolution group used in the existing Res2Net network, since the multi-scale enhancement network performs pooling processing to reduce the resolution of the feature subsets before performing convolution processing on each group of feature subsets, thereby reducing the amount of computation and parameters, performs upsampling processing after performing convolution processing to restore the feature subsets to the resolution before performing pooling processing, and then performs accumulation processing with the feature subsets before performing pooling processing to restore the feature details lost by the pooling operation, thereby reducing the amount of computation and parameters while ensuring the validity of the feature extraction information.
Fig. 4 is a hardware block diagram of an electronic device according to an exemplary embodiment of the present application, where the electronic device includes: a communication interface 401, a processor 402, a machine-readable storage medium 403, and a bus 404; wherein the communication interface 401, the processor 402 and the machine-readable storage medium 403 communicate with each other via a bus 404. The processor 402 may execute the above-described feature extraction method by reading and executing machine executable instructions in the machine-readable storage medium 403 corresponding to the control logic of the feature extraction method, and the specific content of the method is referred to the above-described embodiments, which will not be described again here.
The machine-readable storage medium 403 referred to herein may be any electronic, magnetic, optical, or other physical storage device that can contain or store information such as executable instructions, data, and the like. For example, the machine-readable storage medium may be: volatile memory, non-volatile memory, or similar storage media. In particular, the machine-readable storage medium 403 may be a RAM (random Access Memory), a flash Memory, a storage drive (such as a hard disk drive), any type of storage disk (such as an optical disk, a DVD, etc.), or similar storage medium, or a combination thereof.
Fig. 5 is a flowchart illustrating an embodiment of a feature extraction apparatus according to an exemplary embodiment of the present invention, which performs feature extraction using the feature extraction model as illustrated in fig. 2 and can be applied to an electronic device, as illustrated in fig. 5, and includes:
a feature extraction module 510, configured to input an original feature map into a trained feature extraction model, group the original feature map according to channels by the feature extraction model through a packet network to obtain G groups of feature subsets, and output the G groups of feature subsets to a multi-scale enhancement network in the feature extraction model, perform multi-scale enhancement processing on the G groups of feature subsets by the multi-scale enhancement network to obtain G groups of processed feature subsets, and output the G groups of processed feature subsets to a post-processing network in the feature extraction model, splice the G groups of processed feature subsets according to channels by the post-processing network, and add the spliced feature maps and the original feature map to obtain an output feature map;
an obtaining module 520, configured to obtain an output feature map output by the feature extraction model;
wherein the multi-scale enhancement processing comprises pooling processing, convolution processing, upsampling processing, and accumulation processing.
In an optional implementation manner, the feature extraction module 510 is specifically configured to, in a process that a packet network groups the original feature maps according to channels to obtain G groups of feature subsets, perform, through a first convolution layer in the packet network, a dimension reduction process on the original feature maps to obtain a feature map after dimension reduction, and output the feature map to a packet layer in the packet network; the grouping layer groups the feature maps after the dimensionality reduction according to channels to obtain G groups of feature subsets; and the size of each group of feature subsets is the same, but the number of channels of each group of feature subsets is 1/G of the number of channels of the feature map after dimension reduction.
In an optional implementation manner, the feature extraction module 510 is specifically configured to, in a process that the multi-scale enhancement network performs multi-scale enhancement processing on G groups of feature subsets respectively to obtain G groups of processed feature subsets, perform multi-scale enhancement processing on a first group of feature subsets to obtain a first group of processed feature subsets; and starting from the second group of feature subsets, splicing the processed feature subsets of the previous group with the feature subsets of each group according to a channel, and performing multi-scale enhancement processing on the spliced feature subsets of the group to obtain the processed feature subsets of the group.
In an optional implementation manner, the feature extraction module 510 is specifically configured to, in a process that the post-processing network splices the G groups of processed feature subsets according to channels and adds the spliced feature map to the original feature map to obtain an output feature map, splice the G groups of processed feature subsets according to the channels through a splice layer in the post-processing network to obtain a splice feature map, and output the splice feature map to a second convolution layer in the post-processing network; the second convolution layer carries out dimension increasing processing on the splicing characteristic diagram to obtain the splicing characteristic diagram after dimension increasing, and outputs the splicing characteristic diagram to a compression activation SE layer in the post-processing network; the SE layer performs enhancement processing on the spliced characteristic diagram after the dimension is increased to obtain an enhanced characteristic diagram, and outputs the enhanced characteristic diagram to an accumulation layer of the post-processing network; the accumulation layer adds the original characteristic diagram and the enhanced characteristic diagram to obtain an output characteristic diagram; and the number and the size of channels of the spliced feature map after the dimension is increased are the same as those of the original feature map.
The specific details of the implementation process of the functions and actions of each unit in the above device are the implementation processes of the corresponding steps in the above method, and are not described herein again.
For the device embodiments, since they substantially correspond to the method embodiments, reference may be made to the partial description of the method embodiments for relevant points. The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules can be selected according to actual needs to achieve the purpose of the scheme of the invention. One of ordinary skill in the art can understand and implement it without inventive effort.
Other embodiments of the invention will be apparent to those skilled in the art from consideration of the specification and practice of the invention disclosed herein. The invention is intended to cover any variations, uses, or adaptations of the invention following, in general, the principles of the invention and including such departures from the present disclosure as come within known or customary practice within the art to which the invention pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the invention being indicated by the following claims.
It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrases "comprising a," "8230," "8230," or "comprising" does not exclude the presence of other like elements in a process, method, article, or apparatus comprising the element.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like made within the spirit and principle of the present invention should be included in the scope of the present invention.

Claims (7)

1. A method for extracting features of an image, the method comprising:
inputting an original feature map of an input image into a trained feature extraction model, grouping the original feature map according to channels by the feature extraction model through a packet network to obtain G groups of feature subsets, outputting the G groups of feature subsets to a multi-scale enhancement network in the feature extraction model, respectively performing multi-scale enhancement processing on the G groups of feature subsets by the multi-scale enhancement network to obtain G groups of processed feature subsets, outputting the G groups of processed feature subsets to a post-processing network in the feature extraction model, splicing the G groups of processed feature subsets according to the channels by the post-processing network, and adding the spliced feature maps and the original feature map to obtain an output feature map;
acquiring an output feature map output by the feature extraction model, wherein the output feature map is used for any one of classification tasks, detection tasks and segmentation tasks;
the multi-scale enhancement network respectively performs multi-scale enhancement processing on the G groups of feature subsets to obtain the G groups of processed feature subsets, and the method comprises the following steps: performing multi-scale enhancement processing on the first group of feature subsets to obtain a first group of processed feature subsets; starting from the second group of feature subsets, splicing the processed feature subsets of the previous group with the feature subsets of each group according to a channel, and performing multi-scale enhancement processing on the spliced feature subsets of the group to obtain the processed feature subsets of the group;
the processing sequence of the multi-scale enhancement processing is as follows in sequence: pooling, convolution, up-sampling and accumulation.
2. The method of claim 1, wherein the packet network groups the raw feature maps into G groups of feature subsets according to channels, comprising:
carrying out dimension reduction processing on the original characteristic diagram through a first convolution layer in the packet network to obtain a dimension-reduced characteristic diagram, and outputting the dimension-reduced characteristic diagram to a packet layer in the packet network;
the grouping layer groups the feature maps subjected to the dimensionality reduction according to channels to obtain G groups of feature subsets;
and the size of each group of feature subsets is the same, but the channel number of each group of feature subsets is 1/G of the channel number of the feature map after dimension reduction.
3. The method of claim 1, wherein the post-processing network concatenates the G groups of processed feature subsets according to channels and adds the concatenated feature maps to the original feature map to obtain an output feature map, comprising:
splicing the characteristic subsets processed by the G groups according to the channels by the splicing layer in the post-processing network to obtain a splicing characteristic diagram, and outputting the splicing characteristic diagram to a second convolution layer in the post-processing network;
the second convolution layer carries out dimension increasing processing on the splicing characteristic diagram to obtain the splicing characteristic diagram after dimension increasing, and outputs the splicing characteristic diagram to a compression activation SE layer in the post-processing network;
the SE layer performs enhancement processing on the spliced characteristic diagram after the dimension is increased to obtain an enhanced characteristic diagram, and outputs the enhanced characteristic diagram to an accumulation layer of the post-processing network;
the accumulation layer is used for accumulating the original characteristic diagram and the enhanced characteristic diagram to obtain an output characteristic diagram;
and the number and the size of channels of the spliced feature map after the dimension is increased are the same as those of the original feature map.
4. An apparatus for extracting features of an image, the apparatus comprising:
the characteristic extraction module is used for inputting an original characteristic diagram of an input image into a trained characteristic extraction model, grouping the original characteristic diagram according to channels by the characteristic extraction model through a packet network to obtain G groups of characteristic subsets, outputting the G groups of characteristic subsets to a multi-scale enhancement network in the characteristic extraction model, respectively carrying out multi-scale enhancement processing on the G groups of characteristic subsets by the multi-scale enhancement network to obtain G groups of processed characteristic subsets, outputting the G groups of processed characteristic subsets to a post-processing network in the characteristic extraction model, splicing the G groups of processed characteristic subsets according to the channels by the post-processing network, and adding the spliced characteristic diagram and the original characteristic diagram to obtain an output characteristic diagram; the multi-scale enhancement network respectively performs multi-scale enhancement processing on the G groups of feature subsets to obtain G groups of processed feature subsets, and the method comprises the following steps: performing multi-scale enhancement processing on the first group of feature subsets to obtain a first group of processed feature subsets; starting from the second group of feature subsets, splicing the processed feature subsets of the previous group with the feature subsets of each group according to a channel, and performing multi-scale enhancement processing on the spliced feature subsets of the group to obtain the processed feature subsets of the group;
the acquisition module is used for acquiring an output feature map output by the feature extraction model, and the output feature map is used for any one of a classification task, a detection task and a segmentation task;
wherein, the processing sequence of the multi-scale enhancement processing is as follows in sequence: pooling, convolution, up-sampling and accumulation.
5. The apparatus according to claim 4, wherein the feature extraction module is specifically configured to, in a process of grouping the original feature maps according to channels in a packet network to obtain G groups of feature subsets, perform a dimension reduction process on the original feature maps through a first convolution layer in the packet network to obtain a feature map after the dimension reduction, and output the feature map to a packet layer in the packet network; the grouping layer groups the feature maps after the dimensionality reduction according to channels to obtain G groups of feature subsets; and the size of each group of feature subsets is the same, but the channel number of each group of feature subsets is 1/G of the channel number of the feature map after dimension reduction.
6. The apparatus according to claim 4, wherein the feature extraction module is specifically configured to, in a process that the multi-scale enhancement network performs multi-scale enhancement processing on G groups of feature subsets respectively to obtain G groups of processed feature subsets, perform multi-scale enhancement processing on a first group of feature subsets to obtain a first group of processed feature subsets; and starting from the second group of feature subsets, splicing the processed feature subsets of the previous group with the feature subsets of each group according to channels, and performing multi-scale enhancement processing on the spliced feature subsets of the group to obtain the processed feature subsets of the group.
7. The apparatus according to claim 4, wherein the feature extraction module is specifically configured to, in the process that the post-processing network splices the G groups of processed feature subsets according to the channels and adds the spliced feature map to the original feature map to obtain the output feature map, splice the G groups of processed feature subsets according to the channels through the splice layer in the post-processing network to obtain a spliced feature map, and output the spliced feature map to the second convolution layer in the post-processing network; the second convolution layer carries out dimension increasing processing on the splicing characteristic diagram to obtain the splicing characteristic diagram after dimension increasing, and outputs the splicing characteristic diagram to a compression activation SE layer in the post-processing network; the SE layer performs enhancement processing on the spliced characteristic diagram after the dimension is increased to obtain an enhanced characteristic diagram, and outputs the enhanced characteristic diagram to an accumulation layer of the post-processing network; the accumulation layer is used for accumulating the original characteristic diagram and the enhanced characteristic diagram to obtain an output characteristic diagram; and the number and the size of channels of the spliced characteristic diagram after the dimension is increased are the same as those of the original characteristic diagram.
CN201910927813.8A 2019-09-27 2019-09-27 Feature extraction method and device Active CN110781923B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910927813.8A CN110781923B (en) 2019-09-27 2019-09-27 Feature extraction method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910927813.8A CN110781923B (en) 2019-09-27 2019-09-27 Feature extraction method and device

Publications (2)

Publication Number Publication Date
CN110781923A CN110781923A (en) 2020-02-11
CN110781923B true CN110781923B (en) 2023-02-07

Family

ID=69384601

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910927813.8A Active CN110781923B (en) 2019-09-27 2019-09-27 Feature extraction method and device

Country Status (1)

Country Link
CN (1) CN110781923B (en)

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111612751B (en) * 2020-05-13 2022-11-15 河北工业大学 Lithium battery defect detection method based on Tiny-yolov3 network embedded with grouping attention module
CN111553321A (en) * 2020-05-18 2020-08-18 城云科技(中国)有限公司 Mobile vendor target detection model, detection method and management method thereof
CN112489001B (en) * 2020-11-23 2023-07-25 石家庄铁路职业技术学院 Tunnel water seepage detection method based on improved deep learning
CN112633077A (en) * 2020-12-02 2021-04-09 特斯联科技集团有限公司 Face detection method, system, storage medium and terminal based on intra-layer multi-scale feature enhancement
CN112580453A (en) * 2020-12-08 2021-03-30 成都数之联科技有限公司 Land use classification method and system based on remote sensing image and deep learning
CN112507888A (en) * 2020-12-11 2021-03-16 北京建筑大学 Building identification method and device
CN112686297B (en) * 2020-12-29 2023-04-14 中国人民解放军海军航空大学 Radar target motion state classification method and system
CN113643261B (en) * 2021-08-13 2023-04-18 江南大学 Lung disease diagnosis method based on frequency attention network
CN114092813B (en) * 2021-11-25 2022-08-05 中国科学院空天信息创新研究院 Industrial park image extraction method and system, electronic equipment and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108615027A (en) * 2018-05-11 2018-10-02 常州大学 A method of video crowd is counted based on shot and long term memory-Weighted Neural Network
CN109344883A (en) * 2018-09-13 2019-02-15 西京学院 Fruit tree diseases and pests recognition methods under a kind of complex background based on empty convolution
CN109934241A (en) * 2019-03-28 2019-06-25 南开大学 It can be integrated into Image Multiscale information extracting method and the application in neural network framework
CN110059772A (en) * 2019-05-14 2019-07-26 温州大学 Remote sensing images semantic segmentation method based on migration VGG network
CN110232693A (en) * 2019-06-12 2019-09-13 桂林电子科技大学 A kind of combination thermodynamic chart channel and the image partition method for improving U-Net

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108615027A (en) * 2018-05-11 2018-10-02 常州大学 A method of video crowd is counted based on shot and long term memory-Weighted Neural Network
CN109344883A (en) * 2018-09-13 2019-02-15 西京学院 Fruit tree diseases and pests recognition methods under a kind of complex background based on empty convolution
CN109934241A (en) * 2019-03-28 2019-06-25 南开大学 It can be integrated into Image Multiscale information extracting method and the application in neural network framework
CN110059772A (en) * 2019-05-14 2019-07-26 温州大学 Remote sensing images semantic segmentation method based on migration VGG network
CN110232693A (en) * 2019-06-12 2019-09-13 桂林电子科技大学 A kind of combination thermodynamic chart channel and the image partition method for improving U-Net

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
Proposed method to Malayalam Handwritten Character Recognition using Residual Network enhanced by multi-scaled features;Samatha P Salim et al.;《 2019 1st International Conference on Innovations in Information and Communication Technology (ICIICT)》;20190620;全文 *
Res2Net: A New Multi-Scale Backbone Architecture;Shang-Hua Gao et al.;《IEEE Transactions on Pattern Analysis and Machine Intelligence》;20190830;第43卷(第2期);全文 *
基于ResNet的遥感图像飞机目标检测新方法;赵丹新 等;《电子设计工程》;20181120;第26卷(第22期);全文 *

Also Published As

Publication number Publication date
CN110781923A (en) 2020-02-11

Similar Documents

Publication Publication Date Title
CN110781923B (en) Feature extraction method and device
CN108010031B (en) Portrait segmentation method and mobile terminal
CN108664981B (en) Salient image extraction method and device
US9239948B2 (en) Feature descriptor for robust facial expression recognition
JP7045483B2 (en) Coding pattern processing methods and devices, electronic devices, and computer programs
US11354797B2 (en) Method, device, and system for testing an image
CN109919110B (en) Video attention area detection method, device and equipment
CN111476719A (en) Image processing method, image processing device, computer equipment and storage medium
JP7026165B2 (en) Text recognition method and text recognition device, electronic equipment, storage medium
CN112308866A (en) Image processing method, image processing device, electronic equipment and storage medium
CN114238904B (en) Identity recognition method, and training method and device of dual-channel hyper-resolution model
CN112001923B (en) Retina image segmentation method and device
CN111553290A (en) Text recognition method, device, equipment and storage medium
CN110619334A (en) Portrait segmentation method based on deep learning, architecture and related device
CN113256643A (en) Portrait segmentation model training method, storage medium and terminal equipment
CN111967478A (en) Feature map reconstruction method and system based on weight inversion, storage medium and terminal
CN108810319B (en) Image processing apparatus, image processing method, and program
Zheng et al. Joint residual pyramid for joint image super-resolution
CN115187456A (en) Text recognition method, device, equipment and medium based on image enhancement processing
CN112288748B (en) Semantic segmentation network training and image semantic segmentation method and device
CN114511702A (en) Remote sensing image segmentation method and system based on multi-scale weighted attention
US20200372280A1 (en) Apparatus and method for image processing for machine learning
CN115393868B (en) Text detection method, device, electronic equipment and storage medium
CN113963282A (en) Video replacement detection and training method and device of video replacement detection model
CN111831207A (en) Data processing method, device and equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant