CN112232361A - Image processing method and device, electronic equipment and computer readable storage medium - Google Patents
Image processing method and device, electronic equipment and computer readable storage medium Download PDFInfo
- Publication number
- CN112232361A CN112232361A CN202011091051.1A CN202011091051A CN112232361A CN 112232361 A CN112232361 A CN 112232361A CN 202011091051 A CN202011091051 A CN 202011091051A CN 112232361 A CN112232361 A CN 112232361A
- Authority
- CN
- China
- Prior art keywords
- feature map
- feature
- characteristic diagram
- convolution
- image
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000003672 processing method Methods 0.000 title abstract description 7
- 238000000034 method Methods 0.000 claims abstract description 45
- 238000010586 diagram Methods 0.000 claims description 86
- 238000012545 processing Methods 0.000 claims description 42
- 238000004364 calculation method Methods 0.000 claims description 19
- 238000005070 sampling Methods 0.000 claims description 14
- 230000008569 process Effects 0.000 description 14
- 238000001514 detection method Methods 0.000 description 5
- 230000004927 fusion Effects 0.000 description 5
- 230000006870 function Effects 0.000 description 4
- 238000012549 training Methods 0.000 description 4
- 238000007499 fusion processing Methods 0.000 description 3
- 230000004913 activation Effects 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 101100400452 Caenorhabditis elegans map-2 gene Proteins 0.000 description 1
- 101150064138 MAP1 gene Proteins 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 230000002708 enhancing effect Effects 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000003062 neural network model Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000000750 progressive effect Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/44—Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/211—Selection of the most significant subset of features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- Computer Vision & Pattern Recognition (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Evolutionary Biology (AREA)
- Biophysics (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Molecular Biology (AREA)
- General Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- Biomedical Technology (AREA)
- Health & Medical Sciences (AREA)
- Multimedia (AREA)
- Image Analysis (AREA)
Abstract
The application provides an image processing method and device, an electronic device and a computer readable storage medium, comprising: the method comprises the steps of obtaining a multi-scale feature map of an image to be detected, using the multi-scale feature map as a first feature map, identifying position features and non-position features in the first feature map for each first feature map, carrying out convolution operation on the first feature map to obtain a second feature map, wherein in the convolution operation, the weight of a convolution kernel corresponding to the position features is larger than the weight of a convolution kernel corresponding to the non-position features, so that the position features in the second feature map are enhanced, a third feature map obtained by multiplying the second feature map and the first feature map has enhanced position features, and therefore a fourth feature map used for detecting the image to be detected is generated according to the third feature map, and the accuracy of detecting the image to be detected can be improved.
Description
Technical Field
The present application relates to the field of data processing, and in particular, to a method and an apparatus for image processing, an electronic device, and a computer-readable storage medium.
Background
In image processing, an FPN (Feature Pyramid Network) is often used to detect and process an image, and the FPN mainly includes convolution operation and Feature fusion processing. The feature maps with different scales can be obtained through convolution operation, and the feature fusion processing is to perform feature fusion calculation on the feature maps after convolution to obtain a new feature map so as to perform image processing such as image detection and the like by using the new feature map.
Because the position features included in the feature map obtained by the downsampling process (i.e., the number of sampling points of the feature map is reduced through convolution operation in the FPN) are weakened, the position features of the new feature map obtained after the feature map is subsequently used for fusion calculation are less, and the accuracy of image detection is affected.
Disclosure of Invention
The inventors have found, through research, that the reason why the feature map obtained by the down-sampling process includes a weakened position feature is that the position feature in the feature map obtained by the down-sampling process is almost the same as the non-position feature, that is, the information of the position feature in the feature map obtained by the down-sampling process is not prominent (that is, the information of the position feature is substantially weakened), and thus the present application provides an image processing method and apparatus for enhancing the position feature in the salient feature map before performing the feature fusion calculation on the feature map obtained by the down-sampling process, so as to solve the problem that the position feature of a new feature map obtained by the feature fusion calculation is less and affects the image detection accuracy.
In order to achieve the above object, the present application provides the following technical solutions:
a method of image processing, comprising:
receiving an image to be detected;
acquiring a multi-scale characteristic map of the image to be detected as a first characteristic map;
for each of the first feature maps, identifying location features and non-location features in the first feature map;
performing convolution operation on the first characteristic diagram to obtain a second characteristic diagram; in the convolution operation, the weight of the convolution kernel corresponding to the position feature is greater than the weight of the convolution kernel corresponding to the non-position feature;
multiplying the second characteristic diagram with the first characteristic diagram to obtain a third characteristic diagram;
and generating a fourth characteristic diagram for detecting the image to be detected according to the third characteristic diagram.
In the foregoing method, optionally, the performing convolution operation on the first feature map to obtain a second feature map includes:
and performing the convolution operation on the first characteristic diagram for multiple times to obtain the second characteristic diagram.
In the foregoing method, optionally, in performing the convolution operation on the first feature map, the size of a convolution kernel is 1 × 1 × M, where M is the number of channels of the first feature map.
In the foregoing method, optionally, in any convolution operation performed on the first feature map after the convolution operation, the size of a convolution kernel is 1 × 1 × N, where N is the number of convolution kernels in the last convolution operation.
In the above method, optionally, the identifying the position feature and the non-position feature in the first feature map, and performing convolution operation on the first feature map to obtain a second feature map includes:
inputting the first characteristic diagram into a pre-constructed space weight model to obtain a second characteristic diagram output by the space weight model; the spatial weight model is used for identifying position features and non-position features in the first feature map and performing convolution operation on the first feature map, wherein in the convolution operation, the weight of a convolution kernel corresponding to the position features is larger than the weight of a convolution kernel corresponding to the non-position features.
Optionally, in the method, generating a fourth feature map for detecting the image to be detected according to the third feature map includes:
taking the third feature map with the smallest scale as a first fourth feature map;
performing convolution calculation on any one third feature map except for the third feature map with the minimum scale by using a convolution layer with the size of 1 × 1, and performing feature connection operation on the third feature map and a target fourth feature map subjected to upsampling processing to obtain a fourth feature map corresponding to the third feature map;
the target fourth feature map of the third feature map is a fourth feature map corresponding to a third feature map having a size adjacent to and smaller than that of the third feature map in a multi-scale third feature map, and the fourth feature map is calculated at least based on the third feature map.
Optionally, the acquiring the multi-scale feature map of the image to be detected includes:
and performing bottom-up path down-sampling processing on the image to be detected by using the FPN to obtain characteristic maps of the image to be detected with different scales.
An apparatus for image processing, comprising:
the receiving unit is used for receiving an image to be detected;
the acquisition unit is used for acquiring the multi-scale characteristic diagram of the image to be detected as a first characteristic diagram;
the identification unit is used for identifying position features and non-position features in the first feature map aiming at each first feature map;
the first operation unit is used for performing convolution operation on the first characteristic diagram to obtain a second characteristic diagram; in the convolution operation, the weight of the convolution kernel corresponding to the position feature is greater than the weight of the convolution kernel corresponding to the non-position feature
The second operation unit is used for multiplying the second characteristic diagram and the first characteristic diagram to obtain a third characteristic diagram;
and the generating unit is used for generating a fourth feature map for detecting the image to be detected according to the third feature map.
An electronic device, comprising: a processor and a memory for storing a program; the processor is used for running the program to realize the image processing method.
A computer-readable storage medium having stored therein instructions which, when run on a computer, cause the computer to perform the method of image processing described above.
According to the method and the device, the multi-scale feature map of the image to be detected is obtained and used as the first feature map, the position feature and the non-position feature in the first feature map are identified aiming at each first feature map, the convolution operation is carried out on the first feature map, the second feature map is obtained, and because the weight of the convolution kernel corresponding to the position feature is larger than the weight of the convolution kernel corresponding to the non-position feature in the convolution operation, the position feature in the second feature map is enhanced, the third feature map obtained by multiplying the second feature map and the first feature map has the enhanced position feature, the fourth feature map used for detecting the image to be detected is generated according to the third feature map, and the accuracy of detecting the image to be detected can be improved.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present application, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
Fig. 1 is a flowchart of a method for image processing according to an embodiment of the present disclosure;
fig. 2 is a schematic diagram of image processing of an image processing model according to an embodiment of the present application;
fig. 3 is a schematic diagram of image processing of a spatial weight model according to an embodiment of the present disclosure;
fig. 4 is a schematic structural diagram of an image processing apparatus according to an embodiment of the present application;
fig. 5 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
In the field of image processing, particularly image detection processing, it is common to perform image feature extraction on an image using FPN and detect the image based on the extracted image features. However, because the FPN performs bottom-up path down-sampling on the picture to obtain the multi-scale feature map, the original position features of the image may be weakened, which may possibly cause the accuracy of the subsequent image detection based on the multi-scale feature map to be reduced.
The inventors have found that the reason why the feature map obtained by the down-sampling process includes the weakened position features is that the position features in the feature map obtained by the down-sampling process are almost the same as those of the non-position features, that is, the feature map obtained by the down-sampling process weakens the information of the position features.
In order to make the aforementioned objects, features and advantages of the present application more comprehensible, the present application is described in further detail with reference to the accompanying drawings and the detailed description. It is to be understood that the embodiments described are only a few embodiments of the present application and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
According to the method provided by the embodiment of the application, the execution main body is the server with the image processing function.
Fig. 1 is a flowchart of an image processing method according to an embodiment of the present application, and the method may include the following steps:
s101, receiving an image to be detected.
S102, acquiring a multi-scale characteristic diagram of the image to be detected as a first characteristic diagram.
The specific embodiment mode of the step is as follows: and performing bottom-up path down-sampling processing on the image to be detected by using the FPN to obtain feature maps of the image to be detected in different scales, and taking the feature maps of the different scales as a first feature map.
The number of the multi-scale feature maps obtained by FPN processing and the scale of each feature map can be set by self according to requirements. The prior art can be referred to for a specific embodiment of downsampling an image to be detected by using FPN.
And S103, identifying the position feature and the non-position feature in the first feature map aiming at each first feature map.
The position feature is position information for describing a target image included in the picture, such as a shape contour edge of the target image.
In this step, the position feature and the non-position feature in the first feature map may be identified by using a position identification model trained in advance, the position identification model may be obtained by training a training sample carrying a position feature tag and a non-position feature tag, and the specific training configuration may refer to an existing neural network model training method.
Of course, the position feature and the non-position feature in the first feature map may also be determined by an existing method for identifying the shape contour edge of the target image.
And S104, performing convolution operation on the first characteristic diagram to obtain a second characteristic diagram.
In the convolution operation, the weight of the convolution kernel corresponding to the position feature is greater than the weight of the convolution kernel corresponding to the non-position feature.
It should be noted that, the weights of the convolution kernels corresponding to the position features are greater than the weights of the convolution kernels corresponding to the non-position features, so that the position features and the non-position features in the feature map are different, and the position features in the salient feature map are enhanced.
The embodiment mode of the step can be as follows: and performing convolution operation on the first characteristic diagram for multiple times to obtain a second characteristic diagram, wherein in any convolution operation, the weight of the convolution kernel corresponding to the position characteristic is greater than the weight of the convolution kernel corresponding to the non-position characteristic. The specific times of the convolution operation can be set by self according to requirements. In this step, the purpose of each convolution operation is to further enhance the information of the position feature in the feature map.
The convolution operation performed on the first characteristic diagram for multiple times is as follows: and taking the first characteristic diagram as the first characteristic diagram for convolution operation, and performing convolution operation on the characteristic diagram obtained by the last convolution operation by any subsequent convolution operation.
In the convolution operation of the first feature map, the convolution kernel size is 1 × 1 × M, and M is the number of channels of the first feature map. In any convolution operation after the convolution operation is performed on the first feature map, the size of a convolution kernel is 1 × 1 × N, and N is the number of convolution kernels of the previous convolution operation.
For example, if the size of the convolution kernel of the first convolution operation is 1 × 1 × M and the number thereof is 512, the size of the second convolution operation is 1 × 1 × 512.
Note that, the convolution operation performed in this step does not change the scale of the feature map, that is, the scale of the finally obtained second feature map is the same as the scale of the first feature map.
And S105, multiplying the second characteristic diagram and the first characteristic diagram to obtain a third characteristic diagram.
In this embodiment, the second feature map and the first feature map have the same scale, and the multiplication operation of the second feature map and the first feature map is that each feature value in the second feature map is multiplied by a feature value at the same position in the first feature map to obtain a third feature map.
And S106, generating a fourth feature map for detecting the image to be detected according to the third feature map.
Specific embodiment modes of the step may include steps a1 to a 2:
step A1, taking the third feature map with the smallest dimension as the first fourth feature map;
in this example, for each scale of the first feature map, the scale of the second feature map obtained by performing the first type convolution operation and the second type convolution operation is the same as the scale of the first feature map, so that the scale sample of the third feature map obtained by performing the multiplication operation on the second feature map and the first feature map is the same as the scale sample of the first feature map. Therefore, the third feature map with the smallest scale is the third feature map corresponding to the first feature map with the smallest scale in the multi-scale first feature map.
And A2, performing convolution calculation on the third feature map by using a convolution layer with the size of 1 x 1 aiming at any one of the third feature maps except the third feature map with the smallest scale, and performing feature connection operation on the target fourth feature map subjected to the upsampling processing to obtain a fourth feature map corresponding to the third feature map.
And the target fourth feature map of the third feature map is a fourth feature map corresponding to the third feature map, wherein the size of the fourth feature map is adjacent to that of the third feature map and is smaller than that of the third feature map in the multi-scale third feature map, and the third feature map corresponds to the fourth feature map, and the fourth feature map is obtained by calculation at least according to the third feature map.
Performing convolution calculation on the third feature map by using the convolution layer with the size of 1 × 1, so that the number of channels of the obtained feature map is consistent with the number of channels of the target fourth feature map, performing upsampling processing on the target fourth feature map, and enabling the target fourth feature map to be consistent with the scale of the feature map obtained by performing convolution calculation on the third feature map by using the convolution layer with the size of 1 × 1.
After convolution calculation is carried out on the third characteristic diagram by adopting a convolution layer with the size of 1 multiplied by 1, characteristic connection operation is carried out on the convolution calculation and the target fourth characteristic diagram after upsampling processing, so that aliasing effect can be avoided.
According to the method, the multi-scale feature map of the image to be detected is obtained and used as the first feature map, the position feature and the non-position feature in the first feature map are identified for each first feature map, convolution operation is conducted on the first feature map, the second feature map is obtained, in the convolution operation, the weight of a convolution kernel corresponding to the position feature is larger than the weight of a convolution kernel corresponding to the non-position feature, the position feature in the second feature map is enhanced, a third feature map obtained by multiplying the second feature map and the first feature map has the enhanced position feature, a fourth feature map used for detecting the image to be detected is generated according to the third feature map, and the accuracy of detecting the image to be detected can be improved.
In the above embodiment, the position feature and the non-position feature in the first feature map are identified, and the convolution operation is performed on the position feature to obtain the second feature map, which may be completed by a pre-established spatial weight model.
The spatial weight model is used for identifying the position features and the non-position features in the first feature map and carrying out convolution operation on the first feature map, wherein in the convolution operation, the weight of a convolution kernel corresponding to the position features is larger than that of a convolution kernel corresponding to the non-position features, so that the information of the position features in the second feature map is enhanced.
In this embodiment, an image to be detected is processed by using an image processing model obtained by combining the FPN model and the spatial weight model. Fig. 2 is a schematic diagram of a process of processing a picture by an image processing model, and as shown in fig. 2, the process of processing the picture by the image processing model is divided into 3 parts:
the first part is a multi-scale first feature map generated by performing bottom-up path downsampling processing on a picture to be detected by using FPN, and C2, C3, C4 and C5 respectively represent first feature maps with different scales. The scale of the first feature map obtained by bottom-up path down-sampling is continuously reduced, and C5 is the first feature map with the smallest scale.
The second part is to calculate the first feature map Cn (n is 2,3,4,5) by using a spatial weight model to obtain a corresponding feature map Rn (n is 2,3,4, 5). In fig. 2, a horizontal line connecting portion between Cn (n ═ 2,3,4,5) and the second feature map Rn (n ═ 2,3,4,5) indicates: and inputting the first characteristic diagram Cn into a space weight model, and outputting the second characteristic diagram Rn after the space weight model operates the first characteristic diagram Cn. The second feature map Rn obtained by calculating the first feature map Cn using the spatial weight model has the same size as the first feature map Cn, for example, the dimensions of C2 are the same as those of R2. Therefore, in Rn (n ═ 2,3,4,5), the R5 scale is the smallest, and the R2 scale is the largest.
Fig. 3 is a schematic diagram of a specific processing procedure of the spatial weight model on the first feature map Cn.
And the third part is to perform feature fusion calculation on each Rn to obtain a final fourth feature map Pn (n is 2,3,4,5) for generating an image to be detected. Wherein P5 is the generated R5 in the second part, and the specific fusion process of P2, P3 and P4 is shown in FIG. 2;
(1) performing 2 times of upsampling on the Pm +1 to obtain a feature map with the same size as that of the Rm, wherein m is 2,3 and 4;
(2) performing 1 × 1 convolution operation on Rm to obtain a feature map of the channel number same as Pm +1, wherein m is 2,3 and 4;
(3) and (3) performing characteristic connection operation on the characteristic graphs obtained in the steps (1) and (2) to obtain Pm, wherein m is 2,3 and 4, and the characteristic connection operation is adopted to replace the addition operation in the traditional FPN structure, so that the aliasing effect caused by the addition operation can be avoided.
In the method provided by this embodiment, an image processing model obtained by combining the FPN model and the spatial weight model is used to process an image to be detected. The spatial weight model is used for identifying the position features and the non-position features in the first feature map and performing convolution operation on the position features, and in the convolution operation, the weight of the convolution kernel corresponding to the position features is larger than the weight of the convolution kernel corresponding to the non-position features, so that the position features in the second feature map output by the second feature map are enhanced, and the accuracy of detecting the image to be detected can be improved.
FIG. 3 is a graph of a spatial weight model versus a first feature map CnSchematic processing of the body. Wherein the feature value of the first feature map Cn is represented by Cn (p,q). As shown in fig. 3, the small cube in the first feature map Cn indicates that the feature values of all the channels on the same straight line of the first feature map Cn constitute a numerical sequence. Wherein, on different channels in the same straight line, Cn (p,q)May be different.
The specific processing process of the spatial weight model on the first feature map Cn comprises the following steps:
(1) and performing a first convolution operation on the first characteristic diagram by adopting the first convolution layer to obtain a characteristic diagram 1.
In fig. 3, the minicubes in the feature map 1 represent numerical value sequences corresponding to numerical value sequences of the minicubes in the first feature map Cn after convolution operation of the first convolution layer.
In the first convolution operation, the size of the convolution kernel in the first convolution layer is 1 × 1 × M, where M is the number of channels in the first feature map Cn. The total number of convolution kernels in the first convolutional layer is 512 (the total number of convolution kernels may be other values, which is only an example).
And performing first convolution operation, wherein the weight of the convolution kernel corresponding to the position characteristic is greater than the weight of the convolution kernel corresponding to the non-position characteristic. The spatial weight model can determine the information of the position feature and the non-position feature included in the first feature map Cn through loss function calculation and negative feedback calculation in the convolution operation process.
(2) And performing activation calculation on the characteristic diagram 1 by using a Relu activation function to obtain a characteristic diagram 2. The calculation formula is as follows:
wherein,the characteristic values in the characteristic diagram 2 are represented,representing the weights of the convolution kernels in the first convolution layer.
(3) And performing convolution calculation for the second time by using the characteristic diagram 2. A characteristic map 3 is obtained. In fig. 3, the minicubes in the signature graph 3 represent: the numerical sequence of the minicubes in the feature map 2 corresponds to the numerical sequence after the convolution operation of the first convolution layer.
Since the number of channels of the feature map obtained by the first convolution operation is 512, the size of the convolution kernel of the second convolution layer is 1 × 1 × 512, and the total number of convolution kernels is 512 (the total number of convolution kernels may be other values, which is only an example here). Similarly, in the second convolutional layer, the weight of the convolutional kernel used for calculating the position feature of the feature map is greater than the weight of the convolutional kernel used for calculating the non-position feature.
(4) And normalizing the characteristic diagram 3 to obtain a characteristic diagram 4. Fig. 4 is a second characteristic diagram of the above embodiment.
(5) And multiplying the feature map 4 by the first feature map Cn to obtain a feature map 5. Fig. 5 is a third characteristic diagram of the above embodiment. In this embodiment, optionally, the spatial weight model may be pre-constructed into a model architecture that obtains the second feature map and performs multiplication operation on the second feature map and the first feature map.
It should be noted that, in this embodiment, performing convolution operation twice by the spatial weight model is only an example, and optionally, the spatial weight model may also be configured to perform convolution operation more times, and the weight of the convolution kernel in each convolution layer for calculating the position feature of the feature map is greater than the weight of the convolution kernel for calculating the non-position feature.
In the method provided by this embodiment, in the convolution operation of the spatial weight model, the weight of the convolution kernel corresponding to the position feature is greater than the weight of the convolution kernel corresponding to the non-position feature, so that the position feature in the output second feature map is enhanced, and the accuracy of detecting the image to be detected can be improved.
Fig. 4 is a schematic structural diagram of an apparatus 400 for processing pictures according to the present application, including:
a receiving unit 401, configured to receive an image to be detected.
An obtaining unit 402, configured to obtain a multi-scale feature map of an image to be detected as a first feature map;
an identifying unit 403, configured to identify, for each first feature map, a location feature and a non-location feature in the first feature map.
A first operation unit 404, configured to perform convolution operation on the first feature map to obtain a second feature map; in the convolution operation, the weight of the convolution kernel corresponding to the position feature is greater than the weight of the convolution kernel corresponding to the non-position feature.
A second operation unit 405, configured to multiply the second feature map and the first feature map to obtain a third feature map.
And a generating unit 406, configured to generate a fourth feature map for detecting the image to be detected according to the third feature map.
Optionally, the specific implementation manner of performing convolution operation on the first feature map by the first operation unit 404 to obtain the second feature map is as follows: and performing convolution operation on the first feature map for multiple times to obtain a second feature map, wherein in any convolution operation, the weight of the convolution kernel corresponding to the position feature is greater than the weight of the convolution kernel corresponding to the non-position feature.
Optionally, in the convolution operation performed on the first feature map by the first operation unit 404, the size of the convolution kernel is 1 × 1 × M, where M is the number of channels of the first feature map.
Optionally, in any convolution operation after the first operation unit 404 performs the convolution operation on the first feature map, the size of the convolution kernel is 1 × 1 × N, and N is the number of convolution kernels of the previous convolution operation.
Optionally, the specific implementation manner of identifying the position feature and the non-position feature in the first feature map by the identifying unit 403 and performing convolution operation on the first feature map by the first operation unit 404 to obtain the second feature map is as follows:
and inputting the first feature map into a pre-constructed space weight model, obtaining a second feature map space weight model output by the space weight model, wherein the second feature map space weight model is used for identifying position features and non-position features in the first feature map, and performing convolution operation on the first feature map, and in the convolution operation, the weight of a convolution kernel corresponding to the position features is greater than the weight of a convolution kernel corresponding to the non-position features.
Optionally, the specific implementation manner of generating, by the generating unit 406, the fourth feature map for detecting the image to be detected according to the third feature map is as follows:
taking the third feature map with the minimum dimension as a first fourth feature map;
performing convolution calculation on the third feature map by using a convolution layer with the size of 1 x 1 aiming at any one of the third feature maps except the third feature map with the minimum scale, and performing feature connection operation on the third feature map and a target fourth feature map subjected to upsampling processing to obtain a fourth feature map corresponding to the third feature map;
the target fourth feature map of the third feature map is a fourth feature map corresponding to the third feature map, in the multi-scale third feature map, the size of which is adjacent to the size of the third feature map and is smaller than the size of the third feature map, and the third feature map and the fourth feature map correspond to each other in a way that the fourth feature map is calculated at least according to the third feature map.
Optionally, the specific implementation manner of the obtaining unit 402 obtaining the multi-scale feature map of the image to be detected is as follows: and performing bottom-up path down-sampling processing on the image to be detected by using the FPN to obtain characteristic maps of the image to be detected in different scales.
The present application further provides an electronic device 500, a schematic structural diagram of which is shown in fig. 5, including: a processor 501 and a memory 502, the memory 502 is used for storing application programs, the processor 501 is used for executing the application programs to realize the picture processing method of the present application, that is, the following steps are executed:
receiving an image to be detected;
acquiring a multi-scale characteristic map of the image to be detected as a first characteristic map;
for each of the first feature maps, identifying location features and non-location features in the first feature map;
performing convolution operation on the first characteristic diagram to obtain a second characteristic diagram; in the convolution operation, the weight of the convolution kernel corresponding to the position feature is greater than the weight of the convolution kernel corresponding to the non-position feature;
multiplying the second characteristic diagram with the first characteristic diagram to obtain a third characteristic diagram;
and generating a fourth characteristic diagram for detecting the image to be detected according to the third characteristic diagram.
The present application also provides a computer-readable storage medium having stored therein instructions, which when run on a computer, cause the computer to perform the method of picture processing of the present application, namely to perform the steps of:
acquiring a multi-scale characteristic map of the image to be detected as a first characteristic map;
for each of the first feature maps, identifying location features and non-location features in the first feature map;
performing convolution operation on the first characteristic diagram to obtain a second characteristic diagram; in the convolution operation, the weight of the convolution kernel corresponding to the position feature is greater than the weight of the convolution kernel corresponding to the non-position feature;
multiplying the second characteristic diagram with the first characteristic diagram to obtain a third characteristic diagram;
and generating a fourth characteristic diagram for detecting the image to be detected according to the third characteristic diagram.
The functions described in the method of the embodiment of the present application, if implemented in the form of software functional units and sold or used as independent products, may be stored in a storage medium readable by a computing device. Based on such understanding, part of the contribution to the prior art of the embodiments of the present application or part of the technical solution may be embodied in the form of a software product stored in a storage medium and including several instructions for causing a computing device (which may be a personal computer, a server, a mobile computing device or a network device) to execute all or part of the steps of the method described in the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
The embodiments are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same or similar parts among the embodiments are referred to each other.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present application. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the application. Thus, the present application is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.
Claims (10)
1. A method of image processing, comprising:
receiving an image to be detected;
acquiring a multi-scale characteristic map of the image to be detected as a first characteristic map;
for each of the first feature maps, identifying location features and non-location features in the first feature map;
performing convolution operation on the first characteristic diagram to obtain a second characteristic diagram; in the convolution operation, the weight of the convolution kernel corresponding to the position feature is greater than the weight of the convolution kernel corresponding to the non-position feature;
multiplying the second characteristic diagram with the first characteristic diagram to obtain a third characteristic diagram;
and generating a fourth characteristic diagram for detecting the image to be detected according to the third characteristic diagram.
2. The method of claim 1, wherein the convolving the first feature map to obtain a second feature map comprises:
and performing the convolution operation on the first characteristic diagram for multiple times to obtain the second characteristic diagram.
3. The method according to claim 2, wherein in the convolution operation performed on the first feature map, a convolution kernel size is 1 × 1 × M, where M is the number of channels of the first feature map.
4. The method according to claim 3, wherein in any one of the convolution operations after the convolution operation is performed on the first feature map, a convolution kernel size is 1 x N, and N is the number of convolution kernels of a previous convolution operation.
5. The method according to any one of claims 1-3, wherein said identifying the location features and non-location features in the first feature map and said convolving the first feature map to obtain a second feature map comprises:
inputting the first characteristic diagram into a pre-constructed space weight model to obtain a second characteristic diagram output by the space weight model; the spatial weight model is used for identifying position features and non-position features in the first feature map and performing convolution operation on the first feature map, wherein in the convolution operation, the weight of a convolution kernel corresponding to the position features is larger than the weight of a convolution kernel corresponding to the non-position features.
6. The method according to claim 1, wherein generating a fourth feature map for detecting the image to be detected according to the third feature map comprises:
taking the third feature map with the smallest scale as a first fourth feature map;
performing convolution calculation on any one third feature map except for the third feature map with the minimum scale by using a convolution layer with the size of 1 × 1, and performing feature connection operation on the third feature map and a target fourth feature map subjected to upsampling processing to obtain a fourth feature map corresponding to the third feature map;
the target fourth feature map of the third feature map is a fourth feature map corresponding to a third feature map having a size adjacent to and smaller than that of the third feature map in a multi-scale third feature map, and the fourth feature map is calculated at least based on the third feature map.
7. The method of claim 1, wherein the obtaining of the multi-scale feature map of the image to be detected comprises:
and performing bottom-up path down-sampling processing on the image to be detected by using the FPN to obtain characteristic maps of the image to be detected with different scales.
8. An apparatus for image processing, comprising:
the receiving unit is used for receiving an image to be detected;
the acquisition unit is used for acquiring the multi-scale characteristic diagram of the image to be detected as a first characteristic diagram;
the identification unit is used for identifying position features and non-position features in the first feature map aiming at each first feature map;
the first operation unit is used for performing convolution operation on the first characteristic diagram to obtain a second characteristic diagram; in the convolution operation, the weight of the convolution kernel corresponding to the position feature is greater than the weight of the convolution kernel corresponding to the non-position feature
The second operation unit is used for multiplying the second characteristic diagram and the first characteristic diagram to obtain a third characteristic diagram;
and the generating unit is used for generating a fourth feature map for detecting the image to be detected according to the third feature map.
9. An electronic device, comprising: a processor and a memory for storing a program; the processor is configured to execute the program to implement the method of image processing according to any one of claims 1 to 7.
10. A computer-readable storage medium having stored therein instructions which, when run on a computer, cause the computer to perform the method of image processing according to any one of claims 1-7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011091051.1A CN112232361B (en) | 2020-10-13 | 2020-10-13 | Image processing method and device, electronic equipment and computer readable storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011091051.1A CN112232361B (en) | 2020-10-13 | 2020-10-13 | Image processing method and device, electronic equipment and computer readable storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112232361A true CN112232361A (en) | 2021-01-15 |
CN112232361B CN112232361B (en) | 2021-09-21 |
Family
ID=74112459
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011091051.1A Active CN112232361B (en) | 2020-10-13 | 2020-10-13 | Image processing method and device, electronic equipment and computer readable storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112232361B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113486908A (en) * | 2021-07-13 | 2021-10-08 | 杭州海康威视数字技术股份有限公司 | Target detection method and device, electronic equipment and readable storage medium |
Citations (17)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP1738192A1 (en) * | 2004-04-08 | 2007-01-03 | Raytheon Company | System and method for dynamic weight processing |
CN106886023A (en) * | 2017-02-27 | 2017-06-23 | 中国人民解放军理工大学 | A kind of Radar Echo Extrapolation method based on dynamic convolutional neural networks |
CN108090836A (en) * | 2018-01-30 | 2018-05-29 | 南京信息工程大学 | Based on the equity investment method for weighting intensive connection convolutional neural networks deep learning |
CN108229497A (en) * | 2017-07-28 | 2018-06-29 | 北京市商汤科技开发有限公司 | Image processing method, device, storage medium, computer program and electronic equipment |
EP3385887A1 (en) * | 2017-04-08 | 2018-10-10 | INTEL Corporation | Sub-graph in frequency domain and dynamic selection of convolution implementation on a gpu |
CN109614876A (en) * | 2018-11-16 | 2019-04-12 | 北京市商汤科技开发有限公司 | Critical point detection method and device, electronic equipment and storage medium |
CN109996023A (en) * | 2017-12-29 | 2019-07-09 | 华为技术有限公司 | Image processing method and device |
CN110210571A (en) * | 2019-06-10 | 2019-09-06 | 腾讯科技(深圳)有限公司 | Image-recognizing method, device, computer equipment and computer readable storage medium |
CN110309876A (en) * | 2019-06-28 | 2019-10-08 | 腾讯科技(深圳)有限公司 | Object detection method, device, computer readable storage medium and computer equipment |
CN110807362A (en) * | 2019-09-23 | 2020-02-18 | 腾讯科技(深圳)有限公司 | Image detection method and device and computer readable storage medium |
CN110852349A (en) * | 2019-10-21 | 2020-02-28 | 上海联影智能医疗科技有限公司 | Image processing method, detection method, related equipment and storage medium |
CN111476252A (en) * | 2020-04-03 | 2020-07-31 | 南京邮电大学 | Computer vision application-oriented lightweight anchor-frame-free target detection method |
CN111507408A (en) * | 2020-04-17 | 2020-08-07 | 深圳市商汤科技有限公司 | Image processing method and device, electronic equipment and storage medium |
CN111612075A (en) * | 2020-05-22 | 2020-09-01 | 中国科学院自动化研究所 | Interest point and descriptor extraction method based on joint feature recombination and feature mixing |
CN111738069A (en) * | 2020-05-13 | 2020-10-02 | 北京三快在线科技有限公司 | Face detection method and device, electronic equipment and storage medium |
CN111753730A (en) * | 2020-06-24 | 2020-10-09 | 国网电子商务有限公司 | Image examination method and device |
CN111754491A (en) * | 2020-06-28 | 2020-10-09 | 国网电子商务有限公司 | Picture definition judging method and device |
-
2020
- 2020-10-13 CN CN202011091051.1A patent/CN112232361B/en active Active
Patent Citations (17)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP1738192A1 (en) * | 2004-04-08 | 2007-01-03 | Raytheon Company | System and method for dynamic weight processing |
CN106886023A (en) * | 2017-02-27 | 2017-06-23 | 中国人民解放军理工大学 | A kind of Radar Echo Extrapolation method based on dynamic convolutional neural networks |
EP3385887A1 (en) * | 2017-04-08 | 2018-10-10 | INTEL Corporation | Sub-graph in frequency domain and dynamic selection of convolution implementation on a gpu |
CN108229497A (en) * | 2017-07-28 | 2018-06-29 | 北京市商汤科技开发有限公司 | Image processing method, device, storage medium, computer program and electronic equipment |
CN109996023A (en) * | 2017-12-29 | 2019-07-09 | 华为技术有限公司 | Image processing method and device |
CN108090836A (en) * | 2018-01-30 | 2018-05-29 | 南京信息工程大学 | Based on the equity investment method for weighting intensive connection convolutional neural networks deep learning |
CN109614876A (en) * | 2018-11-16 | 2019-04-12 | 北京市商汤科技开发有限公司 | Critical point detection method and device, electronic equipment and storage medium |
CN110210571A (en) * | 2019-06-10 | 2019-09-06 | 腾讯科技(深圳)有限公司 | Image-recognizing method, device, computer equipment and computer readable storage medium |
CN110309876A (en) * | 2019-06-28 | 2019-10-08 | 腾讯科技(深圳)有限公司 | Object detection method, device, computer readable storage medium and computer equipment |
CN110807362A (en) * | 2019-09-23 | 2020-02-18 | 腾讯科技(深圳)有限公司 | Image detection method and device and computer readable storage medium |
CN110852349A (en) * | 2019-10-21 | 2020-02-28 | 上海联影智能医疗科技有限公司 | Image processing method, detection method, related equipment and storage medium |
CN111476252A (en) * | 2020-04-03 | 2020-07-31 | 南京邮电大学 | Computer vision application-oriented lightweight anchor-frame-free target detection method |
CN111507408A (en) * | 2020-04-17 | 2020-08-07 | 深圳市商汤科技有限公司 | Image processing method and device, electronic equipment and storage medium |
CN111738069A (en) * | 2020-05-13 | 2020-10-02 | 北京三快在线科技有限公司 | Face detection method and device, electronic equipment and storage medium |
CN111612075A (en) * | 2020-05-22 | 2020-09-01 | 中国科学院自动化研究所 | Interest point and descriptor extraction method based on joint feature recombination and feature mixing |
CN111753730A (en) * | 2020-06-24 | 2020-10-09 | 国网电子商务有限公司 | Image examination method and device |
CN111754491A (en) * | 2020-06-28 | 2020-10-09 | 国网电子商务有限公司 | Picture definition judging method and device |
Non-Patent Citations (3)
Title |
---|
JINCHEN等: "Dynamic Region-Aware Convolution", 《HTTPS://ARXIV.ORG/ABS/2003.12243V1》 * |
朱威等: "基于动态图卷积和空间金字塔池化的点云深度学习网络", 《计算机科学》 * |
陈朋等: "基于改进动态配置的FPGA卷积神经网络加速器的优化方法", 《高技术通讯》 * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113486908A (en) * | 2021-07-13 | 2021-10-08 | 杭州海康威视数字技术股份有限公司 | Target detection method and device, electronic equipment and readable storage medium |
CN113486908B (en) * | 2021-07-13 | 2023-08-29 | 杭州海康威视数字技术股份有限公司 | Target detection method, target detection device, electronic equipment and readable storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN112232361B (en) | 2021-09-21 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Shen et al. | Deep semantic face deblurring | |
CN108875523B (en) | Human body joint point detection method, device, system and storage medium | |
CN109376631B (en) | Loop detection method and device based on neural network | |
CN108875533B (en) | Face recognition method, device, system and computer storage medium | |
CN110610154A (en) | Behavior recognition method and apparatus, computer device, and storage medium | |
CN111160229B (en) | SSD network-based video target detection method and device | |
CN113837257B (en) | Target detection method and device | |
CN107730514A (en) | Scene cut network training method, device, computing device and storage medium | |
CN111275066A (en) | Image feature fusion method and device and electronic equipment | |
WO2017070923A1 (en) | Human face recognition method and apparatus | |
CN113781164B (en) | Virtual fitting model training method, virtual fitting method and related devices | |
CN114266894A (en) | Image segmentation method and device, electronic equipment and storage medium | |
CN113569070A (en) | Image detection method and device, electronic equipment and storage medium | |
CN112528318A (en) | Image desensitization method and device and electronic equipment | |
CN110782430A (en) | Small target detection method and device, electronic equipment and storage medium | |
CN112232361B (en) | Image processing method and device, electronic equipment and computer readable storage medium | |
US20220189151A1 (en) | Processing system, estimation apparatus, processing method, and non-transitory storage medium | |
CN111027670B (en) | Feature map processing method and device, electronic equipment and storage medium | |
CN111079643B (en) | Face detection method and device based on neural network and electronic equipment | |
CN110210279A (en) | Object detection method, device and computer readable storage medium | |
CN115546271B (en) | Visual analysis method, device, equipment and medium based on depth joint characterization | |
CN111353577B (en) | Multi-task-based cascade combination model optimization method and device and terminal equipment | |
CN112801045B (en) | Text region detection method, electronic equipment and computer storage medium | |
CN109978043A (en) | A kind of object detection method and device | |
US20230035671A1 (en) | Generating stereo-based dense depth images |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CP03 | Change of name, title or address |
Address after: 100032 room 8018, 8 / F, building 7, Guangyi street, Xicheng District, Beijing Patentee after: State Grid Digital Technology Holdings Co.,Ltd. Patentee after: State Grid E-Commerce Technology Co., Ltd Address before: 311 guanganmennei street, Xicheng District, Beijing 100053 Patentee before: STATE GRID ELECTRONIC COMMERCE Co.,Ltd. Patentee before: State Grid E-Commerce Technology Co., Ltd |
|
CP03 | Change of name, title or address |