CN112232361A - Image processing method and device, electronic equipment and computer readable storage medium - Google Patents

Image processing method and device, electronic equipment and computer readable storage medium Download PDF

Info

Publication number
CN112232361A
CN112232361A CN202011091051.1A CN202011091051A CN112232361A CN 112232361 A CN112232361 A CN 112232361A CN 202011091051 A CN202011091051 A CN 202011091051A CN 112232361 A CN112232361 A CN 112232361A
Authority
CN
China
Prior art keywords
feature map
feature
characteristic diagram
convolution
image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202011091051.1A
Other languages
Chinese (zh)
Other versions
CN112232361B (en
Inventor
张宾
孙喜民
周晶
刘丹
李晓明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
State Grid Digital Technology Holdings Co ltd
State Grid E Commerce Technology Co Ltd
Original Assignee
State Grid E Commerce Co Ltd
State Grid E Commerce Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by State Grid E Commerce Co Ltd, State Grid E Commerce Technology Co Ltd filed Critical State Grid E Commerce Co Ltd
Priority to CN202011091051.1A priority Critical patent/CN112232361B/en
Publication of CN112232361A publication Critical patent/CN112232361A/en
Application granted granted Critical
Publication of CN112232361B publication Critical patent/CN112232361B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/211Selection of the most significant subset of features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Biophysics (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Biomedical Technology (AREA)
  • Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

The application provides an image processing method and device, an electronic device and a computer readable storage medium, comprising: the method comprises the steps of obtaining a multi-scale feature map of an image to be detected, using the multi-scale feature map as a first feature map, identifying position features and non-position features in the first feature map for each first feature map, carrying out convolution operation on the first feature map to obtain a second feature map, wherein in the convolution operation, the weight of a convolution kernel corresponding to the position features is larger than the weight of a convolution kernel corresponding to the non-position features, so that the position features in the second feature map are enhanced, a third feature map obtained by multiplying the second feature map and the first feature map has enhanced position features, and therefore a fourth feature map used for detecting the image to be detected is generated according to the third feature map, and the accuracy of detecting the image to be detected can be improved.

Description

Image processing method and device, electronic equipment and computer readable storage medium
Technical Field
The present application relates to the field of data processing, and in particular, to a method and an apparatus for image processing, an electronic device, and a computer-readable storage medium.
Background
In image processing, an FPN (Feature Pyramid Network) is often used to detect and process an image, and the FPN mainly includes convolution operation and Feature fusion processing. The feature maps with different scales can be obtained through convolution operation, and the feature fusion processing is to perform feature fusion calculation on the feature maps after convolution to obtain a new feature map so as to perform image processing such as image detection and the like by using the new feature map.
Because the position features included in the feature map obtained by the downsampling process (i.e., the number of sampling points of the feature map is reduced through convolution operation in the FPN) are weakened, the position features of the new feature map obtained after the feature map is subsequently used for fusion calculation are less, and the accuracy of image detection is affected.
Disclosure of Invention
The inventors have found, through research, that the reason why the feature map obtained by the down-sampling process includes a weakened position feature is that the position feature in the feature map obtained by the down-sampling process is almost the same as the non-position feature, that is, the information of the position feature in the feature map obtained by the down-sampling process is not prominent (that is, the information of the position feature is substantially weakened), and thus the present application provides an image processing method and apparatus for enhancing the position feature in the salient feature map before performing the feature fusion calculation on the feature map obtained by the down-sampling process, so as to solve the problem that the position feature of a new feature map obtained by the feature fusion calculation is less and affects the image detection accuracy.
In order to achieve the above object, the present application provides the following technical solutions:
a method of image processing, comprising:
receiving an image to be detected;
acquiring a multi-scale characteristic map of the image to be detected as a first characteristic map;
for each of the first feature maps, identifying location features and non-location features in the first feature map;
performing convolution operation on the first characteristic diagram to obtain a second characteristic diagram; in the convolution operation, the weight of the convolution kernel corresponding to the position feature is greater than the weight of the convolution kernel corresponding to the non-position feature;
multiplying the second characteristic diagram with the first characteristic diagram to obtain a third characteristic diagram;
and generating a fourth characteristic diagram for detecting the image to be detected according to the third characteristic diagram.
In the foregoing method, optionally, the performing convolution operation on the first feature map to obtain a second feature map includes:
and performing the convolution operation on the first characteristic diagram for multiple times to obtain the second characteristic diagram.
In the foregoing method, optionally, in performing the convolution operation on the first feature map, the size of a convolution kernel is 1 × 1 × M, where M is the number of channels of the first feature map.
In the foregoing method, optionally, in any convolution operation performed on the first feature map after the convolution operation, the size of a convolution kernel is 1 × 1 × N, where N is the number of convolution kernels in the last convolution operation.
In the above method, optionally, the identifying the position feature and the non-position feature in the first feature map, and performing convolution operation on the first feature map to obtain a second feature map includes:
inputting the first characteristic diagram into a pre-constructed space weight model to obtain a second characteristic diagram output by the space weight model; the spatial weight model is used for identifying position features and non-position features in the first feature map and performing convolution operation on the first feature map, wherein in the convolution operation, the weight of a convolution kernel corresponding to the position features is larger than the weight of a convolution kernel corresponding to the non-position features.
Optionally, in the method, generating a fourth feature map for detecting the image to be detected according to the third feature map includes:
taking the third feature map with the smallest scale as a first fourth feature map;
performing convolution calculation on any one third feature map except for the third feature map with the minimum scale by using a convolution layer with the size of 1 × 1, and performing feature connection operation on the third feature map and a target fourth feature map subjected to upsampling processing to obtain a fourth feature map corresponding to the third feature map;
the target fourth feature map of the third feature map is a fourth feature map corresponding to a third feature map having a size adjacent to and smaller than that of the third feature map in a multi-scale third feature map, and the fourth feature map is calculated at least based on the third feature map.
Optionally, the acquiring the multi-scale feature map of the image to be detected includes:
and performing bottom-up path down-sampling processing on the image to be detected by using the FPN to obtain characteristic maps of the image to be detected with different scales.
An apparatus for image processing, comprising:
the receiving unit is used for receiving an image to be detected;
the acquisition unit is used for acquiring the multi-scale characteristic diagram of the image to be detected as a first characteristic diagram;
the identification unit is used for identifying position features and non-position features in the first feature map aiming at each first feature map;
the first operation unit is used for performing convolution operation on the first characteristic diagram to obtain a second characteristic diagram; in the convolution operation, the weight of the convolution kernel corresponding to the position feature is greater than the weight of the convolution kernel corresponding to the non-position feature
The second operation unit is used for multiplying the second characteristic diagram and the first characteristic diagram to obtain a third characteristic diagram;
and the generating unit is used for generating a fourth feature map for detecting the image to be detected according to the third feature map.
An electronic device, comprising: a processor and a memory for storing a program; the processor is used for running the program to realize the image processing method.
A computer-readable storage medium having stored therein instructions which, when run on a computer, cause the computer to perform the method of image processing described above.
According to the method and the device, the multi-scale feature map of the image to be detected is obtained and used as the first feature map, the position feature and the non-position feature in the first feature map are identified aiming at each first feature map, the convolution operation is carried out on the first feature map, the second feature map is obtained, and because the weight of the convolution kernel corresponding to the position feature is larger than the weight of the convolution kernel corresponding to the non-position feature in the convolution operation, the position feature in the second feature map is enhanced, the third feature map obtained by multiplying the second feature map and the first feature map has the enhanced position feature, the fourth feature map used for detecting the image to be detected is generated according to the third feature map, and the accuracy of detecting the image to be detected can be improved.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present application, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
Fig. 1 is a flowchart of a method for image processing according to an embodiment of the present disclosure;
fig. 2 is a schematic diagram of image processing of an image processing model according to an embodiment of the present application;
fig. 3 is a schematic diagram of image processing of a spatial weight model according to an embodiment of the present disclosure;
fig. 4 is a schematic structural diagram of an image processing apparatus according to an embodiment of the present application;
fig. 5 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
In the field of image processing, particularly image detection processing, it is common to perform image feature extraction on an image using FPN and detect the image based on the extracted image features. However, because the FPN performs bottom-up path down-sampling on the picture to obtain the multi-scale feature map, the original position features of the image may be weakened, which may possibly cause the accuracy of the subsequent image detection based on the multi-scale feature map to be reduced.
The inventors have found that the reason why the feature map obtained by the down-sampling process includes the weakened position features is that the position features in the feature map obtained by the down-sampling process are almost the same as those of the non-position features, that is, the feature map obtained by the down-sampling process weakens the information of the position features.
In order to make the aforementioned objects, features and advantages of the present application more comprehensible, the present application is described in further detail with reference to the accompanying drawings and the detailed description. It is to be understood that the embodiments described are only a few embodiments of the present application and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
According to the method provided by the embodiment of the application, the execution main body is the server with the image processing function.
Fig. 1 is a flowchart of an image processing method according to an embodiment of the present application, and the method may include the following steps:
s101, receiving an image to be detected.
S102, acquiring a multi-scale characteristic diagram of the image to be detected as a first characteristic diagram.
The specific embodiment mode of the step is as follows: and performing bottom-up path down-sampling processing on the image to be detected by using the FPN to obtain feature maps of the image to be detected in different scales, and taking the feature maps of the different scales as a first feature map.
The number of the multi-scale feature maps obtained by FPN processing and the scale of each feature map can be set by self according to requirements. The prior art can be referred to for a specific embodiment of downsampling an image to be detected by using FPN.
And S103, identifying the position feature and the non-position feature in the first feature map aiming at each first feature map.
The position feature is position information for describing a target image included in the picture, such as a shape contour edge of the target image.
In this step, the position feature and the non-position feature in the first feature map may be identified by using a position identification model trained in advance, the position identification model may be obtained by training a training sample carrying a position feature tag and a non-position feature tag, and the specific training configuration may refer to an existing neural network model training method.
Of course, the position feature and the non-position feature in the first feature map may also be determined by an existing method for identifying the shape contour edge of the target image.
And S104, performing convolution operation on the first characteristic diagram to obtain a second characteristic diagram.
In the convolution operation, the weight of the convolution kernel corresponding to the position feature is greater than the weight of the convolution kernel corresponding to the non-position feature.
It should be noted that, the weights of the convolution kernels corresponding to the position features are greater than the weights of the convolution kernels corresponding to the non-position features, so that the position features and the non-position features in the feature map are different, and the position features in the salient feature map are enhanced.
The embodiment mode of the step can be as follows: and performing convolution operation on the first characteristic diagram for multiple times to obtain a second characteristic diagram, wherein in any convolution operation, the weight of the convolution kernel corresponding to the position characteristic is greater than the weight of the convolution kernel corresponding to the non-position characteristic. The specific times of the convolution operation can be set by self according to requirements. In this step, the purpose of each convolution operation is to further enhance the information of the position feature in the feature map.
The convolution operation performed on the first characteristic diagram for multiple times is as follows: and taking the first characteristic diagram as the first characteristic diagram for convolution operation, and performing convolution operation on the characteristic diagram obtained by the last convolution operation by any subsequent convolution operation.
In the convolution operation of the first feature map, the convolution kernel size is 1 × 1 × M, and M is the number of channels of the first feature map. In any convolution operation after the convolution operation is performed on the first feature map, the size of a convolution kernel is 1 × 1 × N, and N is the number of convolution kernels of the previous convolution operation.
For example, if the size of the convolution kernel of the first convolution operation is 1 × 1 × M and the number thereof is 512, the size of the second convolution operation is 1 × 1 × 512.
Note that, the convolution operation performed in this step does not change the scale of the feature map, that is, the scale of the finally obtained second feature map is the same as the scale of the first feature map.
And S105, multiplying the second characteristic diagram and the first characteristic diagram to obtain a third characteristic diagram.
In this embodiment, the second feature map and the first feature map have the same scale, and the multiplication operation of the second feature map and the first feature map is that each feature value in the second feature map is multiplied by a feature value at the same position in the first feature map to obtain a third feature map.
And S106, generating a fourth feature map for detecting the image to be detected according to the third feature map.
Specific embodiment modes of the step may include steps a1 to a 2:
step A1, taking the third feature map with the smallest dimension as the first fourth feature map;
in this example, for each scale of the first feature map, the scale of the second feature map obtained by performing the first type convolution operation and the second type convolution operation is the same as the scale of the first feature map, so that the scale sample of the third feature map obtained by performing the multiplication operation on the second feature map and the first feature map is the same as the scale sample of the first feature map. Therefore, the third feature map with the smallest scale is the third feature map corresponding to the first feature map with the smallest scale in the multi-scale first feature map.
And A2, performing convolution calculation on the third feature map by using a convolution layer with the size of 1 x 1 aiming at any one of the third feature maps except the third feature map with the smallest scale, and performing feature connection operation on the target fourth feature map subjected to the upsampling processing to obtain a fourth feature map corresponding to the third feature map.
And the target fourth feature map of the third feature map is a fourth feature map corresponding to the third feature map, wherein the size of the fourth feature map is adjacent to that of the third feature map and is smaller than that of the third feature map in the multi-scale third feature map, and the third feature map corresponds to the fourth feature map, and the fourth feature map is obtained by calculation at least according to the third feature map.
Performing convolution calculation on the third feature map by using the convolution layer with the size of 1 × 1, so that the number of channels of the obtained feature map is consistent with the number of channels of the target fourth feature map, performing upsampling processing on the target fourth feature map, and enabling the target fourth feature map to be consistent with the scale of the feature map obtained by performing convolution calculation on the third feature map by using the convolution layer with the size of 1 × 1.
After convolution calculation is carried out on the third characteristic diagram by adopting a convolution layer with the size of 1 multiplied by 1, characteristic connection operation is carried out on the convolution calculation and the target fourth characteristic diagram after upsampling processing, so that aliasing effect can be avoided.
According to the method, the multi-scale feature map of the image to be detected is obtained and used as the first feature map, the position feature and the non-position feature in the first feature map are identified for each first feature map, convolution operation is conducted on the first feature map, the second feature map is obtained, in the convolution operation, the weight of a convolution kernel corresponding to the position feature is larger than the weight of a convolution kernel corresponding to the non-position feature, the position feature in the second feature map is enhanced, a third feature map obtained by multiplying the second feature map and the first feature map has the enhanced position feature, a fourth feature map used for detecting the image to be detected is generated according to the third feature map, and the accuracy of detecting the image to be detected can be improved.
In the above embodiment, the position feature and the non-position feature in the first feature map are identified, and the convolution operation is performed on the position feature to obtain the second feature map, which may be completed by a pre-established spatial weight model.
The spatial weight model is used for identifying the position features and the non-position features in the first feature map and carrying out convolution operation on the first feature map, wherein in the convolution operation, the weight of a convolution kernel corresponding to the position features is larger than that of a convolution kernel corresponding to the non-position features, so that the information of the position features in the second feature map is enhanced.
In this embodiment, an image to be detected is processed by using an image processing model obtained by combining the FPN model and the spatial weight model. Fig. 2 is a schematic diagram of a process of processing a picture by an image processing model, and as shown in fig. 2, the process of processing the picture by the image processing model is divided into 3 parts:
the first part is a multi-scale first feature map generated by performing bottom-up path downsampling processing on a picture to be detected by using FPN, and C2, C3, C4 and C5 respectively represent first feature maps with different scales. The scale of the first feature map obtained by bottom-up path down-sampling is continuously reduced, and C5 is the first feature map with the smallest scale.
The second part is to calculate the first feature map Cn (n is 2,3,4,5) by using a spatial weight model to obtain a corresponding feature map Rn (n is 2,3,4, 5). In fig. 2, a horizontal line connecting portion between Cn (n ═ 2,3,4,5) and the second feature map Rn (n ═ 2,3,4,5) indicates: and inputting the first characteristic diagram Cn into a space weight model, and outputting the second characteristic diagram Rn after the space weight model operates the first characteristic diagram Cn. The second feature map Rn obtained by calculating the first feature map Cn using the spatial weight model has the same size as the first feature map Cn, for example, the dimensions of C2 are the same as those of R2. Therefore, in Rn (n ═ 2,3,4,5), the R5 scale is the smallest, and the R2 scale is the largest.
Fig. 3 is a schematic diagram of a specific processing procedure of the spatial weight model on the first feature map Cn.
And the third part is to perform feature fusion calculation on each Rn to obtain a final fourth feature map Pn (n is 2,3,4,5) for generating an image to be detected. Wherein P5 is the generated R5 in the second part, and the specific fusion process of P2, P3 and P4 is shown in FIG. 2;
(1) performing 2 times of upsampling on the Pm +1 to obtain a feature map with the same size as that of the Rm, wherein m is 2,3 and 4;
(2) performing 1 × 1 convolution operation on Rm to obtain a feature map of the channel number same as Pm +1, wherein m is 2,3 and 4;
(3) and (3) performing characteristic connection operation on the characteristic graphs obtained in the steps (1) and (2) to obtain Pm, wherein m is 2,3 and 4, and the characteristic connection operation is adopted to replace the addition operation in the traditional FPN structure, so that the aliasing effect caused by the addition operation can be avoided.
In the method provided by this embodiment, an image processing model obtained by combining the FPN model and the spatial weight model is used to process an image to be detected. The spatial weight model is used for identifying the position features and the non-position features in the first feature map and performing convolution operation on the position features, and in the convolution operation, the weight of the convolution kernel corresponding to the position features is larger than the weight of the convolution kernel corresponding to the non-position features, so that the position features in the second feature map output by the second feature map are enhanced, and the accuracy of detecting the image to be detected can be improved.
FIG. 3 is a graph of a spatial weight model versus a first feature map CnSchematic processing of the body. Wherein the feature value of the first feature map Cn is represented by Cn (p,q). As shown in fig. 3, the small cube in the first feature map Cn indicates that the feature values of all the channels on the same straight line of the first feature map Cn constitute a numerical sequence. Wherein, on different channels in the same straight line, Cn (p,q)May be different.
The specific processing process of the spatial weight model on the first feature map Cn comprises the following steps:
(1) and performing a first convolution operation on the first characteristic diagram by adopting the first convolution layer to obtain a characteristic diagram 1.
In fig. 3, the minicubes in the feature map 1 represent numerical value sequences corresponding to numerical value sequences of the minicubes in the first feature map Cn after convolution operation of the first convolution layer.
In the first convolution operation, the size of the convolution kernel in the first convolution layer is 1 × 1 × M, where M is the number of channels in the first feature map Cn. The total number of convolution kernels in the first convolutional layer is 512 (the total number of convolution kernels may be other values, which is only an example).
And performing first convolution operation, wherein the weight of the convolution kernel corresponding to the position characteristic is greater than the weight of the convolution kernel corresponding to the non-position characteristic. The spatial weight model can determine the information of the position feature and the non-position feature included in the first feature map Cn through loss function calculation and negative feedback calculation in the convolution operation process.
(2) And performing activation calculation on the characteristic diagram 1 by using a Relu activation function to obtain a characteristic diagram 2. The calculation formula is as follows:
Figure BDA0002722127130000101
wherein,
Figure BDA0002722127130000102
the characteristic values in the characteristic diagram 2 are represented,
Figure BDA0002722127130000103
representing the weights of the convolution kernels in the first convolution layer.
(3) And performing convolution calculation for the second time by using the characteristic diagram 2. A characteristic map 3 is obtained. In fig. 3, the minicubes in the signature graph 3 represent: the numerical sequence of the minicubes in the feature map 2 corresponds to the numerical sequence after the convolution operation of the first convolution layer.
Since the number of channels of the feature map obtained by the first convolution operation is 512, the size of the convolution kernel of the second convolution layer is 1 × 1 × 512, and the total number of convolution kernels is 512 (the total number of convolution kernels may be other values, which is only an example here). Similarly, in the second convolutional layer, the weight of the convolutional kernel used for calculating the position feature of the feature map is greater than the weight of the convolutional kernel used for calculating the non-position feature.
(4) And normalizing the characteristic diagram 3 to obtain a characteristic diagram 4. Fig. 4 is a second characteristic diagram of the above embodiment.
(5) And multiplying the feature map 4 by the first feature map Cn to obtain a feature map 5. Fig. 5 is a third characteristic diagram of the above embodiment. In this embodiment, optionally, the spatial weight model may be pre-constructed into a model architecture that obtains the second feature map and performs multiplication operation on the second feature map and the first feature map.
It should be noted that, in this embodiment, performing convolution operation twice by the spatial weight model is only an example, and optionally, the spatial weight model may also be configured to perform convolution operation more times, and the weight of the convolution kernel in each convolution layer for calculating the position feature of the feature map is greater than the weight of the convolution kernel for calculating the non-position feature.
In the method provided by this embodiment, in the convolution operation of the spatial weight model, the weight of the convolution kernel corresponding to the position feature is greater than the weight of the convolution kernel corresponding to the non-position feature, so that the position feature in the output second feature map is enhanced, and the accuracy of detecting the image to be detected can be improved.
Fig. 4 is a schematic structural diagram of an apparatus 400 for processing pictures according to the present application, including:
a receiving unit 401, configured to receive an image to be detected.
An obtaining unit 402, configured to obtain a multi-scale feature map of an image to be detected as a first feature map;
an identifying unit 403, configured to identify, for each first feature map, a location feature and a non-location feature in the first feature map.
A first operation unit 404, configured to perform convolution operation on the first feature map to obtain a second feature map; in the convolution operation, the weight of the convolution kernel corresponding to the position feature is greater than the weight of the convolution kernel corresponding to the non-position feature.
A second operation unit 405, configured to multiply the second feature map and the first feature map to obtain a third feature map.
And a generating unit 406, configured to generate a fourth feature map for detecting the image to be detected according to the third feature map.
Optionally, the specific implementation manner of performing convolution operation on the first feature map by the first operation unit 404 to obtain the second feature map is as follows: and performing convolution operation on the first feature map for multiple times to obtain a second feature map, wherein in any convolution operation, the weight of the convolution kernel corresponding to the position feature is greater than the weight of the convolution kernel corresponding to the non-position feature.
Optionally, in the convolution operation performed on the first feature map by the first operation unit 404, the size of the convolution kernel is 1 × 1 × M, where M is the number of channels of the first feature map.
Optionally, in any convolution operation after the first operation unit 404 performs the convolution operation on the first feature map, the size of the convolution kernel is 1 × 1 × N, and N is the number of convolution kernels of the previous convolution operation.
Optionally, the specific implementation manner of identifying the position feature and the non-position feature in the first feature map by the identifying unit 403 and performing convolution operation on the first feature map by the first operation unit 404 to obtain the second feature map is as follows:
and inputting the first feature map into a pre-constructed space weight model, obtaining a second feature map space weight model output by the space weight model, wherein the second feature map space weight model is used for identifying position features and non-position features in the first feature map, and performing convolution operation on the first feature map, and in the convolution operation, the weight of a convolution kernel corresponding to the position features is greater than the weight of a convolution kernel corresponding to the non-position features.
Optionally, the specific implementation manner of generating, by the generating unit 406, the fourth feature map for detecting the image to be detected according to the third feature map is as follows:
taking the third feature map with the minimum dimension as a first fourth feature map;
performing convolution calculation on the third feature map by using a convolution layer with the size of 1 x 1 aiming at any one of the third feature maps except the third feature map with the minimum scale, and performing feature connection operation on the third feature map and a target fourth feature map subjected to upsampling processing to obtain a fourth feature map corresponding to the third feature map;
the target fourth feature map of the third feature map is a fourth feature map corresponding to the third feature map, in the multi-scale third feature map, the size of which is adjacent to the size of the third feature map and is smaller than the size of the third feature map, and the third feature map and the fourth feature map correspond to each other in a way that the fourth feature map is calculated at least according to the third feature map.
Optionally, the specific implementation manner of the obtaining unit 402 obtaining the multi-scale feature map of the image to be detected is as follows: and performing bottom-up path down-sampling processing on the image to be detected by using the FPN to obtain characteristic maps of the image to be detected in different scales.
The present application further provides an electronic device 500, a schematic structural diagram of which is shown in fig. 5, including: a processor 501 and a memory 502, the memory 502 is used for storing application programs, the processor 501 is used for executing the application programs to realize the picture processing method of the present application, that is, the following steps are executed:
receiving an image to be detected;
acquiring a multi-scale characteristic map of the image to be detected as a first characteristic map;
for each of the first feature maps, identifying location features and non-location features in the first feature map;
performing convolution operation on the first characteristic diagram to obtain a second characteristic diagram; in the convolution operation, the weight of the convolution kernel corresponding to the position feature is greater than the weight of the convolution kernel corresponding to the non-position feature;
multiplying the second characteristic diagram with the first characteristic diagram to obtain a third characteristic diagram;
and generating a fourth characteristic diagram for detecting the image to be detected according to the third characteristic diagram.
The present application also provides a computer-readable storage medium having stored therein instructions, which when run on a computer, cause the computer to perform the method of picture processing of the present application, namely to perform the steps of:
acquiring a multi-scale characteristic map of the image to be detected as a first characteristic map;
for each of the first feature maps, identifying location features and non-location features in the first feature map;
performing convolution operation on the first characteristic diagram to obtain a second characteristic diagram; in the convolution operation, the weight of the convolution kernel corresponding to the position feature is greater than the weight of the convolution kernel corresponding to the non-position feature;
multiplying the second characteristic diagram with the first characteristic diagram to obtain a third characteristic diagram;
and generating a fourth characteristic diagram for detecting the image to be detected according to the third characteristic diagram.
The functions described in the method of the embodiment of the present application, if implemented in the form of software functional units and sold or used as independent products, may be stored in a storage medium readable by a computing device. Based on such understanding, part of the contribution to the prior art of the embodiments of the present application or part of the technical solution may be embodied in the form of a software product stored in a storage medium and including several instructions for causing a computing device (which may be a personal computer, a server, a mobile computing device or a network device) to execute all or part of the steps of the method described in the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
The embodiments are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same or similar parts among the embodiments are referred to each other.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present application. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the application. Thus, the present application is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (10)

1. A method of image processing, comprising:
receiving an image to be detected;
acquiring a multi-scale characteristic map of the image to be detected as a first characteristic map;
for each of the first feature maps, identifying location features and non-location features in the first feature map;
performing convolution operation on the first characteristic diagram to obtain a second characteristic diagram; in the convolution operation, the weight of the convolution kernel corresponding to the position feature is greater than the weight of the convolution kernel corresponding to the non-position feature;
multiplying the second characteristic diagram with the first characteristic diagram to obtain a third characteristic diagram;
and generating a fourth characteristic diagram for detecting the image to be detected according to the third characteristic diagram.
2. The method of claim 1, wherein the convolving the first feature map to obtain a second feature map comprises:
and performing the convolution operation on the first characteristic diagram for multiple times to obtain the second characteristic diagram.
3. The method according to claim 2, wherein in the convolution operation performed on the first feature map, a convolution kernel size is 1 × 1 × M, where M is the number of channels of the first feature map.
4. The method according to claim 3, wherein in any one of the convolution operations after the convolution operation is performed on the first feature map, a convolution kernel size is 1 x N, and N is the number of convolution kernels of a previous convolution operation.
5. The method according to any one of claims 1-3, wherein said identifying the location features and non-location features in the first feature map and said convolving the first feature map to obtain a second feature map comprises:
inputting the first characteristic diagram into a pre-constructed space weight model to obtain a second characteristic diagram output by the space weight model; the spatial weight model is used for identifying position features and non-position features in the first feature map and performing convolution operation on the first feature map, wherein in the convolution operation, the weight of a convolution kernel corresponding to the position features is larger than the weight of a convolution kernel corresponding to the non-position features.
6. The method according to claim 1, wherein generating a fourth feature map for detecting the image to be detected according to the third feature map comprises:
taking the third feature map with the smallest scale as a first fourth feature map;
performing convolution calculation on any one third feature map except for the third feature map with the minimum scale by using a convolution layer with the size of 1 × 1, and performing feature connection operation on the third feature map and a target fourth feature map subjected to upsampling processing to obtain a fourth feature map corresponding to the third feature map;
the target fourth feature map of the third feature map is a fourth feature map corresponding to a third feature map having a size adjacent to and smaller than that of the third feature map in a multi-scale third feature map, and the fourth feature map is calculated at least based on the third feature map.
7. The method of claim 1, wherein the obtaining of the multi-scale feature map of the image to be detected comprises:
and performing bottom-up path down-sampling processing on the image to be detected by using the FPN to obtain characteristic maps of the image to be detected with different scales.
8. An apparatus for image processing, comprising:
the receiving unit is used for receiving an image to be detected;
the acquisition unit is used for acquiring the multi-scale characteristic diagram of the image to be detected as a first characteristic diagram;
the identification unit is used for identifying position features and non-position features in the first feature map aiming at each first feature map;
the first operation unit is used for performing convolution operation on the first characteristic diagram to obtain a second characteristic diagram; in the convolution operation, the weight of the convolution kernel corresponding to the position feature is greater than the weight of the convolution kernel corresponding to the non-position feature
The second operation unit is used for multiplying the second characteristic diagram and the first characteristic diagram to obtain a third characteristic diagram;
and the generating unit is used for generating a fourth feature map for detecting the image to be detected according to the third feature map.
9. An electronic device, comprising: a processor and a memory for storing a program; the processor is configured to execute the program to implement the method of image processing according to any one of claims 1 to 7.
10. A computer-readable storage medium having stored therein instructions which, when run on a computer, cause the computer to perform the method of image processing according to any one of claims 1-7.
CN202011091051.1A 2020-10-13 2020-10-13 Image processing method and device, electronic equipment and computer readable storage medium Active CN112232361B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011091051.1A CN112232361B (en) 2020-10-13 2020-10-13 Image processing method and device, electronic equipment and computer readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011091051.1A CN112232361B (en) 2020-10-13 2020-10-13 Image processing method and device, electronic equipment and computer readable storage medium

Publications (2)

Publication Number Publication Date
CN112232361A true CN112232361A (en) 2021-01-15
CN112232361B CN112232361B (en) 2021-09-21

Family

ID=74112459

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011091051.1A Active CN112232361B (en) 2020-10-13 2020-10-13 Image processing method and device, electronic equipment and computer readable storage medium

Country Status (1)

Country Link
CN (1) CN112232361B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113486908A (en) * 2021-07-13 2021-10-08 杭州海康威视数字技术股份有限公司 Target detection method and device, electronic equipment and readable storage medium

Citations (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1738192A1 (en) * 2004-04-08 2007-01-03 Raytheon Company System and method for dynamic weight processing
CN106886023A (en) * 2017-02-27 2017-06-23 中国人民解放军理工大学 A kind of Radar Echo Extrapolation method based on dynamic convolutional neural networks
CN108090836A (en) * 2018-01-30 2018-05-29 南京信息工程大学 Based on the equity investment method for weighting intensive connection convolutional neural networks deep learning
CN108229497A (en) * 2017-07-28 2018-06-29 北京市商汤科技开发有限公司 Image processing method, device, storage medium, computer program and electronic equipment
EP3385887A1 (en) * 2017-04-08 2018-10-10 INTEL Corporation Sub-graph in frequency domain and dynamic selection of convolution implementation on a gpu
CN109614876A (en) * 2018-11-16 2019-04-12 北京市商汤科技开发有限公司 Critical point detection method and device, electronic equipment and storage medium
CN109996023A (en) * 2017-12-29 2019-07-09 华为技术有限公司 Image processing method and device
CN110210571A (en) * 2019-06-10 2019-09-06 腾讯科技(深圳)有限公司 Image-recognizing method, device, computer equipment and computer readable storage medium
CN110309876A (en) * 2019-06-28 2019-10-08 腾讯科技(深圳)有限公司 Object detection method, device, computer readable storage medium and computer equipment
CN110807362A (en) * 2019-09-23 2020-02-18 腾讯科技(深圳)有限公司 Image detection method and device and computer readable storage medium
CN110852349A (en) * 2019-10-21 2020-02-28 上海联影智能医疗科技有限公司 Image processing method, detection method, related equipment and storage medium
CN111476252A (en) * 2020-04-03 2020-07-31 南京邮电大学 Computer vision application-oriented lightweight anchor-frame-free target detection method
CN111507408A (en) * 2020-04-17 2020-08-07 深圳市商汤科技有限公司 Image processing method and device, electronic equipment and storage medium
CN111612075A (en) * 2020-05-22 2020-09-01 中国科学院自动化研究所 Interest point and descriptor extraction method based on joint feature recombination and feature mixing
CN111738069A (en) * 2020-05-13 2020-10-02 北京三快在线科技有限公司 Face detection method and device, electronic equipment and storage medium
CN111753730A (en) * 2020-06-24 2020-10-09 国网电子商务有限公司 Image examination method and device
CN111754491A (en) * 2020-06-28 2020-10-09 国网电子商务有限公司 Picture definition judging method and device

Patent Citations (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1738192A1 (en) * 2004-04-08 2007-01-03 Raytheon Company System and method for dynamic weight processing
CN106886023A (en) * 2017-02-27 2017-06-23 中国人民解放军理工大学 A kind of Radar Echo Extrapolation method based on dynamic convolutional neural networks
EP3385887A1 (en) * 2017-04-08 2018-10-10 INTEL Corporation Sub-graph in frequency domain and dynamic selection of convolution implementation on a gpu
CN108229497A (en) * 2017-07-28 2018-06-29 北京市商汤科技开发有限公司 Image processing method, device, storage medium, computer program and electronic equipment
CN109996023A (en) * 2017-12-29 2019-07-09 华为技术有限公司 Image processing method and device
CN108090836A (en) * 2018-01-30 2018-05-29 南京信息工程大学 Based on the equity investment method for weighting intensive connection convolutional neural networks deep learning
CN109614876A (en) * 2018-11-16 2019-04-12 北京市商汤科技开发有限公司 Critical point detection method and device, electronic equipment and storage medium
CN110210571A (en) * 2019-06-10 2019-09-06 腾讯科技(深圳)有限公司 Image-recognizing method, device, computer equipment and computer readable storage medium
CN110309876A (en) * 2019-06-28 2019-10-08 腾讯科技(深圳)有限公司 Object detection method, device, computer readable storage medium and computer equipment
CN110807362A (en) * 2019-09-23 2020-02-18 腾讯科技(深圳)有限公司 Image detection method and device and computer readable storage medium
CN110852349A (en) * 2019-10-21 2020-02-28 上海联影智能医疗科技有限公司 Image processing method, detection method, related equipment and storage medium
CN111476252A (en) * 2020-04-03 2020-07-31 南京邮电大学 Computer vision application-oriented lightweight anchor-frame-free target detection method
CN111507408A (en) * 2020-04-17 2020-08-07 深圳市商汤科技有限公司 Image processing method and device, electronic equipment and storage medium
CN111738069A (en) * 2020-05-13 2020-10-02 北京三快在线科技有限公司 Face detection method and device, electronic equipment and storage medium
CN111612075A (en) * 2020-05-22 2020-09-01 中国科学院自动化研究所 Interest point and descriptor extraction method based on joint feature recombination and feature mixing
CN111753730A (en) * 2020-06-24 2020-10-09 国网电子商务有限公司 Image examination method and device
CN111754491A (en) * 2020-06-28 2020-10-09 国网电子商务有限公司 Picture definition judging method and device

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
JINCHEN等: "Dynamic Region-Aware Convolution", 《HTTPS://ARXIV.ORG/ABS/2003.12243V1》 *
朱威等: "基于动态图卷积和空间金字塔池化的点云深度学习网络", 《计算机科学》 *
陈朋等: "基于改进动态配置的FPGA卷积神经网络加速器的优化方法", 《高技术通讯》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113486908A (en) * 2021-07-13 2021-10-08 杭州海康威视数字技术股份有限公司 Target detection method and device, electronic equipment and readable storage medium
CN113486908B (en) * 2021-07-13 2023-08-29 杭州海康威视数字技术股份有限公司 Target detection method, target detection device, electronic equipment and readable storage medium

Also Published As

Publication number Publication date
CN112232361B (en) 2021-09-21

Similar Documents

Publication Publication Date Title
Shen et al. Deep semantic face deblurring
CN108875523B (en) Human body joint point detection method, device, system and storage medium
CN109376631B (en) Loop detection method and device based on neural network
CN108875533B (en) Face recognition method, device, system and computer storage medium
CN110610154A (en) Behavior recognition method and apparatus, computer device, and storage medium
CN111160229B (en) SSD network-based video target detection method and device
CN113837257B (en) Target detection method and device
CN107730514A (en) Scene cut network training method, device, computing device and storage medium
CN111275066A (en) Image feature fusion method and device and electronic equipment
WO2017070923A1 (en) Human face recognition method and apparatus
CN113781164B (en) Virtual fitting model training method, virtual fitting method and related devices
CN114266894A (en) Image segmentation method and device, electronic equipment and storage medium
CN113569070A (en) Image detection method and device, electronic equipment and storage medium
CN112528318A (en) Image desensitization method and device and electronic equipment
CN110782430A (en) Small target detection method and device, electronic equipment and storage medium
CN112232361B (en) Image processing method and device, electronic equipment and computer readable storage medium
US20220189151A1 (en) Processing system, estimation apparatus, processing method, and non-transitory storage medium
CN111027670B (en) Feature map processing method and device, electronic equipment and storage medium
CN111079643B (en) Face detection method and device based on neural network and electronic equipment
CN110210279A (en) Object detection method, device and computer readable storage medium
CN115546271B (en) Visual analysis method, device, equipment and medium based on depth joint characterization
CN111353577B (en) Multi-task-based cascade combination model optimization method and device and terminal equipment
CN112801045B (en) Text region detection method, electronic equipment and computer storage medium
CN109978043A (en) A kind of object detection method and device
US20230035671A1 (en) Generating stereo-based dense depth images

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CP03 Change of name, title or address

Address after: 100032 room 8018, 8 / F, building 7, Guangyi street, Xicheng District, Beijing

Patentee after: State Grid Digital Technology Holdings Co.,Ltd.

Patentee after: State Grid E-Commerce Technology Co., Ltd

Address before: 311 guanganmennei street, Xicheng District, Beijing 100053

Patentee before: STATE GRID ELECTRONIC COMMERCE Co.,Ltd.

Patentee before: State Grid E-Commerce Technology Co., Ltd

CP03 Change of name, title or address