CN110782420A - Small target feature representation enhancement method based on deep learning - Google Patents

Small target feature representation enhancement method based on deep learning Download PDF

Info

Publication number
CN110782420A
CN110782420A CN201910886472.4A CN201910886472A CN110782420A CN 110782420 A CN110782420 A CN 110782420A CN 201910886472 A CN201910886472 A CN 201910886472A CN 110782420 A CN110782420 A CN 110782420A
Authority
CN
China
Prior art keywords
feature map
characteristic diagram
feature
size
neural network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910886472.4A
Other languages
Chinese (zh)
Inventor
姜明
何利飞
张旻
李鹏飞
汤景凡
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Dianzi University
Hangzhou Electronic Science and Technology University
Original Assignee
Hangzhou Electronic Science and Technology University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Electronic Science and Technology University filed Critical Hangzhou Electronic Science and Technology University
Priority to CN201910886472.4A priority Critical patent/CN110782420A/en
Publication of CN110782420A publication Critical patent/CN110782420A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/50Image enhancement or restoration using two or more images, e.g. averaging or subtraction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20212Image combination
    • G06T2207/20221Image fusion; Image merging

Landscapes

  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a small target feature representation enhancing method based on deep learning. The invention comprises the following steps: step 1, pre-training a neural network model Faster R-CNN on a super-large-scale database which contains more than 1400 million images and covers 20000 categories; step 2, reading input image data; step 3, generating a characteristic diagram through a convolutional neural network, and establishing a characteristic diagram space pyramid; step 4, obtaining the characteristic diagram weight from the attention mechanism module; step 5, fusing feature graphs from different layers according to the obtained weights; step 6, detecting and positioning the characteristic diagram; and 7, repeating the steps 3 to 6 aiming at the specified task and continuing to train the neural network model until the network reaches an optimal value. The method enhances the influence of the significant features and effectively combines the deep semantic and shallow high-resolution convolutional neural network features, thereby improving the overall target detection accuracy.

Description

Small target feature representation enhancement method based on deep learning
Technical Field
The invention relates to a target detection method, in particular to a small target feature representation enhancement method based on deep learning, and belongs to the technical field of computer visual image processing.
Background
Object detection, one of the basic problems of computer vision, is the basis of many other computer vision tasks, such as example segmentation, image captioning, object tracking, etc. From the application point of view, object detection can be divided into two research topics, "general object detection" and "specific object detection", the former aiming at exploring methods for detecting different types of objects under a unified framework to simulate human vision and cognition; the latter refers to detection in a specific application scenario, such as pedestrian detection, face detection, text detection, and the like. In recent years, with the rapid development of deep learning technology, new blood is injected for target detection, and a significant breakthrough is made, which is pushed to an unprecedented research hotspot. At present, target detection is widely applied to the fields of autonomous driving, robot vision, video monitoring and the like.
Disclosure of Invention
The invention provides a small target feature representation enhancing method based on deep learning, which is mainly used for solving the problem of contradiction between detail information and abstract information existing in single-layer convolution features.
To achieve the above technical objectives, the present invention adopts the following technical solutions:
a small target feature representation enhancing method based on deep learning is realized by the following steps:
firstly, pre-training a neural network model Faster R-CNN on a super-large-scale database which contains more than 1400 million images and covers 20000 categories;
reading input image data;
step (3) generating a characteristic diagram through a convolutional neural network, and establishing a characteristic diagram space pyramid;
step (4) obtaining feature map weights from the attention mechanism module;
step (5) fusing feature maps from different layers according to the obtained weights;
and (6) detecting and positioning the characteristic diagram.
Step (7) repeating the steps (3) to (6) for a specific task to continue training the neural network model until the network reaches an optimal value;
further, the deep convolutional neural network framework adopted in the step (1) is fast R-CNN, which comprises:
fast RCNN firstly supports inputting pictures of arbitrary size, and the pictures are set with a normalized dimension before entering the network, for example, the short edge of the image is set to be not more than 600, and the long edge of the image is set to be not more than 1000, and we can assume that M × N is 1000 × 600 (if the pictures are less than the size, 0 can be complemented by the edge, i.e. the image has black edges).
Further, the Faster R-CNN network framework includes:
13 convolution (conv) layers: kernel _ size is 3, pad is 1, stride is 1;
using the convolution formula:
Figure RE-GDA0002284356980000021
where kernel _ size indicates that the convolution kernel size used is 3 × 3, pad indicates that the edge is filled with 1 bit, and stride indicates that the convolution kernel is shifted by 1 bit at a time. The formula of calculation shows that the conv layer does not change the picture size, that is: the size of the input picture is equal to the size of the output picture;
further, the Faster R-CNN also includes 13 activation (relu) layers: activating a function without changing the size of the picture;
further, FasterR-CNN also comprises 4 pooling (Pooling) layers: kernel _ size 2, stride 2; the pooling (Pooling) layer would let the output picture be 1/2 of the input picture;
further, after feature extraction, the picture size becomes (M/16) × (N/16), that is: 60 x 40(1000/16 ≈ 60,600/16 ≈ 40); the feature map is 60 × 40 × 512, which means that the size of the feature map is 60 × 40, and the number of the feature map is 512;
further, the step (3) of establishing a feature map spatial pyramid by using the feature map generated by the convolutional neural network is specifically implemented as follows:
the output is activated using the characteristics of the last convolutional layer output of each stage. For the conv2, conv3, conv4 and conv5 outputs, these final outputs are denoted as { C2, C3, C4, C5} and they have a step size of {4,8,16,32} relative to the input image. Conv1 would not be incorporated into the pyramid due to its large memory footprint. The low resolution feature map is then up-sampled by a factor of 2 so that each of the cross-connect bottom-up and top-down path feature maps have the same dimensions. Before fusion, an attention mechanism module is used for automatically learning the weights of the feature maps with different scales, and then fusion is carried out. This process is iterative until the final feature map is generated. This final set of feature maps is called P2, P3, P4, P5, corresponding to C2, C3, C4, C5, respectively.
Further, the step (4) obtains the feature graph weight from the attention mechanism module, and is specifically realized as follows:
the characteristic diagram A is subjected to matrix multiplication with the transposed AT of the characteristic diagram, because the characteristic diagram has channel dimensions, each pixel and every other pixel are equivalently subjected to point multiplication, the point multiplication geometrical meaning of the vectors is to calculate the similarity of two vectors, and the more similar the two vectors are, the larger the point multiplication is. And multiplying the characteristic diagram transpose matrix and the characteristic diagram matrix, and then normalizing by softmax to obtain the attention weight wi. The attention weight wi is multiplied by the transpose of the feature graph through a matrix, the correlation information is redistributed to the original feature graph, and the fusion mode is expressed by a formula as follows:
wi=softmax(matmul(Ai,AiT))
wherein matmul represents the matrix product and the softmax function represents the value that maps the matrix product to (0, 1);
further, the step (5) of fusing feature maps from different layers according to the obtained attention weight wi, wherein the fused formula is as follows:
Ei=wi*Ai+Ai;
wherein Ei represents the ith new feature map, wi represents the attention weight of the ith layer, and Ai represents the ith original feature map;
further, the detection and the positioning of the feature map in the step (6) are specifically realized as follows: and for the features extracted from the fused feature map, detecting whether the features belong to a specific class by using a classifier, and further adjusting the position of the candidate frame belonging to a certain class by using a positioner.
The invention has the following advantages:
the method utilizes the deep learning technology to detect the image content, automatically learns the characteristics of the target types by means of non-manual intervention, and has good robustness and self-adaptive capacity for detecting the small targets during identification, classification and positioning.
The method enhances the influence of the obvious features, and effectively combines the deep semantic and shallow high-resolution convolutional neural network features, thereby improving the overall target detection precision.
Drawings
FIG. 1 is a flow chart of the overall implementation of the present invention;
FIG. 2 is a diagram of a convolutional neural network architecture used in the present invention;
FIG. 3 is a diagram of a weight assignment method used by the present invention;
FIG. 4 is a picture to be examined;
FIG. 5 is a picture after inspection using the present invention;
Detailed Description
The attached drawings disclose a flow chart of a preferred embodiment of the invention in a non-limiting way; the technical solution of the present invention will be described in detail below with reference to the accompanying drawings.
As shown in fig. 1, fig. 2 and fig. 3, a small target feature representation enhancement method based on deep learning is implemented by the following steps:
firstly, pre-training a neural network model Faster R-CNN on a super-large-scale database which contains more than 1400 million images and covers 20000 categories;
reading input image data;
step (3) generating a characteristic diagram through a convolutional neural network, and establishing a characteristic diagram space pyramid;
step (4) obtaining feature map weights from the attention mechanism module;
step (5) fusing feature maps from different layers according to the obtained weights;
and (6) detecting and positioning the characteristic diagram.
Step (7) repeating the steps (3) to (6) for a specific task to continue training the neural network model until the network reaches an optimal value;
further, the deep convolutional neural network framework adopted in the step (1) is fast R-CNN, which comprises:
fast RCNN firstly supports inputting pictures of arbitrary size, and the pictures are set with a normalized dimension before entering the network, for example, the short edge of the image is set to be not more than 600, and the long edge of the image is set to be not more than 1000, and we can assume that M × N is 1000 × 600 (if the pictures are less than the size, 0 can be complemented by the edge, i.e. the image has black edges).
Further, the Faster R-CNN network framework includes:
13 convolution (conv) layers: kernel _ size is 3, pad is 1, stride is 1;
using the convolution formula:
Figure RE-GDA0002284356980000051
where kernel _ size indicates that the convolution kernel size used is 3 × 3, pad indicates that the edge is filled with 1 bit, and stride indicates that the convolution kernel is shifted by 1 bit at a time. The formula of calculation shows that the conv layer does not change the picture size, that is: the size of the input picture is equal to the size of the output picture;
further, the Faster R-CNN also includes 13 activation (relu) layers: activating a function without changing the size of the picture;
further, FasterR-CNN also comprises 4 pooling (Pooling) layers: kernel _ size 2, stride 2; the pooling (Pooling) layer would let the output picture be 1/2 of the input picture;
further, after feature extraction, the picture size becomes (M/16) × (N/16), that is: 60 x 40(1000/16 ≈ 60,600/16 ≈ 40); the feature map is 60 × 40 × 512, which means that the size of the feature map is 60 × 40, and the number of the feature map is 512;
further, the step (3) of establishing a feature map spatial pyramid by using the feature map generated by the convolutional neural network is specifically implemented as follows:
the output is activated using the characteristics of the last convolutional layer output of each stage. For the conv2, conv3, conv4 and conv5 outputs, these final outputs are denoted as { C2, C3, C4, C5} and they have a step size of {4,8,16,32} relative to the input image. Conv1 would not be incorporated into the pyramid due to its large memory footprint. The low resolution feature map is then up-sampled by a factor of 2 so that each of the cross-connect bottom-up and top-down path feature maps have the same dimensions. Before fusion, an attention mechanism module is used for automatically learning the weights of the feature maps with different scales, and then fusion is carried out. This process is iterative until the final feature map is generated. This final set of feature maps is called P2, P3, P4, P5, corresponding to C2, C3, C4, C5, respectively.
Further, the step (4) obtains the feature graph weight from the attention mechanism module, and is specifically realized as follows:
the characteristic diagram A is subjected to matrix multiplication with the transposed AT of the characteristic diagram, because the characteristic diagram has channel dimensions, each pixel and every other pixel are equivalently subjected to point multiplication, the point multiplication geometrical meaning of the vectors is to calculate the similarity of two vectors, and the more similar the two vectors are, the larger the point multiplication is. And multiplying the characteristic diagram transpose matrix and the characteristic diagram matrix, and then normalizing by softmax to obtain the attention weight wi. The attention weight wi is multiplied by the transpose of the feature graph through a matrix, the correlation information is redistributed to the original feature graph, and the fusion mode is expressed by a formula as follows:
wi=softmax(matmul(Ai,AiT))
wherein matmul represents the matrix product and the softmax function represents the value that maps the matrix product to (0, 1);
further, the step (5) of fusing feature maps from different layers according to the obtained attention weight wi, wherein the fused formula is as follows:
Ei=wi*Ai+Ai;
wherein Ei represents the ith new feature map, wi represents the attention weight of the ith layer, and Ai represents the ith original feature map;
further, the detection and the positioning of the feature map in the step (6) are specifically realized as follows: and for the features extracted from the fused feature map, detecting whether the features belong to a specific class by using a classifier, and further adjusting the position of the candidate frame belonging to a certain class by using a positioner.
As can be easily seen from the comparison between fig. 4 and fig. 5, the detection effect on the small target in the picture is significant.

Claims (7)

1. A small target feature representation enhancement method based on deep learning is characterized by comprising the following implementation steps:
firstly, pre-training a neural network model Faster R-CNN on a super-large-scale database which contains more than 1400 million images and covers 20000 categories;
reading input image data;
step (3) generating a characteristic diagram through a convolutional neural network, and establishing a characteristic diagram space pyramid;
step (4) obtaining feature map weights from the attention mechanism module;
step (5) fusing feature maps from different layers according to the obtained weights;
step (6) detecting and positioning the characteristic diagram;
and (7) repeating the steps (3) to (6) for the specified task, and continuing to train the neural network model until the network reaches an optimal value.
2. The method for enhancing small target feature representation based on deep learning of claim 1, wherein the deep convolutional neural network framework adopted in step (1) is fast R-CNN, which comprises:
the fast RCNN firstly supports inputting pictures with any size, and the pictures are set in a regularized scale before entering a network, if the short edge of the picture is set to be not more than N and the long edge of the picture is set to be not more than M, if the picture is less than the size, the edge is supplemented with 0, namely the picture has a black edge;
the Faster R-CNN network framework includes:
13 convolution (conv) layers: kernel _ size is 3, pad is 1, stride is 1;
using the convolution formula:
Figure FDA0002207432670000011
wherein, kernel _ size indicates that the size of the used convolution kernel is 3 × 3, pad indicates that the edge is filled with 1 bit, and stride indicates that the convolution kernel moves 1 bit each time; the computational formula shows that the conv layer does not change the size of the picture;
the Faster R-CNN also includes 13 activation (relu) layers;
FasterR-CNN also comprises 4 pooling layers: kernel _ size 2, stride 2; the pooling layer would let the output picture be 1/2 of the input picture.
3. The method according to claim 2, wherein the image size is (M/16) × (N/16) after the step (2) of reading the input image data and performing feature extraction, and the feature map is (M/16) × (N/16) × 512, which indicates that the feature map has a size of (M/16) × (N/16) and a number of 512.
4. The method for enhancing feature representation of small objects based on deep learning of claim 3, wherein the feature map generated by the convolutional neural network in step (3) is used to establish a feature map spatial pyramid, and the method is specifically implemented as follows:
activating an output using a characteristic of a last convolutional layer output of each stage; for conv2, conv3, conv4 and conv5 outputs, the final outputs are denoted as { C2, C3, C4, C5} and they have a step size of {4,8,16,32} relative to the input image; then, performing 2 times of upsampling on the low-resolution feature map, so that each feature map of the transverse connection from the bottom to the top path and each feature map of the transverse connection from the top to the bottom path have the same size; before fusion, an attention mechanism module is used for automatically learning the weights of the feature maps with different scales, and then fusion is carried out until a final feature map is generated; the final feature map set is called P2, P3, P4, P5, corresponding to C2, C3, C4, C5, respectively.
5. The method for enhancing the small target feature representation based on the deep learning as claimed in claim 4, wherein the step (4) obtains the feature map weight from the attention mechanism module, and is implemented as follows:
matrix multiplication is carried out on the characteristic diagram A and the transposed AT of the characteristic diagram, and as the characteristic diagram has channel dimensions, each pixel and each other pixel are equivalently subjected to dot multiplication operation; multiplying the characteristic diagram transpose matrix and the characteristic diagram matrix, and then normalizing by softmax to obtain an attention weight wi; the attention weight wi is multiplied by the transpose of the feature graph through a matrix, the correlation information is redistributed to the original feature graph, and the fusion mode is expressed by a formula as follows:
wi=softmax(matmul(Ai,AiT))
where matmul represents the matrix product and the softmax function represents the value that maps the matrix product to (0, 1).
6. The method for enhancing small object feature representation based on deep learning of claim 5, wherein the step (5) is to fuse feature maps from different levels according to the obtained attention weight wi, and the formula of the fusion is as follows:
Ei=wi*Ai+Ai;
wherein Ei represents the ith new feature map, wi represents the attention weight of the ith layer, and Ai represents the ith original feature map.
7. The method for enhancing small target feature representation based on deep learning of claim 6, wherein the detection and localization of the feature map in step (6) are implemented as follows: and for the features extracted from the fused feature map, detecting whether the features belong to a specific class by using a classifier, and further adjusting the position of the candidate frame belonging to a certain class by using a positioner.
CN201910886472.4A 2019-09-19 2019-09-19 Small target feature representation enhancement method based on deep learning Pending CN110782420A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910886472.4A CN110782420A (en) 2019-09-19 2019-09-19 Small target feature representation enhancement method based on deep learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910886472.4A CN110782420A (en) 2019-09-19 2019-09-19 Small target feature representation enhancement method based on deep learning

Publications (1)

Publication Number Publication Date
CN110782420A true CN110782420A (en) 2020-02-11

Family

ID=69383587

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910886472.4A Pending CN110782420A (en) 2019-09-19 2019-09-19 Small target feature representation enhancement method based on deep learning

Country Status (1)

Country Link
CN (1) CN110782420A (en)

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111507183A (en) * 2020-03-11 2020-08-07 杭州电子科技大学 Crowd counting method based on multi-scale density map fusion cavity convolution
CN111507359A (en) * 2020-03-09 2020-08-07 杭州电子科技大学 Self-adaptive weighting fusion method of image feature pyramid
CN111539458A (en) * 2020-04-02 2020-08-14 咪咕文化科技有限公司 Feature map processing method and device, electronic equipment and storage medium
CN111563414A (en) * 2020-04-08 2020-08-21 西北工业大学 SAR image ship target detection method based on non-local feature enhancement
CN111709291A (en) * 2020-05-18 2020-09-25 杭州电子科技大学 Takeaway personnel identity identification method based on fusion information
CN111709294A (en) * 2020-05-18 2020-09-25 杭州电子科技大学 Express delivery personnel identity identification method based on multi-feature information
CN111723841A (en) * 2020-05-09 2020-09-29 北京捷通华声科技股份有限公司 Text detection method and device, electronic equipment and storage medium
CN112131935A (en) * 2020-08-13 2020-12-25 浙江大华技术股份有限公司 Motor vehicle carriage manned identification method and device and computer equipment
CN112131925A (en) * 2020-07-22 2020-12-25 浙江元亨通信技术股份有限公司 Construction method of multi-channel characteristic space pyramid
CN112396115A (en) * 2020-11-23 2021-02-23 平安科技(深圳)有限公司 Target detection method and device based on attention mechanism and computer equipment
CN113327253A (en) * 2021-05-24 2021-08-31 北京市遥感信息研究所 Weak and small target detection method based on satellite-borne infrared remote sensing image
CN113570003A (en) * 2021-09-23 2021-10-29 深圳新视智科技术有限公司 Feature fusion defect detection method and device based on attention mechanism
CN113591593A (en) * 2021-07-06 2021-11-02 厦门路桥信息股份有限公司 Method, equipment and medium for detecting target under abnormal weather based on causal intervention
US11436447B2 (en) 2020-06-29 2022-09-06 Beijing Baidu Netcom Science And Technology Co., Ltd. Target detection
US11521603B2 (en) 2020-06-30 2022-12-06 Beijing Baidu Netcom Science And Technology Co., Ltd. Automatically generating conference minutes
CN115482395A (en) * 2022-09-30 2022-12-16 北京百度网讯科技有限公司 Model training method, image classification method, device, electronic equipment and medium

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109242032A (en) * 2018-09-21 2019-01-18 桂林电子科技大学 A kind of object detection method based on deep learning
CN109658387A (en) * 2018-11-27 2019-04-19 北京交通大学 The detection method of the pantograph carbon slide defect of power train
CN109816037A (en) * 2019-01-31 2019-05-28 北京字节跳动网络技术有限公司 The method and apparatus for extracting the characteristic pattern of image
CN109858451A (en) * 2019-02-14 2019-06-07 清华大学深圳研究生院 A kind of non-cooperation hand detection method
CN109902399A (en) * 2019-03-01 2019-06-18 哈尔滨理工大学 Rolling bearing fault recognition methods under a kind of variable working condition based on ATT-CNN
CN109948658A (en) * 2019-02-25 2019-06-28 浙江工业大学 The confrontation attack defense method of Feature Oriented figure attention mechanism and application
CN110084210A (en) * 2019-04-30 2019-08-02 电子科技大学 The multiple dimensioned Ship Detection of SAR image based on attention pyramid network
CN110163836A (en) * 2018-11-14 2019-08-23 宁波大学 Based on deep learning for the excavator detection method under the inspection of high-altitude
CN110245665A (en) * 2019-05-13 2019-09-17 天津大学 Image, semantic dividing method based on attention mechanism

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109242032A (en) * 2018-09-21 2019-01-18 桂林电子科技大学 A kind of object detection method based on deep learning
CN110163836A (en) * 2018-11-14 2019-08-23 宁波大学 Based on deep learning for the excavator detection method under the inspection of high-altitude
CN109658387A (en) * 2018-11-27 2019-04-19 北京交通大学 The detection method of the pantograph carbon slide defect of power train
CN109816037A (en) * 2019-01-31 2019-05-28 北京字节跳动网络技术有限公司 The method and apparatus for extracting the characteristic pattern of image
CN109858451A (en) * 2019-02-14 2019-06-07 清华大学深圳研究生院 A kind of non-cooperation hand detection method
CN109948658A (en) * 2019-02-25 2019-06-28 浙江工业大学 The confrontation attack defense method of Feature Oriented figure attention mechanism and application
CN109902399A (en) * 2019-03-01 2019-06-18 哈尔滨理工大学 Rolling bearing fault recognition methods under a kind of variable working condition based on ATT-CNN
CN110084210A (en) * 2019-04-30 2019-08-02 电子科技大学 The multiple dimensioned Ship Detection of SAR image based on attention pyramid network
CN110245665A (en) * 2019-05-13 2019-09-17 天津大学 Image, semantic dividing method based on attention mechanism

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
SHAOQING REN 等: "Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks", 《ARXIV》 *
TSUNG-YI LIN 等: "Feature Pyramid Networks for Object Detection", 《ARXIV》 *
董镭刚: "特征金字塔网络在图像检测中的应用", 《科学技术创新》 *
陈飞 等: "基于多尺度特征融合的Faster R-CNN道路目标检测", 《中国计量大学学报》 *

Cited By (27)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111507359A (en) * 2020-03-09 2020-08-07 杭州电子科技大学 Self-adaptive weighting fusion method of image feature pyramid
CN111507183A (en) * 2020-03-11 2020-08-07 杭州电子科技大学 Crowd counting method based on multi-scale density map fusion cavity convolution
CN111539458A (en) * 2020-04-02 2020-08-14 咪咕文化科技有限公司 Feature map processing method and device, electronic equipment and storage medium
CN111539458B (en) * 2020-04-02 2024-02-27 咪咕文化科技有限公司 Feature map processing method and device, electronic equipment and storage medium
CN111563414A (en) * 2020-04-08 2020-08-21 西北工业大学 SAR image ship target detection method based on non-local feature enhancement
CN111563414B (en) * 2020-04-08 2022-03-01 西北工业大学 SAR image ship target detection method based on non-local feature enhancement
CN111723841A (en) * 2020-05-09 2020-09-29 北京捷通华声科技股份有限公司 Text detection method and device, electronic equipment and storage medium
CN111709294A (en) * 2020-05-18 2020-09-25 杭州电子科技大学 Express delivery personnel identity identification method based on multi-feature information
CN111709291A (en) * 2020-05-18 2020-09-25 杭州电子科技大学 Takeaway personnel identity identification method based on fusion information
CN111709294B (en) * 2020-05-18 2023-07-14 杭州电子科技大学 Express delivery personnel identity recognition method based on multi-feature information
CN111709291B (en) * 2020-05-18 2023-05-26 杭州电子科技大学 Takeaway personnel identity recognition method based on fusion information
US11436447B2 (en) 2020-06-29 2022-09-06 Beijing Baidu Netcom Science And Technology Co., Ltd. Target detection
US11521603B2 (en) 2020-06-30 2022-12-06 Beijing Baidu Netcom Science And Technology Co., Ltd. Automatically generating conference minutes
CN112131925B (en) * 2020-07-22 2024-06-07 随锐科技集团股份有限公司 Construction method of multichannel feature space pyramid
CN112131925A (en) * 2020-07-22 2020-12-25 浙江元亨通信技术股份有限公司 Construction method of multi-channel characteristic space pyramid
CN112131935A (en) * 2020-08-13 2020-12-25 浙江大华技术股份有限公司 Motor vehicle carriage manned identification method and device and computer equipment
WO2021208726A1 (en) * 2020-11-23 2021-10-21 平安科技(深圳)有限公司 Target detection method and apparatus based on attention mechanism, and computer device
CN112396115A (en) * 2020-11-23 2021-02-23 平安科技(深圳)有限公司 Target detection method and device based on attention mechanism and computer equipment
CN112396115B (en) * 2020-11-23 2023-12-22 平安科技(深圳)有限公司 Attention mechanism-based target detection method and device and computer equipment
CN113327253B (en) * 2021-05-24 2024-05-24 北京市遥感信息研究所 Weak and small target detection method based on satellite-borne infrared remote sensing image
CN113327253A (en) * 2021-05-24 2021-08-31 北京市遥感信息研究所 Weak and small target detection method based on satellite-borne infrared remote sensing image
CN113591593A (en) * 2021-07-06 2021-11-02 厦门路桥信息股份有限公司 Method, equipment and medium for detecting target under abnormal weather based on causal intervention
CN113591593B (en) * 2021-07-06 2023-08-15 厦门路桥信息股份有限公司 Method, equipment and medium for detecting target in abnormal weather based on causal intervention
CN113570003A (en) * 2021-09-23 2021-10-29 深圳新视智科技术有限公司 Feature fusion defect detection method and device based on attention mechanism
CN113570003B (en) * 2021-09-23 2022-01-07 深圳新视智科技术有限公司 Feature fusion defect detection method and device based on attention mechanism
CN115482395B (en) * 2022-09-30 2024-02-20 北京百度网讯科技有限公司 Model training method, image classification device, electronic equipment and medium
CN115482395A (en) * 2022-09-30 2022-12-16 北京百度网讯科技有限公司 Model training method, image classification method, device, electronic equipment and medium

Similar Documents

Publication Publication Date Title
CN110782420A (en) Small target feature representation enhancement method based on deep learning
CN111210443B (en) Deformable convolution mixing task cascading semantic segmentation method based on embedding balance
Gao et al. Reading scene text with fully convolutional sequence modeling
CN114202672A (en) Small target detection method based on attention mechanism
US20180114071A1 (en) Method for analysing media content
CN108734210B (en) Object detection method based on cross-modal multi-scale feature fusion
JP2017062781A (en) Similarity-based detection of prominent objects using deep cnn pooling layers as features
CN111027576B (en) Cooperative significance detection method based on cooperative significance generation type countermeasure network
CN111353544B (en) Improved Mixed Pooling-YOLOV 3-based target detection method
CN110781744A (en) Small-scale pedestrian detection method based on multi-level feature fusion
CN110781980B (en) Training method of target detection model, target detection method and device
CN112434618B (en) Video target detection method, storage medium and device based on sparse foreground priori
CN113487610B (en) Herpes image recognition method and device, computer equipment and storage medium
CN116758130A (en) Monocular depth prediction method based on multipath feature extraction and multi-scale feature fusion
CN111507359A (en) Self-adaptive weighting fusion method of image feature pyramid
CN112037239B (en) Text guidance image segmentation method based on multi-level explicit relation selection
Fan et al. A novel sonar target detection and classification algorithm
CN112149526A (en) Lane line detection method and system based on long-distance information fusion
Zhao et al. BiTNet: a lightweight object detection network for real-time classroom behavior recognition with transformer and bi-directional pyramid network
US20230072445A1 (en) Self-supervised video representation learning by exploring spatiotemporal continuity
CN109284752A (en) A kind of rapid detection method of vehicle
Cai et al. Vehicle detection based on visual saliency and deep sparse convolution hierarchical model
TWI809957B (en) Object detection method and electronic apparatus
CN113688864B (en) Human-object interaction relation classification method based on split attention
CN114387489A (en) Power equipment identification method and device and terminal equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20200211

RJ01 Rejection of invention patent application after publication