CN111767799A - Improved down-going human target detection algorithm for fast R-CNN tunnel environment - Google Patents

Improved down-going human target detection algorithm for fast R-CNN tunnel environment Download PDF

Info

Publication number
CN111767799A
CN111767799A CN202010484802.XA CN202010484802A CN111767799A CN 111767799 A CN111767799 A CN 111767799A CN 202010484802 A CN202010484802 A CN 202010484802A CN 111767799 A CN111767799 A CN 111767799A
Authority
CN
China
Prior art keywords
pedestrian
convolution
processing
tunnel environment
cnn
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010484802.XA
Other languages
Chinese (zh)
Inventor
赵敏
唐毅
王卫平
孙棣华
王世森
陈星州
李莹英
杨国峰
何雪宁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chongqing Expressway Group Co ltd
Chongqing University
Original Assignee
Chongqing Expressway Group Co ltd
Chongqing University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chongqing Expressway Group Co ltd, Chongqing University filed Critical Chongqing Expressway Group Co ltd
Priority to CN202010484802.XA priority Critical patent/CN111767799A/en
Publication of CN111767799A publication Critical patent/CN111767799A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/46Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
    • G06V10/462Salient features, e.g. scale invariant feature transforms [SIFT]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/52Surveillance or monitoring of activities, e.g. for recognising suspicious objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection

Abstract

The invention discloses an improved human target detection algorithm under the Faster R-CNN tunnel environment, which comprises the following steps: establishing a pedestrian target data set under an expressway tunnel environment, and randomly dividing the pedestrian target data set into a training set and a testing set; optimizing the Anchor in the Faster R-CNN network by adopting an unsupervised learning algorithm based on the training set obtained in the step to obtain the Anchor setting; establishing a cavity convolution pyramid structure; designing an attention mechanism for processing feature information and enhancing the expression capability of the features; and establishing a pedestrian detection framework under the environment of the expressway tunnel. The method improves the pedestrian target feature extraction capability under the conditions of dark images, small target relative scale, vehicle lamp influence and the like, and improves the pedestrian target detection rate under the tunnel environment.

Description

Improved down-going human target detection algorithm for fast R-CNN tunnel environment
Technical Field
The invention relates to the technical field of pedestrian detection in computer vision, in particular to an improved pedestrian target detection algorithm under the Faster R-CNN tunnel environment.
Background
In related laws and regulations, pedestrians are prohibited to appear on expressways, but some people take lucky psychology to pass through the expressways, which causes great potential safety hazards to the safe operation of the expressways, so that the automatic detection technology of the pedestrian targets has important practical significance to the safe operation of the expressways. Through carrying out the analysis to the surveillance video under the highway tunnel environment, it is dim and pedestrian's target is less relatively to discover that the picture is general, has the interference of car light simultaneously for pedestrian's characteristic draws the difficulty, leads to pedestrian's target detection effect not very good under the tunnel environment. Therefore, the technology for detecting the pedestrian target under the tunnel environment has important theoretical and practical significance.
Reading the existing patents and papers, the detection algorithm of the pedestrian target with good detection effect mostly adopts a deep learning method. For example, in a multi-scale pedestrian target detection neural network based on feature fusion (CN110490174A) applied by university of electronic technology, a feature-fused multi-scale pedestrian target detection neural network is constructed, and feature information in different feature layers of the neural network is fully utilized by fusing feature information in different layers in the neural network, so that the neural network can effectively extract target feature information, and the pedestrian target detection rate is improved; the method for detecting the small-scale intensive pedestrians, which is applied by Beijing deep-waken technology Co., Ltd (CN110414464A), has the advantages that since the lower-layer network in the deep learning network contains more position information and the higher-layer network contains more semantic information, the detection rate of the small-size pedestrian target is effectively improved by fusing the semantic information of the lower layer into the semantic information of the higher layer to detect the pedestrian target and simultaneously extracting the mask; in a pedestrian detection method based on video sequence interframe information (CN110348329A) applied by the university of electronic science and technology, a Faster R-CNN network is used as a basic frame for detecting a pedestrian target, the detection result of the previous frame of image in a video is added to a detection frame in the pedestrian detection of the current frame, and then a softening non-maximum value inhibition method is used for processing a candidate frame; the pedestrian feature extraction module is built and up sampled, and the high-level semantic information is up-sampled and then fused with the low-level semantic information, so that the high-level semantic information and the low-level position information are fully fused, the feature extraction capability of a neural network is enhanced, and the pedestrian target detection precision is improved; the method and the system for detecting the pedestrian based on the two-stage attention mechanism are applied to Shanghai transportation university (CN110135243A), and two different attention mechanisms are added to an RPN network in an Faster R-CNN network and are respectively used for weighting object positioning and characteristic information, so that the positioning accuracy and the detection accuracy of the pedestrian target are effectively improved.
Aiming at the problems that a video image is dim, a pedestrian target is small and is easily influenced by factors such as vehicle lamps and the like in a tunnel environment, the method is improved on the basis of the Faster R-CNN algorithm with high target detection precision. Firstly, aiming at the difference of data sets, redesigning an anchor of Faster R-CNN to enable the anchor to adapt to the extraction of a pedestrian target candidate frame in a tunnel environment; then aiming at the small size of a pedestrian target in the image, the invention designs a hole convolution pyramid module, collects feature information in different sub-regions in a feature map by using hole convolution, and increases the relation between network feature information; meanwhile, aiming at the problem of dispersion of feature information after the fast R-CNN network is added with the hollow convolution pyramid structure, the invention designs an attention mechanism to process the feature information and enhance the expression capability of the feature.
In conclusion, the method starts from the actual environment of the highway tunnel, firstly optimizes the anchor in the fast R-CNN network by using the K-means clustering algorithm according to different sample data sets, and then designs the hole convolution gold tower module to collect and fuse the information of different sub-areas in the characteristic diagram, thereby increasing the relation between the characteristic information. Meanwhile, aiming at the dispersion of the characteristic information, the characteristic information is enhanced by using an attention mechanism. Finally, a pedestrian target detection algorithm suitable for the tunnel environment is formed. The method can effectively improve the pedestrian target feature extraction capability under the conditions of dark images, small target relative scale, vehicle lamp influence and the like, and improve the pedestrian target detection rate under the tunnel environment.
Disclosure of Invention
In view of the above, the present invention provides a detection algorithm for improving a pedestrian target detection rate in a tunnel environment.
The purpose of the invention is realized by the following technical scheme:
an improved algorithm for detecting a descending human target in the fast R-CNN tunnel environment comprises the following steps:
the method comprises the following steps: establishing a pedestrian target data set under an expressway tunnel environment, and randomly dividing the pedestrian target data set into a training set and a testing set;
step two: optimizing the Anchor in the Faster R-CNN network by adopting an unsupervised learning algorithm based on the training set obtained in the first step to obtain the Anchor setting;
step three: establishing a cavity convolution pyramid structure;
step four: designing an attention mechanism for processing feature information and enhancing the expression capability of the features;
step five: a pedestrian detection framework under an expressway tunnel environment is established, and the specific process is as follows:
1) adding a cavity convolution pyramid module behind the fast R-CNN feature extraction layer,
2) the algorithm frame added with the cavity pyramid module is processed by convolution, dimension reduction and activation function,
3) and adding an attention mechanism module for further processing.
Further, the specific acquisition process of the pedestrian target data set in the step one is as follows:
1) and acquiring a video image containing a pedestrian target in a tunnel environment from the expressway monitoring center, and converting the video image into a picture format.
2) And (3) making a pedestrian data set in a VOC (volatile organic compound) format by using a LabelImg tool, and randomly dividing the made data set into a training set and a testing set according to a ratio of 9: 1.
Further, the specific process of the second step is as follows:
1) inputting the data of the training set in the step one into K-means for clustering processing,
2) and further processing the data obtained after the clustering processing to obtain the Anchor setting.
Further, the specific process of the third step is as follows:
1) firstly, the feature map is convoluted, and then the feature map after convolution is processed by the void convolution layers with four different void rates.
2) And carrying out convolution and dimension reduction processing on the feature map passing through the void convolution layer, and combining the feature map and the feature map passing through the convolution processing.
Further, the specific process of the step four is as follows:
1) compressing the characteristic diagram according to the length and width directions to obtain a channel real number with a global receptive field,
2) and (4) carrying out convolution operation on the characteristic diagram, and multiplying the characteristic diagram by the channel real number obtained in the previous step to obtain a final characteristic diagram.
Due to the adoption of the technical scheme, the invention has the following beneficial effects:
the invention optimizes the Anchor in the Faster R-CNN network by using an unsupervised learning method, so that the Anchor is suitable for extracting the pedestrian target candidate frame in the tunnel environment; a hole convolution pyramid module is designed, and the hole convolution is utilized to collect feature information in different sub-regions in a feature map, so that the relation between network feature information is increased; the invention designs an attention mechanism, processes the characteristic information and enhances the expression capability of the characteristic; through the three designs, the pedestrian target feature extraction capability under the conditions of dark images, small target relative scale, vehicle lamp influence and the like is improved, and the pedestrian target detection rate under the tunnel environment is improved.
Additional advantages, objects, and features of the invention will be set forth in part in the description which follows and in part will become apparent to those having ordinary skill in the art upon examination of the following or may be learned from practice of the invention. The objectives and other advantages of the invention may be realized and attained by the means of the instrumentalities and combinations particularly pointed out hereinafter.
Drawings
The drawings of the present invention are described below.
FIG. 1 is a schematic diagram of a void convolution pyramid structure;
FIG. 2 is a schematic illustration of an attention mechanism;
FIG. 3 is a schematic diagram of a partial pedestrian object detection framework.
Detailed Description
The invention is further illustrated by the following figures and examples.
Example 1
As shown in fig. 1-3, the improved human target detection algorithm under the fast R-CNN tunnel environment provided by this embodiment includes the following steps:
the method comprises the following steps: establishing a pedestrian target data set under an expressway tunnel environment, and randomly dividing the pedestrian target data set into a training set and a testing set;
step two: optimizing the Anchor in the Faster R-CNN network by adopting an unsupervised learning algorithm based on the training set obtained in the first step to obtain the Anchor setting;
step three: establishing a cavity convolution pyramid structure;
step four: designing an attention mechanism for processing feature information and enhancing the expression capability of the features;
step five: a pedestrian detection framework under an expressway tunnel environment is established, and the specific process is as follows:
1) adding a cavity convolution pyramid module behind the fast R-CNN feature extraction layer,
2) the algorithm frame added with the cavity pyramid module is processed by convolution, dimension reduction and activation function,
3) and adding an attention mechanism module for further processing.
In this implementation, the specific acquisition process of the pedestrian target data set in the first step is as follows:
1) the method comprises the steps of obtaining video images containing pedestrian targets in a tunnel environment from a highway monitoring center, and saving every 30 frames of the video images into pictures in a jpg format with the size of 704 x 576.
2) Marking the image in the jpg format by using a LabelImg tool, firstly, completely framing the pedestrian target by using a rectangular frame, simultaneously marking a person data label, and finally, storing the marked image as VOC format data. And continuously repeating the processes until all the pictures are marked to form a pedestrian target data set in the expressway tunnel environment. And randomly setting the labeled pedestrian data set as a training set and a data set according to a ratio of 9: 1.
In this embodiment, the specific process of step two is as follows:
1) and (3) inputting the data of the training set in the step one into K-means for clustering, analyzing the VOC07 data set used in the original Faster R-CNN network and the data set made by the user, wherein the two data sets have large difference and can not directly detect the pedestrian target in the expressway tunnel environment by using the default Anchor setting in the Faster R-CNN network. Firstly, carrying out scaling processing on a rectangular frame in a marked pedestrian data set according to the size of a picture processed in a Faster R-CNN algorithm, inputting the width-to-height ratio data of the processed pedestrian marking frame into K-means as a sample set for clustering processing, and obtaining the clustering results of the height and the width-to-height ratio of the marking frame.
2) And processing the result obtained after clustering, and obtaining the pedestrian target frame with the height of 30-200 and the aspect ratio of 0.42 according to the clustering result. And increasing the height by the size of the ratio 2, and setting 4 detection frames in total. The new Anchor setting can be obtained through the clustering result, and compared with the original detection frame, the method has fewer candidate frames, improves the detection speed of the algorithm to a certain extent, and can be more suitable for pedestrian target detection in the tunnel environment.
In this embodiment, the specific process of the third step is as follows:
1) describing the structure of the hollow convolution pyramid in detail by combining with the graph 1, firstly processing a feature map of a feature extraction layer of the fast R-CNN network by a convolution layer with the size of 3 × 3 convolution kernels, and processing the feature map by using a Relu activation function so as to further enhance the nonlinearity of the network; and then, the feature map obtained through the operation passes through the void convolution layers with the void ratios of (1,6,12 and 18), the void convolution can extract features on the basis of not changing the resolution of the feature map, and the information of different areas in the feature map is collected through the convolution layers with different voids, so that the relation between feature information in a network can be effectively improved, and the positioning accuracy of a target can be effectively improved.
2) In order to ensure the weight occupied by the original characteristic diagram in the hollow convolution pyramid structure, the characteristic diagram passing through the hollow convolution layer is subjected to dimensionality reduction through the convolution layer with the size of 1 × 1 convolution kernel, the channel number is changed into one fourth of the original channel number, and then the characteristic diagram subjected to dimensionality reduction passes through an activation function, so that the nonlinearity of the network is further enhanced. And finally, combining the obtained feature map with the feature map which is subjected to convolution and activation function in the previous step to form a new feature map.
In this embodiment, the specific process of the step four is as follows:
1) in detail, the attention mechanism is designed with reference to fig. 2, and assuming that an input feature map is X, the feature map X is first compressed in a width-height direction of the feature map to obtain a real number in a channel direction, where the real number has a global receptive field to some extent. When the compression processing is performed, the formula is as follows;
Figure BDA0002518745290000051
in the formula Fsq(. cndot.) represents compression of the feature map, X represents the feature map, W represents the width of the feature map, H represents the height of the feature map, and (m, n) each represent a pixel value at a certain point.
2) The feature map X is compressed and simultaneously subjected to convolution processing, so that the target feature is further extracted, and meanwhile, a proper activation function is added after the convolution operation, so that the fitting degree of the network to the feature information is further enhanced. And finally, multiplying the feature information subjected to convolution processing by the channel information obtained in the previous step to obtain a final feature map.
In this embodiment, the fifth step is as follows:
1) the pedestrian detection framework in the highway tunnel environment is described in detail in conjunction with fig. 3. Firstly, a hole convolution module is added behind a FasterR-CNN feature extraction layer, so that the association degree between feature information is enhanced, and the feature extraction capability of the network is improved.
2) After the hole convolution gold tower is added in the feature extraction layer of the Faster R-CNN network, the number of channels of the feature map is changed from 512 to 1024, and the number of channels is changed to 512. Firstly, extracting the feature information of the target through the 1 × 1 convolution layer, then performing dimensionality reduction processing through the 3 × 3 convolution layer to change the number of channels into 512, and meanwhile, adding a Relu activation function to enhance the nonlinear features of the network and improve the feature extraction capability of the network.
3) After the convolution operation is performed, the network dimension reduction can cause the dispersion of the network characteristic information. And then adding the designed attention mechanism to convolution operation, further processing the characteristic information, enhancing the characteristic extraction capability of the network, and finally forming a target detection algorithm for the descending people in the expressway tunnel environment.
Example 2
The difference between the embodiment and the embodiment 1 is that in the second step, a mean shift algorithm is adopted to process the aspect ratio of the pedestrian labeling box to obtain a clustering result.
Finally, the above embodiments are only for illustrating the technical solutions of the present invention and not for limiting, although the present invention has been described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications or equivalent substitutions may be made on the technical solutions of the present invention without departing from the spirit and scope of the technical solutions, and all of them should be covered in the protection scope of the present invention.

Claims (5)

1. An improved algorithm for detecting a downlink human target in a Faster R-CNN tunnel environment is characterized by comprising the following steps:
the method comprises the following steps: establishing a pedestrian target data set under an expressway tunnel environment, and randomly dividing the pedestrian target data set into a training set and a testing set;
step two: optimizing the Anchor in the Faster R-CNN network by adopting an unsupervised learning algorithm based on the training set obtained in the first step to obtain the Anchor setting;
step three: establishing a cavity convolution pyramid structure;
step four: designing an attention mechanism for processing feature information and enhancing the expression capability of the features;
step five: a pedestrian detection framework under an expressway tunnel environment is established, and the specific process is as follows:
1) adding a cavity convolution pyramid module behind the fast R-CNN feature extraction layer,
2) the algorithm frame added with the cavity pyramid module is processed by convolution, dimension reduction and activation function,
3) and adding an attention mechanism module for further processing.
2. The pedestrian object detection algorithm of claim 1, wherein the pedestrian object data set in the first step is obtained by the following specific steps:
1) acquiring a video image containing a pedestrian target in a tunnel environment from a highway monitoring center, and converting the video image into a picture format;
2) and (3) making a pedestrian data set in a VOC (volatile organic compound) format by using a LabelImg tool, and randomly dividing the made data set into a training set and a testing set according to a ratio of 9: 1.
3. The pedestrian object detection algorithm of claim 1, wherein the specific process of the second step is as follows:
1) inputting the data of the training set in the step one into K-means for clustering processing,
2) and further processing the data obtained after the clustering processing to obtain the Anchor setting.
4. The pedestrian object detection algorithm of claim 1, wherein the specific process of the third step is as follows:
1) firstly, carrying out convolution processing on the feature map, and then processing the feature map subjected to convolution processing through the void convolution layers with four different void ratios;
2) and carrying out convolution and dimension reduction processing on the feature map passing through the void convolution layer, and combining the feature map and the feature map passing through the convolution processing.
5. The pedestrian object detection algorithm of claim 1, wherein the specific process of the fourth step is as follows:
1) compressing the characteristic diagram according to the length and width directions to obtain a channel real number with a global receptive field,
2) and (4) carrying out convolution operation on the characteristic diagram, and multiplying the characteristic diagram by the channel real number obtained in the previous step to obtain a final characteristic diagram.
CN202010484802.XA 2020-06-01 2020-06-01 Improved down-going human target detection algorithm for fast R-CNN tunnel environment Pending CN111767799A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010484802.XA CN111767799A (en) 2020-06-01 2020-06-01 Improved down-going human target detection algorithm for fast R-CNN tunnel environment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010484802.XA CN111767799A (en) 2020-06-01 2020-06-01 Improved down-going human target detection algorithm for fast R-CNN tunnel environment

Publications (1)

Publication Number Publication Date
CN111767799A true CN111767799A (en) 2020-10-13

Family

ID=72719853

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010484802.XA Pending CN111767799A (en) 2020-06-01 2020-06-01 Improved down-going human target detection algorithm for fast R-CNN tunnel environment

Country Status (1)

Country Link
CN (1) CN111767799A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112949635A (en) * 2021-03-12 2021-06-11 北京理工大学 Target detection method based on feature enhancement and IoU perception

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180268284A1 (en) * 2017-03-15 2018-09-20 Samsung Electronics Co., Ltd. System and method for designing efficient super resolution deep convolutional neural networks by cascade network training, cascade network trimming, and dilated convolutions
CN110188807A (en) * 2019-05-21 2019-08-30 重庆大学 Tunnel pedestrian target detection method based on cascade super-resolution network and improvement Faster R-CNN
CN110378484A (en) * 2019-04-28 2019-10-25 清华大学 A kind of empty spatial convolution pyramid pond context learning method based on attention mechanism
CN110647817A (en) * 2019-08-27 2020-01-03 江南大学 Real-time face detection method based on MobileNet V3
CN110717527A (en) * 2019-09-24 2020-01-21 东南大学 Method for determining target detection model by combining void space pyramid structure
CN110852383A (en) * 2019-11-12 2020-02-28 复旦大学 Target detection method and device based on attention mechanism deep learning network

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180268284A1 (en) * 2017-03-15 2018-09-20 Samsung Electronics Co., Ltd. System and method for designing efficient super resolution deep convolutional neural networks by cascade network training, cascade network trimming, and dilated convolutions
CN110378484A (en) * 2019-04-28 2019-10-25 清华大学 A kind of empty spatial convolution pyramid pond context learning method based on attention mechanism
CN110188807A (en) * 2019-05-21 2019-08-30 重庆大学 Tunnel pedestrian target detection method based on cascade super-resolution network and improvement Faster R-CNN
CN110647817A (en) * 2019-08-27 2020-01-03 江南大学 Real-time face detection method based on MobileNet V3
CN110717527A (en) * 2019-09-24 2020-01-21 东南大学 Method for determining target detection model by combining void space pyramid structure
CN110852383A (en) * 2019-11-12 2020-02-28 复旦大学 Target detection method and device based on attention mechanism deep learning network

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
HENGSHUANG ZHAO等: "Pyramid Scene Parsing Network", 《ARXIV》, pages 1 - 11 *
YA-LI LI等: "HAR-Net: Joint Learning of Hybrid Attention for Single-stage Object Detection", 《ARXIV》, pages 1 - 10 *
李佐龙等: "多尺度特征融合重建的行人检测方法", 《计算机工程与应用》, vol. 57, no. 4, pages 176 - 182 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112949635A (en) * 2021-03-12 2021-06-11 北京理工大学 Target detection method based on feature enhancement and IoU perception
CN112949635B (en) * 2021-03-12 2022-09-16 北京理工大学 Target detection method based on feature enhancement and IoU perception

Similar Documents

Publication Publication Date Title
Weihong et al. Research on license plate recognition algorithms based on deep learning in complex environment
CN110363140B (en) Human body action real-time identification method based on infrared image
Tourani et al. A robust deep learning approach for automatic iranian vehicle license plate detection and recognition for surveillance systems
CN111931684B (en) Weak and small target detection method based on video satellite data identification features
Zhang et al. Scene-free multi-class weather classification on single images
Chen et al. Vehicle detection in high-resolution aerial images based on fast sparse representation classification and multiorder feature
CN111915592B (en) Remote sensing image cloud detection method based on deep learning
CN110119703A (en) The human motion recognition method of attention mechanism and space-time diagram convolutional neural networks is merged under a kind of security protection scene
CN103870803A (en) Vehicle license plate recognition method and system based on coarse positioning and fine positioning fusion
CN112183203A (en) Real-time traffic sign detection method based on multi-scale pixel feature fusion
Aggarwal et al. A robust method to authenticate car license plates using segmentation and ROI based approach
CN111899249A (en) Remote sensing image change detection method based on convolution neural network of ResNet50 and DeeplabV3+
CN110766020A (en) System and method for detecting and identifying multi-language natural scene text
CN112906706A (en) Improved image semantic segmentation method based on coder-decoder
CN111414861A (en) Method for realizing detection processing of pedestrians and non-motor vehicles based on deep learning
CN113378815B (en) Scene text positioning and identifying system and training and identifying method thereof
Lauziere et al. A model-based road sign identification system
CN112132205A (en) Remote sensing image classification method based on convolutional neural network
CN114596316A (en) Road image detail capturing method based on semantic segmentation
CN111767799A (en) Improved down-going human target detection algorithm for fast R-CNN tunnel environment
CN115861756A (en) Earth background small target identification method based on cascade combination network
CN109284752A (en) A kind of rapid detection method of vehicle
CN116612427A (en) Intensive pedestrian detection system based on improved lightweight YOLOv7
CN116052053A (en) Method and device for improving accuracy of monitoring image under intelligent text blog
CN116311205A (en) License plate recognition method, license plate recognition device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination