CN115063691B - Feature enhancement-based small target detection method in complex scene - Google Patents

Feature enhancement-based small target detection method in complex scene Download PDF

Info

Publication number
CN115063691B
CN115063691B CN202210780211.6A CN202210780211A CN115063691B CN 115063691 B CN115063691 B CN 115063691B CN 202210780211 A CN202210780211 A CN 202210780211A CN 115063691 B CN115063691 B CN 115063691B
Authority
CN
China
Prior art keywords
feature
network
prediction
targets
scale
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210780211.6A
Other languages
Chinese (zh)
Other versions
CN115063691A (en
Inventor
潘晓英
贾凝心
王昊
丁雅眉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xian University of Posts and Telecommunications
Original Assignee
Xian University of Posts and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xian University of Posts and Telecommunications filed Critical Xian University of Posts and Telecommunications
Priority to CN202210780211.6A priority Critical patent/CN115063691B/en
Publication of CN115063691A publication Critical patent/CN115063691A/en
Application granted granted Critical
Publication of CN115063691B publication Critical patent/CN115063691B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/10Terrestrial scenes
    • G06V20/13Satellite images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/74Image or video pattern matching; Proximity measures in feature spaces
    • G06V10/75Organisation of the matching processes, e.g. simultaneous or sequential comparisons of image or video features; Coarse-fine approaches, e.g. multi-scale approaches; using context analysis; Selection of dictionaries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/776Validation; Performance evaluation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • G06V10/806Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection

Abstract

The invention belongs to the field of computer vision and target detection, and particularly relates to a small target detection method under a complex scene based on feature enhancement. The technical scheme of the invention is as follows: firstly, a Cutout-DA data enhancement method is provided, new shielding data are generated and expanded into a VisDrone2021 data set, then a multi-scale fused characteristic enhancement path aggregation network MSFE-PANet is designed, richer and finer semantic information characteristics and spatial information characteristics are obtained through an integrated attention mechanism, characteristic fusion and a network prediction scale strategy aiming at a small target, a prediction frame rejection Loss function RB_loss is designed, and finally a model is trained. The invention can enhance the mutual fusion of the strong positioning information of the deep feature map and the strong semantic information of the shallow feature map, help the network to find the region of interest in the complex scene and improve the sensitivity to the small target. And the RB_Loss rejection Loss function and the network prediction scale are designed to solve the problems of overlapping, missing detection of small shielding targets and false detection under a complex background.

Description

Feature enhancement-based small target detection method in complex scene
Technical Field
The invention belongs to the field of computer vision and target detection, and particularly relates to a small target detection method under a complex scene based on feature enhancement.
Background
In recent years, the rapid development of deep learning technology has prompted remarkable breakthrough in computer vision, and has been pushed to unprecedented research hotspots. The main task of computer vision is to parse images, including classification, detection and segmentation of images. Target detection is used as one of the core research directions in the field of computer vision, and specific target classes are found through accurate positioning by using a correlation algorithm. The small target detection has the same important application value as the difficulty of target detection, and plays an important role in the fields of automatic driving, intelligent medical treatment, defect detection, aerial image analysis and the like. Detecting small, remote objects in a high resolution scene photograph of an automobile is a necessary condition for safe deployment of an autonomous automobile; meanwhile, in medical imaging, if the tumor mass and the tumor with the size of only a few pixels can be found early, the early detection is important for accurate and early diagnosis; automated industrial inspection can also benefit from small target inspection by locating small defects that can be seen on the surface of the material. In conclusion, the small target detection has wide application value and important research significance.
Although the target detection algorithm has made a major breakthrough, the study of small targets is still less than ideal due to the significant gap in performance between detecting small targets and large targets. The small target detection in the existing method can not be well applied to actual complex scenes, and mainly has the following problems. 1. Visual characteristics are not obvious: the difficulty in detecting the small target is that the target features are not obvious, available information is less, if the resolution of the image is low, the small target can be represented by a few pixels, and under the condition that the visual features are not obvious, the accurate detection of the small target is a great challenge at present; 2. feature extraction problem: in target detection, the quality of feature extraction directly affects the performance of final detection, and features of small targets are more difficult to extract than those of large-scale targets. Most computer vision architectures use a pooling layer, and some of the features of the small objects are deleted after pooling. Extracting effective small target features in deep neural networks is also a current problem; 3. background interference problems. Small target detection in complex environments is subject to interference from factors such as illumination, complex geographic elements, occlusion, aggregation, etc., so it is difficult to distinguish them from background or similar targets, and effectively improving complex background interference is also a current challenge.
Disclosure of Invention
Aiming at the problems that small targets cannot be accurately detected, characteristics are difficult to extract and detection cannot well meet actual complex scenes in the prior art, the invention provides a small target detection method under complex scenes based on characteristic enhancement.
In order to achieve the above purpose, the technical scheme of the invention is as follows: a small target detection method under a complex scene based on feature enhancement comprises the following steps of
Step 1, data preparation: the dataset is derived from an aerial image;
step 2, data enhancement: the method comprises the steps of firstly, randomly selecting partial data images in a data set, and then randomly shielding the visible partial targets and all targets in the images according to the size proportion of 0.2, 0.4, 0.6 and 0.8 of the targets to generate new shielding data to expand the new shielding data into the VisDrone2021 data set;
step 3, designing a multi-scale fused characteristic enhanced path aggregation network MSFE-PANet;
step 3.1: improving network prediction scale
Removing the prediction head YOLO head3 aiming at the detection large target in YOLO v4, but retaining the corresponding 13 x 13 characteristic diagram; meanwhile, a prediction head YOLO head0 for detecting a small-scale target, which is generated by the shallow high-resolution feature map 104 x 104, is added in the prediction network, and a new network prediction scale structure is generated.
Step 3.2: feature layer fusion
Carrying out corresponding multiple up-sampling on the feature images extracted by each layer of feature network on a new network prediction scale structure, and respectively adding and fusing the feature images with the first layer of feature images to obtain new feature images;
step 3.3: an attention module;
step 4: the prediction block rejection Loss function rb_loss is designed.
Step 5: and training a model.
Step 3.3 above: integrating attention mechanisms in PANet
Step 3.3.1: a CBAM attention module is added as shown in equation (1).
The calculation formula of the channel attention is (2): wherein sigma is a Sigmoid activation function, and MLP weights W 0 And W is 1 Is shared by
The calculation formula of the spatial attention is (3): wherein σ is a Sigmoid activation function, f 7*7 A filter denoted 7*7.
Step 3.3.2: improving the channel attention module of the CBAM;
step 3.3.3: introducing an SE-attention module;
step 3.3.4: improving the SPP module;
step 3.3.5: the SE-attention module is optimized.
Step 3.3.2 above, the calculation formula is defined as (4):
step 3.3.3, giving an input X, the number of channels is C 1 Through F tr Is subjected to a series of convolution and pooling operations to obtain a channel number C 2 Is characterized by U; f (F) sq For feature compression operation, feature compression is carried out along the space dimension, and each two-dimensional feature channel is changed into one pixel; followed by F ex Excitation operations, then weighted by multiplication onto previous features
Calculation formula (5):
in (a): u (U) C Representing the C-th channel in the feature map; z is Z C Is the output of the compression operation. The sigma is a Sigmoid activation function; w (W) 1 ,W 2 All are all fully connected operation; delta is the ReLU activation function. Calculation formula (7) S C Is the C-th weight in step S.
S=F ex (Z,W)=σ(g(Z,W))=σ(W 2 δ(W 1 Z)) (6)
F scale =(U C ,S C )=S C ·U C (7)
The step 3.3.4 specifically comprises the following steps: changing the pooling layer of the kernels with sizes of 1,5, 9 and 13 in the SPP into 1*1 convolution and 3*3 hole convolution, the improved SPP module does not change the size of the feature map, and the output feature map size calculation formula is (8):
the step 3.3.5 specifically comprises the following steps: an improved SPP module is added in the SE-attention to obtain an SSE-attention module.
The step 4 specifically comprises the following steps: taking the degree of overlap IOU between the prior prediction frames of the two overlapped targets as the loss value, optimizing a back propagation network according to the gradient direction, separating the overlapped prior prediction frames of the two targets, defining the prior prediction frames as (9) expressing the matching of different target frames in the formula
Compared with the prior art, the invention has the beneficial effects that:
1. compared with the Mosaic and CutMix data enhancement methods of the reference network YOLOv4, the method has the advantage that mAP on the data set VisDrone2021 is improved by 3.57 precision by adopting the Cutout-DA data enhancement strategy designed by the method. The result fully shows the effectiveness of the detection algorithm in adopting a Cutout-DA strategy aiming at a small target; in the YOLOv4 prediction network, because the output prior prediction frames need to be judged and processed by the NMS, mutual shielding and overlapping targets can influence target frame matching, so that a large number of conditions of missed detection and false detection are caused. The RB_loss provided by the invention further reduces the mutual influence of shielding target detection through the IOU, and the mAP improves the accuracy by 2.8.
2. According to the multi-scale fusion characteristic enhanced path aggregation network MSFE-PANet, richer and finer semantic information characteristics and spatial information characteristics can be obtained through network prediction scale strategy and multi-scale characteristic fusion aiming at small targets, and the mAP improves 9.47 precision on two strategies of Cutout-DA and RB_Loss, so that the accuracy of small target detection is greatly improved; and then an LW-CBAM and SSE-Attention mechanism is added, so that an Attention area is further extracted, the network is helped to concentrate on useful small target objects, the mAP improves 6.63 accuracies, and the problems of omission and false detection of overlapping and shielding small targets under a complex background are solved.
3. The invention can accurately detect small targets, has easy feature extraction and can meet various actual complex scenes. The application range is wide, and the adaptability is strong.
Drawings
FIG. 1 is a diagram of a multi-scale converged feature enhanced path aggregation network MSFE-PANet structure in the present invention;
FIG. 2 is a detailed structure diagram of MSFE-PANet in the present invention;
FIG. 3 is a predicted scale improvement architecture in accordance with the present invention;
FIG. 4 is a channel attention structure of a CBAM according to the present invention;
FIG. 5 is a spatial attention structure of a CBAM according to the present invention;
FIG. 6 is a comparison of results from different modules in the present invention;
FIG. 7 illustrates various embedding patterns of the attention module according to the present invention;
FIG. 8 is a detailed result image of MSFE-PANet in an embodiment of the present invention;
FIG. 9 is a visual result image of MSFE-PANet in an embodiment of the present invention;
Detailed Description
The invention is further described below with reference to the accompanying drawings. The following examples are only for more clearly illustrating the technical aspects of the present invention, and are not intended to limit the scope of the present invention.
The multi-scale fused characteristic enhanced path aggregation network MSFE-PANet can enhance the mutual fusion of the strong positioning information of the deep characteristic diagram and the strong semantic information of the shallow characteristic diagram, help the network to find the region of interest in the complex scene and improve the sensitivity to the small target. And designing an RB_Loss rejection Loss function and a network prediction scale to solve the problems of overlapping and small target shielding missed detection and false detection under a complex background.
Referring to fig. 1, the method for detecting the small target in the complex scene based on the feature enhancement provided by the invention comprises the following steps:
step 1: data preparation. The method comprises the following steps: a large aerial dataset visclone 2021 was used, the image size of which was approximately 2000 x 1500, containing a variety of scenes from country to city, and containing various climate changes, light and shade changes, and shooting angle changes, etc., while including 10 categories of pedestrians, automobiles, bicycles, and tricycles, 6471 images in the dataset were used for training, 548 images were used for verification, and 1610 images were used for testing.
Step 2: data enhancement. The method comprises the following steps: according to the method, partial data images are selected at will in a data set, then partial positions of visible partial targets and all targets in the images are shielded randomly according to 0.2, 0.4, 0.6 and 0.8 of the size proportion of the targets, new shielding data are generated and expanded into the data set, robustness of a model on the shielding targets is enhanced, and accuracy of judgment on the shielding targets is improved.
Step 3: : and (5) algorithm design. The method comprises the following steps: and designing a multi-scale fusion characteristic enhanced path aggregation network FEMF-PANet.
Step 3.1: the network prediction scale is improved, see fig. 3. Removing the prediction head YOLO head3 aiming at the detection large target in YOLO v4, but retaining the corresponding 13 x 13 characteristic diagram; meanwhile, a prediction head YOLO head0 which is generated by the shallow high-resolution feature graphs 104 x 104 and aims at detecting a small-scale target is added in the prediction network, so that a new network prediction scale structure is obtained.
Step 3.2: referring to fig. 3, the feature layer fusion is specifically: the feature images extracted from each layer of feature network are up-sampled by corresponding times on a new network prediction scale structure, and are respectively added and fused with the first layer of feature images to obtain new feature images, so that the feature prediction network is finer, and the detection precision of small targets is improved;
step 3.3: the attention module is introduced. The method comprises the following steps: attention mechanisms are integrated in PANet.
Step 3.3.1: a CBAM attention module is added. CBAM can be integrated into most CNN network frameworks, enabling end-to-end training. Given an intermediate feature map as input, the CBAM sequentially extrapolates the attention pattern along two independent dimensions of the channel and space, as shown in equation (1).
Referring to fig. 4, an improved CBAM attention module is introduced, wherein the channel attention is calculated as (10): sigma is a Sigmoid activation function, and MLP weights W 0 And W is 1 Is shared by
Referring to FIG. 5, at the spatial attention module, a spatial attention pattern is generated using the spatial relationships of features.
The calculation formula of the spatial attention is (3): sigma is a Sigmoid activation function, f 7*7 A filter denoted 7*7.
Step 3.3.2: the channel attention module of CBAM is improved. The present invention uses the convolution of 1*1 instead of the fully connected layers in the channel attention module, resulting in a lighter weight convolved attention module LW-CBAM. The calculation formula can be defined as (11):
step 3.3.3: a SE-attention module is introduced. The method comprises the following steps: first, an input X is given, and the number of channels is C 1 Through F tr Is subjected to a series of convolution and pooling operations to obtain a channel number C 2 Is characterized by U; f (F) sq For the feature compression operation, feature compression is carried out along the space dimension, each two-dimensional feature channel is changed into a pixel, the pixel has a global receptive field, and the output dimension is matched with the input feature channel number; followed by F ex Excitation operation, based on correlation among characteristic channels, each characteristic channel generates a weight to represent importance degree of each characteristic channel, and then the importance degree is weighted to the previous characteristic through multiplication to finish the calibration of the important characteristic. Calculation formula (5): u (U) C Representing the C-th channel in the feature map; z is Z C Is the output of the compression operation. The sigma is a Sigmoid activation function; w (W) 1 ,W 2 All are all fully connected operation; delta is the ReLU activation function. Calculation formula (7) S C Is the C-th weight in step S.
S=F ex (Z,W)=σ(g(Z,W))=σ(W 2 δ(W 1 Z)) (6)
F scale =(U C ,S C )=S C ·U C (7)
Step 3.3.4: referring to fig. 2, the SPP module is modified. The method comprises the following steps: the pooling layer of the 1 x 1,5 x 5,9 x 9 and 13 x 13 size kernels in the SPP was changed to 1*1 convolution and 3*3 hole convolution, but the improved SPP module did not change the size of the feature map. The output characteristic diagram size calculation formula is (13):
step 3.3.5: referring to fig. 2, an optimized SE-attention module is integrated, and an improved SPP module is added in the SE-attention to enhance the expression capability of feature information of a feature map input into the SE-attention, thereby achieving a better classification effect.
Referring to fig. 7, the present invention embeds LW-CBAM and SSE-Attention mechanism modules in two different regions of the neck and detection head of the network, respectively, according to the new network prediction scale structure, to enhance important channels and spatial features. And experimental verification is carried out by adopting four embedding modes, so that an optimal MSFE-PANet network model is obtained, and the performance of small target detection is improved.
Referring to fig. 8, after the optimal attention module is embedded, the method of the invention overlaps small targets, gathers the small targets and blocks the detailed effects of small target detection in a complex background.
Step 4: in the model training scheme, a prediction block rejection Loss function RB_Loss is designed
The method comprises the following steps: the degree of overlap IOU between the a priori predicted frames of two overlapping targets is taken as the value of the penalty. The larger the overlap, the larger the value of the loss function, and in the training phase, the back propagation network will be optimized according to the gradient direction, separating the overlapping a priori prediction frames of the two targets. The rejection loss function is combined with the YOLOv4 model, so that the rejection loss function accords with small target detection under a complex application scene, and the problem that targets in an image are mutually shielded and overlapped is effectively solved. Defined as (9) a priori prediction box in the formula and representing the matching of different target boxes.
The method accords with small target detection in a complex application scene, and effectively solves the problem that targets in images are mutually shielded and overlapped;
step 5: the training model, the network was trained on the visclone 2021 dataset with 200epochs, the experiment set the input picture Size to 416 x 416, the first 100epoch set Batch Size to 4, and the second 100epoch set Batch Size to 8.
Referring to fig. 6, the method of the present invention was validated against the visclone 2021 dataset and compared to the detection performance of the baseline network YOLOv 4. The effectiveness of the method for detecting the small target in the complex scene is verified by gradually adding corresponding modules such as a Cutout-DA data enhancement method, an attention module, an RB_loss and the like.
Referring to fig. 9, the method of the present invention compares the result with other methods, and can accurately detect the missing detection and false detection of the small target, and adapt to the detection task of the small target in the complex scene.

Claims (1)

1. A small target detection method under a complex scene based on feature enhancement comprises the following steps of
Step 1, data preparation: the dataset is derived from an aerial image;
step 2, data enhancement: the method comprises the steps of firstly, randomly selecting partial data images in a data set, then randomly shielding the visible partial targets and all targets in the images according to the size proportion of 0.2, 0.4, 0.6 and 0.8 of the targets, generating new shielding data and expanding the new shielding data into a VisDrone2021 data set;
step 3, designing a multi-scale fused characteristic enhanced path aggregation network MSFE-PANet;
step 3.1: improving network prediction scale
Removing the prediction head YOLO head3 aiming at the detection large target in YOLO v4, but retaining the corresponding 13 x 13 characteristic diagram; meanwhile, a prediction head YOLO head0 which is generated by shallow high-resolution feature graphs 104 x 104 and aims at detecting a small-scale target is added in a prediction network, and a new network prediction scale structure is generated;
step 3.2: feature layer fusion
Carrying out corresponding multiple up-sampling on the feature images extracted by each layer of feature network on a new network prediction scale structure, and respectively adding and fusing the feature images with the first layer of feature images to obtain new feature images;
step 3.3: an attention module;
step 4: designing a prediction frame rejection Loss function RB_Loss;
step 5: training a model;
the step 3.3 specifically comprises
Step 3.3.1: adding a CBAM attention module as shown in a formula (1);
the calculation formula of the channel attention is (2): wherein sigma is a Sigmoid activation function, and MLP weights W 0 And W is 1 Is shared by
The calculation formula of the spatial attention is (3): wherein σ is a Sigmoid activation function, f 7*7 A filter denoted 7*7;
step 3.3.2: improving the channel attention module of the CBAM;
step 3.3.3: introducing an SE-attention module;
step 3.3.4: improving the SPP module;
step 3.3.5: optimizing the SE-attention module;
said step 3.3.2, the calculation formula is defined as (4)
Step 3.3.3, giving an input X, the channel number is C 1 Through F tr Is subjected to a series of convolution and pooling operations to obtain a channel number C 2 Is characterized by U; f (F) sq For feature compression operation, feature compression is carried out along the space dimension, and each two-dimensional feature channel is changed into one pixel; then F is carried out ex Excitation operations, then weighted by multiplication onto previous features
Calculation formula (5):
wherein: u (U) C Representing the C-th channel in the feature map; z is Z C Is the output of the compression operation; calculation formula (6): sigma is a Sigmoid activation function; w (W) 1 ,W 2 All are all fully connected operation; delta is a ReLU activation function; calculation formula (7): s is S C Is the C weight in the step S;
S=F ex (Z,W)=σ(g(Z,W))=σ(W 2 δ(W 1 Z)) (6)
F scale =(U C ,S C )=S C ·U C (7);
the step 3.3.4 is specifically that
The pooling layer of the kernels with the sizes of 1,5, 9 and 13 in the SPP is changed into 1*1 convolution and 3*3 cavity convolution, the improved SPP module does not change the size of the feature map, and the size calculation formula of the output feature map is (8)
The step 3.3.5 is specifically
Adding an improved SPP module into the SE-attention to obtain an SSE-attention module;
the step 4 is specifically that
Taking the degree of overlap IOU between the prior prediction frames of the two overlapped targets as the loss value, optimizing a back propagation network according to the gradient direction, separating the overlapped prior prediction frames of the two targets, defining the prior prediction frames as (9) expressing the matching of different target frames in the formula
CN202210780211.6A 2022-07-04 2022-07-04 Feature enhancement-based small target detection method in complex scene Active CN115063691B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210780211.6A CN115063691B (en) 2022-07-04 2022-07-04 Feature enhancement-based small target detection method in complex scene

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210780211.6A CN115063691B (en) 2022-07-04 2022-07-04 Feature enhancement-based small target detection method in complex scene

Publications (2)

Publication Number Publication Date
CN115063691A CN115063691A (en) 2022-09-16
CN115063691B true CN115063691B (en) 2024-04-12

Family

ID=83204087

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210780211.6A Active CN115063691B (en) 2022-07-04 2022-07-04 Feature enhancement-based small target detection method in complex scene

Country Status (1)

Country Link
CN (1) CN115063691B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112733749A (en) * 2021-01-14 2021-04-30 青岛科技大学 Real-time pedestrian detection method integrating attention mechanism
WO2021139069A1 (en) * 2020-01-09 2021-07-15 南京信息工程大学 General target detection method for adaptive attention guidance mechanism
CN114565900A (en) * 2022-01-18 2022-05-31 广州软件应用技术研究院 Target detection method based on improved YOLOv5 and binocular stereo vision

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11256960B2 (en) * 2020-04-15 2022-02-22 Adobe Inc. Panoptic segmentation

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021139069A1 (en) * 2020-01-09 2021-07-15 南京信息工程大学 General target detection method for adaptive attention guidance mechanism
CN112733749A (en) * 2021-01-14 2021-04-30 青岛科技大学 Real-time pedestrian detection method integrating attention mechanism
CN114565900A (en) * 2022-01-18 2022-05-31 广州软件应用技术研究院 Target detection method based on improved YOLOv5 and binocular stereo vision

Also Published As

Publication number Publication date
CN115063691A (en) 2022-09-16

Similar Documents

Publication Publication Date Title
Bolte et al. Towards corner case detection for autonomous driving
CN107527352B (en) Remote sensing ship target contour segmentation and detection method based on deep learning FCN network
CN105488517B (en) A kind of vehicle brand type identifier method based on deep learning
CN111768388B (en) Product surface defect detection method and system based on positive sample reference
CN111915530B (en) End-to-end-based haze concentration self-adaptive neural network image defogging method
CN109522855B (en) Low-resolution pedestrian detection method and system combining ResNet and SENet and storage medium
CN110222604B (en) Target identification method and device based on shared convolutional neural network
CN111460914A (en) Pedestrian re-identification method based on global and local fine-grained features
CN112215074A (en) Real-time target identification and detection tracking system and method based on unmanned aerial vehicle vision
Cai et al. MHA-Net: Multipath Hybrid Attention Network for building footprint extraction from high-resolution remote sensing imagery
CN112801027A (en) Vehicle target detection method based on event camera
CN114972989A (en) Single remote sensing image height information estimation method based on deep learning algorithm
CN116229452B (en) Point cloud three-dimensional target detection method based on improved multi-scale feature fusion
CN117037004A (en) Unmanned aerial vehicle image detection method based on multi-scale feature fusion and context enhancement
Wang et al. Prohibited items detection in baggage security based on improved YOLOv5
Zhang et al. SA-BEV: Generating Semantic-Aware Bird's-Eye-View Feature for Multi-view 3D Object Detection
CN115063691B (en) Feature enhancement-based small target detection method in complex scene
Tang et al. HIC-YOLOv5: Improved YOLOv5 For Small Object Detection
Zhang et al. Drone video object detection using convolutional neural networks with time domain motion features
CN115797684A (en) Infrared small target detection method and system based on context information
Xie et al. Pedestrian detection and location algorithm based on deep learning
CN116912670A (en) Deep sea fish identification method based on improved YOLO model
Li et al. ECA-YOLOv5: Multi scale infrared salient target detection algorithm based on anchor free network
CN112668495B (en) Full-time space convolution module-based violent video detection algorithm
CN117765378B (en) Method and device for detecting forbidden articles in complex environment with multi-scale feature fusion

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant