CN116994034A - Small target detection algorithm based on feature pyramid - Google Patents

Small target detection algorithm based on feature pyramid Download PDF

Info

Publication number
CN116994034A
CN116994034A CN202310803480.4A CN202310803480A CN116994034A CN 116994034 A CN116994034 A CN 116994034A CN 202310803480 A CN202310803480 A CN 202310803480A CN 116994034 A CN116994034 A CN 116994034A
Authority
CN
China
Prior art keywords
attention
nwd
module
convolution
feature
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310803480.4A
Other languages
Chinese (zh)
Inventor
张丽娟
王敏慧
姜雨彤
周悦
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Changchun University of Technology
Original Assignee
Changchun University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Changchun University of Technology filed Critical Changchun University of Technology
Priority to CN202310803480.4A priority Critical patent/CN116994034A/en
Publication of CN116994034A publication Critical patent/CN116994034A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/52Scale-space analysis, e.g. wavelet analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/766Arrangements for image or video recognition or understanding using pattern recognition or machine learning using regression, e.g. by projecting features on hyperplanes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • G06V10/806Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection

Abstract

The invention provides a small target detection algorithm based on a feature pyramid, which comprises the following steps: firstly, adding a bottom-up path after the FPN structure, and improving the performance through path enhancement and aggregation; secondly, attention is added into the fusion module, screening of important information in the original feature map is achieved by using an attention weight map, and interference of redundant information on small target prediction is reduced; third, to address the problem that IoU is very sensitive to positional deviations of small targets, the use of anchor-based detectors greatly reduces detection performance, an NWD index was introduced. The result shows that the model mAP value is improved after the model is improved under the same training parameters, and the mAP_s of a small target is obviously improved.

Description

Small target detection algorithm based on feature pyramid
Technical Field
The invention belongs to the field of target detection, and designs a novel small target detection algorithm based on a characteristic pyramid aiming at a small target object. The invention can effectively improve the detection accuracy of small targets, accurately give out target classification and prepare the basis for subsequent image work.
Background
Object detection is an important task in classifying and locating objects of interest in images or videos. And the target detection is the basis for solving complex visual tasks such as target segmentation, scene understanding, target tracking, image description and the like. Small targets are ubiquitous in real world applications, including smart medicine, defect detection driving assistance, large scale surveillance, and rescue at sea, among others. Small target detection has now evolved into a popular sub-field of the target detection field and has evolved into an important basis for verifying the reliability of target detection algorithms.
While with the rapid development of deep learning, target detection has made great progress in performance and speed, most algorithms are directed to the detection of normal-sized objects. While small target objects often exhibit very limited visual characteristic information, this increases the difficulty of detecting small targets, resulting in very slow development of small target detection. The feature pyramid (Feature Pyramid Network, FPN) is a very representative network structure in the current small target detection algorithm, and features are enriched in a hierarchical detection mode so as to improve detection performance. Although FPN-based methods have achieved many satisfactory results, there are still problems such as inconsistent computation of different layer gradients, insufficient exploitation of shallow features, etc., which all reduce the effectiveness of the FPN structure. Therefore, in order to solve the problem caused by the FPN structure, the invention makes a brand new improvement on the FPN structure, and can better improve the detection performance of small targets in different environments.
Most existing small target detection methods can be roughly divided into four categories, namely data enhancement, multi-scale learning, custom training strategies for small targets and feature enhancement strategies. One simple and efficient way in data enhancement is to collect more small target data. Another approach is to use simple data enhancement, including rotation, image flipping, and upsampling. Multi-resolution image pyramid is a basic approach to multi-scale learning. In order to reduce the computational cost, some studies have proposed constructing FPN. After this, many approaches have attempted to further improve FPNs, such as PANet, biFPN, recursive-FPN. Multiscale learning strategies typically improve TOD performance through additional computations. The object detector is generally unable to obtain satisfactory detection performance for both small and large objects. Inspired by this fact, SNIP and SNIPER are designed to train subjects selectively over a range of scales. Functional enhancement strategies. Some studies have proposed using GAN to enhance the feature representation of small objects. Wherein PGAN first attempts to apply GAN for small target detection.
Most methods dedicated to small target detection incur additional annotation or computational costs. In contrast, the method provided by the invention does not increase extra cost in the reasoning stage, and can better improve the detection efficiency and detection precision of the small target.
Disclosure of Invention
The invention aims to improve the FPN structure of the existing algorithm to improve the detection precision and detection efficiency of small targets and simultaneously reduce the occupation of computing resources. The small target detection algorithm based on the feature pyramid is provided, and higher small target detection accuracy can be achieved.
In order to achieve the above purpose, the technical scheme adopted by the invention is as follows:
the small target detection algorithm based on the feature pyramid is realized by the following steps:
in order to alleviate the problem that shallow features are not fully utilized in the FPN of the detection RS, a bottom-up feature extraction structure is added after the FPN structure to enhance the accurate position information of the shallow layers. The bottom-up path is increased, so that the position information of the shallow layer features can be more easily propagated, and the detection performance of the small target is improved;
and secondly, adding an attention module into a fusion module of the detection RS in order to better extract small target features. The module comprises two parts of channel attention and space attention. The channel attention module adopts a local cross-channel interaction strategy without dimension reduction, and the strategy is realized through one-dimensional convolution. The spatial attention module adopts a Contextual Transformer (CoT) module, and can fully utilize the context information among input key values to guide the learning of a dynamic attention matrix, so that the visual expressive capacity is enhanced;
step three, since the IoU (Intersection over Union) -based metric is very sensitive to positional deviations of small targets, and the use in anchor-based detectors can significantly degrade detection performance. To solve this problem, a new metric method, normalized wasperstein distance (Normalized Wasserstein Distance, NWD), was introduced. NWD metrics can be easily embedded into any of the allocation, non-maximum suppression (NMS) and loss functions of anchor-based detectors to replace the usual IoU metrics, effectively improving the detection performance of small targets.
Step one, adding a bottom-up path after the FPN structure, and enhancing performance through path enhancement and aggregation. The enhancement path starts with the shallowest layer P2 of the FPN, and P2 is mapped directly to the shallowest layer feature N2 of the enhancement path. Then, each layer adopts transverse connection in upward propagation, and a higher resolution characteristic diagram Ni and a coarser characteristic diagram Pi+1 are taken to generate a new characteristic diagram Ni+1. Each feature map Ni first passes through a 3 x 3 convolution layer, with a step of 2, to reduce the space size. Each element of the feature map pi+1 is then added to the downsampled map by a lateral connection. The fused feature map is processed by another 3×3 convolution layer to generate ni+1 of the subsequent subnetwork. This is an iterative process that terminates near P5.
The attention module in the second step is divided into two parts: channel attention and spatial attention. The convolution kernel size k of the channel attention is determined in an adaptive way, the size of k being proportional to the channel dimension. The convolution features are first aggregated using Global Average Pooling (GAP) and then adaptively determining the convolution kernel size k to perform for the convolution process. And finally, obtaining the channel attention through a sigmoid function. Spatial attention employs a CoT module that integrates contextual information and self-attention into a unified hierarchy. The CoT module firstly adopts 3X 3 convolution to obtain a space Key Value as static context information K1, the Query Value is directly equal to the input characteristic x, and the Value is a convolution result of 1X 1. Then K1 and Query are connected, and the attention matrix A is obtained through two continuous 1X 1 convolutions. And finally multiplying A by Value to obtain a feature map K2, and fusing K1 and K2 to obtain a final feature attention matrix.
Step three NWD index uses the wasperstein distance in the most common transmission theory to calculate the distribution distance. And then carrying out index normalization on the distribution distance to convert the distribution distance into a similarity measurement index NWD. The calculation formula is as follows:
(1)
(2)
the modification of NWD index mainly includes three parts: positive and negative label assignment, NMS and regression loss function. Label distribution is to generate anchor points with different scales and proportions, and then perform binary marking, training classification and regression head on the anchor points. Specifically, the forward label is assigned to two anchor points, namely (1) the anchor point with the highest NWD value and a group-trunk box, wherein the NWD value is larger than Ɵ n, and (2) the anchor point with the NWD value larger than a positive threshold Ɵ p and any group. Thus, if the NWD value of the anchor is below the negative threshold Ɵ n for all gt boxes, a negative label is assigned to the anchor. Furthermore, anchors that are not assigned either positive or negative labels do not participate in the training process. IoU in the NMS is replaced with our NWD index and all prediction boxes are classified according to the score. The highest scoring prediction box M is selected and all other prediction boxes that have significant overlap with M (using a predefined threshold Nt) are suppressed. This process is recursively applied to the remaining blocks. The formula for the regression loss function is defined as:
(3)
wherein the method comprises the steps ofFor the Gaussian distribution model of the prediction box P, +.>Is a gaussian distribution model of the prediction box G.
The method provided by the invention has the main advantages that: (1) The addition of bottom-up paths allows the positional information of shallow features to be more easily propagated, thereby improving small target detection accuracy. (2) The attention module reduces interference of redundant information on small target predictions and occupies less computing resources. (3) The introduction of NWD substitution IoU as a better measure of similarity between two bounding boxes significantly improves the small target detection performance of the detector.
Drawings
FIG. 1 is a flow chart of a feature pyramid based small target detection algorithm of the present invention;
FIG. 2, FIG. 3 is a block diagram of a channel attention module and a spatial attention module of the feature pyramid based small object detection algorithm of the present invention;
FIG. 4 is a Fusion module architecture diagram of a feature pyramid-based small object detection algorithm of the present invention;
FIG. 5 is a network structure diagram of a feature pyramid-based small target detection algorithm of the present invention;
FIGS. 6 and 7 are examples of image detection results obtained in the AI-TOD dataset by the feature pyramid-based small target detection algorithm of the present invention versus the original model;
fig. 8 and 9 are examples of comparison of the AP indices detected by the image obtained in the AI-TOD dataset by the small object detection algorithm based on the feature pyramid with the original model of the present invention.
Detailed Description
The present invention is described in detail below with reference to the drawings so that those skilled in the art can better understand the present invention. It should be noted that modifications can be made to the present invention by those skilled in the art without departing from the core concept of the present invention, which falls within the scope of the present invention.
As shown in fig. 1, the general flow of the small target detection algorithm based on the feature pyramid of the present invention includes the following steps:
in the first step, a bottom-up path is added after the FPN structure, and the construction process is as follows: starting from the shallowest layer P2 of the FPN, P2 is mapped directly to the shallowest layer feature N2 of the enhancement path. Then, each layer adopts transverse connection in upward propagation, and a higher resolution characteristic diagram Ni and a coarser characteristic diagram Pi+1 are taken to generate a new characteristic diagram Ni+1. Each feature map Ni first passes through a 3 x 3 convolution layer, with a step of 2, to reduce the space size. Each element of the feature map pi+1 is then added to the downsampled map by a lateral connection. The fused feature map is processed by another 3×3 convolution layer to generate ni+1 of the subsequent subnetwork. This is an iterative process that terminates near P5;
step two, adding an attention module into the fusion module, wherein the attention module is divided into two parts: channel attention and spatial attention. The convolution kernel size k of the channel attention is determined in an adaptive way, the size of k being proportional to the channel dimension. The convolution features are first aggregated using Global Average Pooling (GAP) and then adaptively determining the convolution kernel size k to perform for the convolution process. And finally, obtaining the channel attention through a sigmoid function. Spatial attention employs a CoT module that integrates contextual information and self-attention into a unified hierarchy. The CoT module firstly adopts 3X 3 convolution to obtain a space Key Value as static context information K1, the Query Value is directly equal to the input characteristic x, and the Value is a convolution result of 1X 1. Then K1 and Query are connected, and the attention matrix A is obtained through two continuous 1X 1 convolutions. Finally multiplying A with Value to obtain a feature map K2, and fusing K1 and K2 to obtain a final feature attention matrix;
the IoU metric is modified in step three to be an NWD metric. The NWD index uses the wasperstein distance in the most common transmission theory to calculate the distribution distance. And then carrying out index normalization on the distribution distance to convert the distribution distance into a similarity measurement index NWD. The calculation formula is as follows:
(1)
(2)
the modification of NWD index mainly includes three parts: positive and negative label assignment, NMS and regression loss function. Label distribution is to generate anchor points with different scales and proportions, and then perform binary marking, training classification and regression head on the anchor points. Specifically, the forward label is assigned to two anchor points, namely (1) the anchor point with the highest NWD value and a group-trunk box, wherein the NWD value is larger than Ɵ n, and (2) the anchor point with the NWD value larger than a positive threshold Ɵ p and any group. Thus, if the NWD value of the anchor is below the negative threshold Ɵ n for all gt boxes, a negative label is assigned to the anchor. Furthermore, anchors that are not assigned either positive or negative labels do not participate in the training process. IoU in the NMS is replaced with our NWD index and all prediction boxes are classified according to the score. The highest scoring prediction box M is selected and all other prediction boxes that have significant overlap with M (using a predefined threshold Nt) are suppressed. This process is recursively applied to the remaining blocks. The formula for the regression loss function is defined as:
(3)
wherein the method comprises the steps ofFor the Gaussian distribution model of the prediction box P, +.>Is a gaussian distribution model of the prediction box G.
Fig. 2, 3 and 4 are block diagrams of attention module according to the present invention. The spatial attention module and the channel attention module form a complete attention module which is added into the fusion module.
FIG. 5 is a network structure diagram of a feature pyramid based small target detection algorithm of the present invention. The bottom-up path structure is introduced after the FPN structure as shown on the basis of the detectrs and the attention module is added to the original fusion module.
Fig. 6, fig. 7, and fig. 8 and fig. 9 show comparative examples of the original model and the model of the present invention. From fig. 6 and fig. 7, it is obvious that the accuracy of detecting small targets by the model of the invention is obviously improved, and more small targets are detected and the accuracy of detection is improved compared with the original model. Fig. 8 and 9 show that the mAP value and the map_s value of the small target are significantly improved according to the AP index.

Claims (4)

1. The small target detection algorithm based on the feature pyramid is provided, and the method is realized through the following steps:
in order to alleviate the problem that shallow features are not fully utilized in the FPN of the detection RS, a bottom-up feature extraction structure is added after the FPN structure to enhance the accurate position information of the shallow layers. The bottom-up path is increased, so that the position information of the shallow layer features can be more easily propagated, and the detection performance of the small target is improved;
and secondly, adding an attention module into a fusion module of the detection RS in order to better extract small target features. The module comprises two parts of channel attention and space attention. The channel attention module adopts a local cross-channel interaction strategy without dimension reduction, and the strategy is realized through one-dimensional convolution. The spatial attention module adopts a Contextual Transformer (CoT) module, and can fully utilize the context information among input key values to guide the learning of a dynamic attention matrix, so that the visual expressive capacity is enhanced;
step three, since the IoU (Intersection over Union) -based metric is very sensitive to positional deviations of small targets, and the use in anchor-based detectors can significantly degrade detection performance. To solve this problem, a new metric method, normalized wasperstein distance (Normalized Wasserstein Distance, NWD), was introduced. NWD metrics can be easily embedded into any of the allocation, non-maximum suppression (NMS) and loss functions of anchor-based detectors to replace the usual IoU metrics, effectively improving the detection performance of small targets.
2. The feature pyramid-based small object detection algorithm according to claim 1, wherein in the first step, a bottom-up path is added after the FPN structure, and the construction process is as follows:
starting from the shallowest layer P2 of the FPN, P2 is mapped directly to the shallowest layer feature N2 of the enhancement path. Then, each layer adopts transverse connection in upward propagation, and a higher resolution characteristic diagram Ni and a coarser characteristic diagram Pi+1 are taken to generate a new characteristic diagram Ni+1. Each feature map Ni first passes through a 3 x 3 convolution layer, with a step of 2, to reduce the space size. Each element of the feature map pi+1 is then added to the downsampled map by a lateral connection. The fused feature map is processed by another 3×3 convolution layer to generate ni+1 of the subsequent subnetwork. This is an iterative process that terminates near P5.
3. The feature pyramid-based small target detection algorithm according to claim 2, wherein the second step adds an attention module to the fusion module, and the attention module is divided into two parts: channel attention and spatial attention. The specific operation is described as follows:
the convolution kernel size k of the channel attention is determined in an adaptive way, the size of k being proportional to the channel dimension. The convolution features are first aggregated using Global Average Pooling (GAP) and then adaptively determining the convolution kernel size k to perform for the convolution process. And finally, obtaining the channel attention through a sigmoid function. Spatial attention employs a CoT module that integrates contextual information and self-attention into a unified hierarchy. The CoT module firstly adopts 3X 3 convolution to obtain a space Key Value as static context information K1, the Query Value is directly equal to the input characteristic x, and the Value is a convolution result of 1X 1. Then K1 and Query are connected, and the attention matrix A is obtained through two continuous 1X 1 convolutions. And finally multiplying A by Value to obtain a feature map K2, and fusing K1 and K2 to obtain a final feature attention matrix.
4. A feature pyramid based small object detection algorithm according to claim 3, wherein IoU metric is modified in step three to NWD metric. The NWD index uses the wasperstein distance in the most common transmission theory to calculate the distribution distance. And then carrying out index normalization on the distribution distance to convert the distribution distance into a similarity measurement index NWD. The calculation formula is as follows:
(1)
(2)
the modification of NWD index mainly includes three parts: positive and negative label assignment, NMS and regression loss function. Label distribution is to generate anchor points with different scales and proportions, and then perform binary marking, training classification and regression head on the anchor points. Specifically, the forward label is assigned to two anchor points, namely (1) the anchor point with the highest NWD value and a group-trunk box, wherein the NWD value is larger than Ɵ n, and (2) the anchor point with the NWD value larger than a positive threshold Ɵ p and any group. Thus, if the NWD value of the anchor is below the negative threshold Ɵ n for all gt boxes, a negative label is assigned to the anchor. Furthermore, anchors that are not assigned either positive or negative labels do not participate in the training process. IoU in the NMS is replaced with our NWD index and all prediction boxes are classified according to the score. The highest scoring prediction box M is selected and all other prediction boxes that have significant overlap with M (using a predefined threshold Nt) are suppressed. This process is recursively applied to the remaining blocks. The formula for the regression loss function is defined as:
(3)
wherein the method comprises the steps ofFor the Gaussian distribution model of the prediction box P, +.>Is a gaussian distribution model of the prediction box G.
CN202310803480.4A 2023-07-03 2023-07-03 Small target detection algorithm based on feature pyramid Pending CN116994034A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310803480.4A CN116994034A (en) 2023-07-03 2023-07-03 Small target detection algorithm based on feature pyramid

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310803480.4A CN116994034A (en) 2023-07-03 2023-07-03 Small target detection algorithm based on feature pyramid

Publications (1)

Publication Number Publication Date
CN116994034A true CN116994034A (en) 2023-11-03

Family

ID=88533021

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310803480.4A Pending CN116994034A (en) 2023-07-03 2023-07-03 Small target detection algorithm based on feature pyramid

Country Status (1)

Country Link
CN (1) CN116994034A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117237830A (en) * 2023-11-10 2023-12-15 湖南工程学院 Unmanned aerial vehicle small target detection method based on dynamic self-adaptive channel attention

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117237830A (en) * 2023-11-10 2023-12-15 湖南工程学院 Unmanned aerial vehicle small target detection method based on dynamic self-adaptive channel attention
CN117237830B (en) * 2023-11-10 2024-02-20 湖南工程学院 Unmanned aerial vehicle small target detection method based on dynamic self-adaptive channel attention

Similar Documents

Publication Publication Date Title
Jin et al. Multi-feature fusion and enhancement single shot detector for traffic sign recognition
Yuan et al. Anomaly detection in traffic scenes via spatial-aware motion reconstruction
KR100483832B1 (en) Method of describing image texture descriptor
Ju et al. A simple and efficient network for small target detection
CN108734210B (en) Object detection method based on cross-modal multi-scale feature fusion
Derpanis et al. Classification of traffic video based on a spatiotemporal orientation analysis
Wang et al. Multifocus image fusion using convolutional neural networks in the discrete wavelet transform domain
Min et al. New approach to vehicle license plate location based on new model YOLO‐L and plate pre‐identification
TW202207077A (en) Text area positioning method and device
Wang et al. An advanced YOLOv3 method for small-scale road object detection
Gao et al. CAMRL: A joint method of channel attention and multidimensional regression loss for 3D object detection in automated vehicles
CN111626120B (en) Target detection method based on improved YOLO-6D algorithm in industrial environment
CN111723693A (en) Crowd counting method based on small sample learning
Liu et al. Study of human action recognition based on improved spatio-temporal features
CN116994034A (en) Small target detection algorithm based on feature pyramid
Li et al. Lcnn: Low-level feature embedded cnn for salient object detection
Wu et al. FSANet: Feature-and-spatial-aligned network for tiny object detection in remote sensing images
Yu et al. Traffic sign detection based on visual co-saliency in complex scenes
Gu et al. Embedded and real-time vehicle detection system for challenging on-road scenes
Xiao et al. Pedestrian object detection with fusion of visual attention mechanism and semantic computation
CN112446431A (en) Feature point extraction and matching method, network, device and computer storage medium
CN103324753A (en) Image retrieval method based on symbiotic sparse histogram
CN115527133A (en) High-resolution image background optimization method based on target density information
Li et al. Fast object detection from unmanned surface vehicles via objectness and saliency
Tighkhorshid et al. Car depth estimation within a monocular image using a light CNN

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination