CN116681983A - Long and narrow target detection method based on deep learning - Google Patents

Long and narrow target detection method based on deep learning Download PDF

Info

Publication number
CN116681983A
CN116681983A CN202310648368.8A CN202310648368A CN116681983A CN 116681983 A CN116681983 A CN 116681983A CN 202310648368 A CN202310648368 A CN 202310648368A CN 116681983 A CN116681983 A CN 116681983A
Authority
CN
China
Prior art keywords
detection
loss
target
deep learning
frame
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310648368.8A
Other languages
Chinese (zh)
Inventor
焦文华
骆园
田玉宇
李瑞林
谢小浩
蔡晓异
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China University of Mining and Technology CUMT
Original Assignee
China University of Mining and Technology CUMT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China University of Mining and Technology CUMT filed Critical China University of Mining and Technology CUMT
Priority to CN202310648368.8A priority Critical patent/CN116681983A/en
Publication of CN116681983A publication Critical patent/CN116681983A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/24Aligning, centring, orientation detection or correction of the image
    • G06V10/245Aligning, centring, orientation detection or correction of the image by locating a pattern; Special marks for positioning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/7715Feature extraction, e.g. by transforming the feature space, e.g. multi-dimensional scaling [MDS]; Mappings, e.g. subspace methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • G06V10/806Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • General Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Medical Informatics (AREA)
  • Databases & Information Systems (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a long and narrow target detection method based on deep learning, which relates to the technical field of long and narrow target detection, wherein a test image is input into a detection model to detect a target object in the image, and the detection model comprises a data acquisition and preprocessing module, a long and narrow target detection network training module and a test image detection frame generation module. By adopting the structure, the invention obtains images with proper sizes and increases training samples through preprocessing data, thereby improving the generalization capability of the network model; a global attention mechanism GAM is added between a BackBone network of the backhaul and the Neck, so that the extraction capability of the network to the characteristics of the target object is enhanced, and the detection precision of the target is further improved; and introducing a directional boundary box representation method, carrying out accurate regression of the detection box, removing the generated repeated detection box by adopting a control threshold value, and acquiring a more accurate detection box result by adopting a CIoU loss function.

Description

Long and narrow target detection method based on deep learning
Technical Field
The invention relates to the technical field of long and narrow target detection, in particular to a long and narrow target detection method based on deep learning.
Background
The computer vision target detection aims at identifying and positioning a target object existing in an image, belongs to a classical task in the field of computer vision, has important application value in the fields of informationized intelligent agriculture, industrial intellectualization, automatic driving and the like, and becomes an important precondition for a subsequent vision task. Along with the rapid development of deep learning technology, the target detection task breaks through to the new field step by step, and the problems of low efficiency, poor accuracy, time consumption and labor consumption of the traditional manual detection mode are successively solved.
In recent years, in the case of narrow and long dense target detection in multiple fields, for example, there are stuck wheat grain detection and dense wheat ear detection in an agricultural scene, remote sensing target images of aircraft vessels and the like acquired from satellite images, and dense industrial product crack detection in an industrial scene, because targets are shielded from each other and the arrangement directions of the targets are different, the resolution of target objects is reduced, and the conventional single-stage YOLO, SSD and RetinaNet, double-stage Fast RCNN and Fast RCNN target detection methods have the problems of low precision and omission.
In the existing long and narrow target detection method, publication No. CN113326763A discloses a remote sensing target detection method based on boundary frame consistency, which mainly uses a ResNet101 Conv1-5 network model as a base network, generates a prediction boundary frame through a thermal diagram, offset information, prediction frame information and direction information, performs positioning display according to the prediction boundary frame, and improves regression effect and detection speed. However, the method has strong dependence on the data set and weak generalization capability, and has low efficiency and higher omission ratio when the scene is switched to a long, narrow and dense target data set with different directions.
Accordingly, there is a need to provide a method for detecting an elongated object based on deep learning to solve the above-mentioned problems.
Disclosure of Invention
The invention aims to provide a long and narrow target detection method based on deep learning, which is used for mainly solving the problems of low efficiency and missed detection generated when long and narrow targets with uneven arrangement and different directions are detected.
In order to achieve the above purpose, the invention provides a long and narrow target detection method based on deep learning, which inputs a test image into a detection model to detect a target object in the image, wherein the detection model comprises a data acquisition and preprocessing module, a long and narrow target detection network training module and a test image detection frame generation module.
Preferably, the data acquisition and preprocessing module comprises a data acquisition module and a data preprocessing module, wherein the data acquisition module takes a plurality of target images shot by a camera as a data set for model training, verification and testing; the data preprocessing module marks the target image by using a target detection tool ropylelmg, cuts and rotates the data set, and randomly divides the data set into a training set, a verification set and a test set.
Preferably, the detection model adopts convolution, normalization and activation operation to extract feature mapping, and combines channel information fusion operation to send feature graphs with different downsampling rates to a Neck structure.
Preferably, the detection model is improved based on initial YOLOX training, and the detection and regression mode of the detection model in the training and reasoning process is improved to be directed bounding box detection, a global attention mechanism GAM is adopted, and a loss function is optimized.
Preferably, the orientation bounding box detection is based on a conventional rectangular box with a rotation angle θ algebraically expressed as (x) c ,y c W, h, θ), where (x c ,y c ) Representing the coordinates of the center point of the range box, (w, h) representing the width and height of the range box.
Preferably, the global attention mechanism GAM is added between the Backbone network of the backhaul and the negk network.
Preferably, the global attention mechanism GAM comprises the following steps:
s1: compressing the feature map of the target image by using a global average pooling GAP module;
s2: using S D The downsampling module is used for reducing characteristic dimension;
s3: activating by using a ReLU function;
s4: using S U The up-sampling module returns to the original dimension through the full connection layer;
s5: obtaining normalized weights through a sigmoid function;
s6: the normalized weights are weighted to each channel using Scale, outputting the same number of weights as the input features.
Preferably, the loss function takes the form of a multitasking loss, consisting essentially of a localized loss L obj Classification loss L cls And confidence loss L reg Composition, total loss L total The expression is as follows:
L total =L obj +L cls +L reg
in the positioning loss L obj Calculating the positioning error of a prediction frame of an image target object, wherein the positioning error comprises the coordinate error and the width-height error of a boundary frame; confidence loss L reg Calculating the position error of a target object prediction frame; classification loss L cls Calculating a class error of the detection target prediction frame;
classification loss L cls Consists of target class loss and angle loss, expressed as follows by binary cross entropy loss:
wherein S is 2 B is the number of anchor points, θ is the category of angle, I ij Indicating that the jth anchor in the network detects the target object, I ij =1; no target object is detected by the jth anchor, I ij =0;P i (c) Representing the probability of detection as a target object, P i And (θ) represents the probability that the rotation angle of the target object is θ.
Preferably, the confidence loss L of the detection layer is improved based on the cross-over ratio obj And calculating a real space relation between the positioning loss and the box by using the CIoU, and calculating an intersection ratio calculation formula:
wherein pred represents a target object prediction frame, targ represents a target object real boundary frame;
in the method, in the process of the invention,to measure similarity of aspect ratios;
weight function:
CIoU loss function:
wherein l (O) b ,O gt ) Representing the Euclidean distance, w, between the anchor frame center point and the bounding frame center point gt And h gt For the width and height of the bounding box, w b And h b Is the width and height of the anchor frame.
Preferably, the test image detection frame generation module comprises generation of a detection frame and display of a detection result, and the detection frame is subjected to de-duplication processing by adopting a control threshold in the generation process of the detection frame.
Therefore, the long and narrow target detection method based on deep learning has the following beneficial effects:
(1) The detection and regression mode of the detection model in the training and reasoning process is improved to be directional boundary box detection so as to meet the detection requirement of long and narrow target objects.
(2) The invention adopts the global attention mechanism to improve the representation capability of the image so as to obtain richer target characteristics.
(3) The invention adopts the directional boundary box, and can obtain the specific position of the rectangle in the image, thereby realizing the improvement of the detection performance and precision of the rotating target, reducing the size of the corresponding model and improving the detection accuracy through the detection method of the directional boundary box.
(4) The invention uses CIoU loss function, considers the position information between the detection frame and the real frame, and improves the detection performance.
(5) The invention adopts the control threshold value for de-duplication, and solves the problem that a plurality of detection frames appear on a target object in a visual result.
The technical scheme of the invention is further described in detail through the drawings and the embodiments.
Drawings
FIG. 1 is a flowchart of an overall implementation of an elongated object detection method based on deep learning;
FIG. 2 is a data annotation diagram of an elongated object detection method based on deep learning according to the present invention;
FIG. 3 is a CBS module architecture diagram of an elongated object detection method based on deep learning of the present invention;
FIG. 4 is a schematic diagram of an oriented bounding box of an elongated object detection method based on deep learning according to the present invention;
FIG. 5 is a GAM schematic diagram of an elongated object detection method based on deep learning according to the present invention;
FIG. 6 is a decoupling detection head of an elongated target detection method based on deep learning in accordance with the present invention;
FIG. 7 is a contrast diagram of a deduplication process for an elongated object detection method based on deep learning according to the present invention;
FIG. 8 is a model block diagram of an elongated object detection method based on deep learning according to the present invention;
Detailed Description
The technical scheme of the invention is further described below through the attached drawings and the embodiments.
Unless defined otherwise, technical or scientific terms used herein should be given the ordinary meaning as understood by one of ordinary skill in the art to which this invention belongs. The terms "first," "second," and the like, as used herein, do not denote any order, quantity, or importance, but rather are used to distinguish one element from another. The word "comprising" or "comprises", and the like, means that elements or items preceding the word are included in the element or item listed after the word and equivalents thereof, but does not exclude other elements or items. The terms "connected" or "connected," and the like, are not limited to physical or mechanical connections, but may include electrical connections, whether direct or indirect. "upper", "lower", "left", "right", etc. are used merely to indicate relative positional relationships, which may also be changed when the absolute position of the object to be described is changed.
Examples
As shown in fig. 1 to 8, the invention provides a long and narrow target detection method based on deep learning, which inputs a test image into a detection model to detect a target object in the image, wherein the detection model comprises a data acquisition and preprocessing module, a dense data detection network and training module and an image detection frame generation module.
The data acquisition and preprocessing module comprises a data acquisition module and a data preprocessing module, wherein the data acquisition module takes a plurality of target images shot by a camera as a data set for model training, verification and testing; the data preprocessing module adopts a target detection tool robalelmg to label a target image, cuts and rotates a data set, respectively carries out preprocessing operations of 30 degrees, 60 degrees, 90 degrees, 120 degrees and 180 degrees on the data set, cuts an image with the size of 2688 x 2688, and finally, the size of the image is 1024 x 1024, the coincidence degree of the cut image is 200, and the data set is randomly divided into a training set, a verification set and a test set, wherein the proportion is 7:2:1.
the detection model builds an initial model network based on a convolution, batch normalization and Sillu activation (CBS) module, a cross-stage part (CSP) structure, a Feature Pyramid Network (FPN), a Path Aggregation Network (PAN) module and a Space Pyramid Pooling (SPP) module, the architecture of the CBS module is shown in figure 3, the detection model extracts feature mapping by adopting convolution, normalization and activation operation, and the feature map with different downsampling rates is sent to a Neck structure by combining channel information fusion operation. The specific position of the rectangle in the image can be obtained, so that the detection performance and accuracy of the rotating target are improved, the size of a corresponding model is reduced, and the detection accuracy is improved through a detection method of the directional bounding box.
The detection model is improved based on initial YOLOX training, the detection and regression mode of the detection model in the training and reasoning process is improved to be directed boundary box detection, a backbone network continues to use YOLOX-dark 53, a global attention mechanism GAM is adopted, and a loss function is optimized.
Oriented bounding box detection, adding a rotation angle θ based on a conventional rectangular box, algebraically expressed as (x) c ,y c W, h, θ), where (x c ,y c ) Representing the coordinates of the center point of the range box, (w, h) representing the width and height of the range box.
A global attention mechanism GAM is added between the backhaul Backbone network and the neg network.
A global attention mechanism GAM comprising the steps of:
s1: compressing the feature map of the target image by using a global average pooling GAP module;
s2: using S D The downsampling module is used for reducing characteristic dimension;
s3: activating by using a ReLU function;
s4: using S U The up-sampling module returns to the original dimension through the full connection layer;
s5: obtaining normalized weights through a sigmoid function;
s6: the normalized weights are weighted to each channel using Scale, outputting the same number of weights as the input features.
The loss function adopts a multi-task loss form and mainly consists of a positioning loss L obj Classification loss L cls And confidence loss L reg Composition, total loss L total The expression is as follows:
L total =L obj +L cls +L reg
in the positioning loss L obj Calculating the positioning error of a prediction frame of an image target object, wherein the positioning error comprises the coordinate error and the width-height error of a boundary frame; confidence loss L reg Calculating the position error of a target object prediction frame; classification loss L cls Calculating a class error of the detection target prediction frame;
classification loss L cls Consists of target class loss and angle loss, expressed as follows by binary cross entropy loss:
wherein S is 2 B is the number of anchor points, θ is the category of angle, I ij Indicating that the jth anchor in the network detects the target object, I ij =1; no target object is detected by the jth anchor, I ij =0;P i (c) Representing the probability of detection as a target object, P i And (θ) represents the probability that the rotation angle of the target object is θ.
Improving confidence loss L of detection layer based on cross-correlation ratio obj And calculating a real space relation between the positioning loss and the box by using the CIoU, and calculating an intersection ratio calculation formula:
wherein pred represents a target object prediction frame, targ represents a target object real boundary frame;
in the method, in the process of the invention,to measure similarity of aspect ratios;
weight function:
CIoU loss function:
wherein l (O) b ,O gt ) Representing the Euclidean distance, w, between the anchor frame center point and the bounding frame center point gt And h gt For the width and height of the bounding box, w b And h b Is the width and height of the anchor frame.
The test image detection frame generation module comprises detection frame generation and detection result display, wherein the detection frame is subjected to de-duplication processing by adopting a control threshold value in the generation process of the detection frame.
Example 1
Taking dense wheat grains and impurities as an example, namely taking the smallest circumscribed rectangle for all detection frames of a target object, taking the circle center of the circumscribed rectangle, namely rotating the center point coordinates of the rectangular detection frames, and screening according to the distance and the confidence coefficient between the center point coordinates, wherein the specific pseudo code is as follows:
therefore, the method for detecting the long and narrow target based on deep learning is adopted, and the image with proper size is obtained and the training sample is added through preprocessing the data, so that the generalization capability of a network model is improved; adding a GAM global attention mechanism between a BackBone network of the backhaul and the Neck, enhancing the extraction capability of the network to the characteristics of the target object, and further improving the detection precision of the target; the method is introduced to accurately regress the detection frames, the repeated frames are removed by adopting a control threshold value, a CIoU loss function is adopted to obtain more accurate detection frame results, and the problems of low efficiency and detection omission caused by detection of long and narrow targets with uneven arrangement and different directions are solved.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention and not for limiting it, and although the present invention has been described in detail with reference to the preferred embodiments, it will be understood by those skilled in the art that: the technical scheme of the invention can be modified or replaced by the same, and the modified technical scheme cannot deviate from the spirit and scope of the technical scheme of the invention.

Claims (10)

1. A long and narrow target detection method based on deep learning is characterized in that: the method comprises the steps of inputting a test image into a detection model, and detecting a target object in the image, wherein the detection model comprises a data acquisition and preprocessing module, a long and narrow target detection network training module and a test image detection frame generation module.
2. The method for detecting an elongated object based on deep learning according to claim 1, wherein: the data acquisition and preprocessing module comprises a data acquisition module and a data preprocessing module, wherein the data acquisition module takes a plurality of target images shot by a camera as a data set for model training, verification and testing; the data preprocessing module marks the target image by using a target detection tool ropylelmg, cuts and rotates the data set, and randomly divides the data set into a training set, a verification set and a test set.
3. The method for detecting an elongated object based on deep learning according to claim 1, wherein: the detection model adopts convolution, normalization and activation operation to extract feature mapping, and combines channel information fusion operation to send feature graphs with different downsampling rates to a Neck structure.
4. The method for detecting an elongated object based on deep learning according to claim 1, wherein: the detection model is improved based on initial YOLOX training, the detection and regression mode of the detection model in the training and reasoning process is improved to be directed bounding box detection, a global attention mechanism GAM is adopted, and a loss function is optimized.
5. The method for detecting an elongated object based on deep learning according to claim 4, wherein: the orientation bounding box detection adds a rotation angle θ based on a conventional rectangular box, algebraically expressed as (x) c ,y c W, h, θ), where (x c ,y c ) Representing the coordinates of the center point of the range box, (w, h) representing the width and height of the range box.
6. The method for detecting an elongated object based on deep learning according to claim 4, wherein: the global attention mechanism GAM is added between the backhaul Backbone network and the neg network.
7. The method for detecting an elongated object based on deep learning according to claim 6, wherein: the global attention mechanism GAM comprises the following steps:
s1: compressing the feature map of the target image by using a global average pooling GAP module;
s2: using S D The downsampling module is used for reducing characteristic dimension;
s3: activating by using a ReLU function;
s4: using S U The up-sampling module returns to the original dimension through the full connection layer;
s5: obtaining normalized weights through a sigmoid function;
s6: the normalized weights are weighted to each channel using Scale, outputting the same number of weights as the input features.
8. The method for detecting an elongated object based on deep learning according to claim 4, wherein: the loss function adopts a multi-task loss form and mainly comprises a positioning loss L obj Classification loss L cls And confidence loss L reg Composition, total loss L total The expression is as follows:
L total =L obj +L cls +L reg
in the positioning loss L obj Calculating the positioning error of a prediction frame of an image target object, wherein the positioning error comprises the coordinate error and the width-height error of a boundary frame; confidence loss L reg Calculating the position error of a target object prediction frame; classification loss L cls Calculating a class error of the detection target prediction frame;
classification loss L cls Consists of target class loss and angle loss, expressed as follows by binary cross entropy loss:
wherein S is 2 B is the number of anchor points, θ is the category of angle, I ij Indicating that the jth anchor in the network detects the target object, I ij =1; no target object is detected by the jth anchor, I ij =0;P i (c) Representing the probability of detection as a target object, P i And (θ) represents the probability that the rotation angle of the target object is θ.
9. The method for detecting an elongated object based on deep learning according to claim 8, wherein: improving confidence loss L of detection layer based on cross-correlation ratio obj And calculating a real space relation between the positioning loss and the box by using the CIoU, and calculating an intersection ratio calculation formula:
wherein pred represents a target object prediction frame, targ represents a target object real boundary frame;
in the method, in the process of the invention,to measure similarity of aspect ratios;
weight function:
CIoU loss function:
wherein l (O) b ,O gt ) Representing the Euclidean distance, w, between the anchor frame center point and the bounding frame center point gt And h gt For the width and height of the bounding box, w b And h b Is the width and height of the anchor frame.
10. The method for detecting an elongated object based on deep learning according to claim 1, wherein: the test image detection frame generation module comprises detection frame generation and detection result display, wherein the detection frame is subjected to de-duplication processing by adopting a control threshold in the generation process of the detection frame.
CN202310648368.8A 2023-06-02 2023-06-02 Long and narrow target detection method based on deep learning Pending CN116681983A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310648368.8A CN116681983A (en) 2023-06-02 2023-06-02 Long and narrow target detection method based on deep learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310648368.8A CN116681983A (en) 2023-06-02 2023-06-02 Long and narrow target detection method based on deep learning

Publications (1)

Publication Number Publication Date
CN116681983A true CN116681983A (en) 2023-09-01

Family

ID=87786637

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310648368.8A Pending CN116681983A (en) 2023-06-02 2023-06-02 Long and narrow target detection method based on deep learning

Country Status (1)

Country Link
CN (1) CN116681983A (en)

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112668560A (en) * 2021-03-16 2021-04-16 中国矿业大学(北京) Pedestrian detection method and system for pedestrian flow dense area
CN113298169A (en) * 2021-06-02 2021-08-24 浙江工业大学 Convolutional neural network-based rotating target detection method and device
CN114581847A (en) * 2022-03-04 2022-06-03 山东科技大学 Method and device for detecting abnormal behaviors of pedestrians in community based on GAM tracker
CN115272828A (en) * 2022-08-11 2022-11-01 河南省农业科学院农业经济与信息研究所 Intensive target detection model training method based on attention mechanism
CN115546499A (en) * 2022-10-12 2022-12-30 中国人民解放军陆军炮兵防空兵学院 Progressive auxiliary target detection method and system based on CNN and ViT fusion
CN115588126A (en) * 2022-09-29 2023-01-10 长三角信息智能创新研究院 GAM, CARAFE and SnIoU fused vehicle target detection method
CN115690627A (en) * 2022-11-03 2023-02-03 安徽大学 Method and system for detecting aerial image rotating target
CN115841608A (en) * 2022-11-02 2023-03-24 国网青海省电力公司海北供电公司 Multi-chamber lightning arrester identification method based on improved YOLOX
CN115861853A (en) * 2022-11-22 2023-03-28 西安工程大学 Transmission line bird nest detection method in complex environment based on improved yolox algorithm
CN116052218A (en) * 2023-02-13 2023-05-02 中国矿业大学 Pedestrian re-identification method

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112668560A (en) * 2021-03-16 2021-04-16 中国矿业大学(北京) Pedestrian detection method and system for pedestrian flow dense area
CN113298169A (en) * 2021-06-02 2021-08-24 浙江工业大学 Convolutional neural network-based rotating target detection method and device
CN114581847A (en) * 2022-03-04 2022-06-03 山东科技大学 Method and device for detecting abnormal behaviors of pedestrians in community based on GAM tracker
CN115272828A (en) * 2022-08-11 2022-11-01 河南省农业科学院农业经济与信息研究所 Intensive target detection model training method based on attention mechanism
CN115588126A (en) * 2022-09-29 2023-01-10 长三角信息智能创新研究院 GAM, CARAFE and SnIoU fused vehicle target detection method
CN115546499A (en) * 2022-10-12 2022-12-30 中国人民解放军陆军炮兵防空兵学院 Progressive auxiliary target detection method and system based on CNN and ViT fusion
CN115841608A (en) * 2022-11-02 2023-03-24 国网青海省电力公司海北供电公司 Multi-chamber lightning arrester identification method based on improved YOLOX
CN115690627A (en) * 2022-11-03 2023-02-03 安徽大学 Method and system for detecting aerial image rotating target
CN115861853A (en) * 2022-11-22 2023-03-28 西安工程大学 Transmission line bird nest detection method in complex environment based on improved yolox algorithm
CN116052218A (en) * 2023-02-13 2023-05-02 中国矿业大学 Pedestrian re-identification method

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
徐志京等: "基于双重特征增强的遥感舰船小目标检测", 光学学报, vol. 42, no. 18, 30 September 2022 (2022-09-30), pages 2 *

Similar Documents

Publication Publication Date Title
CN110084292B (en) Target detection method based on DenseNet and multi-scale feature fusion
CN113269073B (en) Ship multi-target tracking method based on YOLO V5 algorithm
Lee et al. Simultaneous traffic sign detection and boundary estimation using convolutional neural network
CN109829398B (en) Target detection method in video based on three-dimensional convolution network
CN108052942B (en) Visual image recognition method for aircraft flight attitude
CN110660052A (en) Hot-rolled strip steel surface defect detection method based on deep learning
CN111368769B (en) Ship multi-target detection method based on improved anchor point frame generation model
CN110610210B (en) Multi-target detection method
CN110647802A (en) Remote sensing image ship target detection method based on deep learning
CN108711172B (en) Unmanned aerial vehicle identification and positioning method based on fine-grained classification
CN112949380B (en) Intelligent underwater target identification system based on laser radar point cloud data
CN110008899B (en) Method for extracting and classifying candidate targets of visible light remote sensing image
CN115829991A (en) Steel surface defect detection method based on improved YOLOv5s
CN113033315A (en) Rare earth mining high-resolution image identification and positioning method
CN110110618A (en) A kind of SAR target detection method based on PCA and global contrast
CN113516053A (en) Ship target refined detection method with rotation invariance
CN110866931B (en) Image segmentation model training method and classification-based enhanced image segmentation method
CN116128883A (en) Photovoltaic panel quantity counting method and device, electronic equipment and storage medium
CN113284185B (en) Rotating target detection method for remote sensing target detection
CN113496260B (en) Grain depot personnel non-standard operation detection method based on improved YOLOv3 algorithm
CN110826575A (en) Underwater target identification method based on machine learning
CN116681983A (en) Long and narrow target detection method based on deep learning
CN116051808A (en) YOLOv 5-based lightweight part identification and positioning method
WO2023273337A1 (en) Representative feature-based method for detecting dense targets in remote sensing image
CN116246096A (en) Point cloud 3D target detection method based on foreground reinforcement knowledge distillation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination