CN111985284A - Single-stage target detection device without anchor box based on attention mechanism and semantic weak supervision - Google Patents

Single-stage target detection device without anchor box based on attention mechanism and semantic weak supervision Download PDF

Info

Publication number
CN111985284A
CN111985284A CN201910443385.1A CN201910443385A CN111985284A CN 111985284 A CN111985284 A CN 111985284A CN 201910443385 A CN201910443385 A CN 201910443385A CN 111985284 A CN111985284 A CN 111985284A
Authority
CN
China
Prior art keywords
target detection
semantic
attention mechanism
weak supervision
detection
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910443385.1A
Other languages
Chinese (zh)
Inventor
胡志强
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tianjin University of Science and Technology
Original Assignee
Tianjin University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tianjin University of Science and Technology filed Critical Tianjin University of Science and Technology
Priority to CN201910443385.1A priority Critical patent/CN111985284A/en
Publication of CN111985284A publication Critical patent/CN111985284A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/56Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Multimedia (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Human Computer Interaction (AREA)
  • Image Analysis (AREA)

Abstract

The invention relates to an anchor-box-free single-stage target detection device based on an attention mechanism and semantic weak supervision, and the accuracy of a target detection algorithm is improved by adopting a new method for estimating an object center point from a thermodynamic diagram. Compared with the traditional method, the method greatly reduces the influence of manually designed anchor box structures on the detection precision, and can use better characteristics to train the deep convolutional neural network, thereby obtaining higher precision. The method provides a theoretical basis for a single-stage target detection algorithm framework in the future.

Description

Single-stage target detection device without anchor box based on attention mechanism and semantic weak supervision
Technical Field
The invention belongs to the technical field of image recognition, and particularly relates to a target recognition algorithm based on deep learning.
Background
Object recognition can be applied in many fields, such as assisted driving of automobiles, automatic driving, and the like, for recognizing motor vehicles and pedestrians on a road surface. In recent years, target recognition algorithms gradually evolve from algorithms based on manually designed image features followed by machine learning feature classifiers to deep learning based methods. The target detection algorithm based on deep learning is further divided into a single-stage target detection algorithm and a two-stage target detection algorithm.
Girshick et al propose a 2-stage Fast R-cnn target detection algorithm, wherein in the first stage, an RPN (region pro positive network) is used for detecting a region where an object may exist in an image, and in the second stage, the depth characteristics of the region are used for carrying out classification and position regression on the object. Different from a two-stage target detection algorithm, the w.liu et al proposes a single-stage target detection algorithm. The whole feature map is traversed by a manually designed Anchor Box (Anchor Box), and finally the object is detected by feature classification and position regression.
In conclusion, the performance of the existing target price measurement algorithm greatly depends on the design mode of the anchor box and the selection of the hyper-parameters, and the design mode of the anchor box greatly limits the further improvement of the precision of the existing target detection algorithm.
Disclosure of Invention
The invention aims to overcome the defects in the prior art, and aims to improve the traditional target detection algorithm of manually designed anchor boxes. According to the invention, an attention mechanism module is added after the characteristic diagram is detected, so that the weight of the area of the characteristic diagram containing the target is increased, and the weight of the area without the target is decreased. The other innovation point is that in order to ensure the resolution of the thermodynamic diagram, the thermodynamic diagram is generated on the deconvoluted characteristic diagram, pyramid type fusion is carried out on the characteristic diagrams at different stages, and information with high resolution but weak semantics and information with low resolution but strong semantics are fused. The detection precision can be greatly improved. Another innovation of the method is that a new algorithm for guiding the target detection module by using box semantic segmentation as weak supervision information is provided on the basis of not introducing new calibration information. The weak semantic information can provide more macroscopic information for the detection network, and an angle is changed to enable the feature extraction network to be more concentrated in the feature map area where the target is located, so that a better detection effect is achieved. In order to achieve the purpose, the technical scheme of the invention comprises the following steps:
Step 1: input image features are extracted using a generic feature extraction backbone network (e.g., VGG, ResNet, Googlenet, mobilene, shufflenet, etc.).
Step 2: and extracting feature maps of different scales.
And step 3: and sequentially amplifying feature maps of different scales to the same scale in a deconvolution mode.
And 4, step 4: and sequentially fusing the deconvoluted feature maps.
And 5: target detection is carried out on each fused feature map
Step 6: detection Head (Detection Head) output: 1. the thermodynamic diagram is used for estimating the center position of the object; 2. the classification output is used to estimate the object class; 3. outputting the width, height and position finishing results of each object; 4. the object detection process is aided by box-level semantic information.
And 7: and (6) suppressing the detection result output by each DP in the step 6 by using a non-maximum value to obtain a final detection result. The achievement of the invention provides theoretical basis for designing a rapid, reliable and easily-trained target detection algorithm.
Drawings
FIG. 1 is a block diagram of the overall object detection algorithm of the present invention.
FIG. 2 shows details of the detection head and output features.
Fig. 3 is a schematic diagram of a target detection result, and target center point information is estimated.
FIG. 4 is a specific example of weak semantic supervised information
FIG. 5 is a schematic illustration of an attention mechanism.

Claims (7)

1. An anchor-box-free single-stage target detection device based on an attention mechanism and semantic weak supervision is characterized by comprising the following steps of:
step 1: input image features are extracted using a generic feature extraction backbone network (e.g., VGG, ResNet, Googlenet, mobilene, shufflenet, etc.).
Step 2: and extracting feature maps of different scales.
And step 3: and sequentially amplifying feature maps of different scales to the same scale in a deconvolution mode.
And 4, step 4: and sequentially fusing the deconvoluted feature maps.
And 5: adding an attention mechanism module after each fused feature map and then carrying out target detection
Step 6: detection Head (Detection Head) output: 1. the thermodynamic diagram is used for estimating the center position of the object; 2. the classification output is used to estimate the object class; 3. outputting the width, height and position finishing results of each object; 4. and (5) outputting weak supervision semantics.
And 7: and (6) suppressing the detection result output by each DP in the step 6 by using a non-maximum value to obtain a final detection result.
2. The deconvolution processing method of the feature map according to step 3 of claim 1.
3. The method for using the deconvolution feature map fusion processing mode and the attention module according to the method of claim 2 and step 4.
4. The detection head according to claim 3 or 6 outputs a thermodynamic diagram, and takes the local extreme points of the thermodynamic diagram as the center of the object.
5. The detector head output structure of claim 4 or 6.
6. The weakly supervised semantic segmentation branch of claim 5 step 6.
7. The inspection head of claim 6, step 6, outputting object width and height and position refinements.
CN201910443385.1A 2019-05-21 2019-05-21 Single-stage target detection device without anchor box based on attention mechanism and semantic weak supervision Pending CN111985284A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910443385.1A CN111985284A (en) 2019-05-21 2019-05-21 Single-stage target detection device without anchor box based on attention mechanism and semantic weak supervision

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910443385.1A CN111985284A (en) 2019-05-21 2019-05-21 Single-stage target detection device without anchor box based on attention mechanism and semantic weak supervision

Publications (1)

Publication Number Publication Date
CN111985284A true CN111985284A (en) 2020-11-24

Family

ID=73436857

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910443385.1A Pending CN111985284A (en) 2019-05-21 2019-05-21 Single-stage target detection device without anchor box based on attention mechanism and semantic weak supervision

Country Status (1)

Country Link
CN (1) CN111985284A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113255759A (en) * 2021-05-20 2021-08-13 广州广电运通金融电子股份有限公司 Attention mechanism-based in-target feature detection system, method and storage medium

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113255759A (en) * 2021-05-20 2021-08-13 广州广电运通金融电子股份有限公司 Attention mechanism-based in-target feature detection system, method and storage medium
CN113255759B (en) * 2021-05-20 2023-08-22 广州广电运通金融电子股份有限公司 In-target feature detection system, method and storage medium based on attention mechanism

Similar Documents

Publication Publication Date Title
Dharneeshkar et al. Deep Learning based Detection of potholes in Indian roads using YOLO
CN109615016B (en) Target detection method of convolutional neural network based on pyramid input gain
Börcs et al. Instant object detection in lidar point clouds
CN103927526B (en) Vehicle detecting method based on Gauss difference multi-scale edge fusion
CN102646199B (en) Motorcycle type identifying method in complex scene
CN105046196A (en) Front vehicle information structured output method base on concatenated convolutional neural networks
CN102073852B (en) Multiple vehicle segmentation method based on optimum threshold values and random labeling method for multiple vehicles
JP6095817B1 (en) Object detection device
CN114627437B (en) Traffic target identification method and system
CN112270383B (en) Tunnel large-scale rivet hole extraction method based on full convolution neural network
CN111985286A (en) Target detection algorithm without anchor box based on Gaussian thermodynamic diagram attention mechanism and semantic weak supervision
Aneesh et al. Real-time traffic light detection and recognition based on deep retinanet for self driving cars
CN109543498B (en) Lane line detection method based on multitask network
KR20180062683A (en) Apparatus and Method for Detecting Vehicle using Image Pyramid
Mijić et al. Traffic sign detection using YOLOv3
CN111985284A (en) Single-stage target detection device without anchor box based on attention mechanism and semantic weak supervision
CN111985493A (en) Anchor-box-free target detection algorithm based on Gaussian thermodynamic diagram and attention mechanism
CN113701642A (en) Method and system for calculating appearance size of vehicle body
CN117197146A (en) Automatic identification method for internal defects of castings
CN111986140A (en) Single-stage face detection device without anchor box and using semantic information weak supervision
CN111985285A (en) Target detection device without anchor box based on Gaussian attention mechanism and semantic weak supervision
CN111985515A (en) Single-stage target detection algorithm using semantic information weak supervision deconvolution feature layer fusion
CN111985516A (en) Single-stage target detection algorithm based on attention mechanism
CN111985288A (en) Single-stage unmanned vehicle detection device without anchor box
CN111985287A (en) Quick target detection device based on Gaussian thermodynamic diagram

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
DD01 Delivery of document by public notice
DD01 Delivery of document by public notice

Addressee: Hu Zhiqiang

Document name: Notice of publication of application for patent for invention

WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20201124