CN114154563A - Target detection method based on hybrid supervised training - Google Patents

Target detection method based on hybrid supervised training Download PDF

Info

Publication number
CN114154563A
CN114154563A CN202111355318.8A CN202111355318A CN114154563A CN 114154563 A CN114154563 A CN 114154563A CN 202111355318 A CN202111355318 A CN 202111355318A CN 114154563 A CN114154563 A CN 114154563A
Authority
CN
China
Prior art keywords
prediction
training
peak
class
loss function
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111355318.8A
Other languages
Chinese (zh)
Inventor
李甲
穆凯
齐云山
赵沁平
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beihang University
Original Assignee
Beihang University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beihang University filed Critical Beihang University
Priority to CN202111355318.8A priority Critical patent/CN114154563A/en
Publication of CN114154563A publication Critical patent/CN114154563A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Computational Linguistics (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)

Abstract

The invention provides a target detection method based on hybrid supervised training, and provides a target detection method for hybrid training by using partial fully-labeled data and partial weakly supervised labeled data based on a training data set labeling strategy used in the training process of an analysis target detector. The method comprises the steps that a peak value category activation response mechanism is used, mapping of object category labels and coarse-grained position information is modeled for weak labeling data during training, and detection branch training is assisted; model classification and positioning branches are trained on the fully labeled data. And finally, the results of the two branches are fused in a self-adaptive manner, so that the performance of the target detector is improved. On one hand, the patent provides a training method of a hybrid supervised training target detector based on peak class activation response, which can remarkably reduce training cost while ensuring performance, and on the other hand, the method is combined with the existing target detector, which remarkably reduces training cost and improves detection performance to a certain extent.

Description

Target detection method based on hybrid supervised training
Technical Field
The invention relates to the field of computer vision and multimedia analysis, in particular to a target detection method based on hybrid supervised training.
Background
Object detection is a basic task in the computer field, the object being to locate and output, for a given class, a rectangular bounding box of the class object from the input image. The object detector is mainly divided into a one-stage object detector and a two-stage object detector. The two-stage target detector is based on the R-CNN structure proposed by Girshick, Berkeley, California, and the region of interest is first generated by a low-level computer vision algorithm and then classified and located. SPPNet proposed by Microsoft institute He and Fast R-CNN proposed by Microsoft institute Girshick utilize spatial pyramid pooling to generate features at one time and generate regional features through RoI pooling, thereby effectively reducing redundant computation. The Faster R-CNN proposed by Ren et al, university of china science and technology, further improves performance by using a region proposal network instead of the time-consuming region proposal algorithm. The R-FCN proposed by Microsoft institute Dai et al avoids processing for each region of interest by generating location sensitivity scores over a full convolutional network. Mask R-CNN proposed by Facebook Artificial Intelligence institute He et al effectively solves the problem of coarse space quantization by using an interested region alignment layer. FPN proposed by Facebook Artificial Intelligence research institute Lin et al fuses features with strong low-resolution semantic information and features with weak high-resolution semantic information through a top-to-bottom path and skip connection, so that the problem of scale change is solved. Traditionally, the two-stage detector can obtain better detection performance but often has larger calculation overhead, and does not meet the requirement of real-time application. To address this situation, a one-stage detector avoids the time-consuming proposal generation step, directly classifying predefined detection blocks, such as YOLO by Redmon et al, washington university, and SSD model by Liu et al, north carolina university. The existing training of a target detector usually adopts full-supervised training, namely a data set simultaneously marks the category and the bounding box of each object, however, the marking cost of the data set is high, the time is consumed, meanwhile, parts of the data set are difficult to obtain under a medical scene, especially in a complex and dense scene, the number of object examples is large, the objects are distributed more densely, in addition, the objects are seriously shielded from each other, the cost of the bounding box is high, and the training cost is high. Meanwhile, a part of research provides a weak supervision method, namely a data set only labels the categories appearing in the pictures without marking a surrounding frame, the labeling cost can be obviously reduced by the labeling mode, however, the existing weak supervision method is regarded as multi-instance learning, and due to the lack of explicit position information supervision, the performance is generally greatly different from that of a fully supervised detector. Therefore, by using the hybrid supervised training, that is, by using a small amount of fully supervised labeled data and a large amount of easily obtained weakly supervised labeled data, the training target detection network can well reduce the training cost while ensuring the performance.
The method disclosed by the invention firstly provides a target detection method for lightweight hybrid supervised training, and greatly reduces the training and labeling cost of the model on the basis of equivalent performance. And meanwhile, the method is obviously superior to the existing method under the same labeling cost.
Disclosure of Invention
In light of the above-mentioned practical needs and key issues, the present invention is directed to: a target detection method based on hybrid supervised training is provided, and a small amount of classification and regression heads of a full supervised annotation data training model and a large amount of low cost classification heads of a weak supervised annotation data training model are used in the training process. The model trains classification branches of a classification head for weakly labeled data, simultaneously introduces a class peak value activation response mechanism, models mapping from classification information to coarse-grained position information, and enhances the response of the position of an object while suppressing noise by fusing the extracted coarse-grained position information with an original position information feature map in a test stage.
The invention comprises the following 3 steps:
step S100, calculating the loss of class labels and model classification prediction by using a network loss function for a weakly labeled image of a training data set, and minimizing a classification branch of a loss function training model by using a gradient back propagation method;
step S200, for the full-labeled image of the training data set, calculating a class label and a classification prediction and a loss function of a position label and a positioning prediction respectively by using a network loss function, minimizing the loss function by using a gradient back propagation method, and training classification and positioning branches of a model;
and step S300, for the detected image, performing forward calculation by using the convolutional neural network with the trained network weight by the method, and fusing the result of the peak type activation response branch into the central point detection branch after shifting, and calculating the characteristic by using the enhanced detection characteristic to obtain a prediction frame.
Drawings
FIG. 1 is a flow chart of a hybrid supervised training based target detection method of the present invention;
FIG. 2 is a block diagram of the hybrid supervised training based target detection method of the present invention;
FIG. 3 is a training strategy diagram of the hybrid supervised training based target detection method of the present invention;
FIG. 4 is a fusion detection diagram of the target detection method based on hybrid supervised training of the present invention.
Detailed Description
Embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While certain embodiments of the present disclosure are shown in the drawings, it is to be understood that the disclosure may be embodied in various forms and should not be construed as limited to the embodiments set forth herein. Rather, these embodiments are provided for a more thorough and complete understanding of the present disclosure. It should be understood that the drawings and embodiments of the disclosure are for illustration purposes only and are not intended to limit the scope of the disclosure.
It should be noted that, for convenience of description, only the portions related to the related invention are shown in the drawings. The embodiments and features of the embodiments in the present disclosure may be combined with each other without conflict.
It should be noted that the terms "first", "second", and the like in the present disclosure are only used for distinguishing different devices, modules or units, and are not used for limiting the order or interdependence relationship of the functions performed by the devices, modules or units.
It is noted that references to "a", "an", and "the" modifications in this disclosure are intended to be illustrative rather than limiting, and that those skilled in the art will recognize that "one or more" may be used unless the context clearly dictates otherwise.
The names of messages or information exchanged between devices in the embodiments of the present disclosure are for illustrative purposes only, and are not intended to limit the scope of the messages or information. The following detailed description of embodiments of the invention is provided in connection with the accompanying drawings. The following examples or figures are illustrative of the present invention and are not intended to limit the scope of the present invention.
FIG. 1 is a flow chart of a target detection method based on hybrid supervised training, which includes the following steps:
step S100, calculating the loss of class labels and model classification prediction by using a network loss function for a weakly labeled image of a training data set, and minimizing a classification branch of a loss function training model by using a gradient back propagation method;
step S200, for a full-labeled image of a training data set, calculating a class label and classification prediction by using a network loss function, calculating a loss function of a position label and positioning prediction, minimizing the loss function by using a gradient back propagation method, and training classification and positioning branches of a model;
and step S300, performing forward calculation on the detection image by using the convolutional neural network with the trained network weight by the method, fusing the result of the peak type activation response branch into a central point detection branch after shifting, and forming a final detection frame by using the enhanced detection heat map prediction, the length and width prediction and the central point shift prediction.
Referring to fig. 2, a frame diagram of the target detection method based on hybrid supervised training of the present invention and fig. 3, a training strategy diagram of the target detection method based on hybrid supervised training of the present invention, the target detection method based on hybrid supervised training of the present invention includes the following steps in the training process:
and S100, calculating the loss of class labels and model classification prediction by using a network loss function for the weakly labeled image of the training data set, and minimizing the classification branch of the loss function training model by using a gradient back propagation method.
And training central heat map prediction of the peak value class activation response branch and the central point detection branch and classification capability of the central heat map prediction of the central point detection branch by using a training picture of weak supervision labeling and a corresponding class label, wherein the weak labeling image is an image labeled with only class labels and no bounding box labels. Classification loss function for weakly supervised training
Figure BDA0003357301510000045
Comprises the following steps:
Figure BDA0003357301510000041
wherein s isaggrFor the purpose of peak aggregate response confidence level,
Figure BDA0003357301510000042
wherein the content of the first and second substances,
Figure BDA0003357301510000043
(ii) a peak point response representing a peak point of the peak class activation response map, (i)k,jk) Peak position, N, representing the peak point of the peak class activation response mapcAnd the number of peak points is represented, label is a category label vector in the data set label, and BCE is a cross entropy loss function. Maxpool denotes a maximization pooling operation, pooling N × C × H × W dimensional prediction vectors into N × C category prediction vectors
Figure BDA0003357301510000044
Here, the peak aggregate response confidence is the confidence that the prediction of the peak class activation response branch is obtained after aggregation.
And step S200, for the full-labeled image of the training data set, calculating a class label and classification prediction and a loss function of a position label and positioning prediction respectively by using a network loss function, minimizing the loss function by using a gradient back propagation method, and training classification and positioning branches of a model.
According to the model trained in step S100, the peak class activation response branch and the classification part of the center point detection branch of the full annotation image are trained using class labeling, and the prediction header of the center point detection branch of the model is proposed using full supervised data training, where the full annotation image represents an image labeled with a class label and a bounding box label, and the loss function of the center point detection branch
Figure BDA0003357301510000051
The following were used:
Figure BDA0003357301510000052
wherein the content of the first and second substances,
Figure BDA0003357301510000053
representing the prediction of the central point heat map,
Figure BDA0003357301510000054
represents the length and width scale prediction,
Figure BDA0003357301510000058
Which represents the prediction of the center point offset,
Figure BDA0003357301510000059
for offset-to-center prediction, Y represents the heat map generated by the CenterNet algorithm according to the dataset annotation, (w)i,hi) Representation by CenterNet calculation according to dataset annotationsLength and width dimension (delta w) produced by the methodi,δhi) Representing the center point offset generated by the CenterNet algorithm according to the dataset labels, (δ px)i,δpyi) Representing a learning objective labeling the center offset generated by the centret algorithm from the dataset, the prediction head was trained using the loss functions FocalLoss and L1 distance loss, respectively.
Figure BDA00033573015100000510
Representing the loss of central heat map prediction,
Figure BDA00033573015100000511
indicating the loss of prediction towards the center offset,
Figure BDA00033573015100000512
indicating the loss of the center point offset prediction, GT labels indicating the dataset class labels,
Figure BDA00033573015100000513
representing the loss of confidence prediction for the peak aggregate response, GT boxes representing dataset bounding box labels,
Figure BDA00033573015100000514
representing the loss of the length-width scale prediction.
Training a classification head by using full supervision data for a peak activated response branch, and using a peak class activated response branch loss function for the peak class activated response branch during full supervision training
Figure BDA0003357301510000055
Comprises the following steps:
Figure BDA0003357301510000056
wherein s isaggrFor the peak aggregate response confidence, label is the class label vector in the dataset label, and BCE is the cross entropy loss function.
Fully supervised trainingLoss function of
Figure BDA00033573015100000515
Comprises the following steps:
Figure BDA0003357301510000057
referring to fig. 4, referring to a fusion detection diagram of the target detection method based on hybrid supervised training of the present invention in fig. 4, the target detection method based on hybrid supervised training of the present invention includes the following steps in the test inference detection process:
and step S300, performing forward calculation on the detection image by using the convolutional neural network with the trained network weight through the methods of the steps S100 and S200, fusing the result of the peak type activation response branch into a central point detection branch after deviation, and forming a final detection frame by using the enhanced detection heat map prediction, the length and width prediction and the central point deviation prediction.
In the detection process of the model, a mixed supervision training model trained through the steps is used, a peak type activation response branch is used for obtaining branch responses, a central point detection branch is used for obtaining heat map prediction, length and width scale prediction, central point offset prediction and central offset prediction, finally offset through central offset is used, the position of a peak point is relatively close to the position of a central point, and the peak point is fused with responses at the corresponding positions of the central heat map, so that an enhanced central point heat map is obtained. And the enhanced central point heat map prediction, the length and width prediction and the central point offset prediction form a final detection frame together.
For the peak value category activation response branch, constructing a category activation response graph, and calculating the category probability y output by the last layercAll pixels A for the current layer feature mapi,jPartial derivatives of
Figure BDA0003357301510000061
Wherein, ycFor the probability vector output for the classification,
Figure BDA0003357301510000062
is the pixel at the (i, j) th position on the kth channel of feature map a. Averaging the partial derivatives of each pixel in the spatial dimension to obtain the weight coefficient of the class C for each channel:
Figure BDA0003357301510000063
the contribution weight α of the feature of channel k to the class C is obtained, where Z equals i × j, and Z represents the number of all pixels.
And carrying out weighted summation and linear combination on the weight and the feature map, and obtaining a category activation response map through the processing of an activation function ReLU:
Figure BDA0003357301510000064
wherein the content of the first and second substances,
Figure BDA0003357301510000066
activation of a responsive thermodynamic diagram for a class of class C, AkShowing the operation on all channels k of the profile a.
Selecting a peak point on the class activation response map as an output of the peak class activation response, selecting a series of local maxima within a given neighborhood window using a max pooling operation:
Figure BDA0003357301510000065
the locations of the local maxima of the class activation response maps, each representing a class C, may be obtained using a maximum pooling sliding window calculation. Wherein N iscRepresenting the number of local maxima for the C category. Here, the max-pooling sliding window may be an operation of taking a neighborhood maximum by max-pooling. The domain window represents a square area within a certain range k from top to bottom and from left to right by taking the current pixel as the center, and the maximum value in the domain window is obtained through a sampling function。
Predicting a heat map for a midpoint detection branch
Figure BDA00033573015100000718
Length and width dimensions
Figure BDA00033573015100000719
Shift of neutral point
Figure BDA00033573015100000720
Offset to center
Figure BDA00033573015100000721
For the image to be detected, forward calculation is carried out through a convolutional neural network with trained network weight, and the result of the peak type activation response branch is fused to the central point detection branch after being shifted:
Figure BDA00033573015100000722
wherein the content of the first and second substances,
Figure BDA0003357301510000077
to fuse the enhanced central point heat map,
Figure BDA0003357301510000078
indicating the location of the peak point of the output,
Figure BDA0003357301510000079
a category activation response indicating the corresponding location of the peak point,
Figure BDA00033573015100000710
and beta is the proportion of the hyper-parametric control peak class activation response in the whole fusion process.
Central point heat prediction map with enhanced final selection fusion
Figure BDA00033573015100000711
The points of medium and high response constitute the final detection box:
Figure BDA0003357301510000071
here, the first and second liquid crystal display panels are,
Figure BDA00033573015100000712
indicates the width of a prediction bounding box formed by centering the ith position,
Figure BDA00033573015100000713
indicating the height of the prediction bounding box formed by centering the ith position,
Figure BDA00033573015100000714
represents the abscissa representing the prediction bounding box constituted with the ith position as the center,
Figure BDA00033573015100000715
represents the ordinate of the prediction bounding box constructed with the ith position as the center,
Figure BDA00033573015100000723
representing the offset of the x and y coordinates, respectively.
The foregoing description is only exemplary of the preferred embodiments of the disclosure and is illustrative of the principles of the technology employed. It will be appreciated by those skilled in the art that the scope of the invention in the embodiments of the present disclosure is not limited to the specific combination of the above-mentioned features, but also encompasses other embodiments in which any combination of the above-mentioned features or their equivalents is made without departing from the inventive concept as defined above. For example, the above features and (but not limited to) technical features with similar functions disclosed in the embodiments of the present disclosure are mutually replaced to form the technical solution.

Claims (4)

1. A target detection method based on hybrid supervised training comprises the following steps:
step S100, calculating the loss of class labels and model classification prediction by using a network loss function for a weakly labeled image of a training data set, and minimizing a classification branch of a loss function training model by using a gradient back propagation method;
step S200, for a full-labeled image of a training data set, calculating a class label and classification prediction by using a network loss function, calculating a loss function of a position label and positioning prediction, minimizing the loss function by using a gradient back propagation method, and training classification and positioning branches of a model;
and step S300, performing forward calculation on the detection image by using the convolutional neural network with the trained network weight by the method, fusing the result of the peak type activation response branch into a central point detection branch after shifting, and forming a final detection frame by using the enhanced detection heat map prediction, the length and width prediction and the central point shift prediction.
2. The method of claim 1, wherein the computing the class labels and the model classification predicted loss for weakly labeled images of the training dataset using a network loss function, the training of the classification branches of the model using a gradient back propagation method to minimize the loss function comprises:
for a weak annotation image of a training data set, using class annotation to train the prediction of a central heat map of a weak annotation image peak value class activation response branch and a central point detection branch;
constructing a category activation response graph, and calculating the category probability y of the output of the last layer of classificationcAll pixels A of the current layer feature mapi,jPartial derivatives of
Figure FDA0003357301500000011
Wherein, ycFor the probability vector output for the classification,
Figure FDA0003357301500000012
at (i, j) th position on the kth channel of feature map AA pixel;
averaging the partial derivatives of each pixel in the spatial dimension to obtain a weight coefficient of the category C for each channel:
Figure FDA0003357301500000013
obtaining a weight coefficient alpha of the feature of the channel k to the classification C, wherein Z is i multiplied by j, and Z represents the number of all pixels;
and carrying out weighted summation and linear combination on the weight coefficient and the feature map, and obtaining a class activation response map through the processing of an activation function ReLU:
Figure FDA0003357301500000021
wherein the content of the first and second substances,
Figure FDA0003357301500000022
activation of a responsive thermodynamic diagram for a class of class C, AkRepresenting the operation of all channels K of the characteristic diagram A;
selecting a peak point on the class activation response map as an output of the peak class activation response, selecting a series of local maxima within a given neighborhood window using a max pooling operation:
Figure FDA00033573015000000212
wherein N iscThe number of local maxima representing the C category;
calculating a loss function by using the output of the peak value category activation response and the data set category label, and calculating the confidence coefficient of the peak value aggregation response
Figure FDA0003357301500000023
Wherein the content of the first and second substances,
Figure FDA0003357301500000024
(ii) a peak point response representing a peak point of the peak class activation response map, (i)k,jk) Peak position, N, representing the peak point of the peak class activation response mapcRepresenting the number of peak points, marking the aggregated confidence response and the data set category to calculate a classification loss function, and predicting the heat map of the central point detection branch
Figure FDA0003357301500000025
Calculate the classification loss function after maximum pooling:
Figure FDA0003357301500000026
wherein BCE is a cross entropy loss function, saggrFor peak aggregate response confidence, label is the class label vector in the dataset label, MaxPool represents the maximize pooling operation.
3. The method of claim 1, wherein the using a network loss function, calculating class labels and class predictions, calculating loss functions of location labels and location predictions, minimizing loss functions using a gradient back propagation method, training classification and location branches of a model for a fully labeled image of a training dataset comprises:
for a full-labeled image of a training data set, a peak value category of the full-labeled image is trained by using category labeling to activate a response branch and a classification part of a central point detection branch, a prediction head of the central point detection branch of a proposed model is trained by using full-supervision data, and a loss function of the central point detection branch
Figure FDA0003357301500000027
Comprises the following steps:
Figure FDA0003357301500000028
wherein the content of the first and second substances,
Figure FDA0003357301500000029
representing the prediction of the central point heat map,
Figure FDA00033573015000000210
the length-width scale prediction is represented,
Figure FDA00033573015000000211
represents the center point offset prediction, Y, (w)i,hi)、(δwi,δhi)、(δpxi,δpyi) Respectively representing the learning objectives of the heat map, the length-width scale, the center point offset and the center offset generated by the centret algorithm according to the dataset labels, focallloss is a loss function,
Figure FDA0003357301500000031
representing peak-to-center offset prediction, L1 is distance loss training, peak class activation response branch loss function
Figure FDA0003357301500000032
Comprises the following steps:
Figure FDA0003357301500000033
wherein BCE is a cross entropy loss function, saggrIndicating the peak aggregate response confidence, label is the category label vector in the dataset label.
4. The method according to claim 1, wherein for the detected image, forward calculation is performed by using a convolutional neural network with network weights trained by the above method, the result of the peak class activation response branch is fused into the central point detection branch after being shifted, and the final detection frame is formed by using the enhanced detection heat map prediction and the length-width prediction and the central point shift prediction, which comprises:
for the image to be detected, forward calculation is carried out through a convolutional neural network with trained network weight, and the result of the peak type activation response branch is fused to the central point detection branch after being shifted:
Figure FDA0003357301500000034
wherein the content of the first and second substances,
Figure FDA0003357301500000035
representing the prediction of the central point heat map,
Figure FDA0003357301500000036
a central point heat prediction map representing fusion enhancement,
Figure FDA0003357301500000037
indicating the location of the peak point of the output,
Figure FDA0003357301500000038
a category activation response indicating the corresponding location of the peak point,
Figure FDA0003357301500000039
representing the offset of each point to a central point, wherein beta is the proportion of the hyper-parameter control peak value category activation response in the whole fusion process;
and using the enhanced detection heat map prediction, the length and width prediction and the center point deviation prediction to form a target detection result.
CN202111355318.8A 2021-11-16 2021-11-16 Target detection method based on hybrid supervised training Pending CN114154563A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111355318.8A CN114154563A (en) 2021-11-16 2021-11-16 Target detection method based on hybrid supervised training

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111355318.8A CN114154563A (en) 2021-11-16 2021-11-16 Target detection method based on hybrid supervised training

Publications (1)

Publication Number Publication Date
CN114154563A true CN114154563A (en) 2022-03-08

Family

ID=80456492

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111355318.8A Pending CN114154563A (en) 2021-11-16 2021-11-16 Target detection method based on hybrid supervised training

Country Status (1)

Country Link
CN (1) CN114154563A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114972118A (en) * 2022-06-30 2022-08-30 抖音视界(北京)有限公司 Noise reduction method and device for inspection image, readable medium and electronic equipment
CN116503618A (en) * 2023-04-25 2023-07-28 东北石油大学三亚海洋油气研究院 Method and device for detecting remarkable target based on multi-mode and multi-stage feature aggregation

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114972118A (en) * 2022-06-30 2022-08-30 抖音视界(北京)有限公司 Noise reduction method and device for inspection image, readable medium and electronic equipment
CN116503618A (en) * 2023-04-25 2023-07-28 东北石油大学三亚海洋油气研究院 Method and device for detecting remarkable target based on multi-mode and multi-stage feature aggregation
CN116503618B (en) * 2023-04-25 2024-02-02 东北石油大学三亚海洋油气研究院 Method and device for detecting remarkable target based on multi-mode and multi-stage feature aggregation

Similar Documents

Publication Publication Date Title
CN108985334B (en) General object detection system and method for improving active learning based on self-supervision process
WO2022000838A1 (en) Markov random field-based method for labeling remote control tower video target
CN108537269B (en) Weak interactive object detection deep learning method and system thereof
CN111444939B (en) Small-scale equipment component detection method based on weak supervision cooperative learning in open scene of power field
CN112036447B (en) Zero-sample target detection system and learnable semantic and fixed semantic fusion method
CN112434586B (en) Multi-complex scene target detection method based on domain self-adaptive learning
CN114154563A (en) Target detection method based on hybrid supervised training
CN113920107A (en) Insulator damage detection method based on improved yolov5 algorithm
CN115393687A (en) RGB image semi-supervised target detection method based on double pseudo-label optimization learning
CN114332578A (en) Image anomaly detection model training method, image anomaly detection method and device
CN111368634B (en) Human head detection method, system and storage medium based on neural network
CN113239753A (en) Improved traffic sign detection and identification method based on YOLOv4
CN114820655A (en) Weak supervision building segmentation method taking reliable area as attention mechanism supervision
CN113139594A (en) Airborne image unmanned aerial vehicle target self-adaptive detection method
CN114863348A (en) Video target segmentation method based on self-supervision
CN115147644A (en) Method, system, device and storage medium for training and describing image description model
CN114612658A (en) Image semantic segmentation method based on dual-class-level confrontation network
CN112884135B (en) Data annotation correction method based on frame regression
CN114332473A (en) Object detection method, object detection device, computer equipment, storage medium and program product
CN111062406B (en) Heterogeneous domain adaptation-oriented semi-supervised optimal transmission method
CN116977710A (en) Remote sensing image long tail distribution target semi-supervised detection method
CN116665009A (en) Pipeline magnetic flux leakage image detection method based on multi-scale SSD network
CN116189012A (en) Unmanned aerial vehicle ground small target detection method based on improved YOLOX
CN115953806A (en) 2D attitude detection method based on YOLO
CN116258937A (en) Small sample segmentation method, device, terminal and medium based on attention mechanism

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination