CN114155398A - Label type self-adaptive active learning image target detection method and device - Google Patents

Label type self-adaptive active learning image target detection method and device Download PDF

Info

Publication number
CN114155398A
CN114155398A CN202111435129.1A CN202111435129A CN114155398A CN 114155398 A CN114155398 A CN 114155398A CN 202111435129 A CN202111435129 A CN 202111435129A CN 114155398 A CN114155398 A CN 114155398A
Authority
CN
China
Prior art keywords
target
information
detection
labeling
label
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111435129.1A
Other languages
Chinese (zh)
Inventor
吕梦遥
陈辉
张希雅
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Zhuoxi Brain And Intelligence Research Institute
Original Assignee
Hangzhou Zhuoxi Brain And Intelligence Research Institute
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Zhuoxi Brain And Intelligence Research Institute filed Critical Hangzhou Zhuoxi Brain And Intelligence Research Institute
Priority to CN202111435129.1A priority Critical patent/CN114155398A/en
Publication of CN114155398A publication Critical patent/CN114155398A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Image Analysis (AREA)

Abstract

The invention provides a label type self-adaptive active learning image target detection method and a label type self-adaptive active learning image target detection device, wherein the method comprises the following steps: the method comprises the steps of obtaining a target image, detecting the target image to obtain positioning information and classification information corresponding to a target object, labeling the classification information meeting a first preset condition to obtain a corresponding class label, labeling the positioning information meeting a second preset condition to obtain a corresponding supplementary bounding box label, generating first labeling data according to the two labels, adding the first labeling data into a labeling data set, wherein the labeling data set is pre-stored with second labeling data; and (4) retraining the semi-supervised detection model according to the labeled data set to obtain an iteratively updated target semi-supervised detection model until the model reaches the expected performance or the labeled quantity reaches the budget. The method can not only obviously save the marking cost, but also improve the judgment of the detection algorithm on the target type and position.

Description

Label type self-adaptive active learning image target detection method and device
Technical Field
The invention relates to the technical field of self-adaptive active learning, in particular to a label type self-adaptive active learning image target detection method and device.
Background
In the related art, the target detection method based on the convolutional neural network mainly depends on a large-scale data set and full-supervised training, and mainly comprises a two-stage detector based on a candidate frame, a single-stage detector based on an anchor frame and a frame-free detector based on feature points.
In general, the two-stage detection first extracts candidate frames by means of selective search or area extraction networks and then extracts image features of the candidate frames to make category and location predictions. Girshick et al extract candidate box features for the first time using a convolutional neural network, and classification and localization are respectively realized by a support vector machine and a regression model; the space pyramid pooling model maps the candidate frames to the feature map, the whole map only needs one-time forward calculation, and before the pooling layer is inserted into the last full connection layer of the network, the image representation with fixed length can be obtained without scaling the candidate frames. And, in the related art, the accuracy of the single-stage detection method has reached the level of the two-stage method, but the large number of background anchor blocks limits the performance of the network.
However, these algorithms still rely on large-scale, diverse-pattern, and exhaustive datasets for annotation, and the cost of manual annotation becomes more time-consuming and complex, and it is feasible to select only a portion of representative data for annotation. If the randomly sampled image data is labeled, sufficient rich information can be obtained only when the sampling scale is large enough, otherwise the generalization capability of the model is seriously influenced.
Disclosure of Invention
The present invention is directed to solving, at least to some extent, one of the technical problems in the related art.
Therefore, the first objective of the present invention is to provide an active learning image target detection method with adaptive annotation types, which realizes decoupling of multiple targets in a single image and decoupling of classification and positioning tasks, and utilizes joint training of fully supervised and weakly supervised data to maximally save annotation cost.
The second objective of the present invention is to provide an active learning image target detection device with adaptive annotation type.
A third object of the invention is to propose a non-transitory computer-readable storage medium.
A fourth object of the invention is to propose a computer program product.
To achieve the above object, an embodiment of a first aspect of the present invention provides a method, including:
detecting the target detection object by using the detection model to obtain positioning information and classification information corresponding to the target object
Selecting the most valuable object according to the quantitative index quota for labeling the target object of which the classification information meets the first preset condition so as to obtain the class label corresponding to the target detection object, and selecting the most valuable detection object according to the quantitative index quota for labeling the target detection object of which the positioning information meets the second preset condition so as to obtain the complementary bounding box label corresponding to the detection target object;
generating first labeling data of the target object according to the category label and the supplementary bounding box label, and adding the first labeling data into a labeling data set, wherein second labeling data are prestored in the labeling data set;
and (4) retraining the semi-supervised detection model according to the labeled data set to obtain an iteratively updated target semi-supervised detection model until the model reaches the expected performance or the labeled quantity reaches the budget.
Optionally, in an embodiment of the present application, the initial semi-supervised detection model is designed by:
extracting a multi-scale feature map of the image, taking the central point estimation as a first branch, taking the weak supervision global average pooling as a second branch, and sharing parameters of part of the multi-scale feature map by the first branch and the second branch;
in the first branch, performing convolution on the multi-scale feature map to obtain predicted position information;
in the second branch, convolving the multi-scale feature map results in a response map that can be supervised by image-level labels. Optionally, in an embodiment of the present application, detecting the target image to obtain the positioning information and the classification information corresponding to the target object includes:
and predicting the target image through the initial semi-supervised detection model to obtain positioning information and classification information corresponding to the target object.
Optionally, in an embodiment of the present application, the first preset condition includes that the target object with the information amount higher than a first specific threshold is classified, and the second preset condition includes that the target object with the information amount higher than a second specific threshold is located.
Optionally, in an embodiment of the present application, labeling a target object whose classification information satisfies a first preset condition to obtain a class label corresponding to the target object, and labeling a target object whose positioning information satisfies a second preset condition to obtain a supplementary bounding box label corresponding to the target object includes:
entropy measures the class information amount of the target:
Figure BDA0003381530060000021
wherein the content of the first and second substances,
Figure BDA0003381530060000022
to measure the amount of class information of an object with entropy,
Figure BDA0003381530060000023
representing coordinates of the center point
Figure BDA0003381530060000024
The class prediction probability of (c) is the total number of candidate classes.
For calculating coordinates of center point of point
Figure BDA0003381530060000025
The amount of positioning information is calculated by first calculating a scale compensation matrix
Figure BDA0003381530060000026
The local probability distribution of (2) expects:
Figure BDA0003381530060000031
where r defines the local neighborhood radius.
Secondly, the difference between the entropy of the local average predicted value and the mean value of the predicted value entropy is used for measuring the mutual information of the data distribution and the model predicted distribution so as to obtain the final product
Figure BDA0003381530060000032
As an estimation of the amount of positioning information:
Figure BDA0003381530060000033
wherein the content of the first and second substances,
Figure BDA0003381530060000034
computing the entropy of the information, defined herein as:
Figure BDA0003381530060000035
similarly, center point coordinates are obtained
Figure BDA0003381530060000036
Size information amount of
Figure BDA0003381530060000037
By using
Figure BDA0003381530060000038
Represents the total amount of positioning information:
Figure BDA0003381530060000039
setting threshold e separately for classification and locationc,∈lAdapting when the amount of information of one class of objects exceeds a corresponding thresholdAnnotations of the corresponding type should be provided.
To achieve the above object, a second aspect of the present invention provides an active learning image target detection apparatus with label type adaptation, including:
the detection module is used for detecting the target detection object by using the detection model to obtain the positioning information and the classification information corresponding to the target object;
the evaluation module is used for selecting the most valuable object according to the quantitative index quota and marking the object with classification information meeting the first preset condition so as to obtain the class label corresponding to the target detection object, and selecting the most valuable detection object according to the quantitative index quota and marking the object with positioning information meeting the second preset condition so as to obtain the supplement bounding box label corresponding to the detection target object;
the labeling module is used for generating first labeling data of the target object according to the category label and the supplementary bounding box label, and adding the first labeling data into a labeling data set, wherein second labeling data are prestored in the labeling data set;
and the training module is used for retraining the semi-supervised detection model according to the labeled data set to obtain an iteratively updated target semi-supervised detection model until the model reaches the expected performance or the labeled quantity reaches the budget.
Optionally, in an embodiment of the present application, the initial semi-supervised detection model is designed by:
extracting a multi-scale feature map of an image, estimating a central point as a first branch, performing weak supervision global average pooling as a second branch, and sharing parameters of part of the multi-scale feature map by the first branch and the second branch;
in the first branch, performing convolution on the multi-scale feature map to obtain predicted position information;
in the second branch, convolving the multi-scale feature map results in a response map that can be supervised by image-level labels.
Optionally, in an embodiment of the present application, the detection module is further configured to:
and predicting the target image through the initial semi-supervised detection model to obtain positioning information and classification information corresponding to the target object.
Optionally, in an embodiment of the present application, the first preset condition includes that the target object with the information amount higher than a first specific threshold is classified, and the second preset condition includes that the target object with the information amount higher than a second specific threshold is located.
In order to achieve the above object, a third aspect of the present application provides a computer device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, and when the processor executes the computer program, the method for detecting an object in an active learning image with adaptive annotation type according to the first aspect of the present application is implemented.
To achieve the above object, a non-transitory computer-readable storage medium is provided in a fourth embodiment of the present application, and a computer program is stored thereon, and when executed by a processor, the computer program implements the annotation type adaptive active learning image target detection method described in the first embodiment of the present application.
In summary, the method, the apparatus, the computer device, and the non-transitory computer-readable storage medium for detecting an image target with adaptive annotation types in the embodiments of the present invention provide that an active detection iteration process includes five steps: model reasoning, target retrieval, information quantity evaluation, self-adaptive labeling and semi-supervised training. The method can respectively estimate the classified information quantity and the positioning information quantity of each target in the image, select valuable targets to adaptively add category labels or bounding box labels, and design a detection model capable of carrying out joint training on fully supervised and weakly supervised data. Additional aspects and advantages of the invention will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the invention.
Drawings
The foregoing and/or additional aspects and advantages of the present invention will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings of which:
fig. 1 is a flowchart of an annotation type adaptive active learning image target detection method according to an embodiment of the present invention.
Fig. 2 is a device structure diagram of an annotation type adaptive active learning image target detection method according to an embodiment of the present invention.
Fig. 3 is a structural diagram of a target detection model of the supervised weak supervised joint training provided in the embodiment of the present invention.
Detailed Description
Reference will now be made in detail to embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the drawings are illustrative and intended to be illustrative of the invention and are not to be construed as limiting the invention.
The following describes an annotation type adaptive active learning image target detection method and apparatus according to an embodiment of the present invention with reference to the drawings.
Fig. 1 is a schematic flowchart of a label type adaptive active learning image target detection method according to an embodiment of the present invention.
As shown in fig. 1, the method for detecting an image target by label type adaptive active learning includes the following steps:
and step S1, detecting the target detection object by using the detection model to obtain the positioning information and the classification information corresponding to the target object.
In one embodiment of the present application, before step S1, it is first necessary to design a semi-supervised detection model, including:
extracting a multi-scale feature map of the image, taking the central point estimation as a first branch, taking the weakly supervised global average pooling as a second branch, and sharing parameters of part of the multi-scale feature map by the first branch and the second branch;
in the first branch, performing convolution on the multi-scale feature map to obtain predicted position information;
in the second branch, the multi-scale feature map is convolved to obtain a response map which can be supervised by image-level labels.
Specifically, in one embodiment of the present application, extracting a multi-scale feature map of an image, using a center point estimation as a first branch and a weakly supervised global mean average pooling as a second branch, the first branch and the second branch sharing parameters of a part of the multi-scale feature map, comprises:
firstly, a multi-scale feature map F of the image is extracted by using a feature pyramid networkiDimension Wi×Hi×DiWhere i e {1,2,3} represents three sequentially increasing feature map resolutions, and W, H, D represent the feature map width, height, and depth, respectively. The central point estimate and the weakly supervised branch, in which the feature map is 3 x 3 convolution compressed into
Figure BDA0003381530060000051
Where C is the detection class. Response graph
Figure BDA0003381530060000052
Under pixel level supervision, obtaining class prediction with length C after global average pooling
Figure BDA0003381530060000053
May be supervised by image level labels. The other branch is responsible for predicting the position information, and the feature map is compressed into by 3 x 3 convolution
Figure BDA0003381530060000054
Representing the position compensation of the center point in two dimensions and, similarly,
Figure BDA0003381530060000055
representing a size estimate of the bounding box length and width.
Step S2, selecting the most valuable object according to the quantitative index quota for the target object whose classification information satisfies the first preset condition, and labeling to obtain the class label corresponding to the target detection object, and selecting the most valuable detection object according to the quantitative index quota for the target detection object whose positioning information satisfies the second preset condition, and labeling to obtain the complementary bounding box label corresponding to the detection target object.
In an embodiment of the present application, detecting a target image to obtain positioning information and classification information corresponding to a target object includes:
and predicting the target object through the initial semi-supervised detection model to obtain the positioning information and the classification information corresponding to the target object.
And, in one embodiment of the present application, the first preset condition includes classifying the target object whose information amount is higher than a first specific threshold, and the second preset condition includes locating the target object whose information amount is higher than a second specific threshold.
Specifically, the model is trained with a currently existing annotated data set. If the detection model is initially trained, a part of images (for example, 2,000 VOC07 sheets are commonly used as an initial training set) are randomly selected for complete target labeling, and the initial training set is formed to train the detection model.
Model prediction
Figure BDA0003381530060000061
And (3) representing the probability that the target represented by the key point at (x, y) on the feature map belongs to the class c. And recording the center point of each manually marked target as p e R2Predicting and calculating the loss on the low resolution feature map with the lower sampling ratio R, then
Figure BDA0003381530060000062
Mapping truth values to a thermodynamic diagram Y ∈ [0,1 ] with the following Gaussian kernel]W×H×CThe method comprises the following steps:
Figure BDA0003381530060000063
wherein σpIs the standard deviation of the adaptive target size.
The loss function estimated for the target center point is a pixel level logistic regression, denoted Lk
Figure BDA0003381530060000064
Where α and β are hyper-parameters and N is the number of objects in the image.
In order to recover the discretization error,
Figure BDA0003381530060000065
local coordinate compensation is predicted for each center point, and training of a loss supervision model with L1 is performed, and a loss function is recorded as Loff
Figure BDA0003381530060000066
Record the target size of each manual label as
Figure BDA0003381530060000067
Where k represents the number of object objects in a graph.
Figure BDA0003381530060000068
For each target object size
Figure BDA0003381530060000069
Making regression, using L1 loss as loss function to supervise size regression, and recording loss function as Lsize
Figure BDA00033815300600000610
And representing the image-level label of the sample by using a one-hot vector g, and when at least one object belonging to the class c exists in the sample, recording the corresponding position as 1:
Figure BDA00033815300600000611
wherein
Figure BDA0003381530060000071
Is an indicative function, c(k)Is the class of the kth object, gcIs a truth label belonging to a class c object. The branch is supervised by a multi-label cross entropy loss function, recorded as Lcls
Figure BDA0003381530060000072
Based on the above, the total training targets of the model are:
L=LkoffLoffsizeLsizeclsLcls.
wherein λ isoff、λsize、λclsFor the hyper-parameter, the weight of each branch in the training is controlled.
Step S3, generating first labeled data of the target object according to the category label and the supplemental bounding box label, and adding the first labeled data to a labeled data set, where the labeled data set pre-stores second labeled data.
In an embodiment of the present application, labeling a target object whose classification information satisfies a first preset condition to obtain a class label corresponding to the target object, and labeling a target object whose positioning information satisfies a second preset condition to obtain a supplementary bounding box label corresponding to the target object includes:
entropy measures the class information amount of the target:
Figure BDA0003381530060000073
wherein the content of the first and second substances,
Figure BDA0003381530060000074
to measure the amount of class information of an object with entropy,
Figure BDA0003381530060000075
to representCoordinates of center point
Figure BDA0003381530060000076
The class prediction probability of (c) is the total number of candidate classes.
For calculating coordinates of center point of point
Figure BDA0003381530060000077
The amount of positioning information, first of all, of the scale compensation matrix
Figure BDA0003381530060000078
The local probability distribution of (2) expects:
Figure BDA0003381530060000079
where r defines the local neighborhood radius.
Secondly, the difference between the entropy of the local average predicted value and the mean value of the predicted value entropy is used for measuring the mutual information of the data distribution and the model predicted distribution so as to obtain the final product
Figure BDA00033815300600000710
As an estimation of the amount of positioning information:
Figure BDA00033815300600000711
wherein the content of the first and second substances,
Figure BDA00033815300600000712
computing the entropy of the information, defined herein as:
Figure BDA00033815300600000713
similarly, center point coordinates are obtained
Figure BDA00033815300600000714
Size information amount of
Figure BDA00033815300600000715
By using
Figure BDA00033815300600000716
Represents the total amount of positioning information:
Figure BDA00033815300600000717
setting threshold e separately for classification and locationc,∈lAnd when the information quantity of one type of the target exceeds the corresponding threshold value, the label of the corresponding type is provided in a self-adaptive manner.
And step S4, retraining the semi-supervised detection model according to the annotation data set to obtain an iteratively updated target semi-supervised detection model until the model reaches the expected performance or the annotation quantity reaches the budget.
The technical effects of this application: the characteristics of classification, positioning decoupling and multi-target separation in a detection task are fully utilized, the classification information quantity and the positioning information quantity of each target in an image are respectively estimated, valuable targets are selected, category labels or bounding box labels are added in a self-adaptive mode, meanwhile, a detection model capable of carrying out combined training on fully supervised and weakly supervised data is designed, the labeling cost can be remarkably saved, and the judgment of a detection algorithm on the categories and the positions of the targets can be pertinently improved.
In order to implement the above embodiments, the present invention further provides an active learning image target detection apparatus with adaptive annotation types.
Fig. 2 is a schematic structural diagram of an annotation type adaptive active learning image target detection apparatus according to an embodiment of the present invention.
As shown in fig. 2, the label type adaptive active learning image target detection apparatus includes:
the detection module is used for detecting the target detection object by using the detection model to obtain the positioning information and the classification information corresponding to the target object;
the evaluation module is used for selecting the most valuable object according to the quantitative index quota for the target object of which the classification information meets the first preset condition to label so as to obtain the class label corresponding to the target detection object, and selecting the most valuable detection object according to the quantitative index quota for the target detection object of which the positioning information meets the second preset condition to label so as to obtain the supplement bounding box label corresponding to the detection target object;
the labeling module is used for generating first labeling data of the target object according to the category label and the supplementary bounding box label, and adding the first labeling data into a labeling data set, wherein second labeling data are prestored in the labeling data set;
and the training module is used for retraining the semi-supervised detection model according to the labeled data set to obtain an iteratively updated target semi-supervised detection model until the model reaches the expected performance or the labeled quantity reaches the budget.
In an embodiment of the present application, further, the method further includes:
an initial semi-supervised detection model was designed by:
extracting a multi-scale feature map of the image, taking the central point estimation as a first branch, taking the weak supervision global average pooling as a second branch, and sharing parameters of part of the multi-scale feature map by the first branch and the second branch;
in the first branch, performing convolution on the multi-scale characteristic graph to obtain predicted position information;
in the second branch, the convolution of the multi-scale feature map results in a response map that can be supervised by image-level labels.
In an embodiment of the present application, further, the method further includes:
and the detection module is used for predicting the target object through the initial semi-supervised detection model to obtain the positioning information and the classification information corresponding to the target object.
In an embodiment of the present application, further, the method further includes:
the preset conditions include classifying target objects with information amount higher than a first specific threshold, and the second preset conditions include locating target objects with information amount higher than a second specific threshold.
In one embodiment of the present application, the overall test model structure is shown in FIG. 3.
The technical effects of this application: the characteristics of classification, positioning decoupling and multi-target separation in a detection task are fully utilized, the classification information quantity and the positioning information quantity of each target in an image are respectively estimated, valuable targets are selected, category labels or bounding box labels are added in a self-adaptive mode, meanwhile, a detection model capable of carrying out combined training on fully supervised and weakly supervised data is designed, the labeling cost can be remarkably saved, and the judgment of a detection algorithm on the categories and the positions of the targets can be pertinently improved.
To achieve the above object, a third aspect of the present application provides a computer device, a memory thereon, a processor, and a computer program stored on the memory and executable on the processor, where the computer program, when executed by the processor, implements a method for label type adaptive active learning image target detection as described in the first aspect of the present application.
To achieve the above object, a non-transitory computer-readable storage medium is provided in a fourth embodiment of the present application, and a computer program is stored thereon, and when executed by a processor, the computer program implements a method for label type adaptive active learning image target detection as described in the first embodiment of the present application.
Although the present application has been disclosed in detail with reference to the accompanying drawings, it is to be understood that such description is merely illustrative and not restrictive of the application of the present application. The scope of the present application is defined by the appended claims and may include various modifications, adaptations, and equivalents of the invention without departing from the scope and spirit of the application.
In the description herein, references to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, the schematic representations of the terms used above are not necessarily intended to refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, various embodiments or examples and features of different embodiments or examples described in this specification can be combined and combined by one skilled in the art without contradiction.
Furthermore, the terms "first", "second" and "first" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include at least one such feature. In the description of the present invention, "a plurality" means at least two, e.g., two, three, etc., unless specifically limited otherwise.
Any process or method descriptions in flow charts or otherwise described herein may be understood as representing modules, segments, or portions of code which include one or more executable instructions for implementing steps of a custom logic function or process, and alternate implementations are included within the scope of the preferred embodiment of the present invention in which functions may be executed out of order from that shown or discussed, including substantially concurrently or in reverse order, depending on the functionality involved, as would be understood by those reasonably skilled in the art of the present invention.
The logic and/or steps represented in the flowcharts or otherwise described herein, e.g., an ordered listing of executable instructions that can be considered to implement logical functions, can be embodied in any computer-readable medium for use by or in connection with an instruction execution system, apparatus, or device, such as a computer-based system, processor-containing system, or other system that can fetch the instructions from the instruction execution system, apparatus, or device and execute the instructions. For the purposes of this description, a "computer-readable medium" can be any means that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device. More specific examples (a non-exhaustive list) of the computer-readable medium would include the following: an electrical connection (electronic device) having one or more wires, a portable computer diskette (magnetic device), a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber device, and a portable compact disc read-only memory (CDROM). Additionally, the computer-readable medium could even be paper or another suitable medium upon which the program is printed, as the program can be electronically captured, via for instance optical scanning of the paper or other medium, then compiled, interpreted or otherwise processed in a suitable manner if necessary, and then stored in a computer memory.
It should be understood that portions of the present invention may be implemented in hardware, software, firmware, or a combination thereof. In the above embodiments, the various steps or methods may be implemented in software or firmware stored in memory and executed by a suitable instruction execution system. If implemented in hardware, as in another embodiment, any one or combination of the following techniques, which are known in the art, may be used: a discrete logic circuit having a logic gate circuit for implementing a logic function on a data signal, an application specific integrated circuit having an appropriate combinational logic gate circuit, a Programmable Gate Array (PGA), a Field Programmable Gate Array (FPGA), or the like.
It will be understood by those skilled in the art that all or part of the steps carried by the method for implementing the above embodiments may be implemented by hardware related to instructions of a program, which may be stored in a computer readable storage medium, and when the program is executed, the program includes one or a combination of the steps of the method embodiments.
In addition, functional units in the embodiments of the present invention may be integrated into one processing module, or each unit may exist alone physically, or two or more units are integrated into one module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode. The integrated module, if implemented in the form of a software functional module and sold or used as a stand-alone product, may also be stored in a computer readable storage medium.
The storage medium mentioned above may be a read-only memory, a magnetic or optical disk, etc. Although embodiments of the present invention have been shown and described above, it is understood that the above embodiments are exemplary and should not be construed as limiting the present invention, and that variations, modifications, substitutions and alterations can be made to the above embodiments by those of ordinary skill in the art within the scope of the present invention.

Claims (10)

1. A label type self-adaptive active learning image target detection method is characterized by comprising the following steps:
detecting the target detection object by using the detection model to obtain positioning information and classification information corresponding to the target object;
selecting the most valuable object according to the quantitative index quota for labeling the target object of which the classification information meets the first preset condition so as to obtain the class label corresponding to the target detection object, and selecting the most valuable detection object according to the quantitative index quota for labeling the target detection object of which the positioning information meets the second preset condition so as to obtain the complementary bounding box label corresponding to the detection target object;
generating first labeling data of the target object according to the category label and the supplementary bounding box label, and adding the first labeling data into a labeling data set, wherein second labeling data are prestored in the labeling data set;
and retraining the semi-supervised detection model according to the labeled data set to obtain an iteratively updated target semi-supervised detection model until the model reaches the expected performance or the labeled quantity reaches the budget.
2. The method of claim 1, wherein the semi-supervised detection model is designed by:
extracting a multi-scale feature map of an image, and estimating a central point as a first branch and weakly supervised global average pooling as a second branch, wherein the first branch and the second branch share part of parameters of the multi-scale feature map;
in the first branch, performing convolution on the multi-scale feature map to obtain predicted position information;
in the second branch, convolving the multi-scale feature map results in a response map that can be supervised by image-level labels.
3. The method according to claim 1 or 2, wherein detecting the target image to obtain the corresponding positioning information and classification information comprises:
and predicting the target image through the initial semi-supervised detection model to obtain positioning information and classification information corresponding to the target object.
4. The method according to claim 3, wherein the first predetermined condition includes classifying target objects having an information content above a first specific threshold, and the second predetermined condition includes locating target objects having an information content above a second specific threshold.
5. The method according to claim 4, wherein labeling the target object whose classification information satisfies a first preset condition to obtain a class label corresponding to the target object, and labeling the target object whose positioning information satisfies a second preset condition to obtain a supplementary bounding box label corresponding to the target object, comprises:
entropy measures the class information amount of the target:
Figure FDA0003381530050000021
wherein the content of the first and second substances,
Figure FDA0003381530050000022
to measure the amount of class information of an object with entropy,
Figure FDA0003381530050000023
representing coordinates of the center point
Figure FDA0003381530050000024
The class prediction probability of (c) is the total number of candidate classes.
For calculating coordinates of center point of point
Figure FDA00033815300500000215
The amount of positioning information is calculated by first calculating a scale compensation matrix
Figure FDA0003381530050000025
The local probability distribution of (2) expects:
Figure FDA0003381530050000026
where r defines the local neighborhood radius.
Secondly, the difference between the entropy of the local average predicted value and the mean value of the predicted value entropy is used for measuring the mutual information of the data distribution and the model predicted distribution so as to obtain the final product
Figure FDA0003381530050000027
As an estimation of the amount of positioning information:
Figure FDA0003381530050000028
wherein the content of the first and second substances,
Figure FDA0003381530050000029
computing the entropy of the information, defined herein as:
Figure FDA00033815300500000210
similarly, center point coordinates are obtained
Figure FDA00033815300500000211
Size information amount of
Figure FDA00033815300500000212
By using
Figure FDA00033815300500000213
Represents the total amount of positioning information:
Figure FDA00033815300500000214
setting threshold e separately for classification and locationc,∈lAnd respectively screening out the targets with the information quantity exceeding the corresponding threshold value, quantitatively selecting the target with the maximum information quantity, and providing the label of the corresponding type.
6. An adaptive active learning image target detection device of label type, characterized by comprising:
the detection module is used for detecting the target detection object by using the detection model to obtain positioning information and classification information corresponding to the target object;
the evaluation module is used for selecting the most valuable object according to the quantitative index quota for the target object of which the classification information meets the first preset condition to label so as to obtain the class label corresponding to the target detection object, and selecting the most valuable detection object according to the quantitative index quota for the target detection object of which the positioning information meets the second preset condition to label so as to obtain the supplementary bounding box label corresponding to the detection target object;
the labeling module is used for generating first labeling data of the target object according to the category label and the supplementary bounding box label, and adding the first labeling data into a labeling data set, wherein second labeling data are prestored in the labeling data set;
and the training module is used for retraining the semi-supervised detection model according to the labeled data set to obtain an iteratively updated target semi-supervised detection model until the model reaches the expected performance or the labeled quantity reaches the budget.
7. The apparatus of claim 6, wherein the initial semi-supervised detection model is designed by:
extracting a multi-scale feature map of an image, estimating a central point as a first branch, performing weak supervision global average pooling as a second branch, and sharing parameters of part of the multi-scale feature map by the first branch and the second branch;
in the first branch, performing convolution on the multi-scale feature map to obtain predicted position information;
in the second branch, convolving the multi-scale feature map results in a response map that can be supervised by image-level labels.
8. The apparatus of claim 6 or 7, wherein the detection module is further configured to:
and predicting the target image through the initial semi-supervised detection model to obtain positioning information and classification information corresponding to the target object.
9. The apparatus of claim 8, wherein the first predetermined condition comprises classifying target objects with information amount higher than a first specific threshold, and the second predetermined condition comprises locating target objects with information amount higher than a second specific threshold.
10. A non-transitory computer-readable storage medium having stored thereon a computer program, wherein the computer program, when executed by a processor, implements the annotation type adaptive active learning image target detection method according to any one of claims 1 to 5.
CN202111435129.1A 2021-11-29 2021-11-29 Label type self-adaptive active learning image target detection method and device Pending CN114155398A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111435129.1A CN114155398A (en) 2021-11-29 2021-11-29 Label type self-adaptive active learning image target detection method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111435129.1A CN114155398A (en) 2021-11-29 2021-11-29 Label type self-adaptive active learning image target detection method and device

Publications (1)

Publication Number Publication Date
CN114155398A true CN114155398A (en) 2022-03-08

Family

ID=80784242

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111435129.1A Pending CN114155398A (en) 2021-11-29 2021-11-29 Label type self-adaptive active learning image target detection method and device

Country Status (1)

Country Link
CN (1) CN114155398A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114972222A (en) * 2022-05-13 2022-08-30 徕卡显微系统科技(苏州)有限公司 Cell information statistical method, device, equipment and computer readable storage medium
CN115527083A (en) * 2022-09-27 2022-12-27 中电金信软件有限公司 Image annotation method and device and electronic equipment
CN116403074A (en) * 2023-04-03 2023-07-07 上海锡鼎智能科技有限公司 Semi-automatic image labeling method and device based on active labeling
CN117828538A (en) * 2024-03-06 2024-04-05 山东大学 Multi-source information comprehensive analysis method and system based on weight distribution

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114972222A (en) * 2022-05-13 2022-08-30 徕卡显微系统科技(苏州)有限公司 Cell information statistical method, device, equipment and computer readable storage medium
CN115527083A (en) * 2022-09-27 2022-12-27 中电金信软件有限公司 Image annotation method and device and electronic equipment
CN115527083B (en) * 2022-09-27 2023-04-11 中电金信软件有限公司 Image annotation method and device and electronic equipment
CN116403074A (en) * 2023-04-03 2023-07-07 上海锡鼎智能科技有限公司 Semi-automatic image labeling method and device based on active labeling
CN116403074B (en) * 2023-04-03 2024-05-14 上海锡鼎智能科技有限公司 Semi-automatic image labeling method and device based on active labeling
CN117828538A (en) * 2024-03-06 2024-04-05 山东大学 Multi-source information comprehensive analysis method and system based on weight distribution
CN117828538B (en) * 2024-03-06 2024-05-31 山东大学 Multi-source information comprehensive analysis method and system based on weight distribution

Similar Documents

Publication Publication Date Title
CN114155398A (en) Label type self-adaptive active learning image target detection method and device
US10910099B2 (en) Segmentation, landmark detection and view classification using multi-task learning
CN109086811B (en) Multi-label image classification method and device and electronic equipment
Ke et al. Adaptive change detection with significance test
US20180300576A1 (en) Semi-automatic labelling of datasets
US8442309B2 (en) Semantic scene segmentation using random multinomial logit (RML)
CN108564085B (en) Method for automatically reading of pointer type instrument
CN108229522B (en) Neural network training method, attribute detection device and electronic equipment
KR20170058263A (en) Methods and systems for inspecting goods
CN114067109B (en) Grain detection method, grain detection device and storage medium
CN107578424B (en) Dynamic background difference detection method, system and device based on space-time classification
CN106815806B (en) Single image SR reconstruction method based on compressed sensing and SVR
CN111967535B (en) Fault diagnosis method and device for temperature sensor of grain storage management scene
CN112906816A (en) Target detection method and device based on optical differential and two-channel neural network
CN115496892A (en) Industrial defect detection method and device, electronic equipment and storage medium
CN117671508B (en) SAR image-based high-steep side slope landslide detection method and system
CN115082781A (en) Ship image detection method and device and storage medium
CN112991280B (en) Visual detection method, visual detection system and electronic equipment
CN116090938B (en) Method for identifying load state of rear loading vehicle
CN112348750A (en) SAR image change detection method based on threshold fusion and neighborhood voting
CN116630268A (en) Road disease detection method, system, equipment and medium
CN115512202A (en) Small sample target detection method, system and storage medium based on metric learning
CN114663760A (en) Model training method, target detection method, storage medium and computing device
CN111815627B (en) Remote sensing image change detection method, model training method and corresponding device
Chaabane et al. Self attention deep graph CNN classification of times series images for land cover monitoring

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination