CN116152721A - Target detection method and device based on annealing type label transfer learning - Google Patents
Target detection method and device based on annealing type label transfer learning Download PDFInfo
- Publication number
- CN116152721A CN116152721A CN202310414703.8A CN202310414703A CN116152721A CN 116152721 A CN116152721 A CN 116152721A CN 202310414703 A CN202310414703 A CN 202310414703A CN 116152721 A CN116152721 A CN 116152721A
- Authority
- CN
- China
- Prior art keywords
- target detection
- class
- unknown
- label
- training
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000001514 detection method Methods 0.000 title claims abstract description 128
- 238000000137 annealing Methods 0.000 title claims abstract description 52
- 238000010380 label transfer Methods 0.000 title claims abstract description 28
- 238000000034 method Methods 0.000 claims abstract description 30
- 230000008878 coupling Effects 0.000 claims abstract description 12
- 238000010168 coupling process Methods 0.000 claims abstract description 12
- 238000005859 coupling reaction Methods 0.000 claims abstract description 12
- 238000013508 migration Methods 0.000 claims abstract description 7
- 230000005012 migration Effects 0.000 claims abstract description 7
- 238000012549 training Methods 0.000 claims description 56
- 238000012546 transfer Methods 0.000 claims description 27
- 230000006870 function Effects 0.000 claims description 13
- 238000004364 calculation method Methods 0.000 claims description 12
- 238000013526 transfer learning Methods 0.000 claims description 7
- 238000002372 labelling Methods 0.000 claims description 6
- 230000008859 change Effects 0.000 claims description 5
- 238000010606 normalization Methods 0.000 claims description 5
- 238000004590 computer program Methods 0.000 claims description 2
- 230000000694 effects Effects 0.000 abstract description 7
- 230000007547 defect Effects 0.000 abstract description 5
- 238000000605 extraction Methods 0.000 abstract description 3
- 230000008569 process Effects 0.000 description 7
- 238000012545 processing Methods 0.000 description 3
- 230000001276 controlling effect Effects 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 230000000007 visual effect Effects 0.000 description 2
- 238000013528 artificial neural network Methods 0.000 description 1
- 230000004888 barrier function Effects 0.000 description 1
- 230000001413 cellular effect Effects 0.000 description 1
- 238000012937 correction Methods 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 230000007717 exclusion Effects 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000001105 regulatory effect Effects 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
- G06V20/41—Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/764—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Software Systems (AREA)
- Artificial Intelligence (AREA)
- Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Computing Systems (AREA)
- Databases & Information Systems (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Medical Informatics (AREA)
- Image Analysis (AREA)
- Image Processing (AREA)
Abstract
The invention discloses a target detection method and device based on annealing type label transfer learning. According to the target detection method, according to the principle that known and unknown semantic features exist information coupling in the target image feature extraction process, label migration and feature decoupling are carried out on known class examples in data, and an annealing type scheduling curve is constructed to dynamically allocate the decoupling degree of the two classes of semantic features, so that effective general target information in the known class is extracted to guide learning of unknown knowledge, the detection effect of a model on the unknown class is effectively improved, and meanwhile, the strong detection capability of the original known class is reserved. The method can be applied to target detection application scenes in the open world such as automatic driving, defect detection, target tracking and the like, and has higher practical application value.
Description
Technical Field
The invention relates to a target detection method, in particular to a target detection method based on annealing type label transfer learning, and also relates to a corresponding target detection device, belonging to the technical field of computer vision.
Background
Currently, the task of object detection in computer vision comes up with new challenges. Increasingly, people are not satisfied with the object detection capabilities in conventional closed-set scenes, and begin to work on the object detection tasks for open world scene images. The difficulty of such a task for target detection is that knowledge is growing in the open world, and it is required to detect unknown targets not included in the training set while detecting known targets in the image, and to perform incremental learning on the identified unknown targets according to updating of the data set.
Compared to traditional closed-set target detection tasks, target detection tasks in the open world present new challenges: (1) detection of unknown classes: an unknown instance is detected and distinguished from similar known instances and contexts. (2) incremental learning: the identified unknown classes can be learned incrementally and a balance between learning of the original known class and the newly annotated known class is achieved. To this end, joseph et al uses an unknown class candidate region generation network with an automatic labeling strategy and provides an energy-based binary classifier to distinguish between unknown classes and known classes. Yang et al pre-define a semantic centroid for each class and push object instances near their centroids during the incremental learning process to enhance the discrimination of unknown classes from known classes. Gupta et al add attention-driven pseudo-labeling, novel classification, object scoring, etc. methods to the DETR model to detect unknown classes. Zhao et al make additional corrections to the benchmarks and evaluation metrics of the target detection task under open world settings and use a non-parametric candidate box guidance module and a class-specific exclusion classifier to improve detection of unknown classes.
Although the above-mentioned various methods realize the detection of the unknown class object through the specific network structure, because in the target detection task in the open world, only a large amount of known class labeling information exists in the data set, and the detection of the unknown class depends on the unknown class candidate frame generation mechanism with higher uncertainty, the detection effect of the existing method on the unknown class still needs to be improved.
Disclosure of Invention
The invention aims to provide a target detection method based on annealing type label transfer learning.
Another technical problem to be solved by the invention is to provide an annealing-based target detection device for label transfer learning.
In order to achieve the above purpose, the present invention adopts the following technical scheme:
according to a first aspect of an embodiment of the present invention, there is provided an annealing-based target detection method for label transfer learning, including the steps of:
s1, guiding an open target detection model in the current stage to perform pre-training of a known class by using known class data with labeling information;
s2, adding a combined label for the data example to realize information decoupling of known and unknown semantic features;
s3, constructing an annealing type scheduling curve to dynamically allocate the information decoupling degrees of the known semantic features and the unknown semantic features, and guiding an open target detection model to perform collaborative learning of the known knowledge and the unknown knowledge;
s4, performing updating operation of a training data set according to task setting in the open world, and guiding the open target detection model to perform new incremental learning;
s5, iteratively executing the steps S1 to S4, completing training of the open target detection model, and performing target detection in the open world by using the trained open target detection model.
Preferably, the step S2 specifically includes:
from the training set of the current stageMiddle sample image +.>The image comprises a plurality of examples of known targets, each of which is marked +.>,/>And->Respectively representing a sample instance and a corresponding truth value label; for each instance in the sampled image +.>Add a new tag->The transfer tag called this example, original tag +.>Called truth label, set transfer label of all instances in each image to +.>I.e., the transfer tag of the instance is considered an unknown class.
Preferably, the step S3 specifically includes:
s31, image is formedInputting an open target detection model of the current stage to obtain and output classification probability of each instance;
s32, according to the imageThe true value label and the added transfer label of each target instance acquire the combined unit effective code of each instance;
s33, calculating combined cross entropy loss according to the classification probability and the combined unit effective coding;
and S34, adjusting weights of a truth value tag and a transfer tag in a calculation rule of the combined unit effective code by adopting an annealing type scheduling strategy, and guiding an open target detection model to perform collaborative learning of a known class and an unknown class.
Preferably, the step S31 specifically includes:
image is formedOpen target detection model input to the current stage +.>Wherein the detection head is obtained by the following formula>Output classification probability->
Preferably, the step S32 specifically includes:
computing truth labels in the examplesUnit efficient encoding of (2)And the transfer tagUnit efficient encoding of (2)The calculation rule is as follows:
wherein,,representing the total class number of the known classes, and then calculating the combined unit effective code by the following formulaMeanwhile, the information of the known class and the unknown class carried by the truth label and the transfer label are contained, and effective unknown class characteristics contained in the example are decoupled:
Wherein preferably, in the step S33, the combined cross entropy lossThe calculation formula of (2) is as follows:
wherein,,coupling degree representing the characteristics of the known class and the unknown class, < >>Is the output of the classifier after normalization by the softmax function>In->Probability on class.
Wherein preferably, in the step S34, the method is used for regulatingThe varying annealing scheduling strategy is defined as follows:
wherein,,represents the number of iterations of the current phase, +.>Representing the total number of iterations of the pre-training phase in step S1, < >>Is a constant, express ∈ ->For adjusting the change in weight as the number of iterations increases.
Preferably, the step S4 specifically includes:
s41, adding new known categories and updating the data set according to task settings in the open world;
s42, pre-training the known class of the open target detection model in a new stage by using the updated data set;
s43, performing label migration on the updated data set, and adopting an annealing scheduling strategy to guide an open target detection model in a new stage to perform collaborative learning of known classes and unknown classes;
s44, performing fine-tuning training of small sample increment on the open target detection model in the new stage to keep the detection capability of the open target detection model on the original known category semantics.
Wherein preferably, in the step S41, use is made ofTraining set of phases->After the training of the open target detection model is completed, the +/is performed according to the task setting in the open world>Training of the stage, n new classes are added incrementally to the known classes of the dataset, i.e. the known class set is updated +.>The updated training set is as follows。
According to a second aspect of the embodiment of the present invention, there is provided an object detection device based on annealing type tag transfer learning, including a processor and a memory, where the processor reads a computer program in the memory, and is configured to execute the above object detection method based on annealing type tag transfer learning.
Compared with the prior art, the invention provides the target detection method and the target detection device based on the annealing type label transfer learning, which are used for carrying out label transfer and characteristic decoupling on the known type examples in the data according to the principle that the known and unknown semantic features exist information coupling in the target image feature extraction process, and constructing an annealing type scheduling curve to dynamically allocate the decoupling degree of the two types of semantic features, so that effective general target information in the known type is extracted to guide the learning of unknown knowledge, the detection effect of a model on the unknown type is effectively improved, and meanwhile, the powerful detection capability of the original known type is reserved. The method can be applied to target detection application scenes in the open world such as automatic driving, defect detection, target tracking and the like, and has higher practical application value.
Drawings
FIG. 1 is a flowchart of a training process of an open target detection model used in a target detection method according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of a visual result of an open target detection model capable of detecting known and unknown class instances and performing incremental learning during an autopilot mission;
FIG. 3 is a flowchart of constructing an annealing type scheduling curve to dynamically allocate the information decoupling degrees of the known and unknown semantic features and guiding an open target detection model to perform collaborative learning of the known and unknown knowledge in the embodiment of the present invention;
FIG. 4 is a flowchart of performing an update operation of a training data set according to task settings in the open world and guiding an open target detection model to perform new types of incremental learning in accordance with an embodiment of the present invention;
fig. 5 is a schematic diagram of an object detection device based on annealing type label transfer learning according to an embodiment of the present invention.
Detailed Description
The technical contents of the present invention will be described in detail with reference to the accompanying drawings and specific examples.
Autopilot, defect detection, target tracking, etc. are typical open world target detection application scenarios. In the embodiment of the present invention, an autopilot scenario is mainly described as an example, but the application scenario of the present invention is not limited thereto. As shown in fig. 1, the present invention first provides a target detection method based on annealing type label transfer learning. The open target detection model used by the target detection method is obtained through training the following steps: s1, using known class data with labeling information to guide an open target detection model (simply called a model) of the current stage to perform the pre-training of the known class; s2, adding a combined label for the data example to realize information decoupling of known and unknown semantic features; s3, constructing an annealing type scheduling curve to dynamically allocate the information decoupling degrees of the known semantic features and the unknown semantic features, and guiding an open target detection model to perform collaborative learning of the known knowledge and the unknown knowledge; and S4, performing updating operation of the training data set according to task setting in the open world, and guiding the open target detection model to perform new incremental learning.
In one embodiment of the present invention, the above steps S1, S2, S3 and S4 are iteratively performed. Wherein the incremental learning is set to four stages in total,every new stage needs to learn a new known class and retain the detection capability of the original known class, and at the same time, the unknown instance can be detected. The corresponding open target detection model employs an SGD optimizer with a batch size set to 8. For the super parameters, setPeak value of 1, curve change rate +.>Set to->. In the known class pre-training phase of step S1, the initial learning rate is set to 0.01. The learning rate decay is added in the unknown class training stage of step S3, the initial learning rate is set to 0.0001, and the decay is 1/10 of the previous training iterations of 12000 and 16000 respectively. And (3) continuously iterating the training until the loss function of the open target detection model converges, and storing the parameters of each layer of the neural network with the best performance on the verification set to finish the training of the open target detection model.
On the basis of the trained open target detection model, the target detection method can be applied to application scenes such as automatic driving, defect detection, target tracking and the like, namely, the step S5 is implemented: and detecting targets in the open world by using the trained open target detection model. Here, the inventor takes an autopilot application scenario as an example, and performs landing use on an open target detection data set in a natural real scenario, so as to verify the actual effect of the target detection method provided by the embodiment of the invention. The concrete explanation is as follows:
the inventor selects image data in data sets Pascal-VOC and MS-COCO acquired in a natural scene, wherein the image data contains different types of object images shot in 80 types of natural scenes. Incremental learning is set to four phases, with each phase having a new known class set to 20 classes and other classes than the original known class and the new known class set to unknown classes. Wherein, the number of the original known categories in the first stage is 0, and only comprises 20 new categories and the rest unknown categories; in the fourth stage, the total number of the new known category and the original known category is 80, and no unknown category exists.
As shown in fig. 2, the visual example on the automatic driving task is shown in fig. 2, and the target detection method provided by the embodiment of the invention not only can detect known classes such as automobiles, pedestrians and the like in an open road scene, but also can detect an unlearned bridge deck baffle (left in fig. 2), a tire barrier (right in fig. 2) and a skateboard (right in fig. 2) in the middle of a road, and marks the unknown classes, so that cooperative identification of the known target and the unknown target in the open road scene is realized, and the unexpected situation possibly caused by the occurrence of the unknown object in the automatic driving process is avoided.
In order to quantitatively and accurately measure the performance of the target detection method provided by the embodiment of the invention, the inventor adopts known average precision K-mAP, unknown average precision U-mAP and unknown Recall U-Recall as measurement indexes, and the invention is fairly compared with other similar methods, and the results from the first stage to the fourth stage are shown in table 1.
TABLE 1
As can be seen from Table 1, compared with other similar methods, the method provided by the invention has more excellent performance in detection of unknown class and known class, wherein the average accuracy of the unknown class in the stage one is improved by 200%, and the method has better practical value in automatic driving tasks.
The specific training process of the open target detection model is described in further detail below.
In one embodiment of the present invention, step S1 specifically includes the following sub-steps: for the current stageTraining setWhich comprises a set of known classes +.>Use training set +.>Open target detection model for guiding current stage through multiple rounds of iteration +.>
A pre-training of a known class is performed. The loss function at pre-training is as follows:
wherein,,unit valid (one-hot) code for data truth tag,/for data truth tag>Output after processing the output of the classifier by a normalized exponential function, i.e. a softmax function +.>In->Probability on class.
In one embodiment of the present invention, step S2 specifically includes the following sub-steps: from the training set of the current stageMiddle sample image, image->Comprising a plurality of examples of known targets, each of which can be denoted +.>,/>And->Representing sample instances and corresponding truth labels, respectively. For sampled image->And performing label migration. I.e. the image obtained for samplingEvery instance of->Add a new tag->The transfer tag, the original tag, is called the transfer tag of this example>Called truth label, set transfer label of all instances in each image to +.>The transfer label of the instance is regarded as an unknown class, so that effective unknown class semantic features are decoupled from the known class instance under the condition that unknown class supervision information is not needed, and uncertainty of unknown class identification is reduced.
As shown in fig. 3, the step S3 specifically includes the following sub-steps:
s31, image is formedInputting an open target detection model of the current stage to obtain and output classification probabilities (output values of logits functions, namely original numerical values which are output by the model and are not processed by a softmax function) of each instance;
s32, according to the imageTrue value label and added transfer label of each target instance to obtain combined unit effective compiling of each instanceA code;
s33, calculating combined cross entropy loss according to the classification probability and the combined unit effective coding;
and S34, adjusting weights of a truth value tag and a transfer tag in a calculation rule of the combined unit effective code by adopting an annealing type scheduling strategy, and guiding an open target detection model to perform collaborative learning of a known class and an unknown class.
In one embodiment of the invention, the classification probability is obtained by:
image is formedOpen target detection model input to current stageIn the process, the acquisition detection head
In one embodiment of the invention, the combined unit effective code is calculated by the steps of:
computing truth labels in the examplesIs effective in encoding->And the transfer tag->Is effective in encoding->GaugeThe following is true:
wherein,,representing the total number of classes of the known class, then calculating the combined unit effective code +.>。
Meanwhile, the true value label and the transfer label respectively carry known class information and unknown class information, and effective unknown class characteristics contained in the examples are successfully decoupled, and the specific calculation method is as follows:
wherein,,and the coupling degree of the known class and the unknown class is expressed, and the coupling degree is used for controlling the weight (super parameter) of the combination loss calculation of the truth label and the transfer label.
In one embodiment of the invention, the combined cross entropy loss is calculated by:
efficient coding based on the classification probability and combined unitsCalculating the corresponding cross entropy loss->The specific calculation formula is as follows:
wherein,,the coupling degree of the known class and the unknown class features is expressed and is used for controlling the weight (super parameter) of the combination loss calculation of the truth label and the transfer label. />Is the output of the classifier after the normalization of the softmax functionIn->Probability on class.
In one embodiment of the invention, collaborative learning guided by an annealing scheduling policy specifically comprises the following steps:
with the change of the iteration times, the coupling degree of the annealing scheduling strategy to the known class and the unknown class characteristics is adoptedAnd adjusting to regulate the weight of the truth value label and the transfer label to participate in combination loss calculation, so as to guide the open target detection model to perform collaborative learning of unknown class and known class, and finally achieve balance of the two classes of knowledge. In particular for regulating and controllingThe varying annealing scheduling strategy is defined as follows:
wherein,,represents the number of iterations of the current phase, +.>Representing the pre-training step in step S1Total number of iterations of a segment, +.>Is a constant, express ∈ ->For adjusting the change in weight as the number of iterations increases.
As shown in fig. 4, the step S4 specifically includes the following sub-steps:
s41, adding new known categories and updating the data set according to task settings in the open world;
s42, pre-training the known class of the detection model of the new stage by using the updated data set;
s43, performing label migration on the updated data set, and adopting an annealing scheduling strategy to guide a model in a new stage to perform collaborative learning of a known class and an unknown class;
s44, fine-tuning training of small sample increment is carried out on the model in the new stage so as to keep the detection capability of the model on the semantics of the original known category.
In one embodiment of the invention, the updating of the data set specifically comprises the following sub-steps:
usingTraining set of phases->After training the model, the training is carried out according to the task setting in the open worldTraining of the stage, n (n is a positive integer, the same applies below) new classes are added to the known classes of the data set incrementally, i.e. the known class set is updated to +.>The training set after the update is +.>。
In one embodiment of the invention, the known class data is inThe knowledge pre-training of the stage specifically comprises the following sub-steps:
usingTraining set of phases->For->Model of stage->Carry out a new set of known classes +.>The loss of training process is calculated as follows:
wherein,,is->Unit efficient coding of truth labels of phase-known class data,/->Is->Output of time classifier after normalized by softmax function>In->Probability on class.
In one embodiment of the present invention,the stage annealing type scheduling strategy guided collaborative learning specifically comprises the following sub-steps:
for data sets of new phasesPerforming label migration and guiding an open target detection model +.>Collaborative training of known classes and unknown classes is performed. The loss function at training is as follows:
wherein,,degree of coupling for the features of the known class and the unknown class, -/->Is the output of the classifier after normalization by the softmax function>In->Probability on class. />、/>、/>Respectively represent the time t+1The units of the true value tag, the transfer tag, and the final combined tag are effectively encoded.
In one embodiment of the invention, the small sample fine tuning training specifically comprises the following sub-steps:
sample playback strategy adopting incremental learning to ensure semantic recognition capability of model to old known category and construct the previous stageIs>Which contains +.about.each known class>Samples, in->After finishing the incremental learning using the new dataset, use the small sample set +.>The model is subjected to fine tuning training, and the loss in the training process is calculated as follows:
wherein,,is the final classification output in the fine tuning training after normalization by the softmax function at +.>Probability on class. Finally obtaining an updated open target detection model +.>。
On the basis of the target detection method based on the annealing type label transfer learning, the invention further provides a target detection device based on the annealing type label transfer learning. As shown in fig. 5, the object detection device includes one or more processors 11 and a memory 12. Wherein the memory 12 is coupled to the processor 11 for storing one or more programs that, when executed by the one or more processors 11, cause the one or more processors 11 to implement the target detection method based on annealed tag migration learning as in the above embodiments.
The processor 11 is configured to control the overall operation of the target detection apparatus to complete all or part of the steps of the target detection method based on annealing type label transfer learning. The processor 11 may be a Central Processing Unit (CPU), a Graphics Processor (GPU), a Field Programmable Gate Array (FPGA), an Application Specific Integrated Circuit (ASIC), a Digital Signal Processing (DSP) chip, or the like. The memory 12 is used to store various types of data to support operation at the object detection device, which may include, for example, instructions for any application or method operating on the object detection device, as well as application-related data.
The memory 12 may be implemented by any type or combination of volatile or non-volatile memory devices, such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, etc.
In an exemplary embodiment, the target detection device based on the annealing type label transfer learning may be specifically implemented by a computer chip or an entity, or implemented by a product having a certain function, so as to perform the target detection method based on the annealing type label transfer learning, and achieve technical effects consistent with the method. One exemplary embodiment is a computer. In particular, the computer may be, for example, a personal computer, a laptop computer, a car-mounted human-machine interaction device, a cellular telephone, a camera phone, a smart phone, a personal digital assistant, a media player, a navigation device, an email device, a game console, a tablet computer, a wearable device, or a combination of any of these devices.
In another exemplary embodiment, the invention also provides a computer readable storage medium comprising program instructions which, when executed by a processor, implement the steps of the annealing-based label transfer learning object detection method in any of the above embodiments. For example, the computer readable storage medium may be a memory including program instructions executable by a processor of the target detection apparatus based on the annealing type tag transfer learning to complete the target detection method based on the annealing type tag transfer learning and achieve technical effects consistent with the method described above.
Compared with the prior art, the invention provides the target detection method and the target detection device based on the annealing type label transfer learning, which are used for carrying out label transfer and characteristic decoupling on the known type examples in the data according to the principle that the known and unknown semantic features exist information coupling in the target image feature extraction process, and constructing an annealing type scheduling curve to dynamically allocate the decoupling degree of the two types of semantic features, so that effective general target information in the known type is extracted to guide the learning of unknown knowledge, the detection effect of a model on the unknown type is effectively improved, and meanwhile, the powerful detection capability of the original known type is reserved. The method can be applied to target detection application scenes in the open world such as automatic driving, defect detection, target tracking and the like, and has higher practical application value.
The target detection method and the device based on annealing type label transfer learning provided by the invention are described in detail. Any obvious modifications to the present invention, without departing from the spirit thereof, would constitute an infringement of the patent rights of the invention and would take on corresponding legal liabilities.
Claims (10)
1. The target detection method based on annealing type label transfer learning is characterized by comprising the following steps:
s1, guiding an open target detection model in the current stage to perform pre-training of a known class by using known class data with labeling information;
s2, adding a combined label for the data example to realize information decoupling of known and unknown semantic features;
s3, constructing an annealing type scheduling curve to dynamically allocate the information decoupling degrees of the known semantic features and the unknown semantic features, and guiding an open target detection model to perform collaborative learning of the known knowledge and the unknown knowledge;
s4, performing updating operation of a training data set according to task setting in the open world, and guiding the open target detection model to perform new incremental learning;
s5, iteratively executing the steps S1 to S4, completing training of the open target detection model, and performing target detection in the open world by using the trained open target detection model.
2. The method for detecting targets based on annealing type label transfer learning as claimed in claim 1, wherein the step S2 specifically includes:
from the training set of the current stageMiddle sample image +.>The image comprises a plurality of examples of known targets, each of which is marked +.>,/>And->Respectively representing a sample instance and a corresponding truth value label; for each instance in the sampled image +.>Add a new tag->The transfer tag, the original tag, is called the transfer tag of this example>Called truth label, set transfer label of all instances in each image to +.>I.e., the transfer tag of the instance is considered an unknown class.
3. The method for detecting targets based on annealing type label transfer learning as claimed in claim 2, wherein the step S3 specifically includes:
s31, image is formedInputting an open target detection model of the current stage to obtain and output classification probability of each instance;
s32, according to the imageThe true value label and the added transfer label of each target instance acquire the combined unit effective code of each instance;
s33, calculating combined cross entropy loss according to the classification probability and the combined unit effective coding;
and S34, adjusting weights of a truth value tag and a transfer tag in a calculation rule of the combined unit effective code by adopting an annealing type scheduling strategy, and guiding an open target detection model to perform collaborative learning of a known class and an unknown class.
4. The method for detecting an object based on annealing type tag transfer learning as claimed in claim 3, wherein the step S31 specifically includes:
image is formedOpen target detection model input to the current stage +.>Wherein the detection head is obtained by the following formula>Output classification probability->:
5. The method for detecting an object based on annealing type tag transfer learning as claimed in claim 3, wherein the step S32 specifically includes:
computing truth labels in the examplesIs effective in encoding->And the transfer tag->Is effective in encoding->The calculation rule is as follows:
wherein (1)>Representing the total number of classes of the known class, then calculating the combined unit effective coding +.>,/>Meanwhile, the information of the known class and the unknown class carried by the truth label and the transfer label are contained, and effective unknown class characteristics contained in the example are decoupled:
6. The method for detecting an object based on annealing type tag transfer learning as claimed in claim 3, wherein in said step S33, said combined cross entropy loss is calculated byThe calculation formula of (2) is as follows:
7. The method for detecting targets by annealing-based label transfer learning of claim 3, wherein in step S34, the method is used for controllingThe varying annealing scheduling strategy is defined as follows:
8. The method for detecting targets based on annealing type label transfer learning as claimed in claim 1, wherein the step S4 specifically includes:
s41, adding new known categories and updating the data set according to task settings in the open world;
s42, pre-training the known class of the open target detection model in a new stage by using the updated data set;
s43, performing label migration on the updated data set, and adopting an annealing scheduling strategy to guide an open target detection model in a new stage to perform collaborative learning of known classes and unknown classes;
s44, performing fine-tuning training of small sample increment on the open target detection model in the new stage to keep the detection capability of the open target detection model on the original known category semantics.
9. The method for detecting an object by label transfer learning based on annealing as claimed in claim 8, wherein in said step S41, use is made ofTraining set of phases->After the training of the open target detection model is completed, the +/is performed according to the task setting in the open world>Training of phases adds n new classes incrementally to a known class of data sets, i.e., the known class set is updated to
10. An annealing type label transfer learning-based target detection device, comprising a processor and a memory, wherein the processor reads a computer program in the memory, and is used for executing the annealing type label transfer learning-based target detection method according to any one of claims 1 to 9.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310414703.8A CN116152721B (en) | 2023-04-18 | 2023-04-18 | Target detection method and device based on annealing type label transfer learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310414703.8A CN116152721B (en) | 2023-04-18 | 2023-04-18 | Target detection method and device based on annealing type label transfer learning |
Publications (2)
Publication Number | Publication Date |
---|---|
CN116152721A true CN116152721A (en) | 2023-05-23 |
CN116152721B CN116152721B (en) | 2023-06-20 |
Family
ID=86360369
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310414703.8A Active CN116152721B (en) | 2023-04-18 | 2023-04-18 | Target detection method and device based on annealing type label transfer learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116152721B (en) |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104751198A (en) * | 2013-12-27 | 2015-07-01 | 华为技术有限公司 | Method and device for identifying target object in image |
CN112084866A (en) * | 2020-08-07 | 2020-12-15 | 浙江工业大学 | Target detection method based on improved YOLO v4 algorithm |
US20220092407A1 (en) * | 2020-09-23 | 2022-03-24 | International Business Machines Corporation | Transfer learning with machine learning systems |
CN115731445A (en) * | 2021-08-26 | 2023-03-03 | 丰田自动车株式会社 | Learning method, information processing apparatus, and recording medium having learning program recorded thereon |
-
2023
- 2023-04-18 CN CN202310414703.8A patent/CN116152721B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104751198A (en) * | 2013-12-27 | 2015-07-01 | 华为技术有限公司 | Method and device for identifying target object in image |
CN112084866A (en) * | 2020-08-07 | 2020-12-15 | 浙江工业大学 | Target detection method based on improved YOLO v4 algorithm |
US20220092407A1 (en) * | 2020-09-23 | 2022-03-24 | International Business Machines Corporation | Transfer learning with machine learning systems |
CN115731445A (en) * | 2021-08-26 | 2023-03-03 | 丰田自动车株式会社 | Learning method, information processing apparatus, and recording medium having learning program recorded thereon |
Non-Patent Citations (3)
Title |
---|
K J JOSEPH 等: "Towards Open World Object Detection", Retrieved from the Internet <URL:https://arxiv.org/abs/2103.02603> * |
NA DONG等: "OpenWorld DETR: Transformer based Open World Object Detection", Retrieved from the Internet <URL:https://arxiv.org/abs/2212.02969> * |
ZHENYU WANG 等: "Detecting Everything in the OpenWorld: Towards Universal Object Detection", Retrieved from the Internet <URL:https://arxiv.org/abs/2303.11749> * |
Also Published As
Publication number | Publication date |
---|---|
CN116152721B (en) | 2023-06-20 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
He et al. | Local descriptors optimized for average precision | |
CN110852447A (en) | Meta learning method and apparatus, initialization method, computing device, and storage medium | |
CN109214378A (en) | A kind of method and system integrally identifying metering meter reading based on neural network | |
CN111027605A (en) | Fine-grained image recognition method and device based on deep learning | |
CN106372624B (en) | Face recognition method and system | |
Cermelli et al. | Modeling missing annotations for incremental learning in object detection | |
CN116910571B (en) | Open-domain adaptation method and system based on prototype comparison learning | |
CN112884147A (en) | Neural network training method, image processing method, device and electronic equipment | |
CN114186063A (en) | Training method and classification method of cross-domain text emotion classification model | |
WO2023093124A1 (en) | Lane line tracking method and apparatus, and computer device, storage medium and computer program product | |
CN112527959A (en) | News classification method based on pooling-free convolution embedding and attention distribution neural network | |
CN117218408A (en) | Open world target detection method and device based on causal correction learning | |
CN112836753B (en) | Method, apparatus, device, medium, and article for domain adaptive learning | |
CN114140645A (en) | Photographic image aesthetic style classification method based on improved self-supervision feature learning | |
CN116152721B (en) | Target detection method and device based on annealing type label transfer learning | |
CN116958809A (en) | Remote sensing small sample target detection method for feature library migration | |
CN114863193B (en) | Long-tail learning image classification and training method and device based on mixed batch normalization | |
CN115018884B (en) | Visible light infrared visual tracking method based on multi-strategy fusion tree | |
CN115730312A (en) | Deep hash-based family malware detection method | |
CN115618019A (en) | Knowledge graph construction method and device and terminal equipment | |
CN114970732A (en) | Posterior calibration method and device for classification model, computer equipment and medium | |
JP2016062249A (en) | Identification dictionary learning system, recognition dictionary learning method and recognition dictionary learning program | |
CN112861594A (en) | Online handwritten digit recognition method based on incremental semi-supervised kernel extreme learning machine | |
CN111984872A (en) | Multi-modal information social media popularity prediction method based on iterative optimization strategy | |
CN118552813B (en) | Training method for image processing model, electronic device and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |