CN114912516A - Cross-domain target detection method and system for coordinating feature consistency and specificity - Google Patents

Cross-domain target detection method and system for coordinating feature consistency and specificity Download PDF

Info

Publication number
CN114912516A
CN114912516A CN202210440038.5A CN202210440038A CN114912516A CN 114912516 A CN114912516 A CN 114912516A CN 202210440038 A CN202210440038 A CN 202210440038A CN 114912516 A CN114912516 A CN 114912516A
Authority
CN
China
Prior art keywords
domain
target
memory
category
source
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210440038.5A
Other languages
Chinese (zh)
Other versions
CN114912516B (en
Inventor
王晓伟
蒋沛文
王惠
秦晓辉
边有钢
胡满江
秦洪懋
徐彪
谢国涛
秦兆博
丁荣军
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuxi Institute Of Intelligent Control Hunan University
Original Assignee
Wuxi Institute Of Intelligent Control Hunan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuxi Institute Of Intelligent Control Hunan University filed Critical Wuxi Institute Of Intelligent Control Hunan University
Priority to CN202210440038.5A priority Critical patent/CN114912516B/en
Publication of CN114912516A publication Critical patent/CN114912516A/en
Application granted granted Critical
Publication of CN114912516B publication Critical patent/CN114912516B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a cross-domain target detection method and a system for coordinating feature consistency and specificity, which comprises the following steps: step 1, constructing a source domain data set and a target domain data set, and constructing a reference cross-domain target detection model; step 2, continuously updating memory elements in a memory unit through a feature specificity memory read-write module, guiding a reference cross-domain target detection model to learn feature specificity, guiding memory elements of the same class to be confused by using a source domain memory element and a target domain memory element through a feature consistency weighting alignment module, weighting a loss function of each class level domain discriminator according to the proportion of the class of the target to be detected, further guiding the learning of the cross-domain consistency by the features on the basis of semantic specificity, and obtaining a cross-domain target detection model; and 3, training the model by taking the loss function of the cross-domain target detection model for coordinating the consistency and specificity of the characteristics as an optimization target, and applying the trained model to a target domain.

Description

Cross-domain target detection method and system for coordinating feature consistency and specificity
Technical Field
The invention relates to the technical field of domain self-adaptive target detection based on deep learning, in particular to a cross-domain target detection method and system for coordinating feature consistency and specificity.
Background
The current target detection model based on deep learning generally faces the problem of domain drift caused by the difference of data distribution of a training set (called a source domain) and a testing set (called a target domain), and the problem limits the popularization and application capability of the target detection model to a certain extent, so that great challenges are provided for practical application scenes such as road traffic video monitoring and intelligent vehicle target detection. The unsupervised domain self-adaptive target detection tries to transfer the knowledge of the source domain to the target domain, and can improve the detection performance of the target detection model across different fields on the premise of avoiding additional labeling of target domain data and retraining of the model. How to use an appropriate domain adaptive strategy to complete knowledge migration from a source domain to a target domain to improve the cross-domain robustness of a target detection model is becoming a focus of attention in research fields such as computer vision and migration learning.
The existing unsupervised domain self-adaptive target detection method is usually started from a pixel level, an image level and an example level, a game relation between a target detector and a domain discriminator is established by adding domain discriminators with different levels on the target detector and using a gradient inversion layer to perform sign inversion on a gradient flowing from the domain discriminators to the target detector. During the training process, the target detector continuously generates source domain features and target domain features that are as similar as possible to fool the domain discriminator, and the domain discriminator distinguishes as much as possible whether the features generated by the target detector are from the source domain or the target domain. Practice shows that the consistency of the source domain characteristic and the target domain characteristic can be obviously enhanced through the mutual balancing mode of the target detector and the domain discriminator, and the cross-domain detection capability of the target detection model is greatly improved.
In the field of domain adaptive research, consistency of features means that extracted source domain features and target domain features can reach the same state with each other, and specificity of features means that extracted features of different classes can reach different states with each other. The two occupy equal importance in the feature alignment process and often present contradictory relationships.
Most of the existing unsupervised domain self-adaptive target detection methods are dedicated to learning of feature consistency, but the problem of potential loss of feature specificity caused by over-learning of feature consistency is ignored, so that the problem that specific tasks (such as classification, positioning and the like) which are positioned at the downstream of a model and are related to detection are affected adversely to cause feature misalignment is high, and the performance of cross-domain target detection of the model is hindered to a certain extent.
In addition, there may be a large imbalance in the number of different target classes in the source domain data itself, and in the feature consistency learning process, a larger number of target classes are equivalent to having a larger weight, and a smaller number of target classes are equivalent to having a smaller weight. Even if the existing unsupervised domain self-adaptive target detection method has the capability of acquiring the specific features, the problem of insufficient alignment of the specific features is difficult to avoid when the existing unsupervised domain self-adaptive target detection method faces rich and complex practical application scenes, so that the deviation of a cross-domain target detection model is aggravated.
Disclosure of Invention
The invention aims to provide a cross-domain target detection method and a system for coordinating feature consistency and specificity, which can realize coordination of feature specificity and consistency in cross-domain target detection.
To achieve the above object, the present invention provides a cross-domain target detection method coordinating feature consistency and specificity, which comprises:
step 1, constructing a source domain data set and a target domain data set according to actual application requirements, and constructing a reference cross-domain target detection model capable of preliminarily extracting consistency characteristics;
step 2, a characteristic specificity memory read-write module and a characteristic consistency weighting alignment module are arranged on a reference cross-domain target detection model, memory elements which can represent different types of characteristic information in a memory unit are continuously updated through the characteristic specificity memory read-write module, the reference cross-domain target detection model is guided to learn the characteristic specificity, the source domain memory elements and the target domain memory elements are used by the characteristic consistency weighting alignment module to guide the memory elements of the same type of the source domain and the target domain to be mixed, a loss function of each type level domain discriminator is weighted according to the occurrence proportion of the type of a target to be detected, the characteristic is further guided to learn the cross-domain consistency on the basis of the semantic specificity, and the cross-domain target detection model which can coordinate the specificity and the consistency of the characteristic in the cross-domain target detection is obtained;
and 3, training the model by taking the loss function of the cross-domain target detection model for coordinating the consistency and specificity of the characteristics as an optimization target, and applying the trained model to a target domain.
Further, the step 2 of setting the characteristic specific memory read-write module on the reference cross-domain target detection model specifically includes:
step 2.1.1, acquiring a source domain query vector and a target domain query vector;
step 2.1.2, retrieving memory elements of a source domain and memory elements of a target domain;
step 2.1.3, updating the source domain memory unit and the target domain memory unit:
source field memory elements to be read out
Figure BDA0003613511210000031
Writing to source domain memory cell V s Memory element v at the corresponding category position s,k In (1),
Figure BDA0003613511210000032
memory elements representing the kth category of the source domain;
memory element of target field to be read out
Figure BDA0003613511210000033
Write target field memory cell V t Memory element v at the corresponding category position t,k In (1),
Figure BDA0003613511210000034
memory elements representing the kth category of the target domain.
Further, source domain memory elements
Figure BDA0003613511210000035
Obtained in two cases as follows:
case 1:
a1. if the k-th memory element in the source domain memory unit is retrieved by the source domain query vector with the category label index of k, the read source domain memory element
Figure BDA0003613511210000036
Is described as formula (2):
Figure BDA0003613511210000037
in the formula (I), the compound is shown in the specification,
Figure BDA0003613511210000038
representing source domain queries belonging to class kThe mean value of the vector is calculated,
Figure BDA0003613511210000039
γ 1 representing the weight coefficient applied to the mean of the source domain query vectors belonging to class k, γ 2 Represents a weight coefficient, 0, applied to memory elements of the kth class of the source domain<γ 1 <1,0<γ 2 <1, and γ 12 =1;
a2. If the k-th memory element in the target domain memory unit is retrieved by the target domain query vector with the category label index of k, the read target domain memory element
Figure BDA00036135112100000310
Is described as formula (4):
Figure BDA00036135112100000311
in the formula (I), the compound is shown in the specification,
Figure BDA00036135112100000312
represents the mean of the target domain query vectors belonging to class k,
Figure BDA00036135112100000313
γ 3 representing a weight coefficient, γ, applied to the mean of the target domain query vectors belonging to class k 4 Represents a weight coefficient, 0, applied to the memory elements of the kth class of the target domain<γ 3 <1,0<γ 4 <1, and γ 34 =1;
Case 2:
b1. if the k memory element in the memory unit of the source domain is not retrieved, assigning the memory element of the k category of the source domain to the read memory element of the source domain;
b2. if the k-th memory element in the memory unit of the target domain is not retrieved, the memory element of the k-th category of the target domain is directly assigned to the read memory element of the target domain.
Further, the step 2 of "setting a feature consistency weighted alignment module on the reference cross-domain target detection model" specifically includes:
step 2.2.1, constructing a category level domain discriminator;
step 2.2.2, constructing source domain and target domain vector counters, accumulating the occurrence frequency of each type of query vector in one round of the reference cross-domain target detection model training, acquiring the occurrence proportion of the target type to be detected and weighting the loss function of each type level domain discriminator;
and 2.2.3, calculating the weight of the source domain part and the weight of the target domain part of the loss function applied to each class-level discriminator according to the occurrence proportion of the query vectors of the source domain and the target domain of each class in a training turn.
Further, the calculation formulas of "the weight of the source domain portion and the weight of the target domain portion of the loss function" in step 2.2.3 are formula (12) and formula (13), respectively:
Figure BDA0003613511210000041
Figure BDA0003613511210000042
in the formula, alpha k Representing the weight, β, of the source domain portion of the penalty function applied to the kth class level domain discriminator k Weight of the objective domain part of the penalty function representing the kth class-level domain discriminator, 0<α k <1,0<β k <1。
Further, the weighted alignment penalty function for each class level arbiter in step 2.2.1
Figure BDA0003613511210000043
As shown in equation (14):
Figure BDA0003613511210000044
in the formula (I), the compound is shown in the specification,
Figure BDA0003613511210000045
source domain memory elements representing a kth class level domain discriminator for predictive read-out
Figure BDA0003613511210000046
Probability of belonging to source domain
Figure BDA0003613511210000047
Predicting read target domain memory elements
Figure BDA0003613511210000048
Probability of belonging to target domain
Figure BDA0003613511210000049
Figure BDA00036135112100000410
Indicating a desire.
Further, step 2.2.2 specifically includes:
before the beginning of each training turn, the number N of the source domain query vectors belonging to the category k in the source domain vector counter s,k And the number N of target domain query vectors belonging to class k in the target domain vector counter t,k Are all set to 0;
in each training iteration, the total number n of source domain query vectors indexed by k using the class label is used s,k And the total number n of target domain query vectors indexed by category label k t,k And respectively updating the numerical values at the corresponding category positions of the vector counters of the source domain and the target domain according to an equation (10) and an equation (11):
N s,k ←N s,k +n s,k (10)
N t,k ←N t,k +n t,k (11)
when a training round is finished, the values stored in the source domain and target domain vector counters are the total number of source domain and target domain query vectors of each category in the training round, respectively.
The invention provides a cross-domain target detection system for coordinating feature consistency and specificity, which comprises a reference cross-domain target detection model, a feature specificity memory read-write module and a feature consistency weighting alignment module, wherein,
the reference cross-domain target detection model is provided with a basic target detector, different-level domain discriminators and a gradient inversion layer, wherein the gradient inversion layer is used for realizing the countermeasure training of the basic target detector and the various-level domain discriminators, so that the reference cross-domain target detection model has the capability of preliminarily extracting consistency characteristics;
the characteristic specificity memory read-write module is provided with a source domain memory unit and a target domain memory unit which have read and write basic operations and are used for storing different types of characteristic information, and in each training iteration process, the characteristic specificity memory read-write module is used for reading out a source domain memory element and a target domain memory element from the memory unit, updating the source domain memory unit and the target domain memory unit per se, guiding a reference cross-domain target detection model to learn the semantic specificity of the characteristics and providing input for a subsequent characteristic consistency weighting alignment module;
the feature consistency weighting and aligning module is provided with a source domain and target domain vector counter, a plurality of category level domain discriminators and a gradient inversion layer, wherein each category level domain discriminator takes a source domain memory element and a target domain memory element of a corresponding category as input, and performs confusion on the memory elements of the same category of the source domain and the target domain, and is also used for weighting a loss function of each category level domain discriminator according to the occurrence proportion of a target category to be detected, and further guiding the learning of the cross-domain consistency by the features on the basis of semantic specificity.
Further, the feature specificity memory read-write module specifically comprises:
a query vector acquisition unit for acquiring a source domain query vector and a target domain query vector;
a memory element retrieval unit for retrieving a source domain memory element and a target domain memory element;
a memory element updating unit for updating the source domain memory unit and the target domain memory unit:
source field memory elements to be read out
Figure BDA0003613511210000051
Writing to source domain memory cell V s Memory element v at the corresponding category position s,k In (1),
Figure BDA0003613511210000052
memory elements representing the kth category of the source domain;
memory element of target field to be read out
Figure BDA0003613511210000053
Write target field memory cell V t Memory element v at the corresponding category position t,k In (1),
Figure BDA0003613511210000054
memory elements representing the kth category of the target domain.
Further, the feature consistency weighting and aligning module specifically includes:
a discriminator construction unit for constructing a category-level domain discriminator;
the vector counter constructing unit is used for constructing a source domain vector counter and a target domain vector counter and is used for accumulating the occurrence times of each type of query vector in one round of reference cross-domain target detection model training;
and the discrimination loss weighting unit is used for calculating the weight of the loss function source domain part and the weight of the target domain part applied to each category level discriminator according to the occurrence proportion of the source domain query vector and the target domain query vector of each category in one training turn so as to weight the loss function of each category level discriminator.
According to the method, the memory elements capable of representing different types of feature information in the memory unit are continuously updated by using the feature specificity memory read-write module, the learning of the reference cross-domain target detection model on the feature specificity is guided, then the alignment of the same type features of the two domains is guided by using the memory elements of the source domain and the target domain according to the same attention, the cross-domain consistency of the features is further enhanced on the basis of the specificity, and therefore the coordination of the specificity and the consistency of the features is realized in the cross-domain target detection.
Drawings
FIG. 1 is an architecture diagram of a cross-domain target detection method that coordinates feature consistency and specificity provided by an embodiment of the invention.
FIG. 2 is a flowchart of a cross-domain target detection method that coordinates feature consistency and specificity provided by an embodiment of the invention.
Fig. 3 is a flowchart of a method for implementing setting of a feature-specific memory read-write module on the reference cross-domain target detection model in step 2 of fig. 2.
FIG. 4 is a diagram illustrating a learning process of the feature-specific memory read/write module according to an embodiment of the present invention.
FIG. 5 is a flowchart of a method for implementing the setting of a feature consistency weighted alignment module on the reference cross-domain target detection model in step 2 of FIG. 2.
Fig. 6 is a schematic diagram of a learning process of a feature consistency weighted alignment module according to an embodiment of the present invention.
FIG. 7 is an architecture diagram of a cross-domain object detection system that coordinates feature consistency and specificity provided by embodiments of the present invention.
Detailed Description
The technical solutions provided by the present invention will be described in detail below, and it should be understood that the following detailed description is only illustrative of the present invention and is not intended to limit the scope of the present invention.
The definition of key terms is as follows:
the characteristics are consistent: the source domain features and the target domain features extracted by the trained model can reach nearly the same state, and the method is used for measuring the migration capability of the model when the model spans different fields.
② characteristic specificity: the features of different classes extracted by the trained model can reach states different from each other, and are used for measuring the discrimination capability of the model facing different classes.
As shown in fig. 1 and fig. 2, the cross-domain target detection method for harmonizing feature consistency and specificity provided by the embodiment of the present invention includes:
step 1, according to actual application requirements, a source domain data set and a target domain data set are constructed, and a reference cross-domain target detection model is constructed, wherein the model has the capability of preliminarily extracting consistency characteristics. The term "preliminary" may be understood as that the model already has the capability of extracting consistent features, but the performance of the model still has defects, and it is possible that the capability of extracting consistent features needs to be improved, and it is possible that the capability is not perfect in terms of feature consistency and specificity coordination, so that the subsequent steps of the present invention are required to further improve the performance of cross-domain target detection.
And 2, setting a characteristic specificity memory read-write module and a characteristic consistency weighting alignment module on the reference cross-domain target detection model to obtain a cross-domain target detection model for coordinating characteristic consistency and specificity, wherein the model can not only keep the specificity of the characteristics from a semantic level, but also keep the consistency of the characteristics from a cross-domain level.
And 3, training the model by taking the loss function of the cross-domain target detection model for coordinating the consistency and the specificity of the characteristics as an optimization target, and applying the trained model to a target domain.
The step 1 specifically comprises:
step 1.1, a source domain data set and a target domain data set are constructed.
According to the actual application requirements, a public data set with a label is selected as a source domain, and a data set acquired in an actual scene is used as a target domain. Wherein labels are commonly referred to as bounding box labels and category labels. The target domain data does not have a bounding box label and a category label, and the label of the source domain data needs to be subjected to category filtering, category merging and the like so as to ensure that the source domain and the target domain have the same category to be detected.
And 1.2, building a reference cross-domain target detection model.
According to the actual application requirements, a target detection model based on deep learning is selected as a basic target detector to ensure that a subsequently built model can complete the basic task of target detection, and the loss function of the basic target detector is recorded as
Figure BDA0003613511210000071
Setting pixel level, image level and example level domain discriminators on a basic target detector through a Gradient Reverse Layer (GRL), building a reference cross-domain target detection model to ensure that a subsequently built model has the capability of preliminarily extracting consistency characteristics, and recording an introduced domain discrimination loss function as
Figure BDA0003613511210000072
In order to explain the model building process in the follow-up process, the fast R-CNN commonly adopted in the field of domain adaptive target detection research is selected as a basic target detector, a characteristic diagram output by a characteristic extraction network E is used, and a domain discriminator is set through a gradient inversion layer GRL. As the skilled person will understand, the basic target detector selected when building the cross-domain target detection model with the consistency and specificity of the coordination features only needs to have a similar network structure with the Faster R-CNN, and should not be limited to the Faster R-CNN. In addition, although the reference cross-domain target detection model in fig. 1 only has one level of domain discriminators, the model with the capability of preliminarily extracting the domain consistency features obtained after the pixel level, the image level and the example level domain discriminators are arranged on the basic target detector belongs to the protection category of the "reference cross-domain target detection model" in the present invention.
The step 2 specifically comprises the following steps:
and 2.1, setting a characteristic specificity memory read-write module on the reference cross-domain target detection model, as shown in fig. 3.
Although the reference cross-domain target detection model built in the step 1.2 has a certain capability of generating consistency features, the consistency feature learning process may not be effectively controlled and excessively acts, so that the generated feature specificity is poor, and further negative influence is brought to subsequent specific tasks (such as classification, positioning and the like) related to target detection, so that negative migration of the source domain and target domain features is caused.
In order to effectively overcome the problems, a memory unit with basic operations of reading and writing is used for storing feature information of different categories, and a reference cross-domain target detection model is guided to learn the specificity of features at a semantic level. In a certain training iteration process, the features of different classes are searched for the memory unit by the identity of the query vector, the memory elements which can represent feature information of different classes are read out from the memory unit, and the memory elements can be written into the memory unit again for the next search and reading, and can release beneficial class signals for the subsequent learning of feature consistency.
Step 2.1 specifically includes, as shown in fig. 3:
and 2.1.1, acquiring a source domain query vector and a target domain query vector.
The query vector can retrieve memory elements related to the class feature information from the memory unit, and is the basis for learning feature specificity.
Extracting n from the feature map output by the feature extraction network E according to the correct (Ground Truth) bounding box label of the source domain image s A source domain feature matrix, where s represents the source domain. Pooling the characteristic matrixes in an area of interest (RoI Pooling), flattening the characteristic matrixes with fixed dimensionality obtained after Pooling, and finally obtaining source domain query vectors through two full-connected (FC) layers
Figure BDA0003613511210000081
Figure BDA0003613511210000082
These source domain query vectors all have their respective correct category labels. Wherein the content of the first and second substances,
Figure BDA0003613511210000083
the dimension representing the source domain query vector is d.
Extracting n from the feature graph output by the feature extraction network E according to the pseudo border frame label obtained by predicting the target domain image by the reference cross-domain target detection model t And (3) a target domain feature matrix, wherein t represents the target domain. Pooling the interest regions of the feature matrixes, flattening the feature matrixes with fixed dimensions obtained after pooling, and finally obtaining query vectors of the target region through two full-connection layers
Figure BDA0003613511210000084
These target domain query vectors all have respective pseudo category labels. Wherein the content of the first and second substances,
Figure BDA0003613511210000091
the dimension representing the target domain query vector is d.
And 2.1.2, retrieving the memory elements of the source domain and the memory elements of the target domain.
The memory elements in the memory unit are used for storing different types of feature information, and can be updated along with the training process of the reference cross-domain target detection model, so that the reference cross-domain target detection model can learn feature specificity.
Before training iteration starts, random numbers are needed to be used for respectively initializing memory elements in memory units of a source domain and a target domain, the number of the memory elements in the memory units is equal to the total number of target categories to be detected, and the dimensionality of each memory element is equal to the dimensionality of a query vector.
The source domain memory cell is denoted as V s ={v s,1 ,v s,2 ,…,v s,(K-1) ,v s,K And (c) the step of (c) in which,
Figure BDA0003613511210000092
memory elements representing the kth category of the source domain.
Representing the target domain memory cell as V t ={v t,1 ,v t,2 ,…,v t,(K-1) ,v t,K And (c) the step of (c) in which,
Figure BDA0003613511210000093
memory elements representing the kth category of the target domain.
Wherein K represents the class label index of the object class to be detected, K is epsilon {1,2, …, K }, and K represents the total number of the object classes to be detected.
Because each memory element in the memory units of the source domain and the target domain respectively represents the characteristic information of each target category to be detected, the query vector of the source domain is used in a certain training iteration process
Figure BDA0003613511210000094
The memory elements with the same category can be retrieved from the memory unit of the source domain by the correct category label respectively, and the target domain queries the vector
Figure BDA0003613511210000095
The memory elements of the same category as the memory elements of the target domain can be retrieved from the memory units of the target domain by means of the corresponding pseudo-category labels.
In case 1, if the kth memory element in the source domain memory unit is retrieved by the source domain query vector with the category label index k, the mean value of the source domain query vectors belonging to the category k, which is obtained along the number direction of the source domain query vectors, needs to be obtained
Figure BDA0003613511210000096
This facilitates the representation of the category characteristic information of the current source domain query vector, as shown in equation (1). Wherein the content of the first and second substances,
Figure BDA0003613511210000097
the dimension representing the mean of the source domain query vectors belonging to category k is d.
Figure BDA0003613511210000098
In the formula, n s,k The total number of source domain query vectors that represent a class label index of k, having
Figure BDA0003613511210000099
In order to ensure that the category feature information represented by the memory element read from the source domain memory unit not only pays attention to the category feature information represented by the current source domain query vector, but also considers the category feature information represented by the memory element stored in the memory unit before, the mean value of the source domain query vector representing the current category feature information and the source domain memory element representing the previous category feature information can be weighted and summed respectively to obtain the read source domain memory element
Figure BDA0003613511210000101
As shown in equation (2).
Figure BDA0003613511210000102
In the formula, gamma 1 Representing the weight coefficient applied to the mean of the source domain query vectors belonging to class k, γ 2 Represents a weight coefficient, 0, applied to memory elements of the kth class of the source domain<γ 1 <1,0<γ 2 <1, and γ 12 =1。
Similarly, if the k-th memory element in the target domain memory unit is retrieved by the target domain query vector with the category label index of k, the mean value of the target domain query vectors belonging to the category k can be obtained
Figure BDA0003613511210000103
As shown in equation (3). Wherein the content of the first and second substances,
Figure BDA0003613511210000104
the dimension representing the mean of the target domain query vectors belonging to class k is d.
Figure BDA0003613511210000105
In the formula, n t,k The total number of target domain query vectors representing a class label index of k, has
Figure BDA0003613511210000106
Respectively weighting and then summing the mean value of the query vectors of the target domain belonging to the class k and the memory element of the kth class of the target domain to obtain the read memory element of the target domain
Figure BDA0003613511210000107
As shown in equation (4).
Figure BDA0003613511210000108
In the formula, gamma 3 Representing a weight coefficient, γ, applied to the mean of the target domain query vectors belonging to class k 4 Represents a weight coefficient, 0, applied to the memory elements of the kth class of the target domain<γ 3 <1,0<γ 4 <1, and γ 34 =1。
Because the query vectors of the source domain and the target domain do not necessarily cover all the target classes to be detected in a certain training iteration process, namely n s,k 0 or n t,k Therefore, some memory elements in the source domain and target domain memory units may not be read out, and the subsequent feature consistency learning process needs to take the source domain and target memory elements representing the same class of feature information as input, and the problem can be handled according to the following case 2.
In case 2, if the k-th memory element in the source domain memory unit is not retrieved, the k-th category memory element in the source domain is directly assigned to the read source domain memory element, as shown in formula (5), the category feature information represented by the memory element that has been stored in the memory unit before is still used as the category feature information represented by the memory element read from the source domain memory unit.
Figure BDA0003613511210000109
Similarly, if the k-th memory element in the target domain memory unit is not retrieved, the k-th class memory element of the target domain is directly assigned to the read target domain memory element, as shown in equation (6).
Figure BDA0003613511210000111
And 2.1.3, updating the source domain memory unit and the target domain memory unit.
In the training process of the reference cross-domain target detection model, the read memory elements are used for updating the memory units of the source domain and the target domain, so that on one hand, the memory elements in the memory units can continuously inject new feature information representing the same type on the basis of the original type feature information, and the memory capacity of the memory units is continued. On the other hand, before the next training iteration comes, preparation is made for searching the query vector, so that the category characteristic information represented by the memory elements read again is more reasonable and reliable.
For the source domain, the source domain memory element to be read out by the formula (2) or the formula (5)
Figure BDA0003613511210000112
Writing to source domain memory cell V s Memory element v at the corresponding category position s,k As shown in equation (7).
For the target domain, the memory element of the target domain read out by the formula (4) or the formula (6)
Figure BDA0003613511210000113
Write target field memory cell V t Memory element v at the corresponding category position s,k As shown in equation (8).
Figure BDA0003613511210000114
Figure BDA0003613511210000115
In each training iteration process, whether the source domain or the target domain, all memory elements capable of representing the characteristic information of the respective category can be retrieved from the memory unit directly or indirectly according to the query vector. As shown in fig. 4, it is assumed that the source domain and the target domain have only two types of target to be detected, and with the advancement of the training process of the reference cross-domain target detection model, the query vectors can find memory elements of the corresponding domain and the same type, and can continuously update the type feature information of the memory elements, so that the memory elements can more and more accurately represent the features of the corresponding type, and the different types of features are more and more well separated at the semantic level. Thus, features with progressively increasing specificity may reduce the risk of semantic confusion for different classes.
And 2.2, setting a characteristic consistency weighted alignment module on the reference cross-domain target detection model, as shown in fig. 5.
After the characteristic specificity memory read-write module is arranged on the reference cross-domain target detection model through the step 2.1, although the characteristic which has primary specificity at the semantic level can be obtained, the cross-domain consistency of different types of characteristics still has a space for improvement.
In order to effectively coordinate specificity and consistency of features, it can be considered to use the source domain memory elements and the target domain memory elements read from the source domain memory units and the target domain memory units at each training iteration to guide the alignment of the same class features of the two domains.
In addition, the loss function of each category level domain discriminator is weighted according to the occurrence proportion of the target category to be detected, and the different categories of features are ensured to have the same attention in the alignment process, so that the cross-domain consistency of the different categories of features is enhanced on the basis of semantic specificity.
The step 2.2 specifically comprises the following steps:
and 2.2.1, constructing a category-level domain discriminator.
The domain discriminator has the basic function of distinguishing whether the features come from a source domain or a target domain, and the learning purpose of the consistency features is to confuse the features of the same class of the source domain and the target domain, so that the class-level domain discriminator is constructed by a gradient inversion layer for each class of the target to be detected to realize the alignment of the corresponding class features.
Specifically, each class-level domain discriminator is composed of a series of fully-connected layers, and the read source domain memory elements and the read target domain memory elements of the corresponding class are input, and the read source domain memory elements and the read target domain memory elements are output as the probabilities of predicting the read memory elements from the source domain. The gradient inversion layer can achieve antagonistic training of the category-level domain discriminator and the basic target detector, when the reference cross-domain target detection model is trained to a certain degree, the basic target detector and the category-level discriminator reach dynamic balance, and at the moment, the characteristic with semantic specificity can also keep cross-domain consistency. Loss function per class level arbiter
Figure BDA0003613511210000121
As shown in equation (9).
Figure BDA0003613511210000122
In the formula (I), the compound is shown in the specification,
Figure BDA0003613511210000123
representing the kth class level domain arbiter for predicting the read source domain memory element
Figure BDA0003613511210000124
Probability of belonging to source domain
Figure BDA0003613511210000125
Predicting read target domain memory elements
Figure BDA0003613511210000126
Figure BDA0003613511210000127
Figure BDA0003613511210000128
Figure BDA0003613511210000129
Indicating a desire.
And 2.2.2, constructing a source domain and target domain vector counter.
The vector counter has the main functions of accumulating the occurrence times of query vectors of each category in one round of training of the reference cross-domain target detection model, and is used for acquiring the occurrence proportion of the target categories to be detected and weighting the loss function of the classifier of each category level domain. And the capacity of the vector counter is the total number of the target classes to be detected.
Before the beginning of each training round, all values of the vector counter are set to 0, namely N s,k =0,N t,k 0, wherein N s,k Representing the number of source domain query vectors in the source domain vector counter that belong to class k, N t,k Representing the number of target domain query vectors in the target domain vector counter that belong to category k.
In each training iteration, the total number n of source domain query vectors indexed by k using the class label is used s,k And the total number n of target domain query vectors indexed by category label k t,k And updating the numerical values at the corresponding category positions of the vector counters of the source domain and the target domain, as shown in the formula (10) and the formula (11).
N s,k ←N s,k +n s,k (10)
N t,k ←N t,k +n t,k (11)
When a training round is finished, the numerical values stored in the source domain and target domain vector counters are respectively the total number of the source domain and target domain query vectors of each type in the training round, and after the proportion of the source domain and target domain query vectors of each type in one round is calculated, all the numerical values of the source domain and target domain vector counters are set to be 0 again and used for carrying out statistics on the number of the source domain and target domain query vectors of each type in the next training round.
Step 2.2.3, a weighted alignment loss function is calculated.
Since the number of different target classes in the source domain data may have a large imbalance, and the number of the same target classes in the source domain and target domain data may have a large difference, a large number of target classes is equivalent to having a large weight in the alignment process, and a small number of target classes is equivalent to having a small weight in the alignment process.
In order to adjust the feature alignment degree of different numbers of target classes, the weight of the source domain part and the weight of the target domain part of the loss function applied to each class-level discriminator are calculated according to the occurrence proportion of the source domain query vector and the target domain query vector of each class in a training turn, as shown in formula (12) and formula (13).
Figure BDA0003613511210000131
Figure BDA0003613511210000132
In the formula, alpha k Representing the weight, β, of the source domain portion of the penalty function applied to the kth class level domain discriminator k Weight of the objective domain part of the penalty function representing the kth class-level domain discriminator, 0<α k <1,0<β k <1. If the proportion of the query vectors of the source domain and the target domain of a certain category appearing in one turn is larger, the feature alignment strength can be properly reduced in the feature consistency learning process, and if the proportion of the query vectors of the source domain and the target domain of a certain category appearing in one turn is smaller, the feature alignment strength should be improved as much as possible in the feature consistency learning process.
It should be noted that, the formula (12) and the formula (13) are only one method for obtaining the weights of the source domain part and the target domain part of the loss function of the category-level domain discriminator, and it is within the scope of the present invention to satisfy that the obtained weights are in inverse proportion to the occurrence of each category query vector in a training turn.
Rewriting equation (9) using the weights obtained from equations (12) and (13), the weighted alignment penalty function for each class level discriminator
Figure BDA0003613511210000133
As shown in equation (14).
Figure BDA0003613511210000134
Since a class-level domain discriminator is constructed for each target class to be detected to learn the consistency of the features, the total weighted alignment loss function
Figure BDA0003613511210000141
The sum of the weighted alignment penalty functions for all class-level domain discriminators should be as shown in equation (15).
Figure BDA0003613511210000142
In each training iteration process, the memory elements of the same category of the source domain and the target domain enter corresponding category-level domain discriminators through a gradient inversion layer, and the category-level domain discriminators are responsible for confusing the characteristics of the same category of the source domain and the target domain. As shown in fig. 6, it is assumed that the source domain and the target domain have only two types of target to be detected, and the query vectors are all matched with corresponding memory elements, and with the progress of the training process of the reference cross-domain target detection model, the difference between the memory elements of the source domain and the memory elements of the target domain is continuously reduced.
The step 3 specifically comprises:
and 3.1, obtaining a loss function of the cross-domain target detection model with the coordination feature consistency and the specificity, and training the cross-domain target detection model with the coordination feature consistency and the specificity.
Loss function of cross-domain target detection model for coordinating feature consistency and specificity
Figure BDA0003613511210000148
Including a reference cross-domain target detection model loss function and a total weighted alignment loss function
Figure BDA0003613511210000143
And the reference cross-domain target detection model loss function comprises a base target detector loss function
Figure BDA0003613511210000144
Sum domain discriminant loss function
Figure BDA0003613511210000145
Loss function of cross-domain object detection model to coordinate feature consistency and specificity
Figure BDA0003613511210000146
As shown in equation (16).
Figure BDA0003613511210000147
In the formula, λ 1 And λ 2 For balancing coefficients, it is usually required to obtain through model tuning, and values of 0.01, 0.1, 1, etc. are generally taken.
And (3) taking a formula (16) as an optimization target of the cross-domain target detection model for coordinating feature consistency and specificity, and training the model by adopting a proper optimization algorithm (such as an SGD algorithm, an Adam algorithm and the like). In order to avoid the instability of the cross-domain target detection model for coordinating the consistency and specificity of the features in the initial training stage, the training process is divided into two stages, the first stage only trains the reference cross-domain target detection model, and the second stage trains the whole cross-domain target detection model for coordinating the consistency and specificity of the features, which consists of the reference cross-domain target detection model, the feature specificity memory read-write module and the feature consistency weighting alignment module.
And 3.2, performing cross-domain target detection by using the trained cross-domain target detection model with the coordination feature consistency and specificity.
And loading a weight file which is stored in the training process and corresponds to the performance of the optimal model for a basic target detector (not comprising a domain discriminator, a feature specificity memory read-write module and a feature consistency weighting alignment module), and using the model to detect the target on the target domain without the label. Because the consistency and the specificity of the features are coordinated in the training process, the cross-domain target detection model for coordinating the consistency and the specificity of the features has higher cross-domain robustness.
As shown in fig. 7, the cross-domain target detection system for coordinating feature consistency and specificity provided in the embodiment of the present invention includes a reference cross-domain target detection model, a feature specificity memory read-write module, and a feature consistency weighting alignment module, where:
the reference cross-domain target detection model consists of a basic target detector, domain discriminators of different levels (such as pixel level, image level, instance level and the like) and a gradient inversion layer, wherein the gradient inversion layer can realize the countermeasure training of the basic target detector and the domain discriminators of various levels, so that the reference cross-domain target detection model has the capability of preliminarily extracting consistency characteristics. In addition, the reference cross-domain target detection model is also the basis for subsequently setting a feature specificity memory read-write module and a feature consistency weighting alignment module.
The core of the characteristic specificity memory read-write module is a source domain memory unit and a target domain memory unit, and the memory units have basic operations of reading and writing and can be used for storing different types of characteristic information. In each training iteration process, the characteristic specificity memory read-write module can read out the source domain memory elements and the target domain memory elements from the memory unit and update the source domain memory units and the target domain memory units of the characteristic specificity memory read-write module. The module can guide the reference cross-domain target detection model to learn the semantic specificity of the features and provide input for a subsequent feature consistency weighting and aligning module.
The feature consistency weighting and aligning module is composed of a source domain vector counter, a target domain vector counter, a plurality of category level domain discriminators and a gradient inversion layer. Each category-level domain discriminator takes the source domain memory element and the target domain memory element of the corresponding category as input, and mixes the memory elements of the same category of the two domains. In addition, the loss function of each category level domain discriminator is weighted according to the occurrence proportion of the category of the target to be detected, and the learning of cross-domain consistency by the features is further guided on the basis of semantic specificity.
In one embodiment, the feature-specific memory read/write module specifically includes:
a query vector acquisition unit for acquiring a source domain query vector and a target domain query vector;
a memory element retrieval unit for retrieving a source domain memory element and a target domain memory element;
a memory element updating unit for updating the source domain memory unit and the target domain memory unit:
source field memory elements to be read out
Figure BDA0003613511210000151
Writing to source memory cell V s Memory element v at the corresponding category position s,k In (1),
Figure BDA0003613511210000152
memory elements representing the kth category of the source domain;
memory element of target field to be read out
Figure BDA0003613511210000153
Write target field memory cell V t Memory element v at the corresponding category position t,k In (1),
Figure BDA0003613511210000154
memory elements representing the kth category of the target domain.
The feature specificity memory read-write module provided by the embodiment of the invention continuously updates the class feature information corresponding to the memory element through the read and write operations of the memory unit, so that the memory element can more and more accurately represent the features of the corresponding class, the retention of the specificity of the features of different classes is realized on a semantic level, the problem of potential loss of specificity caused by over-learning feature consistency is avoided, the risk of wrong alignment of the features of different classes is reduced, and the performance of the reference cross-domain target detection model is improved.
In one embodiment, the feature consistency weighted alignment module specifically includes:
a discriminator construction unit for constructing a category-level domain discriminator;
the vector counter constructing unit is used for constructing a source domain vector counter and a target domain vector counter and is used for accumulating the occurrence times of each type of query vector in one round of reference cross-domain target detection model training;
and the discrimination loss weighting unit is used for calculating the weight of the loss function source domain part and the weight of the target domain part applied to each category level discriminator according to the occurrence proportion of the source domain query vector and the target domain query vector of each category in one training turn so as to weight the loss function of each category level discriminator. In this embodiment, in the feature consistency learning process, the source domain partial loss function and the target domain partial loss function of each class-level domain discriminator are balanced by a weighting strategy according to the number of classes of the target to be detected.
The feature consistency weighted alignment module provided by the embodiment of the invention aligns the source domain memory elements and the target domain memory elements through the category-level domain discriminator, indirectly guides the alignment of the same category features of the source domain and the target domain, can also ensure that different category features have the same attention in the alignment process, can enhance the cross-domain consistency of different category features on the basis of semantic specificity, overcomes the problem of insufficient alignment of specific features, and reduces the deviation of a reference cross-domain target detection model.
The invention can reasonably coordinate the specificity and consistency of the features in the training process of the reference cross-domain target detection model, and is different from the prior technical scheme that the learning of the feature consistency is only concerned unilaterally and the learning of the feature specificity is ignored.
Finally, it should be pointed out that: the above examples are only for illustrating the technical solutions of the present invention, and are not limited thereto. Those of ordinary skill in the art will understand that: modifications can be made to the technical solutions described in the foregoing embodiments, or some technical features may be equivalently replaced; such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims (10)

1. A cross-domain target detection method for coordinating feature consistency and specificity is characterized by comprising the following steps:
step 1, constructing a source domain data set and a target domain data set according to actual application requirements, and constructing a reference cross-domain target detection model capable of preliminarily extracting consistency characteristics;
step 2, a characteristic specificity memory read-write module and a characteristic consistency weighting alignment module are arranged on a reference cross-domain target detection model, memory elements which can represent different types of characteristic information in a memory unit are continuously updated through the characteristic specificity memory read-write module, the reference cross-domain target detection model is guided to learn the characteristic specificity, the source domain memory elements and the target domain memory elements are used by the characteristic consistency weighting alignment module to guide the memory elements of the same type of the source domain and the target domain to be mixed, a loss function of each type level domain discriminator is weighted according to the occurrence proportion of the type of a target to be detected, the characteristic is further guided to learn the cross-domain consistency on the basis of the semantic specificity, and the cross-domain target detection model which can coordinate the specificity and the consistency of the characteristic in the cross-domain target detection is obtained;
and 3, training the model by taking the loss function of the cross-domain target detection model for coordinating the consistency and specificity of the characteristics as an optimization target, and applying the trained model to a target domain.
2. The method for cross-domain target detection with feature consistency and specificity harmonized according to claim 1, wherein the step 2 of setting a feature-specific memory read-write module on a reference cross-domain target detection model specifically comprises:
step 2.1.1, acquiring a source domain query vector and a target domain query vector;
step 2.1.2, retrieving memory elements of a source domain and memory elements of a target domain;
step 2.1.3, updating the source domain memory unit and the target domain memory unit:
source field memory elements to be read out
Figure FDA0003613511200000011
Writing to source domain memory cell V s Memory element v at the corresponding category position s,k In (1),
Figure FDA0003613511200000012
memory elements representing the kth category of the source domain;
memory element of target field to be read out
Figure FDA0003613511200000013
Write target field memory cell V t Memory element v at the corresponding category position t,k In (1),
Figure FDA0003613511200000014
memory elements representing the kth category of the target domain.
3. The method of claim 2, wherein source domain memory elements are used to coordinate feature consistency and specificity across domain target detection
Figure FDA0003613511200000015
Obtained in two cases as follows:
case 1:
a1. if the k-th memory element in the source domain memory unit is retrieved by the source domain query vector with the class label index of k, the read-out source domain memory element
Figure FDA0003613511200000016
Is described as formula (2):
Figure FDA0003613511200000021
in the formula (I), the compound is shown in the specification,
Figure FDA0003613511200000022
represents the mean of the source domain query vectors belonging to class k,
Figure FDA0003613511200000023
γ 1 representing the weight coefficient applied to the mean of the source domain query vectors belonging to class k, γ 2 Represents a weight coefficient applied to the k-th class of memory elements of the source domain, 0 < gamma 1 <1,0<γ 2 < 1, and γ 12 =1;
a2. If the k-th memory element in the target domain memory unit is retrieved by the target domain query vector with the category label index of k, the read target domain memory element
Figure FDA0003613511200000024
Is described as formula (4):
Figure FDA0003613511200000025
in the formula (I), the compound is shown in the specification,
Figure FDA0003613511200000026
representing target domain queries belonging to class kThe mean value of the vector is calculated,
Figure FDA0003613511200000027
γ 3 representing a weight coefficient, γ, applied to the mean of the target domain query vectors belonging to class k 4 Represents a weight coefficient applied to memory elements of the kth class of the target domain, 0 < gamma 3 <1,0<γ 4 < 1, and γ 34 =1;
Case 2:
b1. if the k memory element in the memory unit of the source domain is not retrieved, assigning the memory element of the k category of the source domain to the read memory element of the source domain;
b2. if the k-th memory element in the memory unit of the target domain is not retrieved, the memory element of the k-th category of the target domain is directly assigned to the read memory element of the target domain.
4. The method for cross-domain target detection based on feature consistency and specificity coordination according to any one of claims 1 to 3, wherein the step 2 of setting a feature consistency weighted alignment module on a reference cross-domain target detection model specifically comprises:
step 2.2.1, constructing a category level domain discriminator;
step 2.2.2, constructing a source domain and target domain vector counter for accumulating the occurrence times of each category of query vectors in one round of the reference cross-domain target detection model training, acquiring the occurrence proportion of the target category to be detected and weighting the loss function of each category level domain discriminator;
and 2.2.3, calculating the weight of the source domain part and the weight of the target domain part of the loss function applied to each class-level discriminator according to the occurrence proportion of the query vectors of the source domain and the target domain of each class in a training turn.
5. The method for cross-domain target detection with harmonized feature consistency and specificity according to claim 4, wherein the calculation formula of "weight of source domain part and weight of target domain part of loss function" in step 2.2.3 are respectively formula (12) and formula (13):
Figure FDA0003613511200000031
Figure FDA0003613511200000032
in the formula, alpha k Representing the weight, β, of the source domain portion of the penalty function applied to the kth class level domain discriminator k Weight of the objective domain part of the loss function representing the kth class level domain discriminator, 0 < alpha k <1,0<β k <1。
6. The method of cross-domain object detection with harmonized feature consistency and specificity according to claim 5 wherein the weighted alignment penalty function of each class level discriminator in step 2.2.1
Figure FDA0003613511200000033
As shown in equation (14):
Figure FDA0003613511200000034
in the formula (I), the compound is shown in the specification,
Figure FDA0003613511200000035
representing the kth class level domain arbiter for predicting the read source domain memory element
Figure FDA0003613511200000036
Probability of belonging to source domain
Figure FDA0003613511200000037
Predicting read target domain memory elements
Figure FDA0003613511200000038
Probability of belonging to a target domain
Figure FDA0003613511200000039
Figure FDA00036135112000000310
Indicating the desire.
7. The method for cross-domain target detection with harmonized feature consistency and specificity as defined in claim 4, wherein step 2.2.2 specifically comprises:
before the beginning of each training turn, the number N of the source domain query vectors belonging to the category k in the source domain vector counter s,k And the number N of target domain query vectors belonging to class k in the target domain vector counter t,k Are all set to 0;
in each training iteration, the total number n of source domain query vectors indexed by k using the class label is used s,k And the total number n of target domain query vectors indexed by category label k t,k And respectively updating the numerical values at the corresponding category positions of the vector counters of the source domain and the target domain according to an equation (10) and an equation (11):
N s,k ←N s,k +n s,k (10)
N t,k ←N t,k +n t,k (11)
when a training round is finished, the values stored in the source domain and target domain vector counters are the total number of source domain and target domain query vectors of each category in the training round, respectively.
8. A cross-domain target detection system coordinating feature consistency and specificity is characterized by comprising a reference cross-domain target detection model, a feature specificity memory read-write module and a feature consistency weighting alignment module, wherein,
the reference cross-domain target detection model is provided with a basic target detector, different-level domain discriminators and a gradient inversion layer, wherein the gradient inversion layer is used for realizing the countermeasure training of the basic target detector and the various-level domain discriminators, so that the reference cross-domain target detection model has the capability of preliminarily extracting consistency characteristics;
the characteristic specificity memory read-write module is provided with a source domain memory unit and a target domain memory unit which have reading and writing basic operations and are used for storing different types of characteristic information, and in each training iteration process, the characteristic specificity memory read-write module is used for reading out a source domain memory element and a target domain memory element from the memory units, updating the source domain memory unit and the target domain memory unit, guiding a reference cross-domain target detection model to learn the semantic specificity of the characteristics and providing input for a subsequent characteristic consistency weighting alignment module;
the feature consistency weighting and aligning module is provided with a source domain and target domain vector counter, a plurality of category level domain discriminators and a gradient inversion layer, wherein each category level domain discriminator takes a source domain memory element and a target domain memory element of a corresponding category as input, and performs confusion on the memory elements of the same category of the source domain and the target domain, and is also used for weighting a loss function of each category level domain discriminator according to the occurrence proportion of a target category to be detected, and further guiding the learning of the cross-domain consistency by the features on the basis of semantic specificity.
9. The system of claim 1, wherein the feature specificity memory read-write module comprises:
a query vector acquisition unit for acquiring a source domain query vector and a target domain query vector;
a memory element retrieval unit for retrieving a source domain memory element and a target domain memory element;
a memory element updating unit for updating the source domain memory unit and the target domain memory unit:
source field memory elements to be read out
Figure FDA0003613511200000041
Writing to source domain memory cell V s Memory element v at the corresponding category position s,k In (1),
Figure FDA0003613511200000042
memory elements representing the kth category of the source domain;
memory element of target field to be read out
Figure FDA0003613511200000043
Write target field memory cell V t Memory element v at the corresponding category position t,k In (1),
Figure FDA0003613511200000044
memory elements representing the kth category of the target domain.
10. The system for cross-domain object detection that coordinates feature consistency and specificity of claim 8 or 9, wherein the feature consistency weighted alignment module specifically comprises:
a discriminator construction unit for constructing a category-level domain discriminator;
the vector counter constructing unit is used for constructing a source domain vector counter and a target domain vector counter and is used for accumulating the occurrence times of each type of query vector in one round of reference cross-domain target detection model training;
and the discrimination loss weighting unit is used for calculating the weight of the loss function source domain part and the weight of the target domain part applied to each category level discriminator according to the occurrence proportion of the source domain query vector and the target domain query vector of each category in one training turn so as to weight the loss function of each category level discriminator.
CN202210440038.5A 2022-04-25 2022-04-25 Cross-domain target detection method and system for coordinating feature consistency and specificity Active CN114912516B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210440038.5A CN114912516B (en) 2022-04-25 2022-04-25 Cross-domain target detection method and system for coordinating feature consistency and specificity

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210440038.5A CN114912516B (en) 2022-04-25 2022-04-25 Cross-domain target detection method and system for coordinating feature consistency and specificity

Publications (2)

Publication Number Publication Date
CN114912516A true CN114912516A (en) 2022-08-16
CN114912516B CN114912516B (en) 2023-06-06

Family

ID=82765596

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210440038.5A Active CN114912516B (en) 2022-04-25 2022-04-25 Cross-domain target detection method and system for coordinating feature consistency and specificity

Country Status (1)

Country Link
CN (1) CN114912516B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116778277A (en) * 2023-07-20 2023-09-19 湖南大学无锡智能控制研究院 Cross-domain model training method based on progressive information decoupling

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109753992A (en) * 2018-12-10 2019-05-14 南京师范大学 The unsupervised domain for generating confrontation network based on condition adapts to image classification method
CN109858505A (en) * 2017-11-30 2019-06-07 厦门大学 Classifying identification method, device and equipment
CN111542816A (en) * 2018-02-06 2020-08-14 赫尔实验室有限公司 Domain adaptive learning system
CN112861616A (en) * 2020-12-31 2021-05-28 电子科技大学 Passive field self-adaptive target detection method
CN113807420A (en) * 2021-09-06 2021-12-17 湖南大学 Domain self-adaptive target detection method and system considering category semantic matching
WO2021258967A1 (en) * 2020-06-24 2021-12-30 华为技术有限公司 Neural network training method and device, and data acquisition method and device
CN114332568A (en) * 2022-03-16 2022-04-12 中国科学技术大学 Training method, system, equipment and storage medium of domain adaptive image classification network
CN114386527A (en) * 2022-01-18 2022-04-22 湖南大学无锡智能控制研究院 Category regularization method and system for domain adaptive target detection

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109858505A (en) * 2017-11-30 2019-06-07 厦门大学 Classifying identification method, device and equipment
CN111542816A (en) * 2018-02-06 2020-08-14 赫尔实验室有限公司 Domain adaptive learning system
CN109753992A (en) * 2018-12-10 2019-05-14 南京师范大学 The unsupervised domain for generating confrontation network based on condition adapts to image classification method
WO2021258967A1 (en) * 2020-06-24 2021-12-30 华为技术有限公司 Neural network training method and device, and data acquisition method and device
CN112861616A (en) * 2020-12-31 2021-05-28 电子科技大学 Passive field self-adaptive target detection method
CN113807420A (en) * 2021-09-06 2021-12-17 湖南大学 Domain self-adaptive target detection method and system considering category semantic matching
CN114386527A (en) * 2022-01-18 2022-04-22 湖南大学无锡智能控制研究院 Category regularization method and system for domain adaptive target detection
CN114332568A (en) * 2022-03-16 2022-04-12 中国科学技术大学 Training method, system, equipment and storage medium of domain adaptive image classification network

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
XIAOHUA HUANG等: "Domain adaptive object detection with generative adversarial network", 《2020 INTERNATIONAL CONFERENCE ON INTERNET OF THINGS AND INTELLIGENT APPLICATIONS (ITIA)》 *
张超: "基于感兴趣区域的图像检索方法研究", 《中国优秀硕士学位论文全文数据库信息科技辑》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116778277A (en) * 2023-07-20 2023-09-19 湖南大学无锡智能控制研究院 Cross-domain model training method based on progressive information decoupling
CN116778277B (en) * 2023-07-20 2024-03-01 湖南大学无锡智能控制研究院 Cross-domain model training method based on progressive information decoupling

Also Published As

Publication number Publication date
CN114912516B (en) 2023-06-06

Similar Documents

Publication Publication Date Title
CN109949317B (en) Semi-supervised image example segmentation method based on gradual confrontation learning
CN110555399B (en) Finger vein identification method and device, computer equipment and readable storage medium
CN110852107B (en) Relation extraction method, device and storage medium
US11494689B2 (en) Method and device for improved classification
CN110378911B (en) Weak supervision image semantic segmentation method based on candidate region and neighborhood classifier
CN112116593A (en) Domain self-adaptive semantic segmentation method based on Gini index
CN114841257A (en) Small sample target detection method based on self-supervision contrast constraint
CN118196410A (en) Remote sensing image semantic segmentation method, system, equipment and storage medium
CN113850243A (en) Model training method, face recognition method, electronic device and storage medium
CN114842343A (en) ViT-based aerial image identification method
CN114912516B (en) Cross-domain target detection method and system for coordinating feature consistency and specificity
CN113095229A (en) Unsupervised domain self-adaptive pedestrian re-identification system and method
CN115861738A (en) Category semantic information guided remote sensing target detection active sampling method
CN111178196B (en) Cell classification method, device and equipment
CN114386482B (en) Picture classification system and method based on semi-supervised incremental learning
CN114386527B (en) Category regularization method and system for domain adaptive target detection
Pang et al. Salient object detection via effective background prior and novel graph
CN116958806A (en) Pest identification model updating, pest identification method and device and electronic equipment
CN115082955B (en) Deep learning global optimization method, recognition method, device and medium
US20240104885A1 (en) Method and system for unsupervised deep representation learning based on image translation
Kaiser et al. Compensation learning in semantic segmentation
CN114168780A (en) Multimodal data processing method, electronic device, and storage medium
CN111860441A (en) Video target identification method based on unbiased depth migration learning
Li et al. A multi-grained unsupervised domain adaptation approach for semantic segmentation
CN113392370B (en) SLAM system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant