CN117152563A - Training method and device for hybrid target domain adaptive model and computer equipment - Google Patents

Training method and device for hybrid target domain adaptive model and computer equipment Download PDF

Info

Publication number
CN117152563A
CN117152563A CN202311337554.6A CN202311337554A CN117152563A CN 117152563 A CN117152563 A CN 117152563A CN 202311337554 A CN202311337554 A CN 202311337554A CN 117152563 A CN117152563 A CN 117152563A
Authority
CN
China
Prior art keywords
domain
target domain
features
samples
source
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202311337554.6A
Other languages
Chinese (zh)
Other versions
CN117152563B (en
Inventor
陆玉武
胡雪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
South China Normal University
Original Assignee
South China Normal University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by South China Normal University filed Critical South China Normal University
Priority to CN202311337554.6A priority Critical patent/CN117152563B/en
Publication of CN117152563A publication Critical patent/CN117152563A/en
Application granted granted Critical
Publication of CN117152563B publication Critical patent/CN117152563B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/09Supervised learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/094Adversarial learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • G06V10/806Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Data Mining & Analysis (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Medical Informatics (AREA)
  • Databases & Information Systems (AREA)
  • Multimedia (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Molecular Biology (AREA)
  • Mathematical Physics (AREA)
  • Image Analysis (AREA)

Abstract

The embodiment of the application is suitable for the technical field of machine learning, and provides a training method, a training device and computer equipment for a hybrid target domain self-adaptive model, wherein the method comprises the following steps: acquiring a source domain and a target domain, wherein the source domain comprises a plurality of marked samples, and the target domain is obtained by mixing a plurality of sub-target domains comprising a plurality of unmarked samples; respectively extracting sample characteristics by a characteristic extractor, wherein the sample characteristics comprise source domain characteristics of marked samples, target domain characteristics of unmarked samples and fusion characteristics of the source domain samples and the target domain samples; calculating model loss based on the source domain features, the target domain features and the fusion features; if the loss function convergence of the mixed target domain adaptive model is determined based on the model loss, determining that the mixed target domain adaptive model training is completed; if the loss function is not converged, updating parameters of the feature extractor, the classifier and the domain discriminator, and continuing training the model until the loss function is converged. By the method, the performance of the model can be improved.

Description

Training method and device for hybrid target domain adaptive model and computer equipment
Technical Field
The application belongs to the technical field of machine learning, and particularly relates to a training method and device for a hybrid target domain self-adaptive model and computer equipment.
Background
When deep learning is performed, training on a model is mostly performed on rich label data. However, in practice, manually annotating large amounts of data is an expensive and time-consuming task, which greatly limits the feasibility of deep learning. Furthermore, conventional deep learning suffers from poor generalization over a new data set due to domain bias between data sets. That is, a model trained on one labeled dataset may not be used on another unlabeled dataset.
To address these issues, a domain adaptation (Domain Adaptation) method may be used to perform migration learning, thereby applying a model with higher accuracy of source domain data training to a target domain with less marked data to reduce the labor cost of tagging the data.
However, existing domain adaptation methods mainly align the entire image across the source and target domains. However, not all parts of the image are migratable and forced alignment of non-migratable areas, such as the background, may lead to negative migration. Negative migration may prevent training of the model on the target domain, which may reduce the performance of the model. For example, the resulting trained model may not achieve accurate classification when performing classification tasks.
Disclosure of Invention
In view of this, the embodiment of the application provides a training method, a training device and a computer device for a hybrid target domain adaptive model, which are used for improving the classification accuracy of the model.
A first aspect of an embodiment of the present application provides a training method of a hybrid target domain adaptive model, where the hybrid target domain adaptive model includes a feature extractor, a classifier, and a domain arbiter, and the method includes:
acquiring a source domain and a target domain, wherein the source domain comprises a plurality of marked samples with real tag information, the target domain comprises a plurality of unmarked samples, the target domain comprises a plurality of sub-target domains, each sub-target domain comprises a plurality of unmarked samples, and the plurality of unmarked samples of the sub-target domains are mixed to obtain the target domain;
extracting sample features by a feature extractor, respectively, wherein the sample features comprise source domain features of the marked samples, target domain features of the unmarked samples and fusion features of the samples in the source domain and the samples in the target domain;
calculating model losses based on the source domain features, the target domain features and the fusion features, wherein the model losses comprise source supervision classification losses of the hybrid target domain adaptive model, domain countermeasure losses corresponding to the domain discriminators, prediction distribution difference losses corresponding to the classifier and the feature extractor;
If the loss function convergence of the mixed target domain adaptive model is determined based on the model loss, determining that the mixed target domain adaptive model training is completed;
if the loss function is not converged, updating parameters of the feature extractor, the classifier and the domain discriminator, and continuing training the hybrid target domain adaptive model until the loss function is converged.
A second aspect of an embodiment of the present application provides a training apparatus for a hybrid target domain adaptive model, the hybrid target domain adaptive model including a feature extractor, a classifier, and a domain arbiter, the apparatus comprising:
the data acquisition module is used for acquiring a source domain and a target domain, wherein the source domain comprises a plurality of marked samples with real tag information, the target domain comprises a plurality of sub-target domains, each sub-target domain comprises a plurality of unmarked samples, and the plurality of unmarked samples of the sub-target domains are mixed to obtain the target domain;
a feature extraction module for extracting sample features respectively by a feature extractor, wherein the sample features comprise source domain features of the marked samples, target domain features of the unmarked samples and fusion features of the samples in the source domain and the target domain;
A loss calculation module, configured to calculate model loss based on the source domain feature, the target domain feature, and the fusion feature, where the model loss includes a source supervision classification loss of the hybrid target domain adaptive model, a domain countermeasure loss corresponding to the domain arbiter, and a prediction distribution difference loss corresponding to the classifier and the feature extractor;
the first judging module is used for determining that the training of the hybrid target domain adaptive model is completed if the loss function of the hybrid target domain adaptive model is determined to be converged based on the model loss;
and the second judging module is used for updating the parameters of the feature extractor, the classifier and the domain discriminator if the loss function is not converged, and continuing training the mixed target domain adaptive model until the loss function is converged.
A third aspect of an embodiment of the present application provides a computer device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, the processor implementing the method according to the first aspect as described above when executing the computer program.
A fourth aspect of embodiments of the present application provides a computer readable storage medium storing a computer program which, when executed by a processor, implements a method as described in the first aspect above.
A fifth aspect of an embodiment of the application provides a computer program product for, when run on a computer device, causing the computer device to perform the method of the first aspect described above.
Compared with the prior art, the embodiment of the application has the following advantages:
the hybrid target domain adaptive model in the embodiment of the application can comprise a feature extractor, a classifier and a domain arbiter, training data can be obtained when the hybrid target domain adaptive model is trained, the training data can comprise a source domain and a target domain, wherein the source domain can comprise a plurality of marked samples with real tag information, the target domain can comprise a plurality of unmarked samples, and the unmarked samples can come from a plurality of sub-target domains and are mixed together. The sample features may be extracted based on the feature extractor, respectively, and the extracted sample features may include source domain features of the marked samples, target domain features of the unmarked samples, and fusion features of the marked samples and the marked samples. Based on the extracted sample features, model loss may be calculated, such that it may be determined whether training of the model is complete based on the model loss. When the loss function converges based on the model loss, it may be determined that model training is complete. When the loss function is determined not to be converged based on the model loss, parameters of the feature extractor, the classifier and the domain discriminator can be adjusted, and training of the hybrid target domain adaptive model is continued until the loss function is converged. In the embodiment of the application, when the model is trained, the marked sample and the unmarked sample are subjected to feature fusion, so that the sample features contain semantic information of a source domain and patterns of a target domain, and the fusion features of the marked sample and the unmarked sample are used for training, so that the identification capacity of a classifier can be improved, the pseudo-label precision on the target domain is improved, and the model obtained through training has higher classification accuracy.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the following will briefly introduce the drawings that are required to be used in the embodiments or the description of the prior art.
FIG. 1 is a schematic diagram illustrating a comparison of multi-target domain adaptation and hybrid target domain adaptation according to an embodiment of the present application;
FIG. 2 is a schematic view of a feature visualization of a hybrid object domain adaptation in a hybrid feature space provided by an embodiment of the present application;
FIG. 3 is a schematic step flow diagram of a training method of a hybrid target domain adaptive model according to an embodiment of the present application;
FIG. 4 is a schematic diagram of a hybrid target domain adaptive model according to an embodiment of the present application;
FIG. 5 is a graph comparing the effects of different methods provided by embodiments of the present application on discriminative semantic emphasis;
FIG. 6 is a graph showing the comparison of effects of feature clustering based on different models provided by an embodiment of the present application;
FIG. 7 is a schematic diagram of a training device for a hybrid target domain adaptive model according to an embodiment of the present application;
fig. 8 is a schematic diagram of a computer device according to an embodiment of the present application.
Detailed Description
In the following description, for purposes of explanation and not limitation, specific details are set forth such as the particular system architecture, techniques, etc., in order to provide a thorough understanding of the embodiments of the present application. It will be apparent, however, to one skilled in the art that the present application may be practiced in other embodiments that depart from these specific details. In other instances, detailed descriptions of well-known systems, devices, circuits, and methods are omitted so as not to obscure the description of the present application with unnecessary detail.
In recent years, deep learning has made remarkable progress in many applications such as image classification, object recognition, natural language processing, and the like. Most of these algorithms are trained on rich marker data. However, in practice, manually annotating large amounts of data is an expensive and time-consuming task, which greatly limits the feasibility of deep learning. Furthermore, conventional deep learning suffers from poor generalization performance over a new data set due to domain bias. To solve these problems, unsupervised field adaptation (UDA: unsupervised Domain Adaptation) has been widely studied. UDA migrates knowledge from tag rich source domains to untagged target domains to solve the domain offset problem.
Most UDA studies focus on adaptation from one or more source domains to a Single target domain (STDA: single-Target Domain Adaptation). However, unlabeled data may come from a number of different domains, resulting in Multi-target domain adaptation (MTDA: multi-Target Domain Adaptation) becoming a hotspot for research. However, most of the work generally requires the construction of complex network models. In fact, different target samples from multiple distributions will typically be mixed with each other, so mixed target domain (BTDA) is a more realistic migration scenario.
Domain adaptation learning (Domain Adaptation Learning) may utilize source domain knowledge associated with a target domain to assist in learning of the target domain. Hybrid target domain adaptation (blended targets domain adaptation, BTDA) is one type of domain adaptation learning. Because the target domains may be different and the distribution of each target class may be different in the real world. Current research on BTDA is mostly based on direct adaptation from source domain to target domain, and there is a large domain difference. In addition, it is also inevitably affected by irrelevant semantic information, resulting in negative migration.
BTDA is more challenging than MTDA for the following reasons: the mixed feature space has serious domain offset and category offset, so that a good cluster structure is difficult to form for the sample; the clusters of one class may overlap with the clusters of another class. Fig. 1 is a schematic diagram illustrating a comparison of multi-target domain adaptation and hybrid target domain adaptation according to an embodiment of the present application, where each target domain in the MTDA is separated from each other, and each target domain in the BTDA is hybrid, as shown in fig. 1. Fig. 2 is a schematic view of feature visualization of a hybrid object domain adaptation in a hybrid feature space, where numbers in fig. 2 represent categories, numbers in the same category may be in different domains, and as shown in fig. 2, a class 1 cluster in a domain may overlap a class 6 cluster in a domain. It can be seen that hybrid target domain adaptation is more complex than multi-target domain adaptation.
Based on this, the present application proposes a Semantic double-countermeasure network (SDN: semantic Dual-adversarial Network for Blended-target Domain Adaptation) method that can effectively reduce domain differences without using domain labels. In particular, the inventive approach may align category distributions by extending the output of the domain arbiter to the number of categories. In this way, the domain arbiter and the feature extractor are trained in a resistive manner. To suppress irrelevant semantic information, a classifier can be introduced to make min-max gaming. The classifier strives to maximize the variance of the predicted distribution, while the extractor strives to minimize the variance. In this process, irrelevant semantic information is suppressed and primary semantic information is emphasized. In addition, the application also introduces a feature fusion scheme based on random ratio to enhance the source domain, so that the source domain has the texture and style of the mixed target domain, and the domain gap is reduced. Thus, the feature distribution of the same class from different fields forms a good cluster.
The technical scheme of the application is described below through specific examples.
Referring to fig. 3, a schematic step flow diagram of a training method of a hybrid target domain adaptive model according to an embodiment of the present application may specifically include the following steps:
S301, acquiring a source domain and a target domain, wherein the source domain comprises a plurality of marked samples with real tag information, the target domain comprises a plurality of sub-target domains, each sub-target domain comprises a plurality of unmarked samples, and the target domain is obtained by mixing the unmarked samples of the sub-target domains.
The method provided by the embodiment of the application can be applied to computer equipment such as mobile phones, tablet computers, wearable equipment, vehicle-mounted equipment, augmented reality (augmented reality, AR)/Virtual Reality (VR) equipment, notebook computers, ultra-mobile personal computer (UMPC), netbooks, personal digital assistants (personal digital assistant, PDA) and the like, and the embodiment of the application does not limit the specific type of the computer equipment.
The hybrid target domain adaptive model can be applied to a cross-domain classification technology, and can be used for cross-domain target recognition and cross-domain image classification. For example, the trained model may be used to perform image classification tasks. The hybrid target domain adaptive model may include a feature extractor, which may be a neural network, for example, a feature extractor that may use ResNet-50, a classifier, and a domain arbiter for extracting features of the sample data. The features extracted by the feature extractor can be transmitted to the classifier and the domain discriminator, so that the classifier can classify according to the features, and the domain discriminator can judge whether the sample belongs to the target domain or the source domain based on the features. In addition, in the application, after training the domain arbiter, the domain arbiter can also identify the category information.
The source and target fields may be data sets composed of training data for model training. Wherein the data in the source domain is data containing real tag information. The target domain may be a hybrid target domain, i.e. a hybrid data in which the target domain may comprise a plurality of different data sets. The target field may include a plurality of sub-target fields, each sub-target field may include a plurality of unlabeled exemplars, and the unlabeled exemplars of the plurality of sub-target fields may be mixed to obtain the target field. In the embodiment of the application, the number of the samples containing the mark can be n s Is defined asM target domains are defined asWherein->Comprising n m Unlabeled samples. The M target domains are mixed together to form a mixed target domain T.
The data in the source domain and the target domain may be images, for example. The image in the source domain may have a corresponding label, while the image in the target domain may be unlabeled.
S302, respectively extracting sample features by a feature extractor, wherein the sample features comprise source domain features of the marked samples, target domain features of the unmarked samples and fusion features of the samples in the source domain and the target domain.
In general, feature fusion is a data enhancement method that improves model performance and improves neural network robustness. Some studies have shown that low-level features represent mainly the texture, structure and style of an image. The prior knowledge of the previous layers of convolutional neural network is thus used to extract texture and style information for the target domain and mix it with the source domain in a random scale. At this time, the sample feature may include semantic information of the source domain and a style of the target domain. Based on the fusion scheme, model training can be performed, the recognition capability of the classifier is improved, and the accuracy of the pseudo tag on the target domain is improved.
Illustratively, the low-level feature map may be represented asWhere C represents a channel, and H and W represent a height and a width. First, the mean and variance of the source and target domain samples need to be calculated separately:
then, a mixed mean μ can be obtained by using a random number α between 0 and 1 st And the mixed variance sigma st . The mixing method may be defined as follows
Finally, enhanced source features with target styles can be obtained as follows:
where epsilon may be set to 1e-5 to prevent the denominator from being 0.
And the enhanced source domain is used as a bridge connection target domain to carry out knowledge transfer. The fusion of the source features and the target low-level features can improve the final recognition effect.
Based on this, in performing feature extraction, the computer device may extract low-level source domain features of the marked samples and low-level target domain features of the unmarked samples, respectively, through a low-level network of the feature extractor; fusing the low-level source domain features and the low-level target domain features to obtain low-level fusion features; and respectively extracting the low-level source domain features, the low-level target domain features and the low-level fusion features through a high-level network of the feature extractor to obtain the source domain features, the target domain features and the fusion features. The low-level source domain features and the low-level target domain features can be subjected to feature fusion to obtain low-level fusion features through the following formula:
wherein z is st Z is a low-level fusion feature s Is a low-level source domain feature, z t For the low-level target domain feature, H is the height of the sample image, W is the width of the sample image, and α is a random number.
S303, calculating model loss based on the source domain feature, the target domain feature and the fusion feature, wherein the model loss comprises source supervision classification loss of the hybrid target domain self-adaptive model, domain countermeasure loss corresponding to the domain discriminator, and prediction distribution difference loss corresponding to the classifier and the feature extractor.
The existing domain adaptive method mainly aligns the whole image across the source domain and the target domain. However, it is apparent that not all parts of the image are migratable and forced alignment of non-migratable areas, such as the background, may result in negative migration. Therefore, it is necessary to locate irrelevant areas that lead to misclassification and suppress the features of these areas. The application provides discriminant semantic information anti-learning, focuses more on discriminant features in the image, and suppresses the features of error prediction categories. Specifically, the classifier and feature extractor may be counter-trained. In one aspect, the classifier increases the feature weight of the wrong class by maximizing the predicted distribution difference between pairs of samples of the same class. On the other hand, the feature extractor is trained to suppress the wrong class of features and emphasize the discriminative feature map region by minimizing the prediction distribution. Through the countermeasure training process, the scheme of the application can restrain irrelevant semantic information and carry out purer knowledge migration.
Specifically, the source domain feature, the target domain feature and the fusion feature can be respectively input into a classifier to obtain a first prediction tag corresponding to the source domain feature, a second prediction tag corresponding to the target domain feature and a third prediction tag corresponding to the fusion feature; calculating the source supervised classification loss based on the first predictive label and the third predictive label by the following formula:
Wherein L is CE Supervising the classification loss for the source, n s For the number of samples of the source domain, l ce For the cross entropy loss function, C is the classifier, F is the feature extractor,for the ith marker sample, +.>Is->A corresponding first predictive label, f being the deep network of the feature extractor, ++>For the j-th fusion feature,/->Is->The corresponding real label may be a preset real label for the sample in the source domain.
The domain countermeasure loss may then be calculated based on the first predictive tag and the second predictive tag by the following formula:
wherein L is D To combat losses in the domain, n s For the number of samples of the source domain, D c For the domain arbiter to discriminate category c, F is the feature extractor,for the ith marker sample, +.>Is->Corresponding real label, n t For the number of samples of the target domain, +.>For the j-th unlabeled sample, +.>Is->Is a hybrid tag of (a). The hybrid label is a hybrid label of a soft pseudo label and a single hot pseudo label of an unlabeled sample.
The predicted distribution difference loss may be calculated based on the first prediction tag and the second prediction tag by the following formula:
wherein L is JS To predict the distribution discrepancy loss, JS is the divergence function, T is the harmonic parameter, N s,s To meet the requirements ofN, N s,t To meet->Sample number of>The prediction labels representing the two source field samples are identical,the real label representing one source domain sample is identical to the predicted label of one target domain sample.
The hybrid target domain adaptive model loss function may be set to:
wherein L is CE To supervise classification loss for source, L D To combat losses in the domain, L JS To predict the distribution difference loss, F is a feature extractor, C is a classifier, D c Is a domain discriminator, beta is L JS Is used for the positive weight parameter of (a).
And S304, if the loss function convergence of the mixed target domain adaptive model is determined based on the model loss, determining that the mixed target domain adaptive model training is completed.
If the model loss is smaller than the preset threshold, the loss function convergence can be determined, and accordingly model training can be judged to be completed.
And S305, if the loss function is not converged, updating parameters of the feature extractor, the classifier and the domain discriminator, and continuing training the mixed target domain adaptive model until the loss function is converged.
In one possible implementation, the marked samples in the source domain are images with real tag information, the unmarked samples in the target domain are unmarked images, and the trained hybrid target domain adaptive model is used to classify the images.
Referring to fig. 4, a schematic diagram of a framework of a hybrid target domain adaptive model according to an embodiment of the present application is shown. As shown in fig. 4, the framework is composed of a feature extractor, a class classifier, and a domain arbiter with class information. The model is trained using source samples with real labels and unlabeled mixed target samples. The feature extractor and domain discriminators perform countermeasure training to align the category distribution, solving the domain offset problem, while the feature extractor and classifier perform countermeasure training to emphasize discriminative semantic information, suppress irrelevant semantic regions, and perform purer knowledge transfer.
Based on the framework of the hybrid target domain adaptive model, the scheme of the application provides a novel method for solving the image classification task of the BTDA. The method of the scheme comprises the steps of discriminant semantic information countermeasure learning, feature fusion scheme based on random ratio and low-uncertainty pseudo-tag assisted explicit class distribution alignment.
In a BTDA scenario, the inventive scheme will contain a number of marked samples of n s Is defined asM target domains are defined as +.>Wherein->Comprising n m Unlabeled samples. The M target domains are mixed together to form a mixed target domain T. / >And->Respectively representing the i-th picture data and the corresponding tag. n is n s And n m Respectively indicate->And->Is a sample number of (a) in a sample. The data distribution of the source domain and the target domain are different from each other, which is one of the main obstacles to solving the BTDA problem. In this context, the inventive approach represents the feature extractor as F (-), the classifier as C (-), and the domain arbiter as D c (. Cndot.) the use of a catalyst. The object of the inventive solution works to add knowledge from the source domain->Migration to Mixed target Domain->And accurately predicts the label of the unlabeled target specimen.
The existing domain adaptive method mainly aligns the whole image across the source domain and the target domain. However, it is clear that not all parts of the image are migratable and forced alignment of non-migratable areas, such as the background, may lead to negative transitions. Therefore, it is necessary to locate irrelevant areas that lead to misclassification and suppress the features of these areas. The scheme of the application provides the discriminant semantic information for resisting learning, focuses more on discriminant features in the image, and suppresses the features of the misprediction category. Specifically, the scheme of the application subjects the classifier and the feature extractor to countermeasure training. In one aspect, the classifier increases the feature weight of the wrong class by maximizing the predicted distribution difference between pairs of samples of the same class. On the other hand, the feature extractor is trained to suppress the wrong class of features and emphasize the discriminative feature map region by minimizing the prediction distribution. Through the countermeasure training process, the scheme of the application can restrain irrelevant semantic information and carry out purer knowledge migration.
The class activation map for a particular class shows a Convolutional Neural Network (CNN) to distinguish portions of the image for that particular class. For a given image, a k (u, v) represents the activation of the last convolutional layer neural unit k at the spatial location (u, v). After performing global average pooling, result F for cell k k Is thatWhere H and W are the feature map sizes. Thus, for category c, logic output z c Is->Wherein->Is the weight of class c for element k. In fact, the _on>Represents F k Importance for category c. Finally, the predictive score for category c +.>Insertion throughTo z c It is possible to obtain:
the scheme of the application is M c Defined as class c activation map, then
Thus M c (u, v) shows directly the importance of activation a (u, v) in classifying images into category c. By simply upsampling the class activation map to the size of the original input image, the present solution can find the image region that is most relevant to the particular class. Thus, the scheme of the application can see that the prediction score p c Depending on the corresponding class activation map, the CAM may locate class-specific image regions. It motivates the inventive approach to find irrelevant semantic regions and suppress the features of these regions. The inventive arrangements will be discussed in the next subsection.
The scheme of the application has two types of sample pairs, one is a source-source pair, wherein two images are from a source domain, the other is a source-target pair, one image is from a source domain, and the other image is from a target domain. For each sample pair, all images belong to the same class. In order to increase the weight of the uncorrelated regions, the inventive approach maximizes the prediction distribution variance. Since the sample pairs belong to the same class and the prediction scores of the same class are all high, the weight of the incorrect class is increased in order to maximize the prediction distribution difference of the sample pairs. For example, the inventive regimen has two pictures, one picture being a cat with flowers and the other picture being a cat with hairtails. The predicted distribution of these two images is [0.03,0.81,0.16] and [0.21,0.7,0.09], respectively, each element representing "fish", "cat" and "flower" in turn. When the distribution difference is maximized, the former image increases the score of "flowers", and the latter image increases the score of "fish". "flowers" and "fish" are irrelevant semantics of the prediction of "cats". Thus, the weight of the wrong class increases. The loss to maximize the predicted distribution difference is defined as follows:
The inventive approach uses Jensen-Shannon (JS) divergence to measure the distribution difference between a pair of predictions. Wherein L is JS To predict the distribution difference loss, T is the harmonic parameter. N (N) s,s And N s,t Respectively indicate that the satisfaction is satisfiedAnd->Number of samples. />The true labels representing the two source domain samples are identical,/->The real label representing one source domain sample is identical to the predictive label of one target domain sample. C is a classifier.
In the previous classifier training, the scheme of the application increases the weight of irrelevant semantics and finds the misclassified area. The inventive approach requires suppressing these uncorrelated regions and emphasizing the main features by training the feature extractor. The scheme of the application trains a feature extractor to emphasize a discriminant region by minimizing the predicted distribution difference of a sample pair, and the minimized predicted distribution difference loss is expressed as follows:
by the aid of the contrast training method, purer knowledge migration can be performed, interference of irrelevant semantic information is avoided, and classification accuracy is greatly improved.
In general, feature fusion is a data enhancement method that improves model performance and improves neural network robustness. Some studies have shown that low-level features represent mainly the texture, structure and style of an image. The scheme of the application utilizes the prior knowledge of the convolutional neural network of the first layers to extract the texture and style information of the target domain and mixes the texture and style information with the source domain in a random proportion mode. At this time, the sample of the scheme of the application contains semantic information of the source domain and the style of the target domain. The fusion scheme can perform model training, improve the discrimination capability of the classifier and improve the pseudo tag precision on the target domain.
Formally, the scheme of the application represents the low-level characteristic diagram asWhere C represents a channel, and H and W represent a height and a width. Firstly, the method needs to calculate the mean and variance of the source domain and target domain samples respectively:
/>
the scheme of the application then obtains the mixed mean mu by using a random number alpha between 0 and 1 st And the mixed variance sigma st . The mixing method of the scheme of the application is defined as follows
Finally, the scheme of the application obtains the enhancement source characteristics with the target style as follows:
wherein epsilon is set to 1e-5 to prevent the denominator from being 0.
The fusion of the source features and the target low-level features can improve the final recognition effect. Furthermore, the method of the inventive solution does not require the production of specific images, making it efficient in practice.
In previous domain adaptation based on resistance learning, the last layer of domain discriminators had only one output for one sample, called logit. Following the logit is a sigmoid type function to generate a probability representation, i.e., the sample belongs to the source or target. Similar to (paper: class overrides: mutual Conditional Blended-Target Domain Adaptation), the present approach extends the output number of the domain discriminator to c, where c is the number of categories. Each logic works independently as a GAN (generation of an antagonism network) to reduce And->Conditional distribution differences between specific classes. In order to make each logic correspond to a category, the scheme of the application lets the domain discriminator D c Generated predictionsMultiplied by the corresponding one-hot label y i ∈{0,1} c The corresponding logic calculation is then activated to account for resistance loss. The domain counter loss is defined as follows:
wherein,a one-hot tag that is the ith source sample, < >>Hybrid tags representing soft and single thermal pseudo tags of target data.
Target pseudo tags obtained from classifier predictions may be unreliable, adversely affecting distribution alignment. And filtering the low-uncertainty predicted value by adopting an uncertainty-guided filtering mechanism, and converting the low-uncertainty predicted value into a single-hot pseudo tag, wherein the high-uncertainty predicted value is directly used as a soft pseudo tag. The inventive scheme uses entropy as a measure of uncertainty and considers this prediction to be reliable if the uncertainty of the prediction is less than a certain threshold γ. The uncertainty metric formula and hybrid tag are as follows:
during training, the domain discriminator gradually changes from indistinguishable to distinguishable classes, each log may represent a class distribution P (z|y). Some previous approaches actually align marginal distributions, but the approach of the present approach aligns category distributions. Therefore, the scheme of the application can align the category distribution and reduce the inter-domain difference, so that the mixed data can form good clusters.
It should be noted that, the sequence number of each step in the above embodiment does not mean the sequence of execution sequence, and the execution sequence of each process should be determined by its function and internal logic, and should not limit the implementation process of the embodiment of the present application in any way.
In order to illustrate the beneficial effects of the embodiments of the present application, the effects of the embodiments of the present application may be compared with other embodiments.
The inventive protocol the method of the inventive protocol was evaluated by standard dataset testing using the following three BTDA.
Office-31: comprising three different domains: amazon (a), webcam (W), and DSLR (D). Each domain contains 31 common classes.
Office-Home: including images from 65 categories of 4 different fields: artistic (Ar), product (Pr), clipart (Cl) and RealWord (Rw).
Image-CLE: four different fields are included, each field having 12 categories of images: bing (B), imageNet ILSVRC 2012 (I), caltech-256 (C), and Pascal VOC 2012 (P).
For fair comparison, the inventive scheme uses ResNet-50 as the backbone network for all data sets. All experiments herein were performed using PyTorch, which was run on a GPU (GeForce RTX-4090). The scheme of the application adopts a small batch SGD optimizer with momentum of 0.9 to perform network optimization. The inventive approach selects one subset from the dataset as the source domain and mixes the remaining subsets together as the destination domain. According to the scheme, the accuracy of each sub-target domain is calculated, and the evaluation index is the average value of all the accuracy.
The classification results of Office-31, office-Home and ImageCLEF-DA are shown in tables 1 and 2. The "Avg" column is the average value.
TABLE 1
The results of Office-31 evaluation using ResNet-50 are shown in Table 1. The average accuracy of the method of the scheme of the application is 91.6%. The method of the present approach provides significant performance improvements over the baseline method DANN. Notably, the most significant accuracy improvement was observed in task A→W/D, with the method of the present solution achieving a significant performance improvement of 16.5%. In addition, the method of the scheme of the application is 2% higher than the most advanced BTDA method (such as MCDA) on Office-31.
In Table 1, the solution of the present application achieves better results on Office-Home. For example, the method of the present scheme achieves an accuracy of about 4.7% compared to the latest BTDA method MCDA. The average accuracy of the method of the scheme of the application is 75.8%, which is significantly more than 17.9% of the baseline method DANN. This substantial improvement demonstrates the effectiveness and superiority of the method of the present application. Furthermore, it is notable that even though methods such as MTDA, DLC, and DCGCT are compared to methods using domain labels, the methods of the present approach still lead them by 11.7% on Office-Home.
TABLE 2
Table 2: accuracy (%) of BTDA on ImageCLEF-DA (ResNet-50). Each field in the table represents a source, and the remaining fields are blended as targets. Precision is the average precision of all target fields in the hybrid target. * Representing the results generated by the publicly published code. Table 2 shows the classification accuracy of the ResNet-50 based ImageCLEF-DA dataset. In the task C-B/I/P, the classification accuracy is slightly lower than that of the current most advanced method HTA. However, among the other three tasks, a large number of experimental results reached the highest value, and the average accuracy on ImageCLEF-DA was also the highest comparison method. Referring to fig. 5, a visual effect diagram for semantic discrimination based on a hybrid target domain adaptive model according to an embodiment of the present application is shown. As shown in fig. 5, semantic information located by different methods is visually displayed. Original represents an Original picture, MCDA is the latest method for comparison, SDN is the method proposed by the present application, and as can be seen in fig. 5, the actual labels of the Original pictures in the prediction result domain of the method proposed by the present application are the same. The scheme of the application has better effect.
In order to verify the discriminant semantic information emphasizing method of the scheme, the feature images sampled from the Office-Home dataset are subjected to random visualization. It can be seen from the figure that the method provided by the application helps to concentrate the adaptive model in the feature map region with discriminant semantic information. These results intuitively demonstrate that SDN can successfully address the semantic emphasis of key parts in image classification.
Referring to fig. 6, a comparison chart of the visual effects of feature clustering performed by different models according to the embodiment of the present application is shown. FIG. 6 visualizes the feature distribution of t-SNE to task A→D/W. For ResNet50, the target domain features are scattered around bad source domain feature clusters in a cluttered manner. The DANN method also fails to form good clusters of all domain features in the mixed feature space. In contrast, SDN builds a compact cluster of all domain features. As a result, the proposed method proved to be very effective for BTDA tasks.
Referring to fig. 7, a schematic diagram of a training apparatus for a hybrid target domain adaptive model according to an embodiment of the present application may specifically include a data acquisition module 71, a feature extraction module 72, a loss calculation module 73, a first judgment module 74, and a second judgment module 75, where:
a data acquisition module 71, configured to acquire a source domain and a target domain, where the source domain includes a plurality of marked samples with real tag information, and the target domain includes a plurality of unmarked samples;
a feature extraction module 72 for extracting sample features by a feature extractor, respectively, the sample features including source domain features of the marked samples, target domain features of the unmarked samples, and fusion features of the samples in the source domain and the samples in the target domain;
A loss calculation module 73, configured to calculate model losses based on the source domain features, the target domain features, and the fusion features, where the model losses include a source supervision classification loss of the hybrid target domain adaptive model, a domain countermeasure loss corresponding to the domain arbiter, and a prediction distribution difference loss corresponding to the classifier and the feature extractor;
a first judging module 74, configured to determine that the training of the hybrid target domain adaptive model is completed if it is determined that the loss function of the hybrid target domain adaptive model converges based on the model loss;
and a second judging module 75, configured to update parameters of the feature extractor, the classifier, and the domain discriminator if the loss function does not converge, and continue training the hybrid target domain adaptive model until the loss function converges.
In one possible implementation, the feature extraction module 72 includes:
a low-level feature extraction sub-module, configured to extract, through a low-level network of the feature extractor, low-level features of the samples in the source domain and low-level features of the samples in the target domain, respectively;
a low-level feature fusion sub-module, configured to fuse the low-level features of the sample in the source domain with the low-level features of the sample in the target domain to obtain low-level fusion features;
And the high-level feature extraction sub-module is used for extracting the features of the low-level source domain feature, the low-level target domain feature and the low-level fusion feature through a high-level network of the feature extractor to obtain the source domain feature, the target domain feature and the fusion feature.
In one possible implementation manner, the low-level source domain feature and the low-level target domain feature are fused by the following formula to obtain a low-level fusion feature:
wherein Z is st Z for the low-level fusion feature s For the low-level source domain feature, Z t And H is the height of the sample image, W is the width of the sample image, and alpha is a random number.
In one possible implementation, the loss calculation module 73 includes:
the prediction tag calculation sub-module is used for respectively inputting the source domain feature, the target domain feature and the fusion feature into the classifier to obtain a first prediction tag corresponding to the source domain feature, a second prediction tag corresponding to the target domain feature and a third prediction tag corresponding to the fusion feature;
a source supervision classification loss calculation sub-module for calculating the source supervision classification loss based on the first prediction tag and the third prediction tag;
A domain counter loss calculation sub-module for calculating the domain counter loss based on the first predictive tag and the second predictive tag;
and the prediction distribution difference loss calculation sub-module is used for calculating the prediction distribution difference loss based on the first prediction tag and the second prediction tag.
In one possible implementation, the source supervised classification loss is calculated based on the first and third predictive labels by:
wherein L is CE Supervising the classification loss for the source, n s For the number of samples of the source domain, l ce For the cross entropy loss function, C is the classifier, F is the feature extractor,for the ith marker sample, +.>Is->A corresponding first predictive label, f being the deep network of the feature extractor, ++>For the j-th fusion feature,/->Is->Corresponding real labels.
In one possible implementation, the domain counter-loss is calculated based on the first predictive tag and the second predictive tag by the following formula:
wherein L is D To combat losses for the domain, n s D is the number of samples of the source domain c For the domain arbiter to discriminate category C, F is the feature extractor, For the ith marker sample, +.>Is->Corresponding real label, n t For the number of samples of the target domain, +.>For the j-th unlabeled sample, +.>Is->Is a hybrid tag of (a).
In one possible implementation, the prediction distribution difference loss is calculated based on the first prediction tag and the second prediction tag by the following formula:
wherein L is JS For the predicted distribution difference loss, js is the divergence function, T is the harmonic parameterNumber N s,s To meet the requirements ofN, N s,t To meet->Sample number of>The true labels representing the two source domain samples are identical, < +.>The real label representing one source domain sample is identical to the predicted label of one target domain sample.
In one possible implementation, the loss function is:
wherein L is CE Supervising the classification loss for the source, L D To combat losses for the domain, L JS For the predicted distribution difference loss, F is the feature extractor, C is the classifier, D is the domain discriminator, and β is L JS Is used for the positive weight parameter of (a).
In one possible implementation, the marked samples in the source domain are images with real tag information, the unmarked samples in the target domain are unmarked images, and the trained hybrid target domain adaptive model is used for classifying images.
For the device embodiments, since they are substantially similar to the method embodiments, the description is relatively simple, and reference should be made to the description of the method embodiments.
Fig. 8 is a schematic structural diagram of a computer device according to an embodiment of the present application. As shown in fig. 8, the computer device 800 of this embodiment includes: at least one processor 80 (only one shown in fig. 8), a memory 81 and a computer program 82 stored in the memory 81 and executable on the at least one processor 80, the processor 80 implementing the steps in any of the various method embodiments described above when executing the computer program 82.
The computer device 800 may be a desktop computer, a notebook computer, a palm computer, a cloud computer, or the like. The computer device may include, but is not limited to, a processor 80, a memory 81. It will be appreciated by those skilled in the art that fig. 8 is merely an example of a computer device 800 and is not intended to limit the computer device 800, and may include more or fewer components than shown, or may combine certain components, or different components, such as may also include input-output devices, network access devices, etc.
The processor 80 may be a central processing unit (Central Processing Unit, CPU), the processor 80 may also be other general purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), off-the-shelf programmable gate arrays (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, or the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
The memory 81 may in some embodiments be an internal storage unit of the computer device 800, such as a hard disk or a memory of the computer device 800. The memory 81 may in other embodiments also be an external storage device of the computer device 800, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card) or the like, which are provided on the computer device 800. Further, the memory 81 may also include both internal storage units and external storage devices of the computer device 800. The memory 81 is used for storing an operating system, application programs, boot loader (BootLoader), data, other programs etc., such as program codes of the computer program etc. The memory 81 may also be used to temporarily store data that has been output or is to be output.
Embodiments of the present application also provide a computer readable storage medium storing a computer program which, when executed by a processor, implements steps for implementing the various method embodiments described above.
Embodiments of the present application provide a computer program product which, when run on a computer device, causes the computer device to perform the steps that can be carried out in the respective method embodiments described above.
The above embodiments are only for illustrating the technical solution of the present application, and are not limited thereto. Although the application has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present application, and are intended to be included in the scope of the present application.

Claims (10)

1. A method of training a hybrid target domain adaptive model, the hybrid target domain adaptive model comprising a feature extractor, a classifier, and a domain arbiter, the method comprising:
Acquiring a source domain and a target domain, wherein the source domain comprises a plurality of marked samples with real tag information, the target domain comprises a plurality of unmarked samples, the target domain comprises a plurality of sub-target domains, each sub-target domain comprises a plurality of unmarked samples, and the plurality of unmarked samples of the sub-target domains are mixed to obtain the target domain;
extracting sample features by a feature extractor, respectively, wherein the sample features comprise source domain features of the marked samples, target domain features of the unmarked samples and fusion features of the samples in the source domain and the samples in the target domain;
calculating model losses based on the source domain features, the target domain features and the fusion features, wherein the model losses comprise source supervision classification losses of the hybrid target domain adaptive model, domain countermeasure losses corresponding to the domain discriminators, prediction distribution difference losses corresponding to the classifier and the feature extractor;
if the loss function convergence of the mixed target domain adaptive model is determined based on the model loss, determining that the mixed target domain adaptive model training is completed;
if the loss function is not converged, updating parameters of the feature extractor, the classifier and the domain discriminator, and continuing training the hybrid target domain adaptive model until the loss function is converged.
2. The method of claim 1, wherein the extracting, by the feature extractor, the source domain feature corresponding to the marked sample, the corresponding target domain feature of the unmarked sample, and the fusion feature of the sample in the source domain and the sample in the target domain, respectively, comprises:
extracting low-level features of the samples in the source domain and low-level features of the samples in the target domain respectively through a low-level network of the feature extractor;
fusing the low-level features of the samples in the source domain and the low-level features of the samples in the target domain to obtain low-level fusion features;
and respectively extracting the low-level features of the samples in the source domain, the low-level features of the samples in the target domain and the low-level fusion features through a high-level network of the feature extractor to obtain the source domain features, the target domain features and the fusion features.
3. The method of claim 2, wherein the low-level source domain features are feature fused with the low-level target domain features to obtain low-level fusion features by the following formula:
wherein z is st Z for the low-level fusion feature s For the low-level source domain feature, z t And H is the height of the sample image, W is the width of the sample image, and alpha is a random number.
4. The method of claim 1, wherein the calculating a source supervised classification loss of the hybrid target domain adaptive model, a domain countermeasure loss for the domain arbiter, a predicted distribution difference loss for the classifier and the feature extractor based on the source domain features, the target domain features, and the fusion features comprises:
the source domain feature, the target domain feature and the fusion feature are respectively input into the classifier to obtain a first prediction tag corresponding to the source domain feature, a second prediction tag corresponding to the target domain feature and a third prediction tag corresponding to the fusion feature;
calculating the source supervised classification loss based on the first predictive label and the third predictive label;
calculating the domain countermeasure loss based on the first predictive tag and the second predictive tag;
the predicted distribution difference loss is calculated based on the first prediction tag and the second prediction tag.
5. The method of claim 4, wherein the source supervised classification penalty is calculated based on the first predictive label and the third predictive label by the following formula:
wherein L is CE Supervising the classification loss for the source, n s For the number of samples of the source domain, l ce For the cross entropy loss function, C is the classifier, F is the feature extractor,for the ith marker sample, +.>Is->A corresponding first predictive label, f being the deep network of the feature extractor, ++>For the j-th fusion feature,/->Is->Corresponding real labels.
6. The method of claim 4, wherein the domain counter-loss is calculated based on the first predictive tag and the second predictive tag by the following formula:
wherein L is D To combat losses for the domain, n s D is the number of samples of the source domain c For the discrimination of class c by the domain discriminator, F is the feature extractor,for the ith marker sample, +.>Is->Corresponding real label, n t For the number of samples of the target domain, +.>For the j-th unlabeled sample, +.>Is->Is a hybrid tag of (a).
7. The method of claim 4, wherein the predicted distribution difference penalty is calculated based on the first prediction tag and the second prediction tag by the following formula:
Wherein L is JS For the predicted distribution difference loss, JS is a divergence function, T is a harmonic parameter, N s,s To meet the requirements ofN, N s,t To meet->Sample number of>The true labels representing the two source domain samples are identical, < +.>Real tag representing a source domain sample and a destinationThe prediction labels of the domain samples are the same.
8. The method of claim 4, wherein the loss function is:
wherein L is CE Supervising the classification loss for the source, L D To combat losses for the domain, L JS For the predicted distribution difference loss, F is the feature extractor, C is the classifier, D c Beta is the domain arbiterIs used for the positive weight parameter of (a).
9. The method of claim 1, wherein marked samples in the source domain are images with true label information, unmarked samples in the target domain are unmarked images, and the trained hybrid target domain adaptive model is used to classify images.
10. A computer device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, characterized in that the processor implements the method according to any of claims 1-9 when executing the computer program.
CN202311337554.6A 2023-10-16 2023-10-16 Training method and device for hybrid target domain adaptive model and computer equipment Active CN117152563B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311337554.6A CN117152563B (en) 2023-10-16 2023-10-16 Training method and device for hybrid target domain adaptive model and computer equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311337554.6A CN117152563B (en) 2023-10-16 2023-10-16 Training method and device for hybrid target domain adaptive model and computer equipment

Publications (2)

Publication Number Publication Date
CN117152563A true CN117152563A (en) 2023-12-01
CN117152563B CN117152563B (en) 2024-05-14

Family

ID=88898930

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311337554.6A Active CN117152563B (en) 2023-10-16 2023-10-16 Training method and device for hybrid target domain adaptive model and computer equipment

Country Status (1)

Country Link
CN (1) CN117152563B (en)

Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111738315A (en) * 2020-06-10 2020-10-02 西安电子科技大学 Image classification method based on countermeasure fusion multi-source transfer learning
US20210012198A1 (en) * 2018-05-31 2021-01-14 Huawei Technologies Co., Ltd. Method for training deep neural network and apparatus
CN113298189A (en) * 2021-06-30 2021-08-24 广东工业大学 Cross-domain image classification method based on unsupervised domain self-adaption
CN114492574A (en) * 2021-12-22 2022-05-13 中国矿业大学 Pseudo label loss unsupervised countermeasure domain adaptive picture classification method based on Gaussian uniform mixing model
CN114781647A (en) * 2022-04-11 2022-07-22 南京信息工程大学 Unsupervised domain adaptation method for distinguishing simple samples from difficult samples
CN115393599A (en) * 2021-05-21 2022-11-25 北京沃东天骏信息技术有限公司 Method, device, electronic equipment and medium for constructing image semantic segmentation model and image processing
US20220383052A1 (en) * 2021-05-18 2022-12-01 Zhejiang University Unsupervised domain adaptation method, device, system and storage medium of semantic segmentation based on uniform clustering
CN115439788A (en) * 2022-09-11 2022-12-06 复旦大学 Domain self-adaptive method for migrating video model from source domain to target domain
CN116227578A (en) * 2022-12-13 2023-06-06 浙江工业大学 Unsupervised domain adaptation method for passive domain data
CN116263785A (en) * 2022-11-16 2023-06-16 中移(苏州)软件技术有限公司 Training method, classification method and device of cross-domain text classification model
CN116468991A (en) * 2023-02-24 2023-07-21 西安电子科技大学 Incremental-like non-supervision domain self-adaptive image recognition method based on progressive calibration
CN116824216A (en) * 2023-05-22 2023-09-29 南京信息工程大学 Passive unsupervised domain adaptive image classification method

Patent Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20210012198A1 (en) * 2018-05-31 2021-01-14 Huawei Technologies Co., Ltd. Method for training deep neural network and apparatus
CN111738315A (en) * 2020-06-10 2020-10-02 西安电子科技大学 Image classification method based on countermeasure fusion multi-source transfer learning
US20220383052A1 (en) * 2021-05-18 2022-12-01 Zhejiang University Unsupervised domain adaptation method, device, system and storage medium of semantic segmentation based on uniform clustering
CN115393599A (en) * 2021-05-21 2022-11-25 北京沃东天骏信息技术有限公司 Method, device, electronic equipment and medium for constructing image semantic segmentation model and image processing
CN113298189A (en) * 2021-06-30 2021-08-24 广东工业大学 Cross-domain image classification method based on unsupervised domain self-adaption
CN114492574A (en) * 2021-12-22 2022-05-13 中国矿业大学 Pseudo label loss unsupervised countermeasure domain adaptive picture classification method based on Gaussian uniform mixing model
CN114781647A (en) * 2022-04-11 2022-07-22 南京信息工程大学 Unsupervised domain adaptation method for distinguishing simple samples from difficult samples
CN115439788A (en) * 2022-09-11 2022-12-06 复旦大学 Domain self-adaptive method for migrating video model from source domain to target domain
CN116263785A (en) * 2022-11-16 2023-06-16 中移(苏州)软件技术有限公司 Training method, classification method and device of cross-domain text classification model
CN116227578A (en) * 2022-12-13 2023-06-06 浙江工业大学 Unsupervised domain adaptation method for passive domain data
CN116468991A (en) * 2023-02-24 2023-07-21 西安电子科技大学 Incremental-like non-supervision domain self-adaptive image recognition method based on progressive calibration
CN116824216A (en) * 2023-05-22 2023-09-29 南京信息工程大学 Passive unsupervised domain adaptive image classification method

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
LI Y, ET AL.: "Network architecture search for domain adaptation", 《ARXIV》, 31 December 2020 (2020-12-31) *
YAN K, ET AL.: "Deep transfer learning for cross-species plant disease diagnosis adapting mixed subdomains", 《 IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS》, 31 December 2021 (2021-12-31) *

Also Published As

Publication number Publication date
CN117152563B (en) 2024-05-14

Similar Documents

Publication Publication Date Title
US10762373B2 (en) Image recognition method and device
CN110837836A (en) Semi-supervised semantic segmentation method based on maximized confidence
GB2547313A (en) Accurate tag relevance prediction for image search
WO2019080411A1 (en) Electrical apparatus, facial image clustering search method, and computer readable storage medium
US20210056362A1 (en) Negative sampling algorithm for enhanced image classification
US20210174104A1 (en) Finger vein comparison method, computer equipment, and storage medium
JP2022014776A (en) Activity detection device, activity detection system, and activity detection method
Li et al. Lcnn: Low-level feature embedded cnn for salient object detection
Wang et al. S 3 d: scalable pedestrian detection via score scale surface discrimination
Liu et al. Robust salient object detection for RGB images
Karaoglu et al. Detect2rank: Combining object detectors using learning to rank
Ma et al. Location-aware box reasoning for anchor-based single-shot object detection
Lin et al. Hierarchical representation via message propagation for robust model fitting
CN115391588B (en) Fine adjustment method and image-text retrieval method of visual language pre-training model
WO2020135054A1 (en) Method, device and apparatus for video recommendation and storage medium
CN117152563B (en) Training method and device for hybrid target domain adaptive model and computer equipment
CN112424784A (en) Systems, methods, and computer-readable media for improved table identification using neural networks
Pereira et al. Assessing active learning strategies to improve the quality control of the soybean seed vigor
CN112989869B (en) Optimization method, device, equipment and storage medium of face quality detection model
Ma et al. Depth-guided progressive network for object detection
CN114219047B (en) Heterogeneous domain self-adaption method, device and equipment based on pseudo label screening
Tian et al. Structure-aware semantic-aligned network for universal cross-domain retrieval
Cai et al. Semantic and Correlation Disentangled Graph Convolutions for Multilabel Image Recognition
Ma et al. Capsule-Based Regression Tracking via Background Inpainting
En et al. Human-like delicate region erasing strategy for weakly supervised detection

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant