CN109919209B - Domain self-adaptive deep learning method and readable storage medium - Google Patents

Domain self-adaptive deep learning method and readable storage medium Download PDF

Info

Publication number
CN109919209B
CN109919209B CN201910139916.8A CN201910139916A CN109919209B CN 109919209 B CN109919209 B CN 109919209B CN 201910139916 A CN201910139916 A CN 201910139916A CN 109919209 B CN109919209 B CN 109919209B
Authority
CN
China
Prior art keywords
learning
self
image
domain
target domain
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910139916.8A
Other languages
Chinese (zh)
Other versions
CN109919209A (en
Inventor
许娇龙
聂一鸣
肖良
朱琪
商尔科
戴斌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
National Defense Technology Innovation Institute PLA Academy of Military Science
Original Assignee
National Defense Technology Innovation Institute PLA Academy of Military Science
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by National Defense Technology Innovation Institute PLA Academy of Military Science filed Critical National Defense Technology Innovation Institute PLA Academy of Military Science
Priority to CN201910139916.8A priority Critical patent/CN109919209B/en
Publication of CN109919209A publication Critical patent/CN109919209A/en
Application granted granted Critical
Publication of CN109919209B publication Critical patent/CN109919209B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Image Analysis (AREA)

Abstract

The invention discloses a field self-adaptive deep learning method, which carries out rotation transformation on a target domain image to obtain a self-supervised learning training sample set; and performing joint training on the converted self-supervision learning sample set and the source field training sample set to obtain a field self-adaptation deep learning model for the visual task on the target field. The method does not need to label the target domain samples, can effectively learn the feature representation of the target domain, and improves the performance of computer vision tasks on the target domain. The application also discloses a readable storage medium for the field self-adaptive deep learning, which also has the beneficial effects.

Description

Domain self-adaptive deep learning method and readable storage medium
Technical Field
The invention relates to the field of field adaptive deep learning, in particular to a field adaptive deep learning method and a readable storage medium for computer vision tasks.
Background
Models in computer vision tasks, such as image classification, image semantic segmentation, target recognition, target detection and the like, are usually obtained through supervised learning training. Supervised learning, particularly based on deep neural networks, typically requires a large number of labeled training samples. The labeling of the samples needs to consume a large amount of manpower and material resources, for example, the image segmentation needs to be performed with semantic labeling pixel by pixel, the labeling difficulty is very high, and the cost is very high. After the model is trained on the annotation data, it is applied to the test data. Supervised learning is a very effective method when the test data and the training data have the same distribution. However, in practical applications, the distribution of the test data is different from that of the training data, so that the performance of the model learned on the training data is reduced.
Domain adaptation (domain adaptation) is a class of technical approaches to solve the above-mentioned problem of model performance degradation due to differences in the distribution of training and test data. The training data set is often referred to as the source domain and the test data set as the target domain. The data of the source domain is annotated with information, while the data of the target domain is typically annotated without information. The field self-adaptive technology aims to migrate supervision information in a source field to a target field and improve performance of tasks in the target field.
Deep neural network-based domain adaptive learning generally improves the performance of tasks on a target domain by learning a feature representation that is invariant across domains, i.e., a feature representation that has domain commonality. In order to obtain a cross-domain invariant feature representation, the current mainstream method is realized by domain confrontation training. The domain confrontation training is difficult to converge than the non-confrontation training because a pair of target functions confronted with each other needs to be optimized at the same time, and a model obtained by training is often not optimal.
Disclosure of Invention
The technical problem to be solved by the invention is to provide a field self-adaptive deep learning method, which is oriented to computer vision tasks and provides a non-countermeasure field self-adaptive method to improve the performance of tasks in the target field; the application also provides a readable storage medium for field adaptive deep learning, which also solves the technical problem.
The application provides a field self-adaptive deep learning method, which comprises the following steps:
step 1: rotating each target domain image according to a set angle, wherein the images formed after rotation correspond to different category labels respectively; zooming and cutting the images formed after rotation to the same size, then randomly disordering all the images in sequence, keeping the class label corresponding to each image unchanged, and forming a self-supervision learning training sample set;
step 2: constructing a visual task (T) on a target domain for a source domain training sample set through a multi-task learning deep neural network, constructing an image classification task for the self-supervision learning training sample set, and then performing joint training on the source domain training sample set and the self-supervision learning training sample set;
and step 3: and (3) applying the deep learning model obtained by the joint training to a visual task (T) on a target domain.
Optionally, each target domain image in step 1 is rotated by 0 °, 90 °, and 180 ° to form three new pictures.
Optionally, the image formed after rotation in step 1 is scaled and cropped to 224 pixels in length and width.
Optionally, the image formed after rotation in step 1 is subjected to data enhancement before scaling, wherein the data enhancement comprises random brightness or saturation adjustment.
Optionally, the multitask learning deep neural network comprises an encoder backbone network (F), an image classifier network branch (C) and a visual task network branch (S);
the encoder backbone network (F) and the image classifier network branch (C) construct an image classification task for the self-supervision learning training sample set, and the encoder backbone network (F) and the visual task network branch (S) construct a learning task (T) for the source field training sample set.
The invention provides a readable storage medium having stored thereon a program which, when executed by a processor, performs the steps of the domain adaptive deep learning method.
The field self-adaptive deep learning method provided by the invention learns the feature representation of the target field by combining the supervised learning of the source field and the supervised learning task of the target field, thereby realizing the field self-adaptation. The method can not only give full play to the high efficiency of supervised learning in deep neural network training, but also construct a target field self-supervised learning training set without depending on artificial labeling. Through the joint training of the source field samples and the target field samples, the commonality of the source field and the target field is effectively utilized to establish the characteristic representation of the task adapting to the target field, so that the performance of the task on the target field is improved.
The field-adaptive deep learning readable storage medium provided by the invention also has the beneficial effects.
Drawings
Fig. 1 is a schematic flow chart of an adaptive deep learning training process in the field of self-supervision according to an embodiment of the present invention.
FIG. 2 is a schematic flow chart of using the trained model for target domain testing.
Detailed Description
The present invention is described in further detail below with reference to the attached drawing figures.
Fig. 1 shows a schematic diagram of a training process of the field adaptive deep learning according to an embodiment of the present invention. The method mainly comprises the following steps:
step 1: carrying out rotation transformation on the target domain image to obtain a self-supervision learning training sample set;
step 2: and performing joint training on the converted self-supervision learning training sample set and source field training samples to obtain a deep learning model.
And step 3: and applying the model obtained by the joint training to the visual task T on the target domain.
And step 1, performing rotation transformation on the target domain image to obtain a self-supervision learning training sample set. The process firstly rotates the target domain image by 0 degrees, 90 degrees and 180 degrees respectively, and the three rotation angles correspond to the category labels 0, 1 and 2 respectively. The process does not need to manually label each picture one by one, and automatic generation of the labeled samples in the self-supervision learning is realized. And performing data enhancement preprocessing on the rotated image, including randomly adjusting contrast and brightness, and then scaling and cutting the image to a uniform picture size, wherein the length and width of the image scaling and cutting are 224 pixels. And (4) disordering the sequence of the processed images randomly, wherein the label corresponding to each image is unchanged. The process inputs the target domain image and outputs the converted target domain image XtAnd its corresponding rotation type label Yt
And 2, performing combined training on the converted self-supervised learning training sample set and the source field training sample to obtain a deep learning model. The process firstly constructs a multitask learning deep neural network, wherein the network comprises a feature extraction encoder backbone network F, an image classifier network branch C and a network branch S corresponding to a visual task T used on a target domain. A coder backbone network F and a visual task network branch S in the network are used for supervising and learning tasks of a task T on a source domain sample; and the encoder backbone network F and the image classifier network branch C are used for image classification tasks on the transformed target field self-supervision learning samples.
In the figure, the black solid arrows represent the forward propagation of data in the neural network, and the dashed lines represent the backward propagation of the gradient. For the image classification task, the image X after the target domain transformationtInputting the output Y of the encoder backbone network F, the output characteristics of which are used as the input of the image classifier network branch Ct *Is the predicted image rotation category. Classifier loss function according to label class YtAnd prediction class Yt *The classifier errors are calculated which update the parameters of the image classifier network branch C and the feature encoder F by back-propagation. For visual task T, source domain image XsThe output characteristics of the input encoder backbone network F, which is used as the input of the network branch S of the visual task T, are obtained as the output Ys *For the predicted value of the visual task T, the loss function of the visual task T is according to the label type YsAnd prediction class Ys *And calculating the error of the visual task T, and updating the parameters of the visual task network branch S and the encoder backbone network F through back propagation. Because the samples of the source domain and the target domain both affect the parameter update of the encoder backbone network F, the encoder backbone network F obtained through training can learn the cross-domain feature representation, thereby realizing the self-adaptation to the target domain.
Fig. 2 shows a schematic flow chart of applying the model obtained by the domain adaptive learning to the target domain visual task T.
As shown in fig. 2, the feature extraction encoder backbone network F extracts features from an input target domain image, and then obtains a task T prediction result by taking the obtained features as an input of a task T network branch C and performing forward propagation calculation.
The following describes a readable storage medium provided by an embodiment of the present application, and the readable storage medium described below and the field adaptive deep learning method described above may be referred to correspondingly.
A readable storage medium is disclosed having a program stored thereon, which when executed by a processor, performs the steps of a domain adaptive deep learning method.
It is clear to those skilled in the art that, for convenience and brevity of description, the above-described flows of the program in the readable storage medium may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
In order to better illustrate the technical effects of the present invention, taking the task of semantic segmentation of images as an example, the inventors have further performed the following experiments:
experiment 1: adaptive learning in the field of image semantic segmentation from SYNTHIA datasets to cityscaps datasets
The experiment performed domain adaptive learning between SYNTHIA datasets and cityscaps datasets. The SYNTHIA is a virtual scene data set, all data are made through three-dimensional simulation software, and the data comprise 9400 pictures and corresponding semantic segmentation labels. Cityscapes is a real world data set, the training data set contains 2975 pictures, and the verification data set contains 500 pictures. The cityscaps contain a total of 19 semantic tags. In this experiment, we used the SYNTHIA dataset and the class 13 semantic tags common to cityscaps. In this experiment, the SYNTHIA dataset was used as the source domain dataset and the ctyscaps dataset was used as the target domain dataset. The validated dataset of citrescaps in this experiment was used to evaluate the performance of the proposed method. The evaluation index adopts an average intersection ratio mIoU (mean intersection over intersection). mIoU represents the coverage rate of the predicted image semantic segmentation result relative to the true value. The test results are shown below:
table 1:
Figure BDA0001978170560000041
in table 1, from the third column to the last column, each column represents a certain semantic category, and the average precision of the second column is the average value of the semantic segmentation precision of each category. Table 1 compares three methods, including the method using only source domain samples for training (SRC), the method based on antagonistic training (FCN-W), and the method proposed by the present invention (RotDA). As can be seen from table 1, the performance on the target domain is the worst by using only the method of source domain sample training because no adaptive domain adjustment is performed. By adopting the countertraining method, better self-adaptive learning effect can be obtained due to the fact that the field confusion degree of feature representation is improved. According to the invention, through self-supervision learning, the feature representation more adaptive to the target field is obtained, so that the remarkable performance improvement is achieved in the target field.
Experiment 2: adaptive learning in the field of image semantic segmentation from GTA datasets to Cityscapes datasets
The experiment performed domain adaptive learning between GTA and cityscaps datasets. The GTA dataset is derived from a los angeles city scene in a three-dimensional video game, and contains 24966 pictures and their corresponding semantic segmentation labels. The annotations contain class 19 semantic tags, consistent with the Cityscapes semantic tag definition, so this experiment evaluates on class 19 semantic tags. In the experiment, a GTA data set is used as a source field data set, and a Cityscapes data set is used as a target field data set. The test results are shown below:
table 2:
Figure BDA0001978170560000051
from table 2, a similar conclusion can be drawn as in table 1, that is, the domain adaptive deep learning method can achieve better performance than the source domain model, and better performance than the resistance training. The method obtains excellent performance due to the fact that the feature expression adaptive to the target field is obtained through the joint training of the source field sample and the target field sample.
Although the present invention has been described in terms of preferred embodiments, it is to be understood that the invention is not limited to the embodiments described herein, but includes various changes and modifications without departing from the scope of the invention.

Claims (6)

1. A domain adaptive deep learning method is characterized by comprising the following steps:
s1: rotating each target domain image according to a set angle, wherein the images formed after rotation correspond to different category labels respectively; zooming and cutting the images formed after rotation to the same size, then randomly disordering all the images in sequence, keeping the class label corresponding to each image unchanged, and forming a target domain self-supervision learning training sample set;
s2: performing joint training on the target domain self-supervision learning training sample set and the source domain training samples to obtain a deep learning model;
s2.1, constructing a multitask learning neural network comprising a visual task network branch S and an image classifier network branch C;
s2.2 Source Domain samples { Xs,YsAnd the target domain self-supervised learning training sample set { X ] obtained in the S1t,YtInputting the multitask learning deep neural network for joint training;
wherein X in the source domain samplesObtaining an output Y through a visual task network branch Ss *Visual task T loss function based on sample tag value YsAnd the predicted value Ys *Calculating the error of the visual task T; x in target domain self-supervised learning training sample settObtaining an output Y through an image classifier network branch Ct *The classifier penalty function is based on the sample label value YtAnd the predicted value Yt *Calculating the error of the self-supervised learning task;
s3: and applying the deep learning model obtained by the joint training to the visual task T on the target domain.
2. The method of claim 1, wherein: and (4) performing rotation of 0 degrees, 90 degrees and 180 degrees on each target domain image in the step S1 to form three new pictures.
3. The method of claim 2, wherein: the image formed after the rotation in step S1 is scaled and cropped to 224 pixels in both length and width.
4. The method of claim 1, wherein: the image formed after rotation in step S1 is subjected to data enhancement including random brightness or saturation adjustment prior to scaling.
5. The method of claim 1, wherein: the multi-task learning deep neural network comprises an encoder backbone network F, an image classifier network branch C and a visual task network branch S;
the encoder backbone network F and the image classifier network branch C construct an image classification task for the self-supervision learning training sample set, and the encoder backbone network F and the visual task network branch S construct a visual task T for the source field training sample set.
6. A readable storage medium, characterized in that the readable storage medium has stored thereon a program which, when executed by a processor, implements the domain adaptive deep learning method according to any one of claims 1 to 5.
CN201910139916.8A 2019-02-26 2019-02-26 Domain self-adaptive deep learning method and readable storage medium Active CN109919209B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910139916.8A CN109919209B (en) 2019-02-26 2019-02-26 Domain self-adaptive deep learning method and readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910139916.8A CN109919209B (en) 2019-02-26 2019-02-26 Domain self-adaptive deep learning method and readable storage medium

Publications (2)

Publication Number Publication Date
CN109919209A CN109919209A (en) 2019-06-21
CN109919209B true CN109919209B (en) 2020-06-19

Family

ID=66962296

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910139916.8A Active CN109919209B (en) 2019-02-26 2019-02-26 Domain self-adaptive deep learning method and readable storage medium

Country Status (1)

Country Link
CN (1) CN109919209B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TWI840640B (en) 2020-12-25 2024-05-01 台達電子工業股份有限公司 Semi-supervised learning system and semi-supervised learning method

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11113829B2 (en) * 2019-08-20 2021-09-07 GM Global Technology Operations LLC Domain adaptation for analysis of images
CN111093123B (en) * 2019-12-09 2020-12-18 华中科技大学 Flexible optical network time domain equalization method and system based on composite neural network
CN111160553B (en) * 2019-12-23 2022-10-25 中国人民解放军军事科学院国防科技创新研究院 Novel field self-adaptive learning method
CN111144565B (en) * 2019-12-27 2020-10-27 中国人民解放军军事科学院国防科技创新研究院 Self-supervision field self-adaptive deep learning method based on consistency training
CN111898696B (en) * 2020-08-10 2023-10-27 腾讯云计算(长沙)有限责任公司 Pseudo tag and tag prediction model generation method, device, medium and equipment
CN112308149B (en) * 2020-11-02 2023-10-24 平安科技(深圳)有限公司 Optimization method and device for image information identification based on machine learning
CN112949583A (en) * 2021-03-30 2021-06-11 京科互联科技(山东)有限公司 Target detection method, system, equipment and storage medium for complex city scene
CN114283290B (en) * 2021-09-27 2024-05-03 腾讯科技(深圳)有限公司 Training of image processing model, image processing method, device, equipment and medium
CN114549904B (en) * 2022-02-25 2023-07-07 北京百度网讯科技有限公司 Visual processing and model training method, device, storage medium and program product
CN115701868B (en) * 2022-08-22 2024-02-06 中山大学中山眼科中心 Domain self-adaptive enhancement method applicable to various visual tasks

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107909101A (en) * 2017-11-10 2018-04-13 清华大学 Semi-supervised transfer learning character identifying method and system based on convolutional neural networks

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10354199B2 (en) * 2015-12-07 2019-07-16 Xerox Corporation Transductive adaptation of classifiers without source data
CN107273927B (en) * 2017-06-13 2020-09-22 西北工业大学 Unsupervised field adaptive classification method based on inter-class matching
CN107316061B (en) * 2017-06-22 2020-09-22 华南理工大学 Deep migration learning unbalanced classification integration method
CN108009633A (en) * 2017-12-15 2018-05-08 清华大学 A kind of Multi net voting towards cross-cutting intellectual analysis resists learning method and system

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107909101A (en) * 2017-11-10 2018-04-13 清华大学 Semi-supervised transfer learning character identifying method and system based on convolutional neural networks

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
Domain Adaptation of Deformable Part-Based Models;Jiaolong Xu等;《IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE》;20141231;第36卷(第12期);第2367-2380页 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TWI840640B (en) 2020-12-25 2024-05-01 台達電子工業股份有限公司 Semi-supervised learning system and semi-supervised learning method

Also Published As

Publication number Publication date
CN109919209A (en) 2019-06-21

Similar Documents

Publication Publication Date Title
CN109919209B (en) Domain self-adaptive deep learning method and readable storage medium
Li et al. A closed-form solution to photorealistic image stylization
Liu et al. Auto-painter: Cartoon image generation from sketch by using conditional Wasserstein generative adversarial networks
Xu et al. Learning deep structured multi-scale features using attention-gated crfs for contour prediction
CN110188760B (en) Image processing model training method, image processing method and electronic equipment
CN109754015B (en) Neural networks for drawing multi-label recognition and related methods, media and devices
Son et al. Urie: Universal image enhancement for visual recognition in the wild
CN110399518B (en) Visual question-answer enhancement method based on graph convolution
CN111754596A (en) Editing model generation method, editing model generation device, editing method, editing device, editing equipment and editing medium
CN110458084B (en) Face age estimation method based on inverted residual error network
CN113408537B (en) Remote sensing image domain adaptive semantic segmentation method
Liao et al. A deep ordinal distortion estimation approach for distortion rectification
CN111742345A (en) Visual tracking by coloring
CN107886491A (en) A kind of image combining method based on pixel arest neighbors
CN114548279A (en) Semi-supervised image classification method based on distillation network
Zheng et al. T-net: Deep stacked scale-iteration network for image dehazing
Liu et al. Attentive semantic and perceptual faces completion using self-attention generative adversarial networks
Zhang et al. DuGAN: An effective framework for underwater image enhancement
Heo et al. Automatic sketch colorization using DCGAN
CN107729885B (en) Face enhancement method based on multiple residual error learning
Song et al. HDR-Net-Fusion: Real-time 3D dynamic scene reconstruction with a hierarchical deep reinforcement network
CN113781324A (en) Old photo repairing method
Wei et al. Facial image inpainting with deep generative model and patch search using region weight
Wang et al. A multi-scale attentive recurrent network for image dehazing
CN116777929A (en) Night scene image semantic segmentation method, device and computer medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant