CN112836740A

CN112836740A - Markov-based open composite domain-based method for improving model domain adaptivity

Info

Publication number: CN112836740A
Application number: CN202110129302.9A
Authority: CN
Inventors: 谭志; 刘兴业
Original assignee: Beijing University of Civil Engineering and Architecture
Current assignee: Beijing University of Civil Engineering and Architecture
Priority date: 2021-01-29
Filing date: 2021-01-29
Publication date: 2021-05-25
Anticipated expiration: 2041-01-29
Also published as: CN112836740B

Abstract

The invention provides a method for improving the adaptivity of a model domain based on a Markov open composite domain, which provides a concept of the Markov open composite domain, mixes different data sets together through a Markov process, and ensures that the element distribution from different data sets in the composite domain is more dispersed, thereby achieving better domain adaptation effect. The method provided by the invention also has the following beneficial effects: the method comprises the steps of mixing a plurality of domains without domain labels into a Markov composite domain by using a Markov process and combining Bernoulli's theorem, and then combining an open domain to form a Markov open composite domain, so that the element distribution from different data sets in the composite domain is more dispersed, and a better domain self-adaption effect is achieved; meanwhile, a neural network encoder based on a parametric correction linear unit is constructed, so that the neural network can fully utilize all data information in the encoding process, and the aim of fully extracting image characteristics is fulfilled.

Description

Markov-based open composite domain-based method for improving model domain adaptivity

Technical Field

The invention relates to the technical field of computer vision, in particular to a Markov-based method for improving the adaptivity of a model domain by an open composite domain.

Background

The image identification is a technology for processing, analyzing and understanding images by using a computer and identifying targets and objects in different modes, and has very important influence and effect in the intelligent data acquisition and processing process taking the images as main bodies. The image recognition technology of a computer and the image recognition of a human are not fundamentally different in principle, but a machine lacks the influence of human on the sense and vision difference. When a human sees a picture, our brain can quickly sense whether the picture or a picture similar to the picture is seen, and the brain can recognize whether the picture is seen or not according to the classified categories in the memory, and check whether the memory with the same or similar characteristics to the picture exists or not. The same is true for the image recognition technology of the machine, and a corresponding recognition model is established by collecting input image information, and then image features are analyzed and extracted, and a proper classifier is established to achieve the final recognition effect.

Research into the field of image recognition has been a very important part of artificial intelligence research. The image recognition technology has the functions of monitoring, checking and even supervising recognition, and in daily life of people, the image recognition technology is not more and more separated, and even in each corner of life, the figure of the image recognition technology exists. With the continuous updating development of information technology, image recognition has shown its strong utility and has occupied a significant position in daily life. And with the development of computer technology, the human knowledge of image recognition technology is more and more profound.

In the training process of the image recognition model, data in the same data set is often divided into a training set and a test set, so that the test data and the training data have the same bottom-layer characteristics, and the recognition model with higher performance can be obtained through supervised learning. Although the model achieves excellent performance on the reference data set, the real application still faces the challenge of huge differences in background, illumination, image quality and the like. Therefore, in research and experiments, it is found that if an image recognition model trained on a specific data set is used for recognizing other data sets, the recognition accuracy of the model is obviously reduced, and the obtained prediction results have great deviation. In order to make the image recognition model effectively act on a more realistic target image, the image recognition model not only performs well in an experimental stage, but also performs well in real application, and a large number of domestic and foreign researchers have been engaged in domain adaptive research on the image recognition model.

Aiming at the problems, Liu (Liu Ziwei, Miao Zhongqi, Pan Xingang, et al. open compound domain adaptation [ C ]// IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA,2020, pp.12403-12412.) and the like propose an open composite domain adaptation method, construct a continuous and more realistic domain adaptation environment, regard an open composite target domain as a combination of a plurality of traditional homogeneous domains, and obviously improve the model adaptability. Determining a target domain, a composite domain and an open domain, wherein the target domain is a data set used for training a source domain classifier, the composite domain is a set of a plurality of data sets and used for training and testing a model, and the open domain is a data set only used for testing the model; constructing a class encoder, a source domain encoder and a target domain encoder which are respectively used for extracting class characteristics, source domain characteristics and target domain characteristics of the image; thirdly, constructing a class classifier through the source domain data and the class encoder, wherein the class classifier is used for classifying the image; fourthly, training a discriminator on the source domain data set through a source domain encoder, and enabling the result of the content of the source domain data set after being identified by the discriminator to be the same as the result after being classified by a classifier as much as possible; fifthly, after a discriminator is preliminarily trained in the source domain, a cross entropy function between the discrimination result of the discriminator and a randomly generated class label is used for disturbing a trained source domain feature encoder; sixthly, training a strong source domain encoder by repeating the mutual confrontation process of training-disturbance, namely the fourth step and the fifth step; seventhly, after the source domain encoder is trained, calculating the distance between the content in the target domain and the content in the source domain; eighthly, completing the training of a target domain encoder by using a standard loss function of a Generative Adaptive Network (GAN) on a target domain from near to far according to the distance calculated in the seventh step, wherein the target domain feature mapping is basically similar to the source domain feature mapping, so that the class classifier constructed in the third step can be directly applied to the target domain; the ninth step, extract the direct characteristic of the content of the open field; step ten, calculating the mass center of each category according to the classification information of the source domain; and step eleven, constructing a reinforcer according to the centroid to reinforce the direct characteristics of the open domain content, so that the classifier trained from the source domain is suitable for the open domain content.

In the solution proposed by Liu et al, there are also the following drawbacks:

in the complex domain, sufficient distribution among individuals in the domain cannot be guaranteed;

the encoder is constructed to set all of the portions of the data information smaller than 0 to 0 so that the data information smaller than 0 is lost.

Disclosure of Invention

The embodiment of the invention provides a Markov open composite domain-based method for improving the adaptivity of a model domain, which is used for improving a construction method of a composite domain, constructing a more complete encoder and realizing sufficient feature extraction work.

In order to achieve the purpose, the invention adopts the following technical scheme.

A method for improving the adaptivity of a model domain based on a Markov open composite domain comprises the following steps:

s1, acquiring an initial data set for training the model;

s2, obtaining a data set DS for forming a Markov open composite domain based on the initial data set;

s3 slicing the data set DS to obtain a slice combination CD; performing probability operation on the slice combination CD to obtain a transfer matrix P between each element in the slice combination CD;

s4 transfer operation is carried out on the transfer matrix P to obtain a probability set P corresponding to the elements in the slice combination CD_CDAccording to the probability set P_CDRandomly obtaining a data set D to be processed from the slice combination CD_dPut into the memory cell M' CD;

s5 repeats the step S4 until each slice of the storage unit M' CD contains all data sets in the slice combination CD;

s6, randomly disordering the elements of the memory cell M' CD, and adding an open domain to obtain a Markov open composite domain MOCD;

s7 construction of class encoder Epc (-) by a neural network based on parameterized modified linear elements, source domain encoder Eps (-) and target domain encoder Ept (-) by source domain data DT { X ·_s,Y_sThe classification coder trains a classification classifier;

s8 is based on source domain data DT { X }_s,X_sTraining a discriminator D (-) and disturbing a trained source domain encoder Eps (-) by combining the trained discriminator D (-) with a cross entropy function between randomly generated class labels;

s9 repeatedly executing step S8 to obtain an enhanced source domain encoder;

s10 calculating the content and source domain data DT { X ] in the storage unit M' CD_s，Y_sDistance of content from which the target domain encoder Ept (-) is trained;

s11 extracting direct feature of open domain data according to the source domain data DT { X }_s，Y_sCalculating the mass center of each category in the target domain according to the classification information;

s12 constructing a reinforcer according to the centroid, and strengthening the direct characteristics of the open domain data through the reinforcer;

s13 outputs the learned weights of the optimized model.

Preferably, step S2 includes:

s21 sets the composite domain to include d data sets, and obtains a data set DS ═ DS₁，DS₂,……，DS_i，……,DS_d} (1); in the formula DS_i,i∈[1,d]Representing the data sets that make up the composite domain.

Preferably, step S3 includes:

s31 setting each data set to have z images, passing through

Dividing each data set into a plurality of batches, each batch having the same number of images, obtaining a slice combination CD,

in the formula (I), the compound is shown in the specification,

represents DS_iThe m-th batch of images, the batch having

Opening an image;

s32, assuming that the slice combination CD has N elements in total, simplifies the slice combination CD to CD ═ D₁,D₂,……,D_i,……,D_N} (4)；

S33 represents the simplified slice combination CD by the probability equation (5)

Obtaining the system at t_n-1Time of day selected data D_n-1,D_n-1E is the CD; in the formula (I), the compound is shown in the specification,

the representation system is at t_jState of time of day, D_j,j∈[1,n],D_jE CD indicates that the system is at t_jData selected at a time;

s34 passing formula

Obtaining the system at t_nStatus data D of time_n,D_nOne-step transfer probability of the epsilon CD;

s35, based on the one-step transition probability, obtaining a transition matrix among each element in the slice combination CD

In the formula (I), the compound is shown in the specification,

D_x,D_ye.g. CD represents the transition probability between elements;

s36 is the element of each transition matrix P

Assigning a random value, the random value satisfying a condition

Preferably, step S4 includes:

s41 according to Chepman-Kerr Mogolov equation P^(a+b)＝P^(a)P^(b)(9) Obtaining N-1 step transition probability matrix calculation formula P between elements^(N-1)＝P^(N-2)P＝P^(N-3)P P＝P……P P＝P^N-1(10) (ii) a In the formula, P^(N-1)Is an N-1 step transition probability matrix, P^N-1Represents the transfer matrix between the data to the power of N-1;

s42 obtaining N-1 step transition probability matrix based on formula (10)

In the formula (I), the compound is shown in the specification,

D_x,D_ythe epsilon CD represents the transition probability of the N-1 steps between the data;

s43 randomly acquires a data from the slice combination CD as the transfer data D_T；

S44 repeatedly executes the substeps S41 to S43 to obtain a plurality of transfer data D_TFurther obtain a transfer data set D_S，

In the formula

Q is the number of times that the substeps S41 to S43 are repeatedly performed, for the transfer data randomly selected from the slice group CD after each performance of the substeps S41 to S43;

s45 transferring data set D according to each data in slice combination CD_SNumber of occurrences, obtaining frequency of occurrence of data set

Further obtaining a probability set P corresponding to the elements in the slice combination CD_CD，

In the formula

Representing data D in CD_iAt D_SNumber of occurrences in (D), len (D)_S) Set of representations D_STotal length of (d);

s46 according to the probability set P_CDRandomly selecting a data to be processed data set D from the slice combination CD_dPut into a memory cell M ' CD, M ' CD ═ M ' CD_d) (15)。

Preferably, step S7 further includes:

according to the formula

By source domain data DT { X_s,Y_sA class encoder, which trains a class classifier Cs (DEG) by adopting a supervised learning method; in the formula (I), the compound is shown in the specification,

represents (x)_s,y_s) Is derived from (X)_s,Y_s)，E_pc(x_s) Representing the construction of class encoder Epc (-) vs. x with a neural network_sEncoding is performed, and K represents a class.

Preferably, step S8 includes:

s81 is based on source domain data DT { X }_s，Y_sAnd formula

Training discriminator D (·); in the formula, y_csIs the classification result of the classifier Cs (·);

s82 according to formula

The trained source domain encoder Eps (-) is perturbed by this trained discriminator D (-) in combination with a cross-entropy function between randomly generated class labels.

Preferably, in step S10, the training of the target domain encoder Ept (-) according to the distance includes:

according to the content in the memory unit M' CD and the source domain data DT { X_s,Y_sDistance of contents, passing from near to far

And formula

Training the target domain encoder Ept (·);

in the formula (I), the compound is shown in the specification,

denotes x_sIs taken from X_s，

Denotes x_sIs taken from X_t，

Denotes x_tIs taken from X_t，

Representing the penalty function of the discriminator D (-) and,

representing the penalty function of the target domain encoder Ept (-).

It can be seen from the technical solutions provided by the embodiments of the present invention that, the method for improving the model domain adaptivity based on the markov open composite domain provided by the present invention provides a concept of the markov open composite domain, mixes different data sets together through the markov process, and ensures that the element distributions from different data sets in the composite domain are more dispersed, thereby achieving a better domain adaptation effect. The method provided by the invention also has the following beneficial effects:

the method comprises the steps of mixing a plurality of domains without domain labels into a Markov composite domain by using a Markov process and combining Bernoulli's theorem, and then combining an open domain to form a Markov open composite domain, so that the element distribution from different data sets in the composite domain is more dispersed, and a better domain self-adaption effect is achieved;

meanwhile, a neural network encoder based on a parametric correction linear unit is constructed, so that the neural network can fully utilize all data information in the encoding process, and the aim of fully extracting image characteristics is fulfilled.

Additional aspects and advantages of the invention will be set forth in part in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.

FIG. 1 is a process flow diagram of a Markov open composite domain based method for improving domain adaptivity of a model according to the present invention;

FIG. 2 is a flow chart of a Markov open composite domain in a Markov open composite domain based method for improving the domain adaptivity of a model provided by the present invention;

FIG. 3 is a structural diagram of an encoder in a Markov open composite domain-based method for improving the domain adaptivity of a model according to the present invention;

FIG. 4 is a process flow diagram of a preferred embodiment of a Markov open composite domain based method for improving model domain adaptivity in accordance with the present invention;

fig. 5 is a comparison graph of the effect of using the method provided by the invention and the prior art.

Detailed Description

Reference will now be made in detail to embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the accompanying drawings are illustrative only for the purpose of explaining the present invention, and are not to be construed as limiting the present invention.

As used herein, the singular forms "a", "an", "the" and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms "comprises" and/or "comprising," when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof. It will be understood that when an element is referred to as being "connected" or "coupled" to another element, it can be directly connected or coupled to the other element or intervening elements may also be present. Further, "connected" or "coupled" as used herein may include wirelessly connected or coupled. As used herein, the term "and/or" includes any and all combinations of one or more of the associated listed items.

It will be understood by those skilled in the art that, unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. It will be further understood that terms, such as those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of the prior art and will not be interpreted in an idealized or overly formal sense unless expressly so defined herein.

For the convenience of understanding the embodiments of the present invention, the following description will be further explained by taking several specific embodiments as examples in conjunction with the drawings, and the embodiments are not to be construed as limiting the embodiments of the present invention.

Referring to fig. 1, the method for improving the domain adaptivity of a model based on a markov open composite domain provided by the invention comprises the following steps:

s1, acquiring an initial data set for training the model;

s7 construction of class encoder Epc (-) by a neural network based on parameterized modified linear elements, source domain encoder Eps (-) and target domain encoder Ept (-) by source domain data DT { X ·_s，Y_sThe classification coder trains a classification classifier;

s8 is based on source domain data DT { X }_s,Y_sTraining a discriminator D (-) and disturbing a trained source domain encoder Eps (-) by combining the trained discriminator D (-) with a cross entropy function between randomly generated class labels;

s9 repeatedly executing step S8 to obtain an enhanced source domain encoder;

s12 constructing a reinforcer according to the centroid, and strengthening the direct characteristics of the open domain data through the reinforcer; matching the direct features to weights learned in the compound domain to achieve a model that can use learning in the compound domain in the open domain;

s13 outputs the features of the image in the optimized composite domain, i.e., the weights learned by the model.

The invention provides a Markov Open Compound Domain (MOCD) concept, wherein the MOCD is composed of a Markov Compound Domain (MCD) and an Open Domain. Wherein the markov composite domain is a combination of a plurality of conventional and homogenous domains without any domain labels. An open domain refers to a domain that does not occur during training, and likewise does not have a domain label. The invention mixes different data sets together through a Markov process (as shown in FIG. 2), and ensures that the element distribution from different data sets in a composite domain is more dispersed, thereby achieving better domain adaptation effect.

In the embodiment provided by the invention, the data set forming the composite domain is determined, and the whole composite domain data set is the basis for the subsequent improvement process. Step S2 specifically includes:

s21 sets the composite domain to include d data sets, and obtains the data set DS, DS ═ DS₁，DS₂，……，DS_i，……，DS_d} (1); in the formula DS_i，i∈[1，d]Representing constituent composite domainsThe data set of (2).

Further, step S3 specifically includes:

s31 performs slicing on each data set (assuming that the data set has z images), i.e. dividing each data set into several batches (assuming that the batches are w batches), each batch having the same number of images, to obtain a slice combination CD, which is expressed by the following formula (2):

wherein

Represents DS_iThe m-th batch of images, the batch having

Opening an image;

slicing each data set, then integrating the obtained slices together, and finally obtaining a composite domain data set CD with a structure shown in a formula 3.

S32 simplifies the representation of the CD for subsequent convenience of use. Assuming a total of N elements in CD, the expression pattern of CD is summarized as:

CD＝{D₁，D₂，……，D_i，……，D_N} (4)；

the meaning of the S33 discrete Markov process is described as: suppose the system is at t_n-1The state of the moment is s_n-1Then at t in the future_nState of the moment s_nAnd t in the past_n-2，t_n-3，……，t₁State of the moment s_n-2，s_n-3，……，s₁Independently of t only_n-1The state of the moment is s_n-1It is related. In operation, the composite domain data setAnd integrating the CD as a state space, wherein each element D₁,D₂,……,D_i,……,D_NCan be regarded as a state required at a certain moment, and is thus represented by the probability formula (5):

wherein

s34 shows that the system is at t by formula (5)_nTime of day selection of which data is only associated with t_n-1The data selected at the moment is relevant; so that the system at t can be obtained_n-1Time of day selected data D_n-1,D_n-1After e.g. CD, at t_nStatus data D of time_n,D_nThe one-step transfer probability of the epsilon CD is shown as a formula (6)

The one-step transition probability of the S35 system may obtain a transition matrix between each element in the composite set CD as:

wherein

D_x,D_yE.g. CD, represents the transition probability between elements, i.e. data D is selected first in the set of composite domains_XFollowed by selection data D_yThe probability of (d);

s36 need not be for each to achieve randomness

Specifying a specific value; thus, in the initial state, the transition matrix P is given an initial value, for each element in P

Assigning a random value that satisfies the following condition:

further, step S4 specifically includes:

s41 at least N-1 steps of the transfer process are required to be performed in order to be able to traverse all the data sets in the CD. According to the Chepmann-Kerr Morgoffer equation

P^(a+b)＝P^(a)P^(b) (9)；

Wherein P is^(a+b)For a (a + b) step transfer matrix, P^(a)And P^(b)Respectively an a-step transition matrix and a b-step transition matrix, and the calculation mode of obtaining the N-1-step transition probability matrix among elements is

P^(N-1)＝P^(N-2)P＝P^(N-3)P P＝P……P P＝P^N-1(10) (ii) a Of note is P^(N-1)Is an N-1 step transition probability matrix, and P^N-1Represents the transfer matrix between the data to the power of N-1;

s42 calculating to obtain N-1 step transition probability matrix P^(N-1)Is the same as the transfer matrix P between data:

wherein

D_x,D_yEpsilon, CD, represents the probability of N-1 step transition between data, i.e., data D is selected first in the set of composite domains_XThen selecting the number after N-1 step transferAccording to D_yThe probability of (d);

s43, after determining the transition probability matrix of N-1 steps between data, determining to start with a certain data according to the transition probability of N-1 steps between data, and after N-1 steps of transition, selecting the data corresponding to the probability with a certain probability. E.g. starting data as D₁After N-1 step of transfer, the probability of selecting each data in the CD is respectively

Randomly selecting one data from the CD as the transferred data according to the corresponding probability, namely the data is called as transfer data, and using D_T,D_TE is expressed by CD;

s44 in repeated experiments, the frequency of an event is estimated approximately as its probability, as known from bernoulli' S law. Therefore, it is necessary to perform the process of sub-steps S41 to S43 a plurality of times (assumed to be Q times) in constructing the Markov composite domain, resulting in a plurality of transition data D_TForming a set D of transfer data_SExpressed as shown in equation 12

Wherein

For transfer data randomly selected from the CD after each execution of substeps S41 through S43;

s45 then combines the individual data in the CD at D according to the slice_SNumber of occurrences, resulting in frequency of occurrence of the data set

Wherein

Representing data D in CD_iAt D_SNumber of occurrences in (D), len (D)_S) Set of representations D_STotal length of (D)_SThe total number of medium elements; further, a probability set P corresponding to the CD can be obtained_CD

S46 is according to P_CDRandomly selecting one data from CD as a data set to be processed by using the probability D_d,D_dE.g. CD. At D_dThe probability in the transition matrix P needs to be updated after being selected so that D_dRepeated selection in the subsequent selection process is avoided;

obtaining data D to be processed in the above process_dThen put it into a storage unit

M′CD＝M′CD.append(D_d) (15)；

Wherein M' CD is a storage unit, and the initial value is null; the process of step S4 is executed multiple times until M' CD contains each slice of all data sets in CD, and then its internal elements are randomly scrambled to obtain a markov composite domain MCD. And adding an open domain on the basis of the MCD to form a Markov open composite domain MOCD.

The applicant found that in the technical solution disclosed by Liu et al, a neural network encoder based on modified linear units was constructed, and based on this, more advanced domain adaptation results were obtained. However, in the encoding process, the modified linear unit stores all positive values in the original form and sets all negative values to zero, so that information carried by the replaced values is eliminated, and all information cannot be fully utilized, which may cause an error in the final encoding result. To address this drawback, the present invention employs a parameterized modified linear unit to construct the neural network encoder Ep (·). The newly constructed encoder endows a non-zero and learnable parameter to the negative value appearing in the encoding process, so that the encoder can automatically determine the utilization state of the negative value according to the self requirement, fully utilize all data information and improve the performance of the encoder.

In contrast to the solution of Liu et al, the newly constructed encoder also uses the contribution, BatchNorm and Dropout functions. The new encoder is based on a parametric modified linear unit and has 3 convolutional layers, the specific encoder structure is shown in fig. 3.

In a preferred embodiment provided by the present invention, step S7 further specifically includes:

according to the formula

By source domain data DT { X_s,Y_sTraining a class classifier Cs (-) by adopting a supervised learning method; wherein the content of the first and second substances,

represents (x)_s,y_s) Is derived from (X)_s，Y_s)，E_pc(x_s) Representing the construction of class encoder Epc (-) vs. x with a neural network_sEncoding is performed, and K represents a class.

Further, step S8 specifically includes:

s81 is based on source domain data DT { X }_s，Y_sAnd formula

Training a discriminator D (-) to ensure that the result of the source content identified by the discriminator D (-) is as same as the result of the source content classified by the classifier Cs (-) as possible; wherein, y_csIs the classification result of the classifier Cs (·);

s82 according to formula

Further, in step S10, the process of training the target domain encoder Ept (-) according to the distance specifically includes:

according to the content in the memory unit M' CD and the source domain data DT { X_s，Y_sDistance of contents, passing from near to far

And formula

The target domain encoder Ept (-) is trained. At this time, the target domain feature mapping is basically similar to the source domain feature mapping, so that the class classifier constructed in the way can be directly applied to the target domain. In the formula (I), the compound is shown in the specification,

denotes x_sIs taken from X_s，

Denotes x_sIs taken from X_t，

Denotes x_tIs taken from X_t，

Representing the penalty function of the discriminator D (-) and,

representing the penalty function of the target domain encoder Ept (-).

The present invention further provides an embodiment, which is used for displaying a preferred processing flow of the present invention, as shown in fig. 4, and specifically includes the following steps:

first, a data set for training the model is prepared.

In a second step, a data set DS constituting the composite domain is determined according to equation (1).

Third, all data sets for constituting the composite domain are sliced according to equations (2) to (4) and a slice combination CD is obtained.

And fourthly, obtaining a transfer matrix P among elements in the composite set CD according to the formula (5) to the formula (8).

The fifth step of forming a set D of transfer data according to the formulas (9) to (12)_S。

Sixthly, obtaining a probability set P corresponding to the elements in the CD according to a formula (13) and a formula (14)_CD。

Seventh step, according to P_CDProbability of (1), randomly selecting a data D from CD_dAs a set of data to be processed.

Eighth step, obtain the data D to be processed_dThen, it is put into the memory cell M' CD according to the formula (15).

Ninth step at D_dThe probability in the transition matrix P needs to be updated after being selected so that D_dThe selection process is not repeated in the subsequent selection process.

And a tenth step of executing the process from the fifth step to the eighth step for a plurality of times until each slice of all data sets in the CD is contained in the M' CD.

And step ten, randomly disordering the M' CD part elements to obtain a Markov composite domain MCD.

And a twelfth step of adding an open domain on the basis of the MCD to form a Markov open composite domain MOCD.

And a tenth step of constructing a class encoder, a source domain encoder and a target domain encoder by using the neural network based on the parameterized modified linear units, wherein the class encoder, the source domain encoder and the target domain encoder are respectively designed as Epc (-), Eps (-), and Ept (-).

Fourteenth, according to the formula

By source domain data DT { X_s,X_sAnd a class encoder, which trains a class classifier Cs (-) by adopting a supervised learning method.

The fifteenth step, according to the formula

Training discrimination on source domainAnd a classifier C (-) for identifying the source content by the discriminator C (-) and classifying the source content by the classifier C (-) as much as possible.

Sixthly, after the discriminator is preliminarily trained in the source domain, the method is based on

The trained source domain feature encoder is perturbed by a cross-entropy function between the discrimination results of the discriminator and the randomly generated class labels.

Seventeenth, through repeating the process of the seventeenth step and the eighteenth step against each other 'training-disturbing', a strong source domain encoder is trained.

Eighteenth, after the source domain encoder is trained, the distance between the content in the MCD and the source domain content is calculated.

Nineteenth step, at MCD { X_t,Y_tOn the basis of the distance calculated in the eighteenth step, from near to far, by a formula

(19) And (20) (the standard loss function of the GAN) completes the training of the target domain encoder, and the target domain feature mapping is basically similar to the source domain feature mapping, so that the class classifier constructed in the fourteenth step can be directly applied to the target domain.

And twentieth, extracting direct features of the open domain data.

Twenty-first, the centroid of each class in the target domain is calculated based on the classification information of the source domain.

And twenty-second step, constructing an enhancer according to the centroid to enhance the direct characteristics of the open domain content, so that the classifier trained from the source domain is suitable for the open domain content.

And a twenty-third step of outputting the learned weight of the optimized model.

By the method, the improved model is superior to the model proposed by Liu et al and other models for comparison in both the composite domain and the open composite domain, so the self-adaption performance of the image recognition model is improved remarkably, and the experimental result is shown in FIG. 5, wherein in the diagram, "Ours" represents the model trained by the method, "OCDA" represents the model proposed by Liu et al, and "DADA", "BTDA" and "MTDA" represent the models for comparison in the work of Liu et al.

In summary, the method for improving the model domain adaptivity based on the markov open composite domain provided by the invention provides the concept of the markov open composite domain, mixes different data sets together through the markov process, and ensures that the element distributions from different data sets in the composite domain are more dispersed, thereby achieving a better domain adaptation effect. The method provided by the invention also has the following beneficial effects:

Those of ordinary skill in the art will understand that: the figures are merely schematic representations of one embodiment, and the blocks or flow diagrams in the figures are not necessarily required to practice the present invention.

From the above description of the embodiments, it is clear to those skilled in the art that the present invention can be implemented by software plus necessary general hardware platform. Based on such understanding, the technical solutions of the present invention may be embodied in the form of a software product, which may be stored in a storage medium, such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method according to the embodiments or some parts of the embodiments.

The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for apparatus or system embodiments, since they are substantially similar to method embodiments, they are described in relative terms, as long as they are described in partial descriptions of method embodiments. The above-described embodiments of the apparatus and system are merely illustrative, and the units described as separate parts may or may not be physically separate, and the parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.

The above description is only for the preferred embodiment of the present invention, but the scope of the present invention is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the present invention are included in the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims

1. A method for improving the domain adaptivity of a model based on a Markov open composite domain is characterized by comprising the following steps:

s1, acquiring an initial data set for training the model;

s4, the transfer matrix P is transferred to obtain the probability set P corresponding to the elements in the slice combination CD_CDAccording to the probability set P_CDRandomly obtaining a data set D to be processed from the slice combination CD_dPut into the memory cell M' CD;

s5 repeats the step S4 until each slice of all data sets in the slice combination CD is contained in the storage unit M' CD;

s7 construction of class encoder Epc (-) by a neural network based on parameterized modified linear elements, source domain encoder Eps (-) and target domain encoder Ept (-) by source domain data DT { X ·_s，Y_sAnd the class encoder trains a class classifier;

s8 is based on the source domain data DT { X_s，Y_sTraining a discriminator D (-) and disturbing a trained source domain encoder Eps (-) by combining the trained discriminator D (-) with a cross entropy function between randomly generated class labels;

s9 repeatedly executing step S8 to obtain an enhanced source domain encoder;

s10 calculating the content in the memory cell M' CD and the source domain data DT { X }_s，Y_s-a distance of content from which to train the target domain encoder Ept (·);

s11, extracting the direct feature of the open domain data according to the source domain data DT { X }_s，Y_sCalculating the mass center of each category in the target domain according to the classification information;

s13 outputs the learned weights of the optimized model.

2. The method according to claim 1, wherein step S2 includes:

s21 sets the composite domain to include d data sets, and obtains the data set DS, DS ═ DS₁，DS₂，......，DS_i，......，DS_d} (1); in the formula DS_i，i∈[1，d]Representing the data sets that make up the composite domain.

3. The method according to claim 2, wherein step S3 includes:

s31 setting each data set to have z images, passing through

Dividing each data set into a plurality of batches, each batch having the same number of images, obtaining the slice combination CD,

in the formula (I), the compound is shown in the specification,

represents DS_iThe m-th batch of images, the batch having

Opening an image;

s32, assuming that the slice combination CD has N elements in total, reduces the slice combination CD to CD ═ D₁，D₂，......，D_i，......，D_N} (4)；

S33 represents the simplified slice combination CD by probability formula (5)

Obtaining the system at t_n-1Time of day selected data D_n-1，D_n-1E is the CD; in the formula (I), the compound is shown in the specification,

the representation system is at t_jState of time of day, D_j，j∈[1，n]，D_jE CD indicates that the system is at t_jData selected at a time;

s34 passing formula

Obtaining the system at t_nStatus data D of time_n，D_nOne-step transfer probability of the epsilon CD;

s35, based on the one-step transition probability, obtaining a transition matrix between each element in the slice combination CD

In the formula (I), the compound is shown in the specification,

representing transition probabilities between elements;

s36 is the element of each transition matrix P

Assigning a random value, the random value satisfying a condition

4. The method according to claim 3, wherein step S4 includes:

s41 according to Chepman-Kerr Mogolov equation P^(a+b)＝P^(a)P^(b)(9) Obtaining N-1 step transition probability matrix calculation formula P between elements^(N-1)＝P^(N-2)P＝P^(N-3)P P＝P ......P P＝P^N-1(10) (ii) a In the formula, P^(N-1)For N-1 step transition probability momentArray, P^N-1Represents the transfer matrix between the data to the power of N-1;

s42 obtaining N-1 step transition probability matrix based on formula (10)

In the formula (I), the compound is shown in the specification,

representing the N-1 step transition probability between data;

s43 randomly acquiring a data from the slice combination CD as a transfer data D_T；

S44 repeatedly executes the substeps S41 to S43 to obtain a plurality of the transfer data D_TFurther obtain a transfer data set D_S，

In the formula

Q is the number of times that the sub-steps S41 to S43 are repeatedly performed, for transfer data randomly selected from the slice group CD after each of the sub-steps S41 to S43 is performed;

s45 transferring the data set D according to each data in the slice combination CD_SNumber of occurrences, obtaining frequency of occurrence of data set

In the formula

s46 according to the probability set P_CDRandomly selecting a data set D to be processed from the slice combination CD_dPut into a memory cell M ' CD, M ' CD ═ M ' CD_d) (15)。

5. The method according to claim 4, wherein step S7 further comprises:

according to the formula

By source domain data DT { X_s，Y_sTraining a class classifier Cs (-) by adopting a supervised learning method; in the formula (I), the compound is shown in the specification,

represents (x)_s，y_s) Is derived from (X)_s，Y_s)，E_pc(x_s) Representing the construction of class encoder Epc (-) vs. x with a neural network_sEncoding is performed, and K represents a class.

6. The method according to claim 5, wherein step S8 includes:

s81 is based on the source domain data DT { X_s，Y_sAnd formula