CN113920382B - Cross-domain image classification method based on class consistency structured learning and related device - Google Patents
Cross-domain image classification method based on class consistency structured learning and related device Download PDFInfo
- Publication number
- CN113920382B CN113920382B CN202111530728.1A CN202111530728A CN113920382B CN 113920382 B CN113920382 B CN 113920382B CN 202111530728 A CN202111530728 A CN 202111530728A CN 113920382 B CN113920382 B CN 113920382B
- Authority
- CN
- China
- Prior art keywords
- matrix
- target domain
- label
- domain
- updated
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2413—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
- G06F18/24133—Distances to prototypes
- G06F18/24143—Distances to neighbourhood prototypes, e.g. restricted Coulomb energy networks [RCEN]
Abstract
The embodiment of the application discloses a cross-domain image classification method and device based on class consistency structured learning, electronic equipment and a computer readable storage medium. The method comprises the following steps: the method comprises the steps of obtaining an initialized pseudo label of a target domain sample based on a first image classifier trained by using source domain data, obtaining an initialized projection matrix, a graph matrix, a category weight matrix, a Laplace matrix of a source domain and a Laplace matrix of the target domain based on the initialized pseudo label and a label of the source domain, updating the projection matrix by using the matrix obtained through initialization, performing projection learning by using the updated projection matrix to update the initialized pseudo label, and finally taking the updated pseudo label as a final classification result of the target domain sample image when the cycle number reaches a preset number. Therefore, by learning the Laplace matrix from the source domain to the target domain in the same category, the consistency and continuity of the samples in the category are improved, and the classification performance of the model on the samples in the target domain is improved.
Description
Technical Field
The application belongs to the technical field of machine learning, and particularly relates to a cross-domain image classification method and device based on class consistency structured learning, electronic equipment and a computer-readable storage medium.
Background
The unsupervised Domain Adaptation (Domain Adaptation) refers to a machine learning method for training a model applied to an unlabeled target Domain in a source Domain containing labels.
Data distribution differences (edge distribution differences and conditional distribution differences) may exist between the source domain data set containing the label and the target domain data set containing no label, and therefore, when the source domain trained model is applied to the target domain, the performance of the model may be significantly reduced ("overfitting"). In order to alleviate the data Distribution difference between the source domain and the target domain, the conventional unsupervised domain Adaptation method may adopt a method based on feature Adaptation, such as Transport Component Analysis (TCA) and Joint Distribution Adaptation (JDA); example weight-based methods such as the Transfer Joint Matching method (TJM) and the Coupled Knowledge Transfer method (CKET) can also be employed.
In order to overcome the data distribution difference of the source domain and the target domain, most of the current unsupervised domain adaptation methods for processing image classification introduce edge distribution matching and conditional distribution matching. After the sample data is subjected to edge distribution matching and condition distribution matching, the sample points of the same category in different domains are distributed and evacuated, that is, the samples in the classes in different domains are distributed and evacuated, and the consistency and the continuity are poor. The distributed and sparse sample clusters of the same category can greatly reduce the classification performance of the model on the target domain samples.
Disclosure of Invention
The embodiment of the application provides a cross-domain image classification method, a device, electronic equipment and a computer readable storage medium based on class consistency structured learning, which can solve the problem that the classification performance of a model on a target domain sample is low due to poor consistency and continuity of an intra-class sample in the existing unsupervised domain adaptation method.
In a first aspect, an embodiment of the present application provides a cross-domain image classification method based on class consistency structured learning, including:
acquiring a source domain data set and a target domain data set, wherein the source domain data set comprises source domain sample images and labels of the source domain sample images, and the target domain data set comprises target domain sample images;
obtaining an initialization pseudo label of each target domain sample image based on a first image classifier trained by using a source domain data set;
projecting the source domain data set and the target domain data set to the same public space to obtain source domain sample points and target domain sample points, initializing according to the same type of source domain sample points and target domain sample points based on the initialized pseudo labels and labels, and obtaining a projection matrix, a graph matrix, a type weight matrix, a Laplace matrix of the source domain and a Laplace matrix of the target domain;
updating the projection matrix according to the graph matrix, the category weight matrix, the Laplace matrix of the source domain and the Laplace matrix of the target domain to obtain an updated projection matrix;
performing projection learning on the source domain data set and the target domain data set by using the updated projection matrix to obtain projected source domain sample data and projected target domain sample data;
classifying the projected target domain sample data based on a second image classifier trained by using the projected source domain sample data with the label to obtain a pseudo label of the projected target domain sample data;
when the cycle times do not reach the preset times, respectively updating the graph matrix, the category weight matrix and the Laplacian matrix of the target domain according to the labels and the pseudo labels to obtain an updated graph matrix, an updated category weight matrix and an updated Laplacian matrix of the target domain, and returning to update the projection matrix according to the graph matrix, the category weight matrix, the Laplacian matrix of the source domain and the Laplacian matrix of the target domain to obtain an updated projection matrix based on the updated graph matrix, the updated category weight matrix and the updated Laplacian matrix of the target domain;
when the cycle times reach the preset times, obtaining the classification result of each target domain sample image, wherein the classification result is a pseudo label;
wherein, according to the graph matrix, the category weight matrix, the laplacian matrix of the source domain and the laplacian matrix of the target domain, updating the projection matrix to obtain an updated projection matrix, including:
by the formula (X.OMEGA.X)T+βIm)P=XHXTP phi, obtaining an updated projection matrix; beta is a hyperparameter;
wherein X ═ Xs,Xt]∈Rm×n,XsFor the source domain data set, XtFor the target domain data set, P is the projection matrix, Im∈Rm ×mIs an identity matrix of dimension m, Rm×mRepresenting a real space of dimensions m x m, Rm×nRepresenting a real number space with dimension of m multiplied by n, wherein m and n respectively represent space dimensions;
Φ=diag(Φ1,Φ2,...,Φd)∈Rd×da diagonal matrix whose diagonal elements are lagrange multipliers;alpha, eta and gamma are all hyper-parameters; mcIs a matrix of the class weights,
v(i)to contain zt,iFront ofA set of target domain sample points of the nearest neighbor point,δ∈[0,1]is a pre-set abutment factor and is,is a rounded up symbol;representing sample pointsAnd sample pointThe same as the category label of (1);representing sample pointsThe category label of (a) is set,representing sample pointsThe category label of (1), wherein,is thatIs determined by the point of the neighborhood of the point,αi、αja weight coefficient representing each sample in the target domain; ztRepresenting the form of the target domain data in projection space, Zi=PTXt,zt,iRepresents ZtThe ith data in (1);
respectively a source domain sample point and a target domain sample point;the source domain sample points and the target domain sample points with the category of c;Lslaplace matrix, L, for the source domaintA Laplace matrix as a target domain; g is a matrix of the graph, and is And isWith a representation dimension of ns×ntReal number space of, nsAnd ntRepresenting a spatial dimension;
u(i)to compriseThe target domain sample point set of k nearest neighbors,k is the preset number of nearest neighbors, C ∈ {1, 2., C } is the number of categories common to the source domain and the target domain,andthe number of samples of the source domain and target domain class c respectively,respectively a sample point with the category c in the source domain and the target domain; n is ns+nt;
Therefore, the laplacian matrix from the source domain to the target domain is learned to provide consistency and continuity of the samples in the class between the domains, so that the classification performance of the model on the samples in the target domain is further improved, and the knowledge transfer from the source domain to the target domain is promoted.
In some possible implementation manners of the first aspect, updating the graph matrix, the category weight matrix, and the laplacian matrix of the target domain according to the label and the pseudo label, respectively, to obtain an updated graph matrix, an updated category weight matrix, and an updated laplacian matrix of the target domain, includes: by the formula
Calculating to obtain an updated graph matrix according to the label and the pseudo label; by the formula
Calculating to obtain an updated category weight matrix according to the label and the pseudo label; by the formula
And calculating to obtain an updated Laplace matrix of the target domain according to the label and the pseudo label.
In some possible implementation manners of the first aspect, classifying the projected target domain sample data based on a second image classifier trained using the labeled projected source domain sample data to obtain a pseudo label of the projected target domain sample data includes:
training an image classifier by using the projected source domain sample data with the label to obtain a second image classifier;
and classifying the projected target domain sample data by using a second image classifier to obtain a first classification result of each projected target domain sample data, wherein the first classification result is a pseudo label.
In some possible implementations of the first aspect, obtaining an initialization pseudo label for each target domain sample image based on a first image classifier trained using a source domain dataset includes:
training an image classifier by using a source domain data set to obtain a trained first image classifier;
and classifying the sample images of each target domain by using the first image classifier to obtain a second classification result of the sample images of each target domain, wherein the second classification result is an initialization pseudo label.
In a second aspect, an embodiment of the present application provides a cross-domain image classification device based on class consistency structured learning, including:
the system comprises an acquisition module, a storage module and a processing module, wherein the acquisition module is used for acquiring a source domain data set and a target domain data set, the source domain data set comprises a source domain sample image and a label of the source domain sample image, and the target domain data set comprises a target domain sample image;
the pseudo label initialization module is used for obtaining initialization pseudo labels of all target domain sample images based on a first image classifier trained by using a source domain data set;
the initialization module is used for projecting the source domain data set and the target domain data set to the same public space to obtain source domain sample points and target domain sample points, and initializing according to the source domain sample points and the target domain sample points of the same type based on the initialization pseudo labels and the initialization labels to obtain a projection matrix, a graph matrix, a type weight matrix, a Laplacian matrix of the source domain and a Laplacian matrix of the target domain;
the projection matrix updating module is used for updating the projection matrix according to the graph matrix, the category weight matrix, the Laplace matrix of the source domain and the Laplace matrix of the target domain to obtain an updated projection matrix;
the projection module is used for performing projection learning on the source domain data set and the target domain data set by using the updated projection matrix to obtain projected source domain sample data and projected target domain sample data;
the pseudo label updating module is used for classifying the projected target domain sample data based on a second image classifier trained by using the projected source domain sample data with the label to obtain a pseudo label of the projected target domain sample data;
the circulation module is used for respectively updating the graph matrix, the category weight matrix and the Laplace matrix of the target domain according to the label and the pseudo label when the circulation frequency does not reach the preset frequency to obtain an updated graph matrix, an updated category weight matrix and an updated Laplace matrix of the target domain, and returning to the Laplace matrix according to the graph matrix, the category weight matrix, the Laplace matrix of the source domain and the Laplace matrix of the target domain to update the projection matrix to obtain an updated projection matrix based on the updated graph matrix, the updated category weight matrix and the updated Laplace matrix of the target domain; when the cycle times reach the preset times, obtaining the classification result of each target domain sample image, wherein the classification result is a pseudo label;
wherein, the projection matrix updating module is specifically configured to:
by the formula (X.OMEGA.X)T+βIm)P=XHXTP phi, obtaining an updated projection matrix; beta is a hyperparameter;
wherein X ═ Xs,Xt]∈Rm×n,XsFor the source domain data set, XtFor the target domain data set, P is the projection matrix, Im∈Rm ×mIs an identity matrix of dimension m, Rm×mRepresenting a real space of dimensions m x m, Rm×nRepresenting a real number space with dimension of m multiplied by n, wherein m and n respectively represent space dimensions;
Φ=diag(Φ1Φ2,...,Φd)∈Rd×da diagonal matrix whose diagonal elements are lagrange multipliers;alpha, eta and gamma are all hyper-parameters; mcIs a matrix of the class weights, v(i)to contain zt,iFront ofA set of target domain sample points of the nearest neighbor point,δ∈[0,1]is a pre-set abutment factor and is,is a rounded up symbol;
representing sample pointsAnd sample pointThe same as the category label of (1);representing sample pointsThe category label of (a) is set,representing sample pointsThe category label of (1), wherein,is thatIs determined by the point of the neighborhood of the point,αi、αja weight coefficient representing each sample in the target domain; ztRepresenting the form of the target domain data in projection space, Zt=PTXt,zt,iRepresents ZtThe ith data in (1);
respectively a source domain sample point and a target domain sample point;the source domain sample points and the target domain sample points with the category of c;
Lslaplace matrix, L, for the source domaintA Laplace matrix as a target domain; g is a matrix of the graph,
and is And isWith a representation dimension of ns×ntReal number space of, nsAnd ntRepresenting a spatial dimension;
,zi=PTxi,zj=PTxjas projected sample points, y (z)i),y(zj) Labels for the projected sample points;u(i)to compriseThe target domain sample point set of k nearest neighbors,k is the preset number of nearest neighbors, C ∈ {1, 2., C } is the number of categories common to the source domain and the target domain,andthe number of samples of the source domain and target domain class c respectively,respectively a sample point with the category c in the source domain and the target domain; n is nsnt;
Identity matrices of dimensions n and d, 1n×nIs a square matrix with elements of 1.
In some possible implementations of the second aspect, the loop module is specifically configured to: by the formula
Calculating to obtain an updated graph matrix according to the label and the pseudo label; by the formula
Calculating to obtain an updated category weight matrix according to the label and the pseudo label; by the formula
And calculating to obtain an updated Laplace matrix of the target domain according to the label and the pseudo label.
In some possible implementations of the second aspect, the pseudo tag updating module is specifically configured to:
training an image classifier by using the projected source domain sample data with the label to obtain a second image classifier;
and classifying the projected target domain sample data by using a second image classifier to obtain a first classification result of each projected target domain sample data, wherein the first classification result is a pseudo label.
In some possible implementations of the second aspect, the pseudo tag initialization module is specifically configured to:
training an image classifier by using a source domain data set to obtain a trained first image classifier;
and classifying the sample images of each target domain by using the first image classifier to obtain a second classification result of the sample images of each target domain, wherein the second classification result is an initialization pseudo label.
In a third aspect, an embodiment of the present application provides an electronic device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, and when the processor executes the computer program, the method according to any one of the first aspect is implemented.
In a fourth aspect, the present application provides a computer-readable storage medium, in which a computer program is stored, and the computer program is executed by a processor to implement the method according to any one of the above first aspects.
In a fifth aspect, embodiments of the present application provide a computer program product, which, when run on an electronic device, causes the electronic device to perform the method of any one of the above first aspects.
It is understood that the beneficial effects of the second aspect to the fifth aspect can be referred to the related description of the first aspect, and are not described herein again.
Drawings
Fig. 1 is a schematic flowchart of a cross-domain image classification method based on class consistency structured learning according to an embodiment of the present application;
FIG. 2 is a schematic block diagram of another flowchart of a cross-domain image classification method based on class consistency structure learning according to an embodiment of the present application;
fig. 3 is a block diagram illustrating a structure of a cross-domain image classification device based on class consistency structure learning according to an embodiment of the present application;
fig. 4 is a block diagram schematically illustrating a structure of an electronic device according to an embodiment of the present application.
Detailed Description
In the following description, for purposes of explanation and not limitation, specific details are set forth, such as particular system structures, techniques, etc. in order to provide a thorough understanding of the embodiments of the present application. It will be apparent, however, to one skilled in the art that the present application may be practiced in other embodiments that depart from these specific details. In other instances, detailed descriptions of well-known systems, devices, circuits, and methods are omitted so as not to obscure the description of the present application with unnecessary detail.
It will be understood that the terms "comprises" and/or "comprising," when used in this specification and the appended claims, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
As used in this specification and the appended claims, the term "if" may be interpreted contextually as "when", "upon" or "in response to" determining "or" in response to detecting ". Similarly, the phrase "if it is determined" or "if a [ described condition or event ] is detected" may be interpreted contextually to mean "upon determining" or "in response to determining" or "upon detecting [ described condition or event ]" or "in response to detecting [ described condition or event ]".
Furthermore, in the description of the present application and the appended claims, the terms "first," "second," "third," and the like are used for distinguishing between descriptions and not necessarily for describing or implying relative importance.
Reference throughout this specification to "one embodiment" or "some embodiments," or the like, means that a particular feature, structure, or characteristic described in connection with the embodiment is included in one or more embodiments of the present application. Thus, appearances of the phrases "in one embodiment," "in some embodiments," "in other embodiments," or the like, in various places throughout this specification are not necessarily all referring to the same embodiment, but rather "one or more but not all embodiments" unless specifically stated otherwise. The terms "comprising," "including," "having," and variations thereof mean "including, but not limited to," unless expressly specified otherwise.
At present, most unsupervised domain adaptation methods do not consider the consistency and continuity of similar samples among the fields, so that the situations of fuzzy inter-class distribution boundary and sparse intra-class sample distribution occur after the convergence through a plurality of times of projection learning.
In order to improve the consistency and continuity of the intra-Class samples, the embodiment of the present application provides a Cross-Domain image classification method (ccs-Domain Class-Wise Structure Learning, CCSL) based on Class consistency Structure Learning, so as to promote the knowledge migration from a source Domain to a target Domain by Learning a laplacian matrix from the source Domain to the target Domain on the basis of a CKET algorithm.
In addition, considering that the pseudo label learned by the target domain has certain uncertainty, the embodiment of the application provides a pseudo label credibility mechanism based on the sample distribution characteristics so as to improve the credibility of the pseudo label of the target domain and further reduce the risk of negative knowledge migration from the source domain to the target domain.
Referring to fig. 1, a flowchart of a cross-domain image classification method based on class consistency structured learning according to an embodiment of the present application is shown, where the method includes the following steps:
step S101, a source domain data set and a target domain data set are obtained, wherein the source domain data set comprises source domain sample images and labels of the source domain sample images, and the target domain data set comprises target domain sample images.
It is to be understood that the source domain data set is labeled data and the target domain data set is unlabeled data. The label in the source domain sample image may characterize the class to which the image belongs.
Illustratively, the source domain data set and the target domain data set may use existing public data sets, such as Office-caltech (surf), COIL20, PIE, and the like. Where Office-caltech (surf) and COIL20 are data sets for item recognition and PIE is a data set for face gesture recognition.
Taking the Office-Caltech (SURF) dataset as an example, Office-Caltech (SURF) includes four subdata sets, C (Caltech), A (Amazon), W (Webcam), and D (DSLR), respectively. Each sub data set comprises a different number of pictures with a common number of categories of 10 between the 4 sub data sets. Randomly selecting one data set from the 4 data subsets as the source domain data set, and selecting one data set from the remaining 3 data subsets as the target domain data set. For example, a Caltech subdata set is selected as a source domain data set, and an Amazon subdata set is selected as a target domain data set. The Caltech subdata set comprises 1123 pictures, the Amazon subdata set comprises 958 pictures, and each picture is compressed and extracted into 800-dimensional column directionMeasuring and finally obtaining a source domain data matrix Xs∈R800×1123Target field data matrix Xt∈R800×958The label information corresponding to the source field is the real category information of each picture, so that the label Y of the sample image of the source fields∈R1123×1,YsEach specific element y insThe label information representing the sample has ten categories, so ys∈{1,2,...10}。
In addition, some relevant parameters need to be acquired. The relevant parameters may include α, β, η, γ, δ, projection subspace dimension d, number of neighbors k, and number of iterations T. Illustratively, k is 5 and T is 10.
After obtaining the relevant parameters, the optimal values of the parameters α, β, γ may be found according to Grid Search lookup, where η is 0.5, for example.
In a specific application, corresponding parameter values can be input according to different selected source domain data sets and target domain data sets. For example, the subspace dimension d of Office-Caltech10 is 10, δ is 0.1; COIL20 corresponds to d ═ 20 and δ ═ 0.05; and d is 100 and delta is 0.25 corresponding to the PIE.
Step S102, obtaining initialization pseudo labels of all target domain sample images based on a first image classifier trained by using a source domain data set.
After obtaining the target domain data set, the source domain data set, and the associated parameters, the pseudo label y of the target domain sample image may be initialized in the original spacet,i(1≤i≤nt). Wherein the original space is relative to the projection space. Illustratively, the initialization pseudo label of the target domain sample image may be obtained by the following process: the image classifier may be trained using the labeled source domain sample images to obtain a trained first image classifier. The image classifier may be exemplified by a K-neighbor classifier. After the trained first image classifier is obtained, classifying each target domain sample image by using the first image classifier to obtain a classification result of each target domain sample image, and taking the classification result of the target domain sample image as the target domain sample imageThe classification result of the target domain sample image is used for representing the belonged category of the image.
Step S103, projecting the source domain data set and the target domain data set to the same public space to obtain source domain sample points and target domain sample points, and initializing according to the source domain sample points and the target domain sample points of the same type based on the initialized pseudo labels and the labels to obtain a projection matrix, a graph matrix, a type weight matrix, a Laplacian matrix of the source domain and a Laplacian matrix of the target domain.
In specific application, each sample image in the source domain data set and each sample image in the target domain data set are projected to the same public space, so that a source domain sample point corresponding to each source domain sample image and a target domain sample point corresponding to each target domain sample image are obtained. That is, the source domain sample points are sample points at which the source domain sample images are projected into the common space, and the target domain sample points are sample points at which the target domain sample images are projected into the common space.
It will be appreciated that the source domain exemplar points in the common space have corresponding labels, and that the labels of the source domain exemplar points in the common space are the same as the labels of the source domain exemplar images in the original space. Similarly, the target domain sample point also corresponds to a corresponding pseudo label, and is the same as the pseudo label of the target domain sample image in the original space. Based on the above, after projection, the category to which each source domain sample point belongs can be determined according to the label, the category to which each target domain sample point belongs can be determined according to the pseudo label, and further the target domain sample point and the source domain sample point which belong to the same category can be determined.
From the target domain sample points and the source domain sample points that belong to the same class, an initialized projection matrix P ═ I can be obtainedmRm×m. Then, according to the information of the initialized projection matrix, the label, the pseudo label and the like, the formula is used
firstly, in order to enhance the classification performance of the target domain image by enhancing the compactness of the sample clusters of the same category, the following mathematical expression can be obtained:
P∈Rm×dc ∈ {1, 2., C } is the number of classes common to the source and target domains,andthe number of samples of the source domain and target domain class c respectively,sample points of class c for the source domain and the target domain, respectively.To reassign the weights to the target domain samples based on the sample distribution characteristics,the assignment rule is determined by the following trust mechanism: and when the projected target domain sample point is positioned in the first k nearest adjacent points of the projected source domain sample point, assigning 1, otherwise, assigning 0. Expressed as:u(i)is composed ofOf k nearest neighbors, wherein k is the preset nearest neighbor number.
Based on this, the above equation 1 can be rewritten as the following equation:
X=[Xs,Xt]∈Rm×nfor the original source and target domain data sets, XsFor the source domain data set, XtFor the target domain data set, n ═ ns+nt, ,zi=PTxi,zj=PTxjAs projected sample points, y (z)i),y(zj) Respectively the label values of the sample points, respectively two diagonal matrices
After the initialized projection matrix is obtained, the information such as the initialized projection matrix, the label and the pseudo label is processed by a formula
And, based on the information such as the initialized projection matrix, the label and the pseudo label, the formula is used
Calculating to obtain an initialized Laplace matrix L of the source domainsAnd the Laplace matrix L of the target domaint。
Wherein, the formula
firstly, in order to realize global matching of a source domain and a target domain, edge distribution matching and conditional distribution matching are introduced, and a specific expression is as follows:
wherein the content of the first and second substances,
;xi,xjsubscripts for source domain and target domain sample points, respectively;respectively a source domain sample point and a target domain sample point;Respectively, a source domain sample point and a target domain sample point of class c.
Then, adding local manifold learning in the source domain and the target domain respectively, wherein the specific expression is as follows:
wherein the content of the first and second substances,
to ensure a more compact distribution of sample points of the same class, the weighting coefficients for the target domain samples are reassigned using the following equation:
δ∈[0,1]is a pre-set abutment factor and is,to round up the symbol, v(i)To contain zt,iFront ofA set of target domain sample points of the nearest neighbor point.
Based on this, the weight coefficients for the target domain sample points may be defined as follows:
up to this point, the above equations (3) and (4) may be rewritten as the following expressions, respectively:
wherein the content of the first and second substances,1≤c≤C,M0the assignment is as in equation (3) above.
Wherein the content of the first and second substances,y(xt,i)=y(xt,j)∧αiαj=1,Ls,Ltlaplacian matrices for the source domain and the target domain respectively, as a diagonal matrixLsThe solution process is similar.
The initialized target domain pseudo label, the initialized graph matrix G and the initialized category weight matrix M can be obtained through the stepscLaplace matrix L of initialized source domainsAnd the Laplace matrix L of the initialized target domaint。
And step S104, updating the projection matrix according to the graph matrix, the category weight matrix, the Laplace matrix of the source domain and the Laplace matrix of the target domain to obtain an updated projection matrix.
Obtaining initialized graph matrix G and category weight matrix McSource domain laplacian matrix LsAnd the Laplace matrix L of the target domaintThen, by the formula (X Ω X)T+βIm)P=XHXTP Φ, the updated projection matrix P is calculated.
Wherein, Im∈Rm×mIs an identity matrix with dimension m, phi ═ diag (phi)1Φ2,...,Φd)∈Rd×dA diagonal matrix whose diagonal elements are lagrange multipliers;
equation (X Ω X)T+βIm)P=XHXTThe acquisition process of P Φ can be as follows:
first, the above formula (2), formula (7) and formula (8) are combined, and a regularization term is addedThe final objective function of CCSL can be obtained:
s.t.PTXHXTP=Id
(9)
wherein the constraint term is derived by Principal Component Analysis (PCA) to maximize data variance,In,Idrespectively identity matrices of dimensions n and d,the matrix is a square matrix with elements of 1, and alpha, beta, eta and gamma are 4 preset hyper-parameters.
The above equation (9) is a nonlinear function, so the solution is performed using a lagrange multiplier, and the lagrange function is as follows:
finally, taking the derivative of P of equation (10) and making it 0, the following equation can be obtained:
(XΩXT+βIm)P=XHXTPΦ (11)
and selecting the first d minimum eigenvectors to form a projection matrix P.
That is, based on the above equation (11), the matrix G and the matrix M are determinedcSource domain laplacian matrix LsAnd the Laplace matrix L of the target domaintAnd calculating a projection matrix P, wherein the projection matrix P is the updated projection matrix.
And S105, performing projection learning on the source domain data set and the target domain data set by using the updated projection matrix to obtain projected source domain sample data and projected target domain sample data.
After obtaining the updated projection matrix P according to the above equation (11), respectively projecting each source domain sample image in the source domain data set and each target domain sample image in the target domain data set to the same public space by using the updated projection matrix P, and obtaining the projected source domain sample data Zs=PT*XsAnd projected target domain sample data Zt=PT*Xt。
And S106, classifying the projected target domain sample data based on a second image classifier trained by using the projected source domain sample data with the label to obtain a pseudo label of the projected target domain sample data.
In some embodiments, tagged projected source domain sample data Z is useds=PT*XsAnd training the image classifier to obtain a second trained image classifier. The image classifier may be exemplified by a K-neighbor classifier. Then, a second image classifier is used for sampling data Z of the projected target domaint=PT*XtAnd classifying to obtain a classification result of each projected target domain sample data, and taking the classification result as a pseudo label of the projected target domain sample data. That is to say that the first and second electrodes,and updating the initialized pseudo label to obtain an updated target domain pseudo label.
And step S107, judging whether the circulation frequency reaches the preset frequency, if not, entering step S108, otherwise, entering step S109 if the circulation frequency reaches the preset frequency.
In a specific application, the steps S104 to S108 are executed in a loop, that is, the projection matrix is continuously updated, and the updated projection matrix is used to update the pseudo tag of the target domain. If the cumulative cycle number at the current time is greater than or equal to the preset number after the step S106 is executed, step S109 is performed, that is, the pseudo label calculated in the step S106 in the current cycle process is used as the final image classification result, and the updated projection matrix obtained in the step S104 in the current cycle process is output; otherwise, if the cumulative cycle count at the current time does not reach the preset count, the process goes to step S108, and then returns to step S104.
And S108, respectively updating the graph matrix, the category weight matrix and the Laplace matrix of the target domain according to the labels and the pseudo labels to obtain an updated graph matrix, an updated category weight matrix and an updated Laplace matrix of the target domain, and returning to the S104.
In specific application, the matrix G may be updated according to the above formula (2) and the matrix M may be updated according to the above formula (7) according to information such as tags and pseudo tagscThe Laplace matrix L of the target domain is updated according to the above equation (8)t. Then according to the Laplace matrix L of the source domainsUpdated matrix G and matrix M obtained in step S108cAnd the Laplace matrix L of the target domaintAnd updating the projection matrix to obtain the updated projection matrix. Then, using the updated projection matrix to perform projection learning on each data in the source domain data set and the target domain data set to obtain projected source domain sample data and projected target domain sample data, using the projected source domain sample data to train an image classifier to obtain a trained image classifier, and using the trained image classifier to perform projection learning on the projected target domain sampleThe data is classified, and the pseudo label of the projected target domain sample data is obtained. Finally, judging whether the current accumulated cycle number reaches a preset number, and if the current accumulated cycle number reaches the preset number, taking the pseudo label as a final image classification result; if the cycle times do not reach the preset times, updating the matrix G and the matrix McAnd the Laplace matrix L of the target domaint. And circulating in sequence until the circulation times reach the preset times.
And step S109, obtaining the classification result of each target domain sample image, wherein the classification result is a pseudo label.
It is to be understood that the pseudo label of each target domain sample image may characterize the category to which the target domain sample image belongs. In other embodiments, the projection matrix P may be output in addition to the pseudo-tags.
It should be noted that while the CKET learns the projection matrix, the weights of the target domain samples are redistributed according to the sparsity of the distribution of the target domain samples in the projection space. On the basis of CKET, the Laplace matrix from the source domain to the target domain in the same category is learned, so that the consistency and continuity of the intra-class samples between the domains are further improved, and further the knowledge transfer from the source domain to the target domain is improved.
Therefore, the laplacian matrix from the source domain to the target domain is learned to provide consistency and continuity of the samples in the class between the domains, so that the classification performance of the model on the samples in the target domain is further improved, and the knowledge transfer from the source domain to the target domain is promoted.
In order to better describe the CCSL method provided in the embodiment of the present application, it needs to be described with reference to another schematic flow diagram of the class consistency structure learning-based cross-domain image classification method shown in fig. 2. At this time, the image classifier is specifically a K-nearest neighbor classifier, and the preset number of times is 10.
As shown in fig. 2, the number of initialization cycles t is 1, and corresponding parameter values of parameters α, β, η, γ, δ, etc., as well as the projection subspace dimension d and the number of tie points k are input according to different data sets. Namely, corresponding parameter values are input according to the difference between the selected source domain data set and the selected target domain data set.
And then training the K neighbor classifier by using the source domain data set to obtain the trained K neighbor classifier.
Next, each sample in the target domain dataset is classified using the trained K-nearest neighbor classifier to initialize a pseudo label for each target domain sample image. And calculating a graph matrix G according to the above expression (2), and calculating a category weight matrix M according to the above expression (7)cCalculating Laplace matrix L of the target domain according to the above equation (8)tLaplace matrix L of sum source domainsTo matrix G, matrix McLaplace matrix L of the target domaintAnd a Laplace matrix L of the source domainsInitialization is performed.
Then, the projection matrix P is calculated according to the above equation (11) to update the initialized projection matrix, and an updated projection matrix is obtained. And performing projection learning on the source domain data and the target domain data by using the updated projection matrix to obtain projected source domain data Zs=PT*XsAnd projected target domain data Zt=PT*Xt. Reusing projected source domain data Zs=PT*XsTraining the K nearest neighbor classifier, obtaining the trained K nearest neighbor classifier, and using the trained K nearest neighbor classifier to project the target domain data Zt=PT*XtAnd classifying to obtain a classification result, and taking the classification result as a target domain pseudo label.
Judging whether the cycle times t is less than or equal to 10, if so, returning to the calculation of the matrix G according to the formula (2), and calculating the matrix M according to the formula (7)cUpdating the Laplace matrix L of the target domain according to the above equation (8)tLaplace matrix L of sum source domainsIf not, outputting a target domain pseudo label and a projection matrix P, wherein the target domain pseudo label is a final image classification result.
It should be understood that, the sequence numbers of the steps in the foregoing embodiments do not imply an execution sequence, and the execution sequence of each process should be determined by its function and inherent logic, and should not constitute any limitation to the implementation process of the embodiments of the present application.
Corresponding to the method for classifying cross-domain images based on class consistency structure learning described in the foregoing embodiments, fig. 3 shows a block diagram of a cross-domain image classification device based on class consistency structure learning provided in an embodiment of the present application, and for convenience of explanation, only the relevant parts of the embodiment of the present application are shown.
Referring to fig. 3, the apparatus includes:
an obtaining module 31, configured to obtain a source domain data set and a target domain data set, where the source domain data set includes a source domain sample image and a label of the source domain sample image, and the target domain data set includes a target domain sample image;
a pseudo label initialization module 32, configured to obtain an initialized pseudo label of each target domain sample image based on a first image classifier trained using the source domain data set;
the initialization module 33 is configured to project the source domain data set and the target domain data set to the same public space, obtain source domain sample points and target domain sample points, and initialize according to the same type of source domain sample points and target domain sample points based on the initialization pseudo tag and the initialization label, so as to obtain a projection matrix, a graph matrix, a type weight matrix, a laplacian matrix of the source domain, and a laplacian matrix of the target domain;
a projection matrix updating module 34, configured to update the projection matrix according to the graph matrix, the category weight matrix, the laplacian matrix of the source domain, and the laplacian matrix of the target domain, so as to obtain an updated projection matrix;
the projection module 35 is configured to perform projection learning on the source domain data set and the target domain data set by using the updated projection matrix, and obtain projected source domain sample data and projected target domain sample data;
a pseudo label updating module 36, configured to classify the projected target domain sample data based on a second image classifier trained using the projected source domain sample data with a label, and obtain a pseudo label of the projected target domain sample data;
the circulation module 37 is configured to update the graph matrix, the category weight matrix, and the laplacian matrix of the target domain according to the label and the pseudo label when the circulation frequency does not reach the preset frequency, to obtain an updated graph matrix, an updated category weight matrix, and an updated laplacian matrix of the target domain, and to return to the laplacian matrix of the source domain and the laplacian matrix of the target domain according to the graph matrix, the category weight matrix, the laplacian matrix of the source domain, and the laplacian matrix of the target domain based on the updated graph matrix, the updated category weight matrix, and the updated laplacian matrix of the target domain, to update the projection matrix, and to obtain an updated projection matrix; when the cycle times reach the preset times, obtaining the classification result of each target domain sample image, wherein the classification result is a pseudo label;
wherein, the projection matrix updating module is specifically configured to: by the formula (X.OMEGA.X)T+βIm)P=XHXTP phi obtains an updated projection matrix; beta is a hyperparameter;
wherein X ═ Xs,Xt]∈Rm×n,XsFor the source domain data set, XtFor the target domain data set, P is the projection matrix, Im∈Rm×mIs an identity matrix of dimension m, Rm×mRepresenting a real space of dimensions m x m, Rm×nRepresenting a real number space with dimension of m multiplied by n, wherein m and n respectively represent space dimensions; Φ to diag (Φ)1Φ2,...,Φd)∈Rd×dA diagonal matrix whose diagonal elements are lagrange multipliers;alpha, eta and gamma are all hyper-parameters; mcIs a matrix of the class weights,
v(i)to contain zt,iFront ofA set of target domain sample points of the nearest neighbor point,δ∈[0,1]is a pre-set abutment factor and is,is a rounded up symbol;
representing sample pointsAnd sample pointThe same as the category label of (1);representing sample pointsThe category label of (a) is set,representing sample pointsThe category label of (1), wherein,is thatIs determined by the point of the neighborhood of the point,αi、αja weight coefficient representing each sample in the target domain; ztIndicating that the target domain data is in the projection spaceOf the intermediate form, Zt=PTXt,zt,iRepresents ZtThe ith data in (1);
respectively a source domain sample point and a target domain sample point;the source domain sample points and the target domain sample points with the category of c;
Lslaplace matrix, L, for the source domaintA Laplace matrix as a target domain; g is a matrix of the graph, and is And isWith a representation dimension of ns×ntReal number space of, nsAnd ntRepresenting a spatial dimension;
,zi=PTxi,zj=PTxjas projected sample points, y (z)i),y(zj) Labels for the projected sample points;
u(i)to compriseThe target domain sample point set of k nearest neighbors,k is the preset number of nearest neighbors, C ∈ {1, 2., C } is the number of categories common to the source domain and the target domain,andthe number of samples of the source domain and target domain class c respectively,respectively a sample point with the category c in the source domain and the target domain; n is ns+nt;
In some possible implementations, the loop module is specifically configured to: by the formula
Calculating to obtain an updated graph matrix according to the label and the pseudo label; by the formula
Calculating to obtain an updated category weight matrix according to the label and the pseudo label; by the formula
And calculating to obtain an updated Laplace matrix of the target domain according to the label and the pseudo label.
In some possible implementations, the pseudo tag update module is specifically configured to: training an image classifier by using the projected source domain sample data with the label to obtain a second image classifier; and classifying the projected target domain sample data by using a second image classifier to obtain a first classification result of each projected target domain sample data, wherein the first classification result is a pseudo label.
In some possible implementations, the pseudo tag initialization module is specifically configured to: training an image classifier by using a source domain data set to obtain a trained first image classifier; and classifying the sample images of each target domain by using the first image classifier to obtain a second classification result of the sample images of each target domain, wherein the second classification result is an initialization pseudo label.
It should be noted that, for the information interaction, execution process, and other contents between the above-mentioned devices/units, the specific functions and technical effects thereof are based on the same concept as those of the method embodiment in the embodiment of the present application, which may be referred to in the method embodiment section specifically, and are not described herein again.
Fig. 4 is a schematic structural diagram of an electronic device according to an embodiment of the present application. As shown in fig. 4, the electronic apparatus 4 of this embodiment includes: at least one processor 40 (only one shown in fig. 4), a memory 41, and a computer program 42 stored in the memory 41 and executable on the at least one processor 40, the processor 40 implementing the steps in any of the various object tracking method embodiments described above when executing the computer program 42.
The electronic device 4 may be a desktop computer, a notebook, a palm computer, a cloud server, or other computing devices. The electronic device may include, but is not limited to, a processor 40, a memory 41. Those skilled in the art will appreciate that fig. 4 is merely an example of the electronic device 4, and does not constitute a limitation of the electronic device 4, and may include more or less components than those shown, or combine some of the components, or different components, such as an input-output device, a network access device, etc.
The Processor 40 may be a Central Processing Unit (CPU), and the Processor 40 may be other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic device, discrete hardware component, or the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
The memory 41 may in some embodiments be an internal storage unit of the electronic device 4, such as a hard disk or a memory of the electronic device 4. The memory 41 may also be an external storage device of the electronic device 4 in other embodiments, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like, which are provided on the electronic device 4. Further, the memory 41 may also include both an internal storage unit and an external storage device of the electronic device 4. The memory 41 is used for storing an operating system, an application program, a BootLoader (BootLoader), data, and other programs, such as program codes of the computer program. The memory 41 may also be used to temporarily store data that has been output or is to be output.
It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-mentioned division of the functional units and modules is illustrated, and in practical applications, the above-mentioned function distribution may be performed by different functional units and modules according to needs, that is, the internal structure of the apparatus is divided into different functional units or modules to perform all or part of the above-mentioned functions. Each functional unit and module in the embodiments may be integrated in one processing unit, or each unit may exist alone physically, or two or more units are integrated in one unit, and the integrated unit may be implemented in a form of hardware, or in a form of software functional unit. In addition, specific names of the functional units and modules are only for convenience of distinguishing from each other, and are not used for limiting the protection scope of the present application. The specific working processes of the units and modules in the system may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
An embodiment of the present application further provides an electronic device, including: at least one processor, a memory, and a computer program stored in the memory and executable on the at least one processor, the processor implementing the steps of any of the various method embodiments described above when executing the computer program.
The embodiments of the present application further provide a computer-readable storage medium, where a computer program is stored, and when the computer program is executed by a processor, the computer program implements the steps in the above-mentioned method embodiments.
The embodiments of the present application provide a computer program product, which when running on an electronic device, enables the electronic device to implement the steps in the above method embodiments when executed.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, all or part of the processes in the methods of the embodiments described above can be implemented by a computer program, which can be stored in a computer-readable storage medium and can implement the steps of the embodiments of the methods described above when the computer program is executed by a processor. Wherein the computer program comprises computer program code, which may be in the form of source code, object code, an executable file or some intermediate form, etc. The computer readable medium may include at least: any entity or device capable of carrying computer program code to a photographing apparatus/terminal apparatus, a recording medium, computer Memory, Read-Only Memory (ROM), Random Access Memory (RAM), an electrical carrier signal, a telecommunications signal, and a software distribution medium. Such as a usb-disk, a removable hard disk, a magnetic or optical disk, etc. In certain jurisdictions, computer-readable media may not be an electrical carrier signal or a telecommunications signal in accordance with legislative and patent practice.
In the above embodiments, the descriptions of the respective embodiments have respective emphasis, and reference may be made to the related descriptions of other embodiments for parts that are not described or illustrated in a certain embodiment.
Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
In the embodiments provided in the present application, it should be understood that the disclosed apparatus, electronic device and method may be implemented in other ways. For example, the above-described apparatus/electronic device embodiments are merely illustrative, and for example, the division of the modules or units is only one logical division, and there may be other divisions when actually implemented, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
The above-mentioned embodiments are only used for illustrating the technical solutions of the present application, and not for limiting the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; such modifications and substitutions do not substantially depart from the spirit and scope of the embodiments of the present application and are intended to be included within the scope of the present application.
Claims (10)
1. A cross-domain image classification method based on class consistency structured learning is characterized by comprising the following steps:
obtaining a source domain data set and a target domain data set, wherein the source domain data set comprises a source domain sample image and a label of the source domain sample image, and the target domain data set comprises a target domain sample image;
obtaining an initialized pseudo label of each target domain sample image based on a first image classifier trained by using the source domain data set;
projecting the source domain data set and the target domain data set to the same public space to obtain source domain sample points and target domain sample points, and initializing according to the source domain sample points and the target domain sample points of the same type based on the initialized pseudo labels and the labels to obtain a projection matrix, a graph matrix, a type weight matrix, a Laplace matrix of a source domain and a Laplace matrix of a target domain;
updating the projection matrix according to the graph matrix, the class weight matrix, the Laplace matrix of the source domain and the Laplace matrix of the target domain to obtain an updated projection matrix;
performing projection learning on the source domain data set and the target domain data set by using the updated projection matrix to obtain projected source domain sample data and projected target domain sample data;
classifying the projected target domain sample data based on a second image classifier trained by using the projected source domain sample data with the label to obtain a pseudo label of the projected target domain sample data;
when the cycle times do not reach the preset times, respectively updating the graph matrix, the category weight matrix and the Laplacian matrix of the target domain according to the label and the pseudo label to obtain an updated graph matrix, an updated category weight matrix and an updated Laplacian matrix of the target domain, and returning to update the projection matrix according to the graph matrix, the category weight matrix, the Laplacian matrix of the source domain and the Laplacian matrix of the target domain to obtain an updated projection matrix based on the updated graph matrix, the updated category weight matrix and the updated Laplacian matrix of the target domain;
when the cycle times reach the preset times, obtaining the classification result of each target domain sample image, wherein the classification result is the pseudo label;
wherein, according to the graph matrix, the class weight matrix, the laplacian matrix of the source domain, and the laplacian matrix of the target domain, updating the projection matrix to obtain an updated projection matrix, includes: by the formula (X.OMEGA.X)T+βIm)P=XHXTP phi obtains the updated projection matrix; beta is a hyperparameter;
wherein X ═ Xs,Xt]∈Rm×n,XsFor the source domain data set, XtFor the target domain data set, P is the projection matrix, Im∈Rm×mIs an identity matrix of dimension m, Rm×mWith a representation dimension of m x mReal number space, Rm×nRepresenting a real number space with dimension of m multiplied by n, wherein m and n respectively represent space dimensions;
Φ=diag(Φ1,Φ2,...,Φd)∈Rd×da diagonal matrix whose diagonal elements are lagrange multipliers;alpha, eta and gamma are all hyper-parameters; mcFor the said class weight matrix, the class weight matrix,
δ∈[0,1]is a pre-set abutment factor and is,is a rounded up symbol;representing sample pointsAnd sample pointThe same as the category label of (1);representing sample pointsThe category label of (a) is set,representing sample pointsThe category label of (1), wherein,is thatIs determined by the point of the neighborhood of the point,αi、αja weight coefficient representing each sample in the target domain; ztRepresenting the form of the target domain data in projection space, Zt=PTXt,zt,iRepresents ZtThe ith data in (1);
respectively a source domain sample point and a target domain sample point;the source domain sample points and the target domain sample points with the category of c;
Lsis the Laplace matrix of the source domain, LtA Laplace matrix for the target domain; g is the matrix of the graph, and is And is With a representation dimension of ns×ntReal number space of, nsAnd ntRepresenting a spatial dimension;
u(i)to compriseThe target domain sample point set of k nearest neighbors,k is the preset number of nearest neighbors, C ∈ {1, 2., C } is the number of categories common to the source domain and the target domain,andthe number of samples of the source domain and target domain class c respectively,respectively a sample point with the category c in the source domain and the target domain; n is ns+nt;
2. The method of claim 1, wherein updating the graph matrix, the class weight matrix, and the target domain laplacian matrix according to the labels and the pseudo labels to obtain an updated graph matrix, an updated class weight matrix, and an updated target domain laplacian matrix respectively comprises:
by the formula
Calculating to obtain the updated graph matrix according to the label and the pseudo label;
by the formula
Calculating to obtain the updated category weight matrix according to the label and the pseudo label;
by the formula
And calculating to obtain the updated Laplace matrix of the target domain according to the label and the pseudo label.
3. The method of claim 1, wherein classifying the projected target domain sample data based on a second image classifier trained using the projected source domain sample data with the label to obtain a pseudo label for the projected target domain sample data comprises:
training an image classifier by using the projected source domain sample data with the label to obtain a second image classifier;
classifying the projected target domain sample data by using the second image classifier to obtain a first classification result of each projected target domain sample data, wherein the first classification result is the pseudo label.
4. The method of claim 1, wherein obtaining an initialized pseudo label for each of the target domain sample images based on a first image classifier trained using the source domain dataset comprises:
training an image classifier by using the source domain data set to obtain the trained first image classifier;
classifying each target domain sample image by using the first image classifier to obtain a second classification result of each target domain sample image, wherein the second classification result is the initialized pseudo label.
5. A cross-domain image classification device based on class consistency structured learning is characterized by comprising the following components:
an obtaining module, configured to obtain a source domain data set and a target domain data set, where the source domain data set includes a source domain sample image and a label of the source domain sample image, and the target domain data set includes a target domain sample image;
a pseudo label initialization module, configured to obtain an initialized pseudo label of each target domain sample image based on a first image classifier trained using the source domain data set;
the initialization module is used for projecting the source domain data set and the target domain data set to the same public space to obtain source domain sample points and target domain sample points, and initializing according to the source domain sample points and the target domain sample points of the same type based on the initialized pseudo labels and the labels to obtain a projection matrix, a graph matrix, a type weight matrix, a Laplace matrix of a source domain and a Laplace matrix of a target domain;
the projection matrix updating module is used for updating the projection matrix according to the graph matrix, the category weight matrix, the Laplace matrix of the source domain and the Laplace matrix of the target domain to obtain an updated projection matrix;
the projection module is used for performing projection learning on the source domain data set and the target domain data set by using the updated projection matrix to obtain projected source domain sample data and projected target domain sample data;
a pseudo label updating module, configured to classify the projected target domain sample data based on a second image classifier trained using the projected source domain sample data with the label, and obtain a pseudo label of the projected target domain sample data;
the circulation module is used for respectively updating the graph matrix, the category weight matrix and the Laplace matrix of the target domain according to the label and the pseudo label when the circulation frequency does not reach a preset frequency to obtain an updated graph matrix, an updated category weight matrix and an updated Laplace matrix of the target domain, and returning to update the projection matrix according to the graph matrix, the category weight matrix, the Laplace matrix of the source domain and the Laplace matrix of the target domain to obtain an updated projection matrix based on the updated graph matrix, the updated category weight matrix and the updated Laplace matrix of the target domain; when the cycle times reach the preset times, obtaining the classification result of each target domain sample image, wherein the classification result is the pseudo label;
wherein the projection matrix updating module is specifically configured to:
by the formula (X.OMEGA.X)T+βIm)P=XHXTP phi obtains the updated projection matrix; beta is a hyperparameter;
wherein X ═ Xs,Xt]∈Rm×n,XsFor the source domain data set, XtFor the target domain data set, P is the projection matrix, Im∈Rm×mIs an identity matrix of dimension m, Rm×mRepresenting a real space of dimensions m x m, Rm×nRepresenting a real number space with dimension of m multiplied by n, wherein m and n respectively represent space dimensions; Φ to diag (Φ)1,Φ2,...,Φd)∈Rd×dA diagonal matrix whose diagonal elements are lagrange multipliers;
alpha, eta and gamma are all hyper-parameters; mcFor the said class weight matrix, the class weight matrix,
δ∈[0,1]is a pre-set abutment factor and is,is a rounded up symbol;representing sample pointsAnd sample pointThe same as the category label of (1);representing sample pointsThe category label of (a) is set,representing sample pointsThe category label of (1), wherein,is thatIs determined by the point of the neighborhood of the point,αi、αja weight coefficient representing each sample in the target domain; ztRepresenting the form of the target domain data in projection space, Zt=PTXt,zt,iRepresents ZtThe ith data in (1);
respectively a source domain sample point and a target domain sample point;the source domain sample points and the target domain sample points with the category of c;
Lsis the Laplace matrix of the source domain, LtA Laplace matrix for the target domain; g is the matrix of the graph, and is And is With a representation dimension of ns×ntReal number space of, nsAnd ntRepresenting a spatial dimension;
u(i)to compriseThe target domain sample point set of k nearest neighbors,k is the preset number of nearest neighbors, C ∈ {1, 2., C } is the number of categories common to the source domain and the target domain,andthe number of samples of the source domain and target domain class c respectively,samples of class c in the source and target domains, respectivelyPoint; n is ns+nt;
6. The apparatus of claim 5, wherein the circulation module is specifically configured to:
by the formula
Calculating to obtain the updated graph matrix according to the label and the pseudo label;
by the formula
Calculating to obtain the updated category weight matrix according to the label and the pseudo label;
by the formula
And calculating to obtain the updated Laplace matrix of the target domain according to the label and the pseudo label.
7. The apparatus of claim 5, wherein the pseudo tag update module is specifically configured to:
training an image classifier by using the projected source domain sample data with the label to obtain a second image classifier;
classifying the projected target domain sample data by using the second image classifier to obtain a first classification result of each projected target domain sample data, wherein the first classification result is the pseudo label.
8. The apparatus of claim 5, wherein the pseudo tag initialization module is specifically configured to:
training an image classifier by using the source domain data set to obtain the trained first image classifier;
classifying each target domain sample image by using the first image classifier to obtain a second classification result of each target domain sample image, wherein the second classification result is the initialized pseudo label.
9. An electronic device comprising a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor implements the method of any of claims 1 to 4 when executing the computer program.
10. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the method according to any one of claims 1 to 4.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111530728.1A CN113920382B (en) | 2021-12-15 | 2021-12-15 | Cross-domain image classification method based on class consistency structured learning and related device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111530728.1A CN113920382B (en) | 2021-12-15 | 2021-12-15 | Cross-domain image classification method based on class consistency structured learning and related device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113920382A CN113920382A (en) | 2022-01-11 |
CN113920382B true CN113920382B (en) | 2022-03-15 |
Family
ID=79248895
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111530728.1A Active CN113920382B (en) | 2021-12-15 | 2021-12-15 | Cross-domain image classification method based on class consistency structured learning and related device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113920382B (en) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114219047B (en) * | 2022-02-18 | 2022-05-10 | 深圳大学 | Heterogeneous domain self-adaption method, device and equipment based on pseudo label screening |
CN117237857B (en) * | 2023-11-13 | 2024-02-09 | 腾讯科技(深圳)有限公司 | Video understanding task execution method and device, storage medium and electronic equipment |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112348081A (en) * | 2020-11-05 | 2021-02-09 | 平安科技(深圳)有限公司 | Transfer learning method for image classification, related device and storage medium |
CN113420775A (en) * | 2021-03-31 | 2021-09-21 | 中国矿业大学 | Image classification method under extremely small quantity of training samples based on adaptive subdomain field adaptation of non-linearity |
Family Cites Families (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20110320387A1 (en) * | 2010-06-28 | 2011-12-29 | International Business Machines Corporation | Graph-based transfer learning |
US10956817B2 (en) * | 2018-04-18 | 2021-03-23 | Element Ai Inc. | Unsupervised domain adaptation with similarity learning for images |
CN109299676A (en) * | 2018-09-07 | 2019-02-01 | 电子科技大学 | A kind of visual pursuit method of combining classification and domain adaptation |
KR20200075344A (en) * | 2018-12-18 | 2020-06-26 | 삼성전자주식회사 | Detector, method of object detection, learning apparatus, and learning method for domain transformation |
CN110348579B (en) * | 2019-05-28 | 2023-08-29 | 北京理工大学 | Domain self-adaptive migration feature method and system |
CN111340021B (en) * | 2020-02-20 | 2022-07-15 | 中国科学技术大学 | Unsupervised domain adaptive target detection method based on center alignment and relation significance |
CN111783831B (en) * | 2020-05-29 | 2022-08-05 | 河海大学 | Complex image accurate classification method based on multi-source multi-label shared subspace learning |
-
2021
- 2021-12-15 CN CN202111530728.1A patent/CN113920382B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112348081A (en) * | 2020-11-05 | 2021-02-09 | 平安科技(深圳)有限公司 | Transfer learning method for image classification, related device and storage medium |
CN113420775A (en) * | 2021-03-31 | 2021-09-21 | 中国矿业大学 | Image classification method under extremely small quantity of training samples based on adaptive subdomain field adaptation of non-linearity |
Also Published As
Publication number | Publication date |
---|---|
CN113920382A (en) | 2022-01-11 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110033026B (en) | Target detection method, device and equipment for continuous small sample images | |
CN113920382B (en) | Cross-domain image classification method based on class consistency structured learning and related device | |
JP5848833B2 (en) | Method and system for comparing images | |
EP2668618A1 (en) | Method and system for comparing images | |
JP2015504215A5 (en) | ||
CN108985190B (en) | Target identification method and device, electronic equipment and storage medium | |
Shetty et al. | Segmentation and labeling of documents using conditional random fields | |
Wang et al. | Optimal transport for label-efficient visible-infrared person re-identification | |
CN111223128A (en) | Target tracking method, device, equipment and storage medium | |
Son et al. | Spectral clustering with brainstorming process for multi-view data | |
CN114169381A (en) | Image annotation method and device, terminal equipment and storage medium | |
CN111104941B (en) | Image direction correction method and device and electronic equipment | |
Xu et al. | Graphical modeling for multi-source domain adaptation | |
Zhu et al. | Multi-view multi-sparsity kernel reconstruction for multi-class image classification | |
US11941792B2 (en) | Machine learning-based analysis of computing device images included in requests to service computing devices | |
CN114444565A (en) | Image tampering detection method, terminal device and storage medium | |
Ghalyan | Estimation of ergodicity limits of bag-of-words modeling for guaranteed stochastic convergence | |
CN112364916A (en) | Image classification method based on transfer learning, related equipment and storage medium | |
JP2019086979A (en) | Information processing device, information processing method, and program | |
Wang et al. | Bayesian denoising hashing for robust image retrieval | |
Singh et al. | Meta-DZSL: a meta-dictionary learning based approach to zero-shot recognition | |
CN111695526B (en) | Network model generation method, pedestrian re-recognition method and device | |
Rad et al. | A multi-view-group non-negative matrix factorization approach for automatic image annotation | |
Yan et al. | Variational Bayesian learning for background subtraction based on local fusion feature | |
CN114492640A (en) | Domain-adaptive-based model training method, target comparison method and related device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |