CN114528913A

CN114528913A - Model migration method, device, equipment and medium based on trust and consistency

Info

Publication number: CN114528913A
Application number: CN202210023290.6A
Authority: CN
Inventors: 陈辉; 丁贵广
Original assignee: Tsinghua University
Current assignee: Tsinghua University
Priority date: 2022-01-10
Filing date: 2022-01-10
Publication date: 2022-05-24

Abstract

The application relates to a model migration method, a device, equipment and a medium based on trust and consistency, wherein the method comprises the following steps: based on preset labeled source domain data, a convolutional neural network is used as a feature extractor to provide features of a source domain image; performing label prediction by using a source domain classification layer, and performing training optimization by using a cross entropy loss function to obtain a pre-trained source domain model; based on a pre-trained source domain model and label-free target domain data, model self-adaptive learning is carried out by using a dual classification network, and training optimization is carried out by using a trust and consistency-based mechanism, so that a source domain model after self-adaptive learning is obtained. Therefore, the problem of field self-adaption under the condition of source domain data missing is solved, namely, the source domain model and the target domain data without labels are given, and the target domain self-adaption learning is carried out through a model migration method, so that the unsupervised learning is realized, and the self-adaption capability of the model is obviously improved.

Description

Model migration method, device, equipment and medium based on trust and consistency

Technical Field

The present application relates to the field of artificial intelligence technologies, and in particular, to a model migration method, apparatus, device, and medium based on trust and consistency.

Background

The current artificial intelligence technology based on deep learning often encounters the problem of performance degradation caused by information deviation in practical application, namely, target data of practical application and source domain data information of model training are not distributed uniformly. To solve such problems, academic and industrial fields research Unsupervised Domain Adaptation (UDA) methods, which aim to migrate the knowledge of source Domain data to the learning process of target Domain data. In the method in the related art, basically, it is assumed that in the field adaptive learning process, a model can contact a large amount of source field data with labels, so that the labeled source field data and the unlabeled target field data are used for model training at the same time, and the finally obtained model is tested on the unlabeled target field data.

The unsupervised domain adaptive methods in the related art can be roughly divided into three categories: difference-based, reconstruction-based, and countermeasure-based. The difference-based approach aims at optimizing the merit function for measuring the difference between the source domain data and the target domain data distribution. Some popular works include Maximum Mean variance (MMD), high-order central moment variance, contrast domain variance, and Wasserstein metrics. Reconstruction-based methods typically introduce an auxiliary reconstruction task to achieve a shared representation of the two domains. Domain specific reconstruction and cycle consistency is further proposed to improve the adaptive performance. Countermeasure-based approach employing generation of countermeasure networks to optimize distances between different data distributions

While these methods work well, they all assume that the tagged source data can be accessed during domain adaptation. In practical applications, however, this premise assumption cannot always be satisfied due to data privacy or resource-limited issues. Such as many industrial data, e.g., medical diagnostic records, product defect status, and user behavior information, are typically limited to internal maintenance. Furthermore, on terminal devices (e.g., cameras), storage and computing resources are often very limited, which also presents significant difficulties for large data access and model training.

That is, mobility and adaptivity are two major key issues that the domain adaptation needs to solve. The mobility requires that the trained model can migrate the knowledge of the source domain data to the maximum extent in the learning process of the target domain, and the self-adaptability aims to enable the model to sense the specific information distribution of the target domain data and flexibly adjust the parameters of the model to achieve the learning target of the specific information. However, in the existing unsupervised domain adaptive methods independent of source domain data, most of the methods aim to shorten the distance between the source domain information distribution and the target domain information distribution, thereby enhancing the mobility of the model. In practical application, the target domain data and the source domain data have obvious differences in texture, color, even background and the like, and the learning capability of the model on the specific information of the target domain can be weakened and the self-adaptability capability of the model can be reduced by simply shortening the distance between the target domain and the source domain.

Disclosure of Invention

The application provides a model migration method, a device, equipment and a medium based on trust and consistency, which aim to solve the problem of field self-adaptation under the condition of source domain data loss, namely, a source domain model and label-free target domain data are given, and target domain self-adaptation learning is carried out through the model migration method, so that unsupervised learning is realized, and the self-adaptation capability of the model is obviously improved.

An embodiment of a first aspect of the present application provides a model migration method based on trust and consistency, including the following steps:

based on preset labeled source domain data, a convolutional neural network is used as a feature extractor to provide features of a source domain image;

performing label prediction by using a source domain classification layer, wherein the source domain classification layer consists of a full connection layer and a weight specification layer, and performing training optimization by using a cross entropy loss function to obtain a pre-trained source domain model; and

and based on the pre-trained source domain model and the label-free target domain data, performing model adaptive learning by using a dual classification network, and performing training optimization by using a trust and consistency-based mechanism to obtain a source domain model after adaptive learning.

Optionally, the training and optimizing by using a trust and consistency-based mechanism to obtain the adaptively learned source domain model includes:

when model migration is carried out on the target domain data, inputting samples in the target domain data into a model to obtain probability distribution;

selecting a label with the maximum probability as a pseudo label of a corresponding sample, and measuring the trust degree of a model to the pseudo label by using entropy;

and sequencing all samples in the target domain data from small to large according to the entropy generated by the trust degree to obtain trusted samples, and training the network by using the trusted samples and corresponding pseudo labels to obtain the source domain model after the self-adaptive learning.

Optionally, before using model adaptive learning on the even classification network, further comprising:

and constructing the dual classification network, wherein the dual classification network comprises a feature extractor and a dual classification head, parameters of a source classifier are fixed in a training process, and the feature extractor and a target classifier are updated through random gradient descent.

Optionally, the training optimization using the trust and consistency based mechanism includes:

extracting the characteristics of the unmarked samples in the target domain data, and obtaining a first distribution prediction result and a second distribution prediction result based on a preset first classifier and a preset second classifier;

and calculating the information entropy of the model prediction distribution based on the first distribution prediction result and the second distribution prediction result to obtain the credibility of the model prediction.

randomly rotating the trusted sample by a preset angle to obtain a new trusted sample;

inputting the trusted sample and the new trusted sample into the dual classification network, and respectively obtaining the characteristics and the prediction distribution results of the trusted sample and the new trusted sample;

using a preset loss function to enable the characteristics and the probability distribution of the credible sample and the new credible sample to be consistent;

and predicting the rotation angle of the new trusty sample relative to the trusty sample by utilizing a preset classification layer, and calculating the prediction loss of the rotation angle.

The embodiment of the second aspect of the present application provides a model migration apparatus based on trust and consistency, including:

the extraction module is used for extracting the characteristics of the source domain image by using a convolutional neural network as a characteristic extractor based on preset labeled source domain data;

the optimization module is used for predicting labels by using a source domain classification layer, wherein the source domain classification layer consists of a full connection layer and a weight specification layer, and a cross entropy loss function is used for training and optimizing to obtain a pre-trained source domain model; and

and the acquisition module is used for carrying out model self-adaptive learning by using a dual classification network based on the pre-trained source domain model and the non-labeled target domain data, and carrying out training optimization by using a trust and consistency-based mechanism to obtain a source domain model after self-adaptive learning.

Optionally, the optimization module is specifically configured to:

Optionally, before using model adaptive learning on the dual classification network, the obtaining module is further configured to:

Optionally, the optimization module is specifically configured to:

inputting the trustable sample and the new trustable sample into the dual classification network, and respectively obtaining the characteristics and the prediction distribution results of the trustable sample and the new trustable sample;

An embodiment of a third aspect of the present application provides an electronic device, including: a memory, a processor and a computer program stored on the memory and executable on the processor, the processor executing the program to implement the trust and consistency based model migration method as described in the above embodiments.

A fourth aspect of the present application provides a computer-readable storage medium storing computer instructions for causing a computer to perform the trust and consistency-based model migration method according to the above embodiments.

Therefore, in the method, a convolutional neural network is used as a feature extractor to provide features of a source domain image based on preset labeled source domain data, a source domain classification layer is used for label prediction, a cross entropy loss function is used for training and optimizing to obtain a pre-trained source domain model, a dual classification network is used for model adaptive learning based on the pre-trained source domain model and unlabeled target domain data, and a trust and consistency mechanism is used for training and optimizing to obtain a source domain model after adaptive learning. Therefore, the problem of field self-adaption under the condition of source domain data missing is solved, namely, the source domain model and the target domain data without labels are given, and the target domain self-adaption learning is carried out through a model migration method, so that the unsupervised learning is realized, and the self-adaption capability of the model is obviously improved.

Additional aspects and advantages of the present application will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the present application.

Drawings

The foregoing and/or additional aspects and advantages of the present application will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings of which:

FIG. 1 is a flowchart of a trust and consistency-based model migration method according to an embodiment of the present application;

FIG. 2 is an exemplary diagram of a dual classification network framework according to one embodiment of the present application;

FIG. 3 is a graphical illustration of a comparison of performance on a VisDA dataset according to one embodiment of the present application;

FIG. 4 is an exemplary diagram of a trust and consistency based model migration apparatus according to an embodiment of the present application;

fig. 5 is an exemplary diagram of an electronic device according to an embodiment of the application.

Detailed Description

Reference will now be made in detail to embodiments of the present application, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the drawings are exemplary and intended to be used for explaining the present application and should not be construed as limiting the present application.

The trust and consistency-based model migration method, apparatus, device, and medium according to embodiments of the present application are described below with reference to the accompanying drawings. In the method, a convolutional neural network is used as a feature extractor to provide features of a source domain image based on preset labeled source domain data, a source domain classification layer is used for label prediction, a cross entropy loss function is used for training optimization to obtain a pre-trained source domain model, a dual classification network is used for model adaptive learning based on the pre-trained source domain model and unlabeled target domain data, and a trust and consistency mechanism is used for training optimization to obtain a source domain model after adaptive learning. Therefore, the problem of field self-adaption under the condition of source domain data missing is solved, namely, the source domain model and the target domain data without labels are given, and the target domain self-adaption learning is carried out through a model migration method, so that the unsupervised learning is realized, and the self-adaption capability of the model is obviously improved.

Specifically, fig. 1 is a schematic flowchart of a trust and consistency-based model migration method according to an embodiment of the present application.

As shown in FIG. 1, the model migration method based on trust and consistency comprises the following steps:

in step S101, based on the preset labeled source domain data, a convolutional neural network is used as a feature extractor to extract features of the source domain image.

In step S102, a source domain classification layer is used for performing label prediction, wherein the source domain classification layer is composed of a full-link layer and a weight specification layer, and a cross entropy loss function is used for training optimization to obtain a pre-trained source domain model.

In particular, embodiments of the present application may be given labeled source domain datasets

And a set of labels Y, for the sampled specimen_s，y_sTherein of

y_sE.g. Y, firstly constructing a neural network model theta_s＝{f_s，h_sIn which f_sIs a feature extractor that initializes f using a pre-trained convolutional neural network on ImageNet_sParameter of (c), h_sIs a classifier and is initialized randomly. During training, use f_sDecimating x_sThen using h_sClassify it, the output result of the classifier is p^s(x_s)＝h_s(f_s(x_s)). Finally, a cross entropy loss training model theta is used_s：

Wherein the content of the first and second substances,

represents a sample x_sIs y_sProbability P (y)_s|x_s) Here the output p of the classifier^s(x_s) Above belong to y_sThe probability value of (a).

Is a desired function. In step S103, based on the pre-trained source domain model and the non-labeled target domain data, model adaptive learning is performed using a dual classification network, and training optimization is performed using a trust and consistency-based mechanism, so as to obtain a source domain model after adaptive learning.

As shown in fig. 2, as can be seen from fig. 2, the dual classification network may adopt a convolutional neural network as a feature extractor, and then use a dual classification head to classify the extracted features, where the dual classification head includes two different classification layers, and a full connection layer and a weight specification layer are located in each layer. When the dual classification network is initialized, the feature extractor parameters of the dual classification network are initialized by using the feature extractor parameters of the source domain model, one classification layer of a dual classification head in the dual classification network is initialized by using the source domain classification layer parameters of the source domain model, and is called a source classifier of the dual classification network, and the other classification layer is initialized by using a random floating point numerical value, and is called a target classifier of the dual classification network. During model migration on target domain data, the parameters of the source classifier are not updated, while the parameters of the target classifier are updated using a random gradient descent.

Optionally, in some embodiments, before performing model adaptive learning using the dual classification network, the method further includes: and constructing a dual classification network, wherein the dual classification network comprises a feature extractor and a dual classification head, the parameters of the source classifier are fixed in the training process, and the feature extractor and the target classifier are updated through random gradient descent.

It should be understood that the dual classification network θ constructed by the embodiments of the present application_tComprising a feature extractor f_tAnd a dual sort header. The parameters of the feature extractor are initialized using the feature extractor of the pre-trained model on the source domain, i.e. f_t＝f_s. The dual classification head comprises two classification layers which are source domain classification layers h'_sAnd a target domain classification layer h_tWherein the parameters of the source domain classification layer are initialized by the classification layer of the pre-trained model on the source domain, i.e. h'_s＝h_sAnd the target domain classification layer h_tInitialization is performed by a random value. The dual classification network constructed finally is theta_t＝{f_t，h_s，h_t}. In the training process, we fix the parameter h of the source classifier_sCharacteristic extractor f_tAnd a target classifier h_tUpdated by a random gradient descent.

Optionally, in some embodiments, training optimization is performed using a trust and consistency-based mechanism to obtain an adaptively learned source domain model, including: when model migration is carried out on target domain data, samples in the target domain data are input into the model, and probability distribution is obtained; selecting a label with the maximum probability as a pseudo label of a corresponding sample, and measuring the trust degree of the model to the pseudo label by using entropy; and sequencing all samples in the target domain data from small to large according to the entropy generated by the trust degree to obtain trusty samples, and training the network by using the trusty samples and the corresponding pseudo labels to obtain the source domain model after the self-adaptive learning.

Optionally, in some embodiments, training optimization is performed using a trust and consistency based mechanism, comprising: extracting the characteristics of the unmarked samples in the target domain data, and obtaining a first distribution prediction result and a second distribution prediction result based on a preset first classifier and a preset second classifier; and calculating the information entropy of the model prediction distribution based on the first distribution prediction result and the second distribution prediction result to obtain the reliability of the model prediction.

In particular, in the training process, unlabeled samples of the target domain

The embodiment of the application can use a feature extractor f_tDecimating sample x_tIs then input to two classifiers h_sAnd h_tIn (2), two distribution prediction results are obtained, i.e. p^s(x_t)＝h_s(f_t(x_t) And p)^t(x_s)＝h_t(f_s(x_t)). Next, calculating the information entropy of the model prediction distribution to represent the credibility of the model prediction, which is respectively:

wherein the content of the first and second substances,

represents a sample x_tIs y_tProbability P (y)_t|x_t) Here, the classifier h_sOutput p of^s(x_t) Above belong to y_tThe probability value of (a), in the same way,

also represents a sample x_tIs y_tProbability P (y) of_t|x_t) But here is classifier h_tOutput p of (2)^t(x_t) Above belong to y_tThe probability value of (a).

Further, for any one sample

The embodiment of the application can obtain the prediction reliability of the method, and a classifier h is used in the method_tCalculated confidence level, i.e. H_t(x_t). Then, the credibility is ranked from small to large, and samples ranked at the top r% are selected to form a credible sample set

For any sample of trustworthiness

The output of the model is

And

we use the label that gets the maximum probability value as its pseudo label y_t', i.e.:

may also be used herein

The experimental result shows that the difference between the two is not great.

Finally, the embodiments of the present application may use only samples in the set of trusted samples

And the resulting pseudo label y_tTo calculate the loss values for two classifiers:

wherein the content of the first and second substances,

representing a sample

Is y_t' probability of

Here the classifier h_sOutput of (2)

Above belong to y_tThe probability value of, and the same way as,

also represents the sample

Is y_t' probability of

But here the classifier h_tOutput of (2)

Above belong to y_tThe probability value of.

Is a desired function.

Optionally, in some embodiments, training optimization is performed using a trust and consistency based mechanism, comprising: randomly rotating the trusted sample by a preset angle to obtain a new trusted sample; inputting the credible sample and the new credible sample into a dual classification network, and respectively obtaining the characteristics and the prediction distribution results of the credible sample and the new credible sample; using a preset loss function to enable the characteristics and the probability distribution of the credible sample and the new credible sample to be consistent; and predicting the rotation angle of the new trusty sample relative to the trusty sample by utilizing a preset classification layer, and calculating the prediction loss of the rotation angle.

In particular, embodiments of the present application may employ an auto-supervised approach to enhance the learning of features. For a given sample of confidence

Another version of the method is obtained by randomly rotating a certain angle

The angle may be any one of 0 degrees, 90 degrees, 180 degrees, and 270 degrees. Then, we will

And

inputting into dual classification network to obtain their respective characteristics

And

and predicting distribution results

And

next, we force using the following loss function for

And

the features extracted by the model and the probability distribution remain consistent:

wherein

Is that

The rotated version of the other picture as

Negative example of

Is that

Just as an example.

Representing a sample

Is y_t' probability of

Here the classifier h_sOutput of (2)

Above belong to y_tThe probability value of, and the same way as,

also represents the sample

Is y_t' probability of

But here the classifier h_tOutput of (2)

Above belong to y_tThe probability value of.

Is a desired function.

In addition, the embodiment of the application can use an additional classification layer h_rTo predict

Relative to

Rotation angle of (d), classification level h_rThe predicted distribution of (a) is:

wherein [ a, b]Indicating that the two vectors are connected together to form a new vector. The relative rotation angle prediction loss is then calculated:

wherein, y_rIs the relative rotation angle, minus one of 0-3, corresponding to 0 degrees, 90 degrees, 180 degrees and 270 degrees, respectively.

Is a classifier h_rOutput of (2)

Above belong to y_rThe probability value of (a).

Further, in order to allow the model to output a confident probability distribution, further, for any one sample

We calculate the maximum information entropy loss function:

where the sum function represents the addition of all elements of the vector.

Finally, the gradient of each loss function is adjusted by integrating all the loss functions and introducing a balance factor:

in summary, the model migration method based on trust and consistency in the embodiment of the present application adopts a dual classification network, and introduces a trust and consistency mechanism to train and optimize the dual classification network, wherein a basic framework of the model migration method includes two stages, namely model pre-training on a source domain and model adaptive learning on a target domain, wherein:

model pre-training on the source domain is: giving labeled source domain data, using a convolutional neural network as a feature extractor, providing features of a source domain image, and then using a source domain classification layer for label prediction, wherein the source domain classification layer consists of a full connection layer and a weight specification layer; and (4) performing training optimization by using a cross entropy loss function to obtain a source domain model.

Model adaptive learning on the target domain is: given a pre-trained source domain model and unlabeled target domain data, model adaptive learning is performed using a dual classification network, and training optimization is performed using a trust and consistency-based mechanism.

The trust-based optimization mechanism specifically comprises the following steps: when model migration is performed on target domain data, firstly, a target domain sample is input into a model to obtain probability distribution, then a label with the maximum probability is selected as a pseudo label of the sample, in addition, the trust degree of the model on the pseudo label is measured by using entropy, and the smaller the entropy is, the more the model trusts the label. We rank all target samples from small to large according to entropy, choose the top 80% of the samples as trustworthy samples, and then train the network with these samples and their pseudo-labels.

The optimization mechanism based on consistency is specifically as follows: for each picture sample on the target domain, we randomly rotate an angle, which can be any number of 0 degrees, 90 degrees, 180 degrees and 270 degrees, to get another picture. The two pictures present the same information from different angles, but the features and probability distributions obtained after the two pictures pass through the dual classification network are different, so that the embodiment of the application uses contrast loss to approximate the feature distance of the two pictures and uses cross entropy loss to approximate the probability distributions of the two pictures, thereby ensuring that the two pictures keep consistency on the features and the prediction results.

Therefore, the model migration method based on trust and consistency in the embodiment of the application can enhance the migration and the adaptability of the model, and is greatly helpful for the feature representation learning of the sample, as can be seen from fig. 3, the embodiment of the application can significantly improve the performance of unsupervised field self-adaptation irrelevant to source domain data, and on a common standard evaluation set of VisDA, compared with a leading method SHOT, the performance can be improved by 2.0% under the same condition, even if compared with the SHOT + + using an additional training method, the application also has the improvement of 0.2%, and the effectiveness is fully proved.

According to the model migration method based on trust and consistency, provided by the embodiment of the application, the characteristics of the source domain image can be provided by using a convolutional neural network as a characteristic extractor based on preset labeled source domain data, and label prediction is performed by using a source domain classification layer, wherein the source domain classification layer consists of a full connection layer and a weight specification layer, training optimization is performed by using a cross entropy loss function to obtain a pre-trained source domain model, model self-adaptive learning is performed by using a dual classification network based on the pre-trained source domain model and unlabeled target domain data, and training optimization is performed by using a mechanism based on trust and consistency to obtain a self-adaptively learned source domain model. Therefore, the problem of field self-adaption under the condition of source domain data missing is solved, namely, the source domain model and the target domain data without labels are given, and the target domain self-adaption learning is carried out through a model migration method, so that the unsupervised learning is realized, and the self-adaption capability of the model is obviously improved.

Next, a trust and consistency-based model migration apparatus proposed according to an embodiment of the present application is described with reference to the drawings.

FIG. 4 is a block diagram of a trust and consistency based model migration apparatus according to an embodiment of the present application.

As shown in fig. 4, the trust and consistency-based model migration apparatus 10 includes: an extraction module 100, an optimization module 200 and an acquisition module 300.

The extraction module 100 is configured to use a convolutional neural network as a feature extractor to extract features of a source domain image based on preset labeled source domain data;

the optimization module 200 is configured to perform label prediction by using a source domain classification layer, where the source domain classification layer is composed of a full-link layer and a weight specification layer, and performs training optimization by using a cross entropy loss function to obtain a pre-trained source domain model; and

the obtaining module 300 is configured to perform model adaptive learning using a dual classification network based on a pre-trained source domain model and unlabeled target domain data, and perform training optimization using a trust and consistency-based mechanism to obtain a source domain model after adaptive learning.

Optionally, the optimization module 200 is specifically configured to:

when model migration is carried out on target domain data, samples in the target domain data are input into the model, and probability distribution is obtained;

selecting a label with the maximum probability as a pseudo label of a corresponding sample, and measuring the trust degree of the model to the pseudo label by using entropy;

and sequencing all samples in the target domain data from small to large according to the entropy generated by the trust degree to obtain trusty samples, and training the network by using the trusty samples and the corresponding pseudo labels to obtain the source domain model after the self-adaptive learning.

Optionally, before performing model adaptive learning using the dual classification network, the obtaining module 100 is further configured to:

and constructing a dual classification network, wherein the dual classification network comprises a feature extractor and a dual classification head, and in the training process, the parameters of the source classifier are fixed, and the feature extractor and the target classifier are updated through random gradient descent.

Optionally, the optimization module 200 is specifically configured to:

inputting the credible sample and the new credible sample into a dual classification network, and respectively obtaining the characteristics and the prediction distribution results of the credible sample and the new credible sample;

It should be noted that the foregoing explanation of the trust and consistency-based model migration method embodiment is also applicable to the trust and consistency-based model migration apparatus of this embodiment, and details are not repeated here.

According to the model migration device based on trust and consistency, provided by the embodiment of the application, the characteristics of the source domain image can be provided by using a convolutional neural network as a characteristic extractor based on preset labeled source domain data, and label prediction is performed by using a source domain classification layer, wherein the source domain classification layer consists of a full connection layer and a weight specification layer, training optimization is performed by using a cross entropy loss function to obtain a pre-trained source domain model, model self-adaptive learning is performed by using a dual classification network based on the pre-trained source domain model and unlabeled target domain data, and training optimization is performed by using a mechanism based on trust and consistency to obtain a self-adaptively learned source domain model. Therefore, the problem of field self-adaption under the condition of source domain data missing is solved, namely, the source domain model and the target domain data without labels are given, and the target domain self-adaption learning is carried out through a model migration method, so that the unsupervised learning is realized, and the self-adaption capability of the model is obviously improved.

Fig. 5 is a schematic structural diagram of an electronic device according to an embodiment of the present application. The electronic device may include:

memory 501, processor 502, and computer programs stored on memory 501 and executable on processor 502.

The processor 502, when executing a program, implements the trust and consistency based model migration method provided in the embodiments described above.

Further, the electronic device further includes:

a communication interface 503 for communication between the memory 501 and the processor 502.

A memory 501 for storing computer programs that can be run on the processor 502.

The memory 501 may comprise high-speed RAM memory, and may also include non-volatile memory (non-volatile memory), such as at least one disk memory.

If the memory 501, the processor 502 and the communication interface 503 are implemented independently, the communication interface 503, the memory 501 and the processor 502 may be connected to each other through a bus and perform communication with each other. The bus may be an Industry Standard Architecture (ISA) bus, a Peripheral Component Interconnect (PCI) bus, an Extended ISA (EISA) bus, or the like. The bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one thick line is shown in FIG. 5, but this is not intended to represent only one bus or type of bus.

Alternatively, in practical implementation, if the memory 501, the processor 502 and the communication interface 503 are integrated on a chip, the memory 501, the processor 502 and the communication interface 503 may complete communication with each other through an internal interface.

The processor 502 may be a Central Processing Unit (CPU), an Application Specific Integrated Circuit (ASIC), or one or more Integrated circuits configured to implement embodiments of the present Application.

The present embodiments also provide a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the trust and consistency based model migration method as described above.

In the description herein, reference to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the application. In this specification, the schematic representations of the terms used above are not necessarily intended to refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or N embodiments or examples. Furthermore, various embodiments or examples and features of different embodiments or examples described in this specification can be combined and combined by one skilled in the art without contradiction.

Furthermore, the terms "first", "second" and "first" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include at least one such feature. In the description of the present application, "N" means at least two, e.g., two, three, etc., unless explicitly defined otherwise.

Any process or method descriptions in flow charts or otherwise described herein may be understood as representing modules, segments, or portions of code which include one or more N executable instructions for implementing steps of a custom logic function or process, and alternate implementations are included within the scope of the preferred embodiment of the present application in which functions may be executed out of order from that shown or discussed, including substantially concurrently or in reverse order, depending on the functionality involved, as would be understood by those reasonably skilled in the art of implementing the embodiments of the present application.

The logic and/or steps represented in the flowcharts or otherwise described herein, e.g., an ordered listing of executable instructions that can be considered to implement logical functions, can be embodied in any computer-readable medium for use by or in connection with an instruction execution system, apparatus, or device, such as a computer-based system, processor-containing system, or other system that can fetch the instructions from the instruction execution system, apparatus, or device and execute the instructions. For the purposes of this description, a "computer-readable medium" can be any means that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device. More specific examples (a non-exhaustive list) of the computer-readable medium would include the following: an electrical connection (electronic device) having one or N wires, a portable computer diskette (magnetic device), a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber device, and a portable compact disc read-only memory (CDROM). Additionally, the computer-readable medium could even be paper or another suitable medium upon which the program is printed, as the program can be electronically captured, via for instance optical scanning of the paper or other medium, then compiled, interpreted or otherwise processed in a suitable manner if necessary, and then stored in a computer memory.

It should be understood that portions of the present application may be implemented in hardware, software, firmware, or a combination thereof. In the above embodiments, the N steps or methods may be implemented in software or firmware stored in a memory and executed by a suitable instruction execution system. If implemented in hardware, as in another embodiment, any one or combination of the following techniques, which are known in the art, may be used: a discrete logic circuit having a logic gate circuit for implementing a logic function on a data signal, an application specific integrated circuit having an appropriate combinational logic gate circuit, a Programmable Gate Array (PGA), a Field Programmable Gate Array (FPGA), or the like.

It will be understood by those skilled in the art that all or part of the steps carried by the method for implementing the above embodiments may be implemented by hardware related to instructions of a program, which may be stored in a computer readable storage medium, and when the program is executed, the program includes one or a combination of the steps of the method embodiments.

In addition, functional units in the embodiments of the present application may be integrated into one processing module, or each unit may exist alone physically, or two or more units are integrated into one module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode. The integrated module, if implemented in the form of a software functional module and sold or used as a stand-alone product, may also be stored in a computer readable storage medium.

The storage medium mentioned above may be a read-only memory, a magnetic or optical disk, etc. Although embodiments of the present application have been shown and described above, it is understood that the above embodiments are exemplary and should not be construed as limiting the present application, and that variations, modifications, substitutions and alterations may be made to the above embodiments by those of ordinary skill in the art within the scope of the present application.

Claims

1. A model migration method based on trust and consistency is characterized by comprising the following steps:

performing label prediction by using a source domain classification layer, wherein the source domain classification layer consists of a full connection layer and a weight specification layer, and performing training optimization by using a cross entropy loss function to obtain the pre-trained source domain model; and

2. The method of claim 1, wherein the training optimization using the trust and consistency based mechanism to obtain the adaptively learned source domain model comprises:

and sequencing all samples in the target domain data from small to large according to entropy generated by the trust degree to obtain trusted samples, and training a network by using the trusted samples and corresponding pseudo labels to obtain the source domain model after the self-adaptive learning.

3. The method of claim 2, further comprising, prior to using model adaptive learning for the dual classification network:

4. The method of claim 3, wherein the using a trust and consistency based mechanism for training optimization comprises:

5. The method of claim 3 or 4, wherein the using a trust and consistency based mechanism for training optimization comprises:

6. A model migration apparatus based on trust and consistency, comprising:

the optimization module is used for predicting the label by using a source domain classification layer, wherein the source domain classification layer consists of a full connection layer and a weight specification layer, and a cross entropy loss function is used for training and optimizing to obtain the pre-trained source domain model; and

7. The apparatus of claim 6, wherein the optimization module is specifically configured to:

8. The apparatus of claim 7, wherein prior to using model adaptive learning for the dual classification network, the obtaining module is further configured to:

9. The apparatus of claim 8, wherein the optimization module is specifically configured to:

10. The apparatus according to claim 8 or 9, wherein the optimization module is specifically configured to:

11. An electronic device, comprising: a memory, a processor, and a computer program stored on the memory and executable on the processor, the processor executing the program to implement the trust and consistency-based model migration method of any of claims 1-5.

12. A computer-readable storage medium, on which a computer program is stored, the program being executable by a processor for implementing a trust and consistency-based model migration method as claimed in any one of claims 1 to 5.