CN113031520B

CN113031520B - Meta-invariant feature space learning method for cross-domain prediction

Info

Publication number: CN113031520B
Application number: CN202110228766.5A
Authority: CN
Inventors: 李迎光; 刘长青; 华家玘; 李晶晶; 郝小忠
Original assignee: Nanjing University of Aeronautics and Astronautics
Current assignee: Nanjing University of Aeronautics and Astronautics
Priority date: 2021-03-02
Filing date: 2021-03-02
Publication date: 2022-03-22
Anticipated expiration: 2041-03-02
Also published as: CN113031520A

Abstract

A meta-invariant feature space learning method of cross-domain prediction is characterized in that existing data are used as source domain data, and the source domain data are grouped and paired; respectively establishing a prediction model for the data in each pairing, further establishing an invariant feature space learning model of the pairing data, and learning the invariant feature space of each pairing data through collaborative training; and (3) learning the element invariant feature space between different pairs by using the element invariant feature space learning model as a base model through an element learning method to obtain an element invariant feature space learning model, and predicting the target domain based on the element invariant feature space learning model. The invention obtains the invariant feature space under two working conditions by a collaborative learning mode, and solves the problem of edge distribution adaptation. The prediction precision under the cross-domain is improved.

Description

Meta-invariant feature space learning method for cross-domain prediction

Technical Field

The invention relates to the field of artificial intelligence, in particular to a cross-domain prediction method for intelligent manufacturing, and specifically relates to a meta-invariant feature space learning method for cross-domain prediction.

Background

The cross-domain prediction is an important research problem in the field of machine learning, the main reason in the manufacturing field is that the edge distribution and condition distribution of data have large difference due to large change of working conditions, the new working conditions are difficult to adapt to through a model trained on a source domain, and a typical problem in the manufacturing field is tool wear prediction. The real-time monitoring of the tool abrasion has important significance on the dynamic control of the machining process, especially for the complex parts of the airplane using a large amount of difficult-to-machine materials, the tool abrasion seriously affects the machining quality of the parts in the machining process, and the tool abrasion is very difficult to predict. The data driving method autonomously learns a data driving model from a large amount of processing data, can be equivalent to a complex mechanism model within a certain error range, and provides a new idea for accurately predicting the wear of the cutter. Deep learning is a typical data driving method, and tool wear is still difficult to accurately monitor under the condition of continuous change of working conditions at present because training of deep learning needs a large amount of labeled sample data under different working conditions, and the sample data is very difficult to obtain in the actual processing process. The latest research realizes the quantitative prediction of the tool wear amount under the variable working conditions based on the small sample multi-task learning method such as meta-learning, but is only suitable for the small working condition conditions such as cutting parameter change and the like, and cannot realize the accurate prediction of the tool wear under the large working condition changes such as large cutting parameter change, tool diameter change, tool material change, part material change and the like.

In the field of machine learning, the joint probability distribution P (X, Y) of input and output under cross-domain is obtained by model learning essentially, and according to Bayesian theorem, the prediction precision and adaptability of the model are embodied in two aspects of edge distribution P (X) and conditional distribution P (Y | X). Here, the edge distribution P (X) is a data distribution of the input amount, and the conditional distribution P (Y | X) is a parameter distribution of the model. In the cross-domain prediction problem, the edge distribution and the condition distribution have large gaps. A common solution to this type of Distribution Adaptation (Distribution Adaptation) problem is to use migration learning, but most of the migration learning methods focus on solving the problem of different single edge distributions or different single condition distributions, and have a certain sample size requirement for target data, and there is a limitation to the above problem. The patent provides a cross-domain prediction meta-invariant feature space learning method, which respectively performs distribution adaptation from two aspects of data edge distribution and model condition distribution, thereby realizing cross-domain accurate prediction.

Disclosure of Invention

The invention aims to provide a meta-invariant feature space learning method aiming at the problem of cross-domain prediction.

The technical scheme of the invention is as follows:

a meta-invariant feature space learning method for cross-domain predictionCharacterized in that: taking the existing data as the source domain data, grouping the source domain data, and recording the jth data as D_jFurther, the grouped data are paired, and the ith paired data is denoted as (D)_j,D_k)_i(ii) a Each pairing is respectively for D_jAnd D_kEstablishing a prediction model

And

further constructing an invariant feature space learning model of the ith pairing data

Learning an invariant feature space of each paired data through collaborative training; model learning in invariant feature space

For the basic model, learning the element invariant feature space between different pairs by an element learning method to obtain an element invariant feature space learning model f_ΦAnd predicting the target domain based on the meta-invariant feature space learning model.

Further, the grouping method for the source domain data groups the source domain data under a specific distribution into a group.

Further, the pairing method selects two groups of data with the minimum distribution distance for pairing by measuring the distance of data distribution in different groups, and the preferred distribution measurement method is the maximum mean difference, namely MMD.

Further, the invariant feature space learning model

Including predictive models

And

the two prediction models can be constructed through a neural network, input quantities under different groups are input, a prediction target quantity and hidden variables are output, and a loss function L is constructed:

wherein L is_MIs the loss of match of the hidden variables of the two predictive models,

and

respectively the reconstruction loss of the input quantities of the two prediction models,

and

respectively, are the loss of prediction output for both prediction models.

Further, the meta-invariant feature space learning model f_ΦThe method comprises learning the change rule from multiple invariant feature spaces by meta-learning method, wherein the parameter of meta-learner is recorded as phi, and the parameter of base model is recorded as theta_iPhi and theta_iIteratively updating by gradient descent:

where the learning rates alpha and beta are fixed hyper-parameters,

is shown asThe gradient of the loss function of the variable feature space model,

a loss function representing the ith task,

representing the ith invariant feature space model,

gradient, T, representing a loss function of the meta-invariant feature space model_iDenotes the ith learning task, p denotes the distribution of the learning tasks, and T denotes the learning task.

Further, the target domain is predicted, the target domain data is paired with a group of existing source domain data, and the target domain is predicted based on the meta-invariant feature space learning model. The optimization method comprises the following steps of finely adjusting the meta-invariant feature space learning model through target domain data and source domain data selected to be matched:

thereby obtaining a prediction model of the target

In the formula [ theta ]_newThe parameters representing the prediction model of the object,

representing the target prediction task.

The invention has the beneficial effects that:

1. the invention obtains the invariant feature space under two working conditions by a collaborative learning mode, and solves the problem of edge distribution adaptation.

2. The invention utilizes the meta-learning idea to learn the change rule of the invariant feature space to obtain the meta-invariant feature space, thereby solving the problem of condition distribution adaptation.

3. The invention uses the element invariant feature space model, and improves the prediction precision under the cross-domain.

Drawings

FIG. 1 is a schematic diagram of a meta-invariant feature space learning method of the present invention, in which TP represents a task pair, Condi represents a working condition, SNA and SNB represent two sub-networks, IFS represents an invariant feature space learning model, and MIFS represents a meta-invariant feature space learning model.

FIG. 2 is a schematic diagram of the invariant feature space model of the present invention, wherein X is^SAnd X^TRespectively representing the input quantities under different groups,

and

respectively representing the output predicted target amounts, Y^SAnd Y^TLabels indicating output predicted target amounts, Z, respectively^SAnd Z^TRepresents an implicit variable, L_MIs the loss of match of the hidden variables Z of the two predictive models,

and

input quantities X of two prediction models respectively^SAnd X^TThe reconstruction of (a) is lost,

and

respectively the prediction outputs Y of the two prediction models^SAnd Y^TIs lost. Enc_SAnd Enc_TCoding networks, Dec, representing two sub-networks, respectively_SAnd Dec_TDecoding networks, FC, representing two sub-networks respectively_SAnd FC_TRespectively representing the predicted networks of the two sub-networks.

Detailed Description

The invention will be further described with reference to the drawings and examples, to which the invention is not restricted.

As shown in fig. 1-2.

A cross-domain prediction meta-invariant feature space learning method takes numerical control machining cutter wear prediction as an example, the cross-domain prediction is realized by predicting cutter wear under variable working conditions, wherein the variable working conditions refer to changes of workpiece materials, cutter sizes or materials, cutting parameters and the like, model input is monitoring signal features, and output is cutter wear. The method comprises the following specific steps:

1. firstly, aiming at specific data distribution under specific working conditions, data under one working condition is divided into a group, and data pairing is carried out. The invention adopts Maximum Mean Difference (MMD) to measure the distance of data distribution in two subdomains, and selects two types of distribution with the minimum MMD distance for pairing.

2. On the variable working condition cutter wear data set, pairwise matching is carried out on the signal characteristic data based on the MMD method, and the matching strategy of the training set is as follows: firstly, in working conditions Condi _1-9, randomly selecting one working condition Condi _1 as a working condition C1 to be paired, and selecting the working condition with the minimum MMD distance from the working condition C1 from the remaining 8 working conditions as a pairing working condition C2; then, the working condition C2 is used as a working condition to be paired, and the working condition with the minimum MMD distance from the working condition C2 is selected from the remaining 7 working conditions and used as a pairing working condition C3; and repeating the steps until all the training set working conditions are matched. The pairing strategy of the test set is: and for each test working condition, selecting a working condition closest to the current test working condition MMD from the training set as a pairing working condition.

3. Establishing an invariant feature space model, converting features under different distributions into an invariant feature space by utilizing collaborative learning of a prediction model under paired working conditions, extracting common features of data under different working conditions, and laying a foundation for obtaining an internal rule of the model by meta-learning. The invariant feature space model architecture is shown in fig. 2, and comprises two sub-networks SNA and SNB, wherein the longitudinal direction is a coding and decoding module for inputting features under two working conditions, and the transverse direction is a tool wear prediction module under two working conditions.

In the invariant feature space model, Enc_SAnd Enc_TCoding layers, Dec, of sub-networks SNA and SNB, respectively_SAnd Dec_TDecoding layers, FC, of sub-networks SNA and SNB, respectively_SAnd FC_TAre the regression layers of sub-networks SNA and SNB, respectively. X is the input vector, Y is the tag vector,

is the output vector of the model and Z is the hidden vector. L is_MIs the loss of match of the hidden vectors of the two sub-networks,

and

the reconstruction loss of sub-networks SNA and SNB respectively,

and

the predicted loss of tool wear for sub-networks SNA and SNB, respectively.

Data set for given source domain and target domain

And

the two sub-domains execute task pairs under similar working conditions, and the data distribution of the two sub-domains can be considered to be subject to the same distribution which is respectively represented by theta^s，θ^TParameterized embedded network

Will input features

Encoding into latent variables in invariant feature space Z

Invariant feature space model

The loss function of (2) consists of three parts: matching loss L_MLoss of reconstitution

And

and predicting loss

And

match loss (match loss):

wherein the content of the first and second substances,

the cosine distance is here chosen as a distance measure for the two subdomain latent variables.

Loss of reconstruction (reconstruction loss):

wherein the content of the first and second substances,

prediction loss (prediction loss):

wherein the content of the first and second substances,

loss function (loss function):

and minimizing the loss function to obtain the maximum invariant feature space under the paired working conditions, wherein the potential features under the invariant feature space can be used for predicting the tool wear amount. Model of invariant feature space

Parameter (d) of

Update by gradient descent:

the objective function of the invariant feature space model optimization is:

wherein the content of the first and second substances,

is an initialization parameter and the learning rate alpha is a fixed hyper-parameter.

Representing the gradient of the invariant feature space model loss function.

4. And learning the change rule of the invariant feature space IFS from a plurality of tasks by using a meta-learning method, and constructing a meta-invariant feature space MIFS. The MAML framework is selected to realize the meta-learning process. The meta-invariant feature space learning method is shown in fig. 1.

The meta learner is composed of parameters

A parameterized MIFS model, a basis learner

Parameterized invariant feature space IFS model

θ is a parameter of the embedding network f and ρ is a parameter of the regression network g. Learner parameters

Update by gradient descent:

wherein the base learning rate alpha is a fixed hyper-parameter,

is a loss function of the base learner,

initial network parameters of a base learner representing gradients of a constant feature space model loss function

Is also a parameter of the meta-learner, which is still updated by gradient descent based on the learner parameter:

wherein the meta-learning rate beta is a fixed hyper-parameter,

gradient, T, representing a loss function of the meta-invariant feature space model_iDenotes the ith learning task, p denotes the distribution of the learning tasks, and T denotes the learning task. The meta-optimization objective function of the meta-invariant feature space is:

5. the meta-learner is optimized to find the best learner initialization parameters. Model in the face of a new task T_newFirstly, the method is matched with the existing working condition, and a small amount of data of a new working condition and data of a matched working condition are utilized to carry out pairing on the element parameters through a small amount of random gradient Steps (SGD)

Fine tuning is carried out to obtain parameters of the base learner

Namely, the prediction model under the new working condition can continuously use the base learner to accurately predict the tool wear amount. In the formula

The parameters of the tool wear prediction model representing the new operating conditions,

indicating the tool wear prediction task for the new operating condition.

The present invention is not concerned with parts which are the same as or can be implemented using prior art techniques.

Claims

1. A meta-invariant feature space learning method of cross-domain prediction is characterized by comprising the following steps: taking the abrasion data of the existing numerical control machining cutter as source domain data, grouping the source domain data, and recording the jth group of data as D_jFurther, the grouped data are paired, and the ith paired data is denoted as (D)_j，D_k)_iFor each pairing D_jAnd D_kEstablishing a prediction model

And

For the basic model, learning the element invariant feature space between different pairs by an element learning method to obtain an element invariant feature space learning model f_φPredicting the target domain based on the meta-invariant feature space learning model; the invariant feature space learning model

Including predictive models

And

the two prediction models are constructed through a neural network and input the input quantity X under different groups^SAnd X^TOutputs the predicted target amount Y^SAnd Y^TAnd an implicit variable Z, constructing a loss function L:

wherein L is_MIs the loss of match of the hidden variables Z of the two predictive models,

and

and

respectively the prediction outputs Y of the two prediction models^SAnd Y^TIs lost.

2. The meta-invariant feature space learning method for cross-domain prediction according to claim 1, wherein: the grouping method of the source domain data refers to grouping the source domain data under a specific distribution.

3. The meta-invariant feature space learning method for cross-domain prediction according to claim 1, wherein: the matching method is to select two groups of data with the minimum distribution distance for matching by measuring the distance of data distribution in different groups, and the distribution measurement method is the maximum mean difference.

4. The meta-invariant feature space learning method for cross-domain prediction according to claim 1, wherein: the element invariant feature space learning model f_ΦThe method comprises learning the change rule from multiple invariant feature spaces by meta-learning method, wherein the parameter of meta-learner is recorded as phi, and the parameter of base model is recorded as theta_iPhi and theta_iIteratively updating by gradient descent:

in the formula: alpha and beta are learning rate parameters, are fixed hyper-parameters,

representing the gradient of the invariant feature space model penalty function,

a loss function representing the ith task,

representing the ith invariant feature space model,

gradient, T, representing a loss function of the meta-invariant feature space model_iRepresenting the ith learning task, p representing the distribution of the learning tasks, and T representing the learning tasks;

the target domain prediction is that target domain data and a group of existing source domain data are paired, and the target domain is predicted based on a meta-invariant feature space learning model; fine-tuning the meta-invariant feature space learning model through the target domain data and the source domain data selected for pairing:

thereby obtaining a prediction model of the target

representing the target prediction task.