CN117932347A

CN117932347A - Small sample time sequence prediction method and system based on resistance transfer learning

Info

Publication number: CN117932347A
Application number: CN202410332311.1A
Authority: CN
Inventors: 蒿颜奇; 罗川
Original assignee: Sichuan University
Current assignee: Sichuan University
Priority date: 2024-03-22
Filing date: 2024-03-22
Publication date: 2024-04-26
Anticipated expiration: 2044-03-22
Also published as: CN117932347B

Abstract

The invention discloses a small sample time sequence prediction method and a system based on resistance transfer learning, which belong to the technical field of time sequence prediction and comprise the following steps: based on resistance migration learning, constructing a composite neural network model comprising a k-means clustering algorithm, a one-dimensional convolutional neural network, a domain classifier and an LSTM long-term memory neural network, and predicting small sample time sequence data by utilizing the trained composite neural network model to obtain a small sample time sequence prediction result. According to the invention, the time sequence data of the source domain is used for training the composite neural network model through resistance transfer learning, so that the time sequence data of the target domain can be predicted, excellent performance is shown on the source domain and the target domain, the time sequence data can be predicted under the condition that only a small amount of small sample time sequence data of the target domain exists, and the problems of low utilization rate and inaccurate prediction caused by poor generalization capability and overlarge data distribution difference of different domains of the traditional time sequence prediction model are solved.

Description

Small sample time sequence prediction method and system based on resistance transfer learning

Technical Field

The invention belongs to the technical field of time sequence prediction, and particularly relates to a small sample time sequence prediction method and system based on resistance transfer learning.

Background

The goal of the migration learning is to effectively overcome the distribution differences observed in different but interrelated target domains or tasks, also known as domain transfers, using information previously obtained from the source task. Domain matching is required for transfer learning, which is an optimization of a specific target domain to improve model efficiency in the target domain, and feature representation that enables model autonomous learning generalization based on antagonistic transfer learning.

In the prior art, when predicting a small sample time sequence, because of limitation of the number of samples, the existing neural network model may excessively learn noise and details in a training set, so that generalization capability is weak, when new prediction data are processed, the situation that a prediction result is easy to fluctuate greatly occurs, accuracy is low, uncertainty of the prediction result is difficult to evaluate and quantify, meanwhile, the utilization rate of a traditional model is low due to overlarge data distribution difference of different domains, and effective and accurate prediction cannot be performed.

Disclosure of Invention

Aiming at the defects in the prior art, the small sample time sequence prediction method and the system based on the resistance transfer learning can solve the problems of low utilization rate and low prediction accuracy caused by poor generalization capability of a traditional model and overlarge difference of different data distribution through the resistance transfer learning.

In order to achieve the aim of the invention, the invention adopts the following technical scheme:

In one aspect, the invention provides a small sample time sequence prediction method based on resistance transfer learning, comprising the following steps:

s1: based on the resistance transfer learning, constructing and training a composite neural network model to obtain a trained composite neural network model;

s2: acquiring small sample time sequence data to be predicted, and extracting features on a time slice by utilizing a trained composite neural network model to obtain feature representation of the time sequence data;

s3: and carrying out regression prediction by using the trained composite neural network model according to the characteristic representation of the time sequence data to obtain a small sample time sequence prediction result.

The beneficial effects of the invention are as follows: the invention can directly use the model obtained by training the time sequence data of the source domain to help predict the time sequence data of the target domain through resistance transfer learning, and simultaneously, the invention can predict the time sequence data under the condition that only a small amount of time sequence data of the target domain is available by taking the characteristic extraction module as a generator and taking the domain classification module as a discriminator through transfer learning, has the characteristics of accuracy and convenience under the condition that the data is scarce, and solves the problems of low utilization rate and inaccurate prediction caused by poor generalization capability of the traditional model and overlarge data distribution difference of different domains.

Further: the specific steps of constructing and training the composite neural network model are as follows:

a1: acquiring time sequence data in the existing multi-domain small sample feature space, and dividing the time sequence data into a training set and a testing set;

A2: dividing the time sequence data of the training set into a source domain and a target domain on different granularities by using a k-means clustering algorithm to obtain source domain long time sequence data and target domain time sequence data;

A3: according to the source domain long time sequence data, utilizing a multi-layer one-dimensional convolutional neural network to extract the characteristics on a time slice, and obtaining the characteristic representation of the source domain long time sequence data and the internal association of the characteristic attribute;

a4: according to the characteristic representation of the source domain long time sequence data and the internal association of the characteristic attribute, the domain classification is utilized, the characteristic representation of the source domain is migrated to the target domain through resistance migration learning, the characteristic representation learning is carried out on the target domain time sequence data, and the characteristic representation of the target domain is obtained;

a5: according to the characteristic representation of the target domain and the time sequence data of the target domain, performing regression prediction by using an LSTM long-term memory neural network to obtain a small sample time sequence prediction result;

A6: and calculating to obtain a final loss function of the composite neural network model according to the test set and the small sample time sequence prediction result, judging whether the final loss function is minimum, if so, completing training to obtain a trained composite neural network model, otherwise, returning to A3.

The beneficial effects of the above-mentioned further scheme are: by training the composite neural network model, generalization capability and performance indexes of the composite neural network model can be improved, so that the composite neural network model can predict small sample time sequence data.

Further: the specific steps of A3 are as follows:

A301: according to the source domain long time sequence data, extracting the characteristics on a time slice by utilizing a plurality of one-dimensional convolution layers to obtain the original characteristic representation of the source domain long time sequence data;

A302: and flattening according to the original local characteristics of the source domain long time sequence data to obtain the characteristic representation and the internal association of the characteristic attribute of the source domain long time sequence data.

The beneficial effects of the above-mentioned further scheme are: the one-dimensional convolutional neural network is utilized to extract the characteristics of the time sequence data, so that the local characteristics in the time sequence data can be effectively captured, the generalization capability is improved, and the characteristics of a longer time span are identified.

Further: the specific steps of A4 are as follows:

A401: according to the local characteristics of the source domain long time sequence data and the internal correlation of the characteristic attributes, processing is carried out by utilizing a gradient inversion layer, so that the local characteristics of the forward source domain long time sequence data are obtained;

A402: and utilizing a plurality of layers of full-connection layers and a Softmax function, migrating local features of the forward source domain long time sequence data to a target domain through resistance migration learning, and assisting the target domain time sequence data to perform feature representation learning to obtain feature representation of the target domain.

The beneficial effects of the above-mentioned further scheme are: the feature extraction and domain classification are used for performing resistance learning, and the feature representation of the source domain is transferred to the target domain, so that the feature extraction capability and the domain classification capability can be improved, and the prediction result is more accurate.

Further: the specific steps of A5 are as follows:

A501: according to the characteristic representation of the target domain and the time sequence data of the target domain, performing regression prediction by using a forward LSTM long-term memory neural network to obtain a small sample time sequence forward prediction result;

A502: according to the characteristic representation of the target domain and the time sequence data of the target domain, performing regression prediction by using a reverse LSTM long-term memory neural network to obtain a small sample time sequence reverse prediction result;

A503: and adding according to the forward prediction result and the reverse prediction result to obtain a small sample time sequence prediction result.

The beneficial effects of the above-mentioned further scheme are: through the forward LSTM long-term memory neural network and the reverse LSTM long-term memory neural network, the global information capturing and predicting of the characteristics of the long time sequence data by the composite neural network model are improved, the expression capability is stronger, and the influence of gradient explosion or gradient disappearance can be reduced.

Further: the expression of the final loss function is as follows:

wherein, As a final loss function,/>Is the global optimum,/>Weights extracted for features,/>Weight for regression prediction,/>Weights for domain classification,/>Loss function for regression prediction,/>For feature extraction,/>For regression prediction,/>For domain classification,/>Is the total number of domains,/>For 1 st source domain,/>For/>Source domain/>In order for the domain of interest to be a target,Loss function classifying a domain,/>As a gradient inversion function,/>For the characteristic value of the small sample time sequence data in the test set,/>For one piece of data in the small sample time sequence data set in the test set,/>For the number of sample timing predictions,/>/>, For small sample timing data in test setActual value/>For the/>, in the small sample timing prediction resultResults,/>To test the first/>, of small sample data in the setActual tag value of personal field,/>For the/>, in the small sample timing prediction resultTag value of individual field.

The beneficial effects of the above-mentioned further scheme are: the performance of the prediction result can be visually observed through the final loss function, and the composite neural network model can be optimized conveniently.

In another aspect, the present invention provides a small sample timing prediction system based on resistance transfer learning, comprising:

The clustering module is used for dividing the time sequence data into a source domain and a target domain on different granularities by using a k-means clustering algorithm to acquire source domain long time sequence data and target domain time sequence data;

the feature extraction module is used for carrying out feature extraction on time slices on the time series data by utilizing a multi-layer one-dimensional convolutional neural network to obtain feature representation of the source domain long time series data and internal association of feature attributes;

The domain classification module is used for migrating the characteristic representation of the source domain to the target domain through resistance migration learning according to the characteristic representation of the source domain long time sequence data and the internal association of the characteristic attribute, assisting the characteristic representation learning of the target domain time sequence data, and obtaining the characteristic representation of the target domain;

and the time sequence regression prediction module is used for predicting the time sequence of the small sample according to the time sequence data and the characteristic representation.

The beneficial effects of the invention are as follows: the invention is based on the GAN generation countermeasure network idea, takes the feature extraction module as a generator, takes the domain classification module as a discriminator, performs countermeasure migration learning of small sample time sequence data, trains the system model by utilizing the time sequence data of the source domain, can predict the time sequence data of the target domain by utilizing the trained system, and has better performance under the condition of only a small amount of small sample time sequence data.

Further: the feature extraction module includes: the one-dimensional convolution layers are sequentially connected with the ReLU activation layer and the pooling layer;

the plurality of one-dimensional convolution layers are used for dividing the source domain long time sequence data into time slices and carrying out quick feature extraction;

and the flattening layer is used for flattening the original characteristic representation of the time sequence data.

The beneficial effects of the above-mentioned further scheme are: by utilizing a plurality of one-dimensional convolution layers, the characteristic representation of the long time sequence data can be obtained from a small amount of small sample time sequence data, the characteristic extraction capability of the composite neural network model on the small sample time sequence data is improved, and the subsequent analysis and prediction are facilitated.

Further: the domain classification module includes: the gradient inversion layer is connected with the full-connection layers in sequence, the last full-connection layer of the full-connection layers is connected with the Softmax function, and the full-connection layers except the last layer are connected with the ReLU activation function and the Dropout discarding layer in sequence;

the gradient inversion layer is used for carrying out forward processing on the characteristic representation of the source domain long time sequence data and the internal association of the characteristic attribute;

The plurality of full-connection layers are used for converting the characteristic representation of the input source domain long time sequence data into the characteristic representation of the specified domain classification task;

The Dropout discarding layer is configured to discard data with a set probability.

The beneficial effects of the above-mentioned further scheme are: the gradient inversion layer is utilized, so that the loss function of the regression prediction module and the loss function of the domain classification module have the same monotonicity, the softmax function is utilized for data classification, the Dropout discarding layer is utilized for discarding part of data, and the overfitting is prevented.

Drawings

FIG. 1 is a flow chart of a small sample timing prediction method based on resistance transfer learning;

FIG. 2 is a block diagram of a small sample timing prediction system based on resistance transfer learning;

FIG. 3 is a diagram of a small sample timing prediction system model architecture based on resistance transfer learning;

FIG. 4 is an application of a small sample timing prediction system based on resistance transfer learning.

Detailed Description

The following description of the embodiments of the present invention is provided to facilitate understanding of the present invention by those skilled in the art, but it should be understood that the present invention is not limited to the scope of the embodiments, and all the inventions which make use of the inventive concept are protected by the spirit and scope of the present invention as defined and defined in the appended claims to those skilled in the art.

Example 1

In one embodiment of the present invention, as shown in fig. 1, a small sample timing prediction method based on resistance transfer learning includes the following steps:

Before the small sample time sequence data is predicted by using the composite neural network model, training is needed, and the specific steps of constructing and training the composite neural network model are as follows:

The specific expression of the k-means clustering algorithm in A2 is as follows:

wherein, Representing the distance between different domains, 2 being the norm,/>For the number of clusters at different granularities,/>For the number of features,/>For the total number of features,/>And/>Respectively two different domains,/>And/>Respectively/>Domain and/>The eigenvalues of the fields.

The specific steps of A3 are as follows:

A301: according to the source domain long time sequence data, extracting features on a time slice by utilizing a plurality of one-dimensional convolution layers to obtain an original feature representation of the source domain long time sequence data, wherein the expression of the one-dimensional convolution neural network is as follows:

wherein, For the final output of the characteristic representation,/>For the characteristic representation of the L-layer convolution layer output,/>Is a filter,/>For/>Characteristic representation of layer convolution layer output,/>For biasing,/>To flatten layer/>To activate the function,/>For inputting the number of features,/>For each layer of convolutional layer numbering,/>Is the number of convolutional layers;

The specific steps of A4 are as follows:

A401: according to the local characteristics of the source domain long time sequence data and the internal correlation of the characteristic attributes, processing by utilizing a gradient inversion layer to obtain the local characteristics of the forward source domain long time sequence data, wherein the specific expression of the gradient inversion layer is as follows:

wherein, As a gradient inversion function,/>Is the characteristic value of specific data,/>To derive the gradient inversion function during back propagation,/>As a super-parameter dynamically changing along with the iterative process,/>Is a unitary matrix,/>To be with natural constant/>An exponential function of the base,/>For the result of formula calculation,/>For the number of batches,/>For the current number of iterations,/>To consider the length of the smallest total batch of target and source domain training data,/>Is the overall number of iterations; in the process of model iterative optimization error, the batch and the iterative times are changed between 0 and 1.

The specific steps of A5 are as follows:

The expression of A5 is as follows:

wherein, For small sample timing predictions,/>For the small sample time sequence forward prediction result output by the forward LSTM long-term memory neural network,/>And outputting a small sample time sequence reverse prediction result for the reverse LSTM long-term memory neural network.

The expression of the LSTM long-term memory neural network in A5 is as follows:

wherein, Is forgetful door,/>For input gate,/>For outputting door,/>Is candidate gate,/>For/>Activating a function,/>For/>Activating a function,/>For updated cell state,/>For the cell state of the last moment,/>For hidden layer output at time t,/>For hidden layer output of last cell,/>For forgetting the weights of the gate multiplied by the hidden layer output,Weight multiplied by feature value for forgetting gate,/>Weights multiplied by hidden layer outputs for input gates,/>For the weight of the multiplication of the input gate with the eigenvalue,/>Weights multiplied by hidden layer output for output gates,/>For outputting the weight of the gate multiplied by the eigenvalue,/>Weights multiplied by hidden layer outputs for candidate gates,/>Weight multiplied by candidate gate and eigenvalue,/>Is the characteristic value of time t/(Transpose of matrix,/>Bias for forgetting gate,/>For biasing of input gate,/>In order to output the bias term of the gate,Bias for candidate gate,/>Is the updated cell state; one time sequence prediction unit of LSTM is divided into input gate/>Output door/>Forget door/>At the input gate/>Output gate/>, with quantization of the amount of information added from input and previous hidden statesJudging whether the information from the unit state should be output to the hidden state; forgetting door/>The amount of information removed from the cell state is quantized.

In one embodiment of the invention, a GRU gated loop cell neural network may also be used, and compared to LSTM long term memory neural networks, the information flow is also regulated by gates, but there is only one update gate and one reset gate, expressed as follows:

wherein, To update the door,/>For/>Activating a function,/>For/>Activating a function,/>To update the weight of the gate,/>To reset the weight of the gate,/>Weights are output for candidate hidden layers,/>For the hidden layer output of the last cell,Is the characteristic value of time t/(To update the bias of the gate,/>To reset the bias of the gate,/>Hidden layer bias as candidate,/>Output for hidden layer of candidates,/>To reset the gate,/>Outputting for the final hidden layer; update gate/>Being able to quantify the extent to which new hidden states are derived from previous hidden states and from current inputs, resetting gates/>The extent to which the previously hidden state should be forgotten is identified.

In A6, through training the composite neural network model, the final output loss function needs to be made as small as possible, so as to illustrate that the composite neural network model achieves the best prediction effect at the moment, and the expression of the final loss function of the composite neural network model is as follows:

The beneficial effects of the invention are as follows: the invention utilizes a k-means clustering algorithm, a one-dimensional convolution neural network, a domain classifier and an LSTM long-short-term memory neural network to form a composite neural network model, and utilizes the composite neural network model obtained by training time sequence data of a source domain to predict time sequence data of a target domain through resistance migration learning, and both the source domain and the target domain show excellent performance.

Example 2

In one embodiment of the present invention, as shown in fig. 2, a small sample timing prediction system structure diagram based on resistance transfer learning includes:

Wherein, the feature extraction module includes: a plurality of one-dimensional convolution layers and a flat layer which are sequentially connected, wherein the one-dimensional convolution layers are sequentially connected with the ReLU activation layer and the pooling layer;

The one-dimensional convolution layers are used for dividing the source domain long time sequence data into time slices and carrying out quick feature extraction;

and the flattening layer is used for flattening the original characteristic representation of the time series data.

The domain classification module comprises: the gradient inversion layer is connected with the full-connection layers in sequence, the last full-connection layer in the full-connection layers is connected with the Softmax function, and the full-connection layers except the last layer are connected with the ReLU activation function and the Dropout discarding layer in sequence;

The full-connection layers are used for converting the characteristic representation of the input source domain long time sequence data into the characteristic representation of the specified domain classification task;

Dropout discard layer for discarding data with set probability.

As shown in fig. 3, a model structure diagram of a small sample time sequence prediction system based on resistance transfer learning is shown, wherein a clustering module divides small sample time sequence data to obtain different source domains and target domains, marks the different domain labels at the same time, inputs the time sequence data of the source domains and the time sequence data of the target domains to a feature extraction module, performs feature extraction on a time slice by using a multi-layer one-dimensional convolutional neural network, and inputs the time sequence data to a domain classification module and a time sequence prediction module after flattening; in the domain classification module, firstly, a gradient inversion layer is entered, segmented training can be avoided, a better effect is kept in domain countermeasure migration, then, a plurality of all-connected layers are entered, all-connected layers except for the last layer in the all-connected layers, namely Fc1 and Fc2, are sequentially connected with a ReLU activation function and a Dropout1 discarding layer, the all-connected layer Fc2 is sequentially connected with the ReLU activation function and the Dropout2 discarding layer, excessive fitting is prevented, and the last all-connected layer Fc3 is connected with a Softmax function and is used for classifying data; the time sequence regression prediction module comprises a forward LSTM long-term memory neural network and a reverse LSTM long-term memory neural network, wherein in an input sequence, an initial item is an original item, a second item is a mirror image copy of the input sequence, a feedforward input and a feedback input are complementary to each other, time sequence data can be subjected to global information capture, learning can be carried out according to past and future characteristic representations in the time sequence data, and finally, a prediction result is output by considering context information from past and future instances.

Example 3

In one embodiment of the invention, a data set about urban air quality is selected as small sample time series data, and the data set is composed of six different data collected in urban air projects by Microsoft urban computing research groups, wherein the data collection time span is one year, specifically, 2014, 5,1 to 2015, 4 and 30 days, and the data set is respectively called urban data, regional data, air quality data, weather forecast data, quality station data and meteorological data, and in the air quality data, a characteristic space contains 12 attributes, namely time, PM10 index, O3 index, CO index, SO2 index, NO2 index, weather, temperature, humidity, atmospheric pressure, wind power and wind direction, and can be used for forecasting concentration values of PM 2.5; the processors and GPUs in the example are Intel Xeon (R) and GeForce RTX 3090, implemented by Python 3.8 and PyTorch 1.7.7 frameworks, with training parameters set as follows: the batch size is 32, the period size is 100, the learning rate is 1e-3, and the drop policy probability of the drop layer is set to 0.3.

According to city distribution, the data set is divided into a cluster A and a cluster B by using a cluster module, wherein the cluster A consists of 19 cities nearby a city, the cluster B covers 24 cities nearby B city and comprises 43 areas in total, 1-42 areas are used as training sets, 43 areas are used as test sets, the cluster A is used as a source domain, and the cluster B is used as a target domain. The dataset contained 2891393 hour recordings of 437 air quality monitoring stations, with the data items including various air indicators and geographic environment indicators.

As shown in fig. 4, the small sample timing prediction system based on the resistance transfer learning is actually applied to a process diagram for predicting PM 2.5;

in the training process, firstly, classifying air quality data sets of sites in different areas by using a clustering module, dividing the air quality data sets into a source domain and a target domain, extracting features on time slices by using a feature extraction module, taking the feature extraction module as a generator, taking a domain classification module as a discriminator, generating a feature representation of source domain long time sequence data and internal association of feature attributes by using the feature extraction module, discriminating by using the domain classification module, transferring the features learned by the source domain to the target domain, assisting the target domain time sequence data to perform feature representation learning, obtaining the feature representation of the target domain, finally predicting the air quality data by using a time sequence regression prediction module, and optimizing a composite neural network model according to a final loss function to enable the composite neural network model to reach the best predicted value;

In the testing process, inputting an air quality data set of a new site to be predicted into a trained composite neural network model for processing, and performing feature extraction on the air quality data set of the new site through a feature extraction module by a frozen domain classification module, and predicting by a time sequence regression prediction module to obtain a PM2.5 change curve.

The time sequence regression prediction module is used for predicting the small sample time sequence, the time sequence regression prediction model comprises a two-way LSTM long-short-term memory neural network, the time sequence regression prediction model can be divided into three layers, the hiding size of each layer is 128, meanwhile, in order to improve training performance, the data is expanded to the range of [0,1] by using normalization of a minimum-maximum function, and in addition, any missing value existing in each column attribute is replaced by the average value of the corresponding column.

In one embodiment of the invention, a Root Mean Square Error (RMSE) and an average absolute error (MAE) are selected as indicators to evaluate the model error and air quality prediction performance of the target test location. The model performance evaluation index is calculated as follows.

Wherein,Is root mean square error,/>Is the average absolute error,/>For the number of sample timing predictions,/>/>, For small sample timing data in test setActual value/>For the/>, in the small sample timing prediction resultResults,/>To test the first/>, of small sample data in the setActual tag value of personal field,/>For the/>, in the small sample timing prediction resultTag value of individual field.

The PM2.5 concentration value was used as an indicator showing the next month change, with a search size of 24 x 30, which represents a search per hour in one month, and the performance of the various models to predict PM2.5 concentration values was evaluated, with the evaluation results shown in table 1.

TABLE 1

Type(s)	Model	RMSE	MAE
				—	GRU	12.482	63.6334
—	Conv-GRU	12.4194	62.6333
				DA	DA-Conv-GRU	<b>3.9553</b>	40.1001
DG	DG- Conv-GRU	4.5386	<b>37.7518</b>
				—	LSTM	3.7976	36.4188
—	Conv-LSTM	3.4802	35.9811
				DA	DA- Conv-LSTM	<b>2.8610</b>	<b>30.2536</b>
DG	DG- Conv-LSTM	3.9908	32.5313

The system comprises a GRU, a LSTM, a Conv, a DA, a DG, a plurality of classifiers, wherein the GRU is a gating cyclic unit neural network, the LSTM is a long and short term memory neural network, the Conv is a one-dimensional convolutional neural network, the DA is a binary classifier for domain adaptation on a hybrid network, knowledge is transferred from a source domain to a given target domain, and the DG is a plurality of classifiers for domain generalization on the hybrid network in which the knowledge is transferred from a plurality of source domains to an unknown target domain; different combinations of models produce different losses, as can be seen from table 1, the RMSE and MAE of the mixed model with transfer strategies DA and DG are smaller, have better predictive power, face significant differences in distribution between test and training sets, transfer strategies significantly reduce the RMSE and MAE of the corresponding model, have better performance for highly interconnected and long-term sequences of air quality datasets than for GRU, and the performance of DA is better than DG using LSTM's predictive modules. Compared with other models, the proposed composite neural network model (DA-Conv-LSTM)Root mean square error sum/>The average absolute error is minimum, the excellent performance is shown, and the result shows that the model not only has better prediction performance on the original data, but also has good generalization capability.

Claims

1. The small sample time sequence prediction method based on resistance transfer learning is characterized by comprising the following steps of:

2. The small sample time sequence prediction method based on resistance transfer learning according to claim 1, wherein the specific steps of constructing and training the composite neural network model are as follows:

3. The small sample timing prediction method based on resistance transfer learning according to claim 2, wherein the specific steps of A3 are as follows:

4. The small sample timing prediction method based on resistance transfer learning according to claim 2, wherein the specific steps of A4 are as follows:

5. The small sample timing prediction method based on resistance transfer learning according to claim 2, wherein the specific steps of A5 are as follows:

6. The small sample timing prediction method based on resistance transfer learning according to claim 2, wherein the expression of the final loss function is as follows:

wherein, As a final loss function,/>Is the global optimum,/>Weights extracted for features,/>Weight for regression prediction,/>Weights for domain classification,/>Loss function for regression prediction,/>For feature extraction,/>For regression prediction,/>For domain classification,/>Is the total number of domains,/>For 1 st source domain,/>For/>Source domain/>For the target domain,/>Loss function classifying a domain,/>As a gradient inversion function,/>For the characteristic value of the small sample time sequence data in the test set,/>For one piece of data in the small sample time sequence data set in the test set,/>For the number of sample timing predictions,/>/>, For small sample timing data in test setActual value/>For the/>, in the small sample timing prediction resultResults,/>To test the first/>, of small sample data in the setActual tag value of personal field,/>For the/>, in the small sample timing prediction resultTag value of individual field.

7. A small sample timing prediction system based on resistance transfer learning for performing the small sample timing prediction method based on resistance transfer learning according to any one of claims 1 to 6, comprising:

8. The small sample timing prediction system based on resistance transfer learning of claim 7, wherein the feature extraction module comprises: the one-dimensional convolution layers are sequentially connected with the ReLU activation layer and the pooling layer;

9. The small sample timing prediction system based on resistance transfer learning of claim 7, wherein the domain classification module comprises: the gradient inversion layer is connected with the full-connection layers in sequence, the last full-connection layer of the full-connection layers is connected with the Softmax function, and the full-connection layers except the last layer are connected with the ReLU activation function and the Dropout discarding layer in sequence;