CN116720158A - Time sequence regression prediction method and system with uncertainty estimation - Google Patents
Time sequence regression prediction method and system with uncertainty estimation Download PDFInfo
- Publication number
- CN116720158A CN116720158A CN202310463764.3A CN202310463764A CN116720158A CN 116720158 A CN116720158 A CN 116720158A CN 202310463764 A CN202310463764 A CN 202310463764A CN 116720158 A CN116720158 A CN 116720158A
- Authority
- CN
- China
- Prior art keywords
- uncertainty
- time sequence
- prediction
- model
- output
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 54
- 238000013528 artificial neural network Methods 0.000 claims abstract description 40
- 230000007246 mechanism Effects 0.000 claims abstract description 32
- 238000012360 testing method Methods 0.000 claims abstract description 25
- 238000012549 training Methods 0.000 claims abstract description 24
- 238000007781 pre-processing Methods 0.000 claims abstract description 15
- 230000006870 function Effects 0.000 claims description 33
- 239000013598 vector Substances 0.000 claims description 25
- 238000004364 calculation method Methods 0.000 claims description 13
- 238000012545 processing Methods 0.000 claims description 12
- 238000003860 storage Methods 0.000 claims description 11
- 238000012952 Resampling Methods 0.000 claims description 10
- 238000004422 calculation algorithm Methods 0.000 claims description 10
- 238000013507 mapping Methods 0.000 claims description 10
- 230000004913 activation Effects 0.000 claims description 6
- 238000004590 computer program Methods 0.000 claims description 6
- 239000000284 extract Substances 0.000 claims description 6
- 230000008569 process Effects 0.000 claims description 6
- 238000009826 distribution Methods 0.000 claims description 5
- 230000000694 effects Effects 0.000 claims description 5
- 239000011159 matrix material Substances 0.000 claims description 4
- 238000003062 neural network model Methods 0.000 claims description 4
- 230000000737 periodic effect Effects 0.000 claims description 4
- 230000000873 masking effect Effects 0.000 claims description 3
- 238000005457 optimization Methods 0.000 claims description 3
- 239000002131 composite material Substances 0.000 claims description 2
- 238000005070 sampling Methods 0.000 claims description 2
- 238000011425 standardization method Methods 0.000 claims description 2
- 238000002372 labelling Methods 0.000 claims 1
- 238000003745 diagnosis Methods 0.000 abstract description 12
- 238000000605 extraction Methods 0.000 abstract description 2
- 238000013136 deep learning model Methods 0.000 description 8
- 238000010586 diagram Methods 0.000 description 4
- 238000013135 deep learning Methods 0.000 description 3
- 238000010438 heat treatment Methods 0.000 description 3
- 238000005259 measurement Methods 0.000 description 3
- 238000000691 measurement method Methods 0.000 description 3
- 238000011160 research Methods 0.000 description 3
- 230000004927 fusion Effects 0.000 description 2
- 238000002347 injection Methods 0.000 description 2
- 239000007924 injection Substances 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 239000000243 solution Substances 0.000 description 2
- 230000002411 adverse Effects 0.000 description 1
- 230000004075 alteration Effects 0.000 description 1
- 230000006399 behavior Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000007664 blowing Methods 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000004146 energy storage Methods 0.000 description 1
- 239000003337 fertilizer Substances 0.000 description 1
- 230000004907 flux Effects 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 238000010885 neutral beam injection Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 239000002245 particle Substances 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/27—Regression, e.g. linear or logistic regression
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/213—Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
- G06N3/0455—Auto-encoder networks; Encoder-decoder networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y04—INFORMATION OR COMMUNICATION TECHNOLOGIES HAVING AN IMPACT ON OTHER TECHNOLOGY AREAS
- Y04S—SYSTEMS INTEGRATING TECHNOLOGIES RELATED TO POWER NETWORK OPERATION, COMMUNICATION OR INFORMATION TECHNOLOGIES FOR IMPROVING THE ELECTRICAL POWER GENERATION, TRANSMISSION, DISTRIBUTION, MANAGEMENT OR USAGE, i.e. SMART GRIDS
- Y04S10/00—Systems supporting electrical power generation, transmission or distribution
- Y04S10/50—Systems or methods supporting the power network operation or management, involving a certain degree of interaction with the load-side end user applications
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The application provides a time sequence regression prediction method and a system with uncertainty estimation. The system comprises: the system comprises a data acquisition and preprocessing module, a neural network module, a training module and a testing module. According to the time sequence regression prediction method with uncertainty estimation, disclosed by the embodiment of the application, the input time sequence is subjected to feature extraction by constructing the time sequence prediction and uncertainty model and taking an attention mechanism as a basis, and the output time sequence and the uncertainty of the prediction are decoded at the same time, so that the actual reference value of a time sequence prediction result can be effectively improved, and compared with the existing uncertainty estimation method, the time sequence regression prediction method with uncertainty estimation has higher execution efficiency and lower time space cost. The method is simultaneously applied to discharge modeling of the Tokamak 0-dimensional diagnosis physical quantity, so that a modeling result is more reliable and has practical value.
Description
Technical Field
The application belongs to the field of deep learning and uncertainty measurement, and particularly relates to a time sequence regression prediction method and system with uncertainty estimation.
Background
Deep neural networks are a powerful machine learning tool that can be used to address time series prediction problems. However, deep neural networks often face many challenges in practical applications, such as insufficient sample size, skew of data set distribution, unknown input, etc., which can lead to uncertainty in the model, which can adversely affect the prediction results of the deep neural network, and reduce the accuracy and reliability of the prediction. For example, in the financial field, uncertainty may lead to models that cannot accurately predict stock price or exchange rate fluctuations, thereby bringing losses to investors; in the traffic transportation field, uncertainty may cause inaccurate traffic prediction, thereby causing inconvenience to traffic planners and passengers; in the field of controlled nuclear fusion, uncertainty may lead to inaccurate discharge predictions, thereby affecting safe and stable operation of the device, and possibly even damaging the device.
Therefore, the uncertainty measurement of the deep neural network is very important, and how to effectively measure and manage the uncertainty in the deep learning model is a research hotspot in the current deep learning field.
Searching the prior patent finds that there is almost no uncertainty measurement method for time series, and the prior literature uses methods such as Monte Carlo dropouout, model integration and the like to estimate uncertainty.
The disadvantages of the existing uncertainty measurement methods mainly include the following aspects:
1. the calculation cost is high: these methods require running multiple models or samples, and are therefore relatively computationally expensive, especially when high precision predictions or large amounts of data are required;
2. poor interpretability: uncertainty metrics produced by these methods are often difficult to interpret, and for application scenarios using these metrics, it is difficult to clearly interpret the source of the prediction uncertainty, which may be less suitable for some critical application scenarios, such as controlled fusion, medical, financial, etc.
In summary, uncertainty measurement is a very important research field in deep learning, and research results can help us to better understand the performance and behavior of the deep learning model, so as to improve the effect and reliability of the deep learning model in practical application, and efficient low-cost modeling for uncertainty estimation of time series still remains a challenge.
Disclosure of Invention
In order to solve the problems that the uncertainty exists in the deep neural network, the calculation cost is high and the accuracy of a prediction result is possibly influenced in the existing uncertainty measurement method, the application provides a time sequence regression prediction method with uncertainty estimation, and modeling of a time sequence and the prediction uncertainty thereof is realized by adopting an attention mechanism. Meanwhile, the model is trained by utilizing the Tokamak discharge experimental data, so that high-fidelity and rapid diagnosis of signal time sequences and modeling of prediction uncertainty thereof are realized, and 0-dimensional diagnosis physical quantity verification of the Tokamak experimental proposal is realized.
In order to achieve the above purpose, the application adopts the following technical scheme:
the method for constructing the time series regression prediction with uncertainty estimation based on the deep learning model of the attention mechanism specifically comprises the following steps:
s1, data acquisition and preprocessing: acquiring time sequence data, including an input time sequence and an output time sequence, and carrying out resampling and standardized preprocessing operation on the data to establish a data set for model training and testing;
s2, constructing a time sequence prediction and uncertainty estimation model based on attention: the input time sequence is transmitted into a time sequence prediction and uncertainty estimation model based on attention, the model extracts characteristics of the input time sequence and maps the characteristics to a potential space, and an output time sequence and uncertainty thereof are obtained through a time sequence output module and an uncertainty estimation module respectively;
s3, training a model: firstly, calculating loss according to a loss function L1 by using the predicted output of a target output time sequence and time sequence output module, and optimizing model parameters by using an error back propagation algorithm; then, on the basis of the existing model parameters, calculating the actual deviation and the loss of the prediction output of the uncertainty estimation module according to a loss function L2, and carrying out parameter optimization of an uncertainty estimation network in the model by using an error back propagation algorithm to finally obtain a time sequence prediction regression optimal model with uncertainty estimation;
s4, testing and verifying the validity of the model: and inputting the test set data into a trained time series prediction and uncertainty model, and outputting a model prediction time series and the prediction uncertainty thereof.
Further, the time series prediction and uncertainty model constructed in the step 2 based on the attention mechanism comprises a position encoder, an input encoder, a time series output module, a direct uncertainty estimation module, a linear output layer,
position encoder: adding timing information to the data to assist the model in learning relative and absolute position information of the data;
an input encoder: extracting features of the input time sequence, compressing the input time sequence into semantic vectors with specified lengths, and mapping the semantic vectors into a potential space;
the time sequence output module is used for: decoding according to the feature vectors mapped into the potential space by the input encoder to obtain global features of the output time sequence, and obtaining the final expression of the output time sequence through the linear full-connection layer;
a direct uncertainty estimation module: and decoding according to the feature vectors mapped into the potential space by the input encoder to obtain global features of the uncertainty of the time sequence, and obtaining the final expression of the uncertainty time sequence through a linear full-connection layer.
Linear output layer: simple linear mapping to the target output dimension is performed for the outputs of the timing output module and the direct uncertainty estimation module.
Further, the loss functions L1 and L2 use a masking mechanism to calculate an effective mean square error loss and a weighted mean square error loss according to the effective length of the time sequence.
Further, the position encoder encodes the position information of the original tensor by using a periodic function to obtain a time sequence tensor, and combines the time sequence tensor with the original input tensor, so that the model has the capability of learning time sequence information.
Further, the direct uncertainty estimation module comprises an uncertainty estimation decoder and a linear output layer, wherein the uncertainty estimation decoder is used for decoding to obtain global features of time sequence uncertainty according to the mapping of input features in a potential space; the linear output layer maps the uncertainty global features into the uncertainty output vector space of the samples by adjusting their weights.
Further, the uncertainty estimation decoder comprises a multi-head attention mechanism and a fully-connected neural network, wherein the multi-head attention mechanism establishes a plurality of attentions by means of a linear layer, and each attentions focuses on different parts of input information and then is spliced, so that the expressive power of a model can be enhanced; the fully-connected neural network is a series connection of a plurality of linear layers and an activation function Relu, and relatively complex nonlinear processing capability can be obtained through composite mapping of simple linear and nonlinear processing units.
On the other hand, the application applies for a time series regression prediction system with uncertainty estimation, which constructs a time series prediction and uncertainty model based on an attention mechanism, and specifically comprises the following steps:
and the data acquisition and preprocessing module is used for: the method comprises the steps of acquiring time sequence data, including an input time sequence and an output time sequence, and carrying out resampling and standardized preprocessing operation on the data to establish a data set for model training and testing;
neural network module: the neural network comprises a position encoder, an input encoder, a time sequence output module, a direct uncertainty estimation module and a linear output layer; the method comprises the steps of obtaining an output time sequence and uncertainty of the output time sequence according to an input time sequence;
training module: training the neural network by using the training data set of the task to obtain a trained neural network;
and a testing module: the method is used for inputting the input time series data of the test set into the trained neural network model, and obtaining the prediction time series and uncertainty thereof.
In an embodiment of the present application, an electronic device is provided that includes a readable storage medium, a central processing unit, and a graphics processor. The readable storage medium has stored thereon a computer program executable by the central processor and the graphics processor, which when executed by the central processor and the graphics processor implement the steps of the time series prediction method.
In an embodiment of the application, a readable storage medium is provided, on which a computer program is stored which, when being executed by a processor, performs the steps of the time-series regression prediction model method with uncertainty estimation.
The beneficial effects of the application are as follows:
1. the deep learning model based on the attention mechanism builds a time sequence regression prediction model with uncertainty estimation, builds a time sequence prediction and uncertainty model, takes the attention mechanism as a basis, and simultaneously decodes the output time sequence and the uncertainty of the prediction by extracting the characteristics of the input time sequence, thereby effectively improving the actual reference value of the time sequence prediction result, and having higher execution efficiency and lower time space cost compared with the existing uncertainty estimation method.
2. The deep learning model based on the attention mechanism builds a time sequence regression prediction model with uncertainty estimation, is innovatively applied to discharge modeling of Tokamak 0-dimensional diagnosis physical quantity, and enables modeling results to be more reliable and have practical value.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings used in the embodiments or the description of the prior art will be briefly described below, and it is apparent that the drawings in the following description are only for the purpose of illustrating the preferred embodiments and are not to be construed as limiting the present application.
FIG. 1 is a schematic flow chart of a time series regression prediction method with uncertainty estimation according to an embodiment of the present application;
FIG. 2 is a basic flowchart of acquiring time series data according to an embodiment of the present application;
FIG. 3 is a schematic diagram of a neural network model for attention-based time series prediction and uncertainty estimation according to an embodiment of the present application;
FIG. 4 is a basic flow chart of training a neural network to obtain a trained neural network according to an embodiment of the present application;
FIG. 5 is a schematic diagram of modeling effect of a time series prediction and uncertainty estimation model based on attention according to an embodiment of the present application;
fig. 6 is a schematic structural diagram of a time sequence prediction apparatus according to an embodiment of the present application;
fig. 7 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
The present application will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present application more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the application. In addition, the technical features of the embodiments of the present application described below may be combined with each other as long as they do not collide with each other.
The application models the time series regression problem by using a time series prediction and uncertainty estimation model based on an attention mechanism, so that the model outputs a time series prediction result and uncertainty thereof at the same time. Therefore, the main contribution of the application is the construction of a time series prediction and uncertainty estimation model based on the attention mechanism, and in addition, in the embodiment, discharge modeling and uncertainty estimation modeling are performed on the Tokamak 0-dimensional diagnostic physical quantity.
As shown in fig. 1, the present application provides a time-series regression prediction method with uncertainty, comprising the steps of:
s1, data acquisition and preprocessing: acquiring time sequence data, including an input time sequence and an output time sequence, and carrying out resampling and standardized preprocessing operation on the data to establish a data set for model training and testing;
s2, constructing a time sequence prediction and uncertainty estimation model based on attention: the input time sequence is transmitted into a time sequence prediction and uncertainty estimation model based on attention, the model extracts characteristics of the input time sequence and maps the characteristics to a potential space, and an output time sequence and uncertainty thereof are obtained through a time sequence output module and an uncertainty estimation module respectively;
s3, training a model: firstly, calculating loss according to a loss function L1 by using the predicted output of a target output time sequence and time sequence output module, and optimizing model parameters by using an error back propagation algorithm; then, on the basis of the existing model parameters, calculating the actual deviation and the loss of the prediction output of the uncertainty estimation module according to a loss function L2, and carrying out parameter optimization of an uncertainty estimation network in the model by using an error back propagation algorithm to finally obtain a time sequence prediction regression optimal model with uncertainty estimation;
s4, testing and verifying the validity of the model: and inputting the test set data into a trained time series prediction and uncertainty model, and outputting a model prediction time series and the prediction uncertainty thereof.
Specifically, as shown in fig. 2, the data acquisition and preprocessing process in step S1 includes the following steps:
s11, acquiring original time sequence data, taking a Tokamak dynamics system as an example, reading the original time sequence data from an MDSPlus database, and storing the original time sequence data into different HDF5 files according to an experimental sequence for subsequent use;
s12, resampling the original time sequence data by using a fixed sampling rate to obtain a resampled time sequence, and reducing the calculation cost on the premise of keeping enough data information;
s13, carrying out standardization processing on resampling time series data by utilizing a Z-Score standardization method, wherein the data after processing accords with standard normal distribution, namely the mean value is 0, and the standard deviation is 1;
s14, according to the preprocessed time series data, the following steps are carried out according to 4:4:2 to obtain two training sets and one test set.
Specifically, as shown in fig. 3, the time series regression prediction model with uncertainty estimation includes a position encoder, an input encoder, a time series output module, a direct uncertainty estimation module and a linear output layer.
The position encoder encodes the position information of the original tensor by using a periodic function to obtain a time sequence tensor, and combines the time sequence tensor with the original input tensor, so that the model has the capability of learning the relative position information and the absolute position information of the time sequence.
The input encoder performs feature extraction on the input time sequence, compresses the input time sequence into semantic vectors with specified lengths, and maps the semantic vectors into potential space.
And the time sequence output module decodes according to the feature vector mapped into the potential space by the input encoder to obtain the global feature of the output diagnosis signal, and obtains the final expression of the diagnosis signal time sequence through the linear full-connection layer.
The direct uncertainty estimation module includes an uncertainty estimation decoder and a regression output layer.
The uncertainty estimation decoder based on the attention mechanism consists of a multi-head attention mechanism and a fully-connected neural network and is used for decoding to obtain global features of time sequence uncertainty according to the mapping of input features in a potential space. The multi-head attention mechanism is to split an input vector into a plurality of heads, perform attention calculation on each head, finally splice the results of all heads to obtain a final output vector, extract task related information by the method, increase the attention of a model to information in different aspects, improve the generalization capability and effect of the model, wherein each head uses the following functions to calculate,
wherein Q, K, V is the input to a multi-headed attention mechanism, where Q, K, V is the same; softmax is an activation function that normalizes a numerical vector to a probability distribution vector, and the sum of the probabilities is 1; t represents a transpose operation; d, d model The dimension of the position vector is the same as the dimension value of the hidden state of the whole model; h represents the number of heads in the multi-head attention mechanism, i E [1, h];W i Q 、W i K 、W i V Respectively a weight matrix of Q, K, V.
The fully connected neural network uses the following functions, performs relatively complex nonlinear processing on the output of the attention mechanism,
wherein x is the input of a fully connected neural network; relu is the linear rectification function in the activation function; w (W) 1 、W 2 Is the weight parameter of two linear layers in the fully connected neural network; t represents a transpose operation; b 1 、b 2 Is the bias of two linear layers in a fully connected neural network.
The linear output layer is a simple one-dimensional linear layer, and the model output tensor and the 0-dimensional diagnostic data tensor are aligned by adjusting the weight of the uncertainty global feature and mapping the uncertainty global feature to an uncertainty output vector space of a sample and performing simple linear mapping.
Specifically, as shown in fig. 4, the model training process in step S3 includes the following steps:
s31, randomly initializing the weight and bias parameters of the neural network;
s32, initializing a neural network super-parameter and a random gradient descent optimizer, wherein the neural network super-parameter comprises batch size, learning rate and iteration times;
s33, calculating loss according to a loss function L1 by using the target output time sequence and the prediction output of the time sequence output module, and optimizing model parameters by using an error back propagation algorithm;
s34, calculating the actual deviation and the loss of the prediction output of the uncertainty estimation module according to the loss function L2 on the basis of the existing model parameters, and optimizing parameters of an uncertainty estimation network in the model by using an error back propagation algorithm;
the loss functions L1 and L2 use a mask mechanism, and calculate an effective mean square error loss VMSE and a weighted mean square error loss VWMSE according to the effective length of the time sequence;
the calculation formula of the effective mean square error loss function VMSE is as follows:
wherein n is V Is the effective length of the time series; w (W) V Is a matrix containing only 0 and 1, and is used for time series
Extracting an effective part by the mean square error of the A; y is real experimental data; y is the model predictive output.
The calculation formula of the effective weighted mean square error loss VWMSE is as follows:
wherein n is V_in Is the number of time slices that are efficient and within the coverage of the prediction uncertainty;n V_out The number of time slices which are effective and outside the coverage of the uncertainty of prediction; y is V_in Is the real experimental data within the coverage range of the prediction uncertainty;is the model prediction output in the coverage range of the prediction uncertainty; y is V_out Is the real experimental data within the coverage range of the prediction uncertainty; />Is the model prediction output in the coverage range of the prediction uncertainty; λ is a parameter for balancing the importance of both coverage and uncertainty width; μ is a predefined target uncertainty coverage; the PUCP is the coverage of uncertainty prediction, and the calculation formula of the PUCP is as follows:
where N is the time series length, a i Is a binary value and has the following calculation formula:
wherein y is i Is the target value of the current value,is a point predictor and uncertainty is uncertainty.
And S35, finally obtaining a time sequence prediction regression optimal model with uncertainty estimation.
The time sequence prediction method provided by the embodiment can be applied to various tasks, and the task of tokamak discharge modeling is taken as an example for discussion below.
The discharge modeling and uncertainty estimation modeling model for the Tokamak 0-dimensional diagnosis physical quantity is realized as follows: the input data of the discharge modeling and uncertainty estimation modeling model aiming at the Tokamak 0-dimensional diagnosis physical quantity is 92 channelsThe control signal time sequence comprises plasma current feedforward, polar magnetic field coil current, a circumferential magnetic field, power of a low clutter current driving and heating system, a neutral beam injection system, an ion cyclotron resonance heating system, an electron cyclotron resonance heating/current driving system, a gas blowing system, supersonic molecular beam injection, a particle injection system and plasma shape feedforward; the modeling target of the discharge modeling and uncertainty estimation modeling model aiming at the Tokamak 0-dimensional diagnosis physical quantity is 11 diagnosis signals, including the actual plasma current I p Tokamak magnetic axis plasma mean electron density n e Plasma energy storage W mhd Tokamak ring voltage V loop Tokamak normalized magnetic specific pressure beta n Circumferential magnetic specific pressure beta t Polar magnetic specific pressure beta p Elongation ratio k, internal inductance li, safety factor q 0 Safety factor q on the 95% flux plane 95 . As shown in fig. 2, the number of layers stacked is: d0 Both D1 and D2 are 6, the hidden layer size D model 512.
On the premise of setting the target uncertainty coverage rate mu to be 0.9 and the parameter lambda to be 4, the discharge modeling and uncertainty estimation modeling results for the tokamak 0-dimensional diagnosis physical quantity are shown in fig. 5. A #73873 gun is selected as test data, the discharge time of the gun exceeds 70s, and the sequence length exceeds 7 multiplied by 10 4 . The specific process is as follows: firstly, setting the input quantity of a Tokamak control system, in the embodiment, directly calling a control system source file corresponding to the #73873 gun, and then using an actual actuator input signal of the #73873 gun as the input of the system. And next, converting the data into a Tensor type, loading the Tensor type into a GPU, loading a trained deep learning model into the GPU, calculating input data by using the trained deep learning model to obtain modeling results of 11 0-dimensional diagnosis physical quantities and prediction uncertainty thereof, and finally visualizing the data.
In the combined fertilizer eastern super loop (EAST) Tokamak device, 932 cannons are taken as test sets, and the performance of a model on the whole test set is tested. The good performance on the test set can indicate that the modeling result of the application is accurate and reliable, and has practical value.
In another aspect, as shown in fig. 6, a time series regression prediction system with uncertainty estimation of the present application, the apparatus includes:
and the data acquisition and preprocessing module is used for: the method comprises the steps of acquiring time sequence data, including an input time sequence and an output time sequence, and carrying out resampling and standardized preprocessing operation on the data to establish a data set for model training and testing;
neural network module: the neural network comprises a position encoder, an input encoder, a time sequence output module, a direct uncertainty estimation module and a linear output layer; the method comprises the steps of obtaining an output time sequence and uncertainty of the output time sequence according to an input time sequence;
training module: training the neural network by using the training data set of the task to obtain a trained neural network;
and a testing module: the method is used for inputting the input time series data of the test set into the trained neural network model, and obtaining the prediction time series and uncertainty thereof.
As shown in FIG. 7, the present application also provides an electronic device comprising a readable storage medium, a core processor, and a graphics processor. Computer program stored on the readable storage medium and executable on the core processor and the graphics processor, which when executed, will implement the time series regression prediction method steps with uncertainty estimation of the present embodiment.
Specifically, the readable storage medium is a general storage medium, such as a mobile magnetic disk, a hard disk, an optical disk, etc., and the storage medium stores a computer program, which when executed by a processor, implements each process of the above-mentioned embodiment of the time sequence prediction method with uncertainty estimation, and can achieve the same technical effect, so that repetition is avoided and redundant description is omitted herein.
While preferred embodiments of the present application have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. It is therefore intended that the following claims be interpreted as including the preferred embodiment and all such alterations and modifications as fall within the scope of the embodiments of the application.
Claims (17)
1. A time sequence regression prediction method with uncertainty estimation is characterized by constructing a time sequence prediction and uncertainty model based on an attention mechanism, and specifically comprises the following steps:
s1, data acquisition and preprocessing: acquiring time sequence data, including an input time sequence and an output time sequence, and performing resampling, standardization and labeling discharge type label preprocessing operation on the data to establish a data set for model training and testing;
s2, constructing a time sequence prediction and uncertainty estimation model based on attention: the input time sequence is transmitted into a time sequence prediction and uncertainty estimation model based on attention, the model extracts characteristics of the input time sequence and maps the characteristics to a potential space, and an output time sequence and uncertainty thereof are obtained through a time sequence output module and an uncertainty estimation module respectively;
s3, training a model: firstly, calculating loss according to a loss function L1 by using the predicted output of a target output time sequence and time sequence output module, and optimizing model parameters by using an error back propagation algorithm; then, on the basis of the existing model parameters, calculating the actual deviation and the loss of the prediction output of the uncertainty estimation module according to a loss function L2, and carrying out parameter optimization of an uncertainty estimation network in the model by using an error back propagation algorithm to finally obtain a time sequence prediction regression optimal model with uncertainty estimation;
s4, testing and verifying the validity of the model: and inputting the test set data into a trained time series prediction and uncertainty model, and outputting a model prediction time series and the prediction uncertainty thereof.
2. The time-series regression prediction method with uncertainty estimation according to claim 1, wherein the data acquisition and preprocessing process in S1 comprises the steps of:
s11, acquiring original time sequence data, wherein the original time sequence data is read from an MDSPlus database and stored in different HDF5 files according to an experimental sequence for subsequent use;
s12, resampling the original time sequence data by using a fixed sampling rate to obtain a resampled time sequence, and reducing the calculation cost on the premise of keeping enough data information;
s13, carrying out standardization processing on resampling time series data by utilizing a Z-Score standardization method, wherein the data after processing accords with standard normal distribution, namely the mean value is 0, and the standard deviation is 1;
s14, according to the preprocessed time series data, the following steps are carried out according to 4:4:2 to obtain two training sets and one test set.
3. The time-series regression prediction method with uncertainty estimation of claim 1, wherein the constructing the time-series prediction and uncertainty model based on the attention mechanism constructed in S2 comprises:
the position encoder encodes the position information of the original tensor by using a periodic function to obtain a time sequence tensor, and combines the time sequence tensor with the original input tensor, so that the model has the capability of learning the relative position information and the absolute position information of the time sequence;
the input encoder extracts the characteristics of the input time sequence, compresses the input time sequence into semantic vectors with specified length, and maps the semantic vectors into a potential space;
the time sequence output module decodes the feature vector mapped to the potential space according to the input encoder to obtain the global feature of the output time sequence, and obtains the final expression of the output time sequence through the linear full-connection layer;
the direct uncertainty estimation module decodes according to the feature vector mapped to the potential space by the input encoder to obtain global features of the uncertainty of the time sequence, and obtains the final expression of the uncertainty time sequence through the linear full-connection layer;
the linear output layer simply and linearly maps the outputs of the timing output module and the direct uncertainty estimation module to the target output dimension.
4. The time series regression prediction method with uncertainty estimation of claim 1 wherein the loss functions L1, L2 in S3 use a masking mechanism to calculate the effective mean square error loss and weighted mean square error loss from the effective length of the time series.
5. The time series regression prediction method with uncertainty estimation of claim 2 wherein the position encoder encodes the position information of the original tensor with a periodic function to obtain a time series tensor and combines the time series tensor with the original input tensor to enable the model to learn the time series information.
6. The method of claim 2, wherein the direct uncertainty estimation module comprises an uncertainty estimation decoder and a linear output layer, wherein the uncertainty estimation decoder is configured to decode global features of time series uncertainty based on a mapping of input features in a potential space; the linear output layer maps the uncertainty global features into the uncertainty output vector space of the samples by adjusting their weights.
7. The time-series regression prediction method with uncertainty estimation according to claim 5, wherein the uncertainty estimation decoder is composed of a multi-head attention mechanism and a fully connected neural network, the multi-head attention mechanism establishes a plurality of attentions by means of a linear layer, each attentions focuses on a different part of input information, and then splicing is performed, so that the expressive power of the model can be enhanced; the fully-connected neural network is a series connection of a plurality of linear layers and an activation function Relu, and relatively complex nonlinear processing capability can be obtained through composite mapping of simple linear and nonlinear processing units.
8. A time series regression prediction method with uncertainty estimates as claimed in claim 3 wherein the position encoder uses sine and cosine functions to add timing information to the relative and absolute position information of the original vector that helps model learn the data.
9. The method of time-series regression prediction with uncertainty estimation of claim 2,
the uncertainty estimation decoder based on the attention mechanism consists of a multi-head attention mechanism and a fully-connected neural network; the multi-head attention mechanism is to split an input vector into a plurality of heads, perform attention calculation on each head, finally splice the results of all heads to obtain a final output vector, extract task related information by the method, increase the attention of a model to information in different aspects, improve the generalization capability and effect of the model, wherein each head uses the following functions to calculate,
wherein Q, K, V is the input to a multi-headed attention mechanism, where Q, K, V is the same; softmax is an activation function that normalizes a numerical vector to a probability distribution vector, and the sum of the probabilities is 1; t represents a transpose operation; d, d model The dimension of the position vector is the same as the dimension value of the hidden state of the whole model; h represents the number of heads in the multi-head attention mechanism, i E [1, h];W i Q 、W i K 、W i V Respectively a weight matrix of Q, K, V.
10. The method of time series regression prediction with uncertainty estimation of claim 9 wherein the fully connected neural network uses a function that performs relatively complex nonlinear processing on the output of the attention mechanism,
wherein x is the input of a fully connected neural network; relu is the linear rectification function in the activation function; w (W) 1 、W 2 Is the weight parameter of two linear layers in the fully connected neural network; t represents a transpose operation; b 1 、b 2 Is the bias of two linear layers in a fully connected neural network.
11. The time series regression prediction method with uncertainty estimation of claim 10 wherein the linear output layer is a simple one-dimensional linear layer, and performing a simple linear mapping aligns the model output tensor with the 0-dimensional diagnostic data tensor.
12. The time series regression prediction method with uncertainty estimation of claim 1, wherein the model training process of step S3 comprises the steps of:
s31, randomly initializing the weight and bias parameters of the neural network;
s32, initializing a neural network super-parameter and a random gradient descent optimizer, wherein the neural network super-parameter comprises batch size, learning rate and iteration times;
s33, calculating loss according to a loss function L1 by using the target output time sequence and the prediction output of the time sequence output module, and optimizing model parameters by using an error back propagation algorithm;
s34, calculating the actual deviation and the loss of the prediction output of the uncertainty estimation module according to the loss function L2 on the basis of the existing model parameters, and optimizing parameters of an uncertainty estimation network in the model by using an error back propagation algorithm;
and S35, finally obtaining a time sequence prediction regression optimal model with uncertainty estimation.
13. The method of time series regression prediction with uncertainty estimation according to claim 4, wherein the calculating of the effective mean square error loss VMSE and weighted mean square error loss VWMSE from the time series effective length by using a masking mechanism for the loss functions L1, L2 specifically comprises:
the calculation formula of the effective mean square error loss function VMSE is as follows:
wherein n is V Is the effective length of the time series; w (W) V Is a matrix only containing 0 and 1 and is used for extracting an effective part aiming at the mean square error of a time sequence; y is real experimental data;is the model predictive output;
the calculation formula of the effective weighted mean square error loss VWMSE is as follows:
wherein n is V_in The number of time slices within the coverage area of the prediction uncertainty is effective; n is n V_out The number of time slices which are effective and outside the coverage of the uncertainty of prediction; y is V_in Is the real experimental data within the coverage range of the prediction uncertainty;is the model prediction output in the coverage range of the prediction uncertainty; y is V_out Is the real experimental data within the coverage range of the prediction uncertainty; />Is the model prediction output in the coverage range of the prediction uncertainty; λ is a parameter for balancing the importance of both coverage and uncertainty width; μ is a predefined target uncertainty coverage; PUCP is the prediction uncertainty coverage.
14. The time-series regression prediction method with uncertainty estimation of claim 13 wherein the PUCP calculation formula is as follows:
where N is the time series length and ai is a binary value calculated as:
wherein y is i Is the target value of the current value,is a point predictor and uncertainty is uncertainty.
15. A time series regression prediction system with uncertainty estimation, the system comprising:
and the data acquisition and preprocessing module is used for: the method comprises the steps of acquiring time sequence data, including an input time sequence and an output time sequence, and carrying out resampling and standardized preprocessing operation on the data to establish a data set for model training and testing;
neural network module: the neural network comprises a position encoder, an input encoder, a time sequence output module, a direct uncertainty estimation module and a linear output layer; the method comprises the steps of obtaining an output time sequence and uncertainty of the output time sequence according to an input time sequence;
training module: training the neural network by using the training data set of the task to obtain a trained neural network;
and a testing module: the method is used for inputting the input time series data of the test set into the trained neural network model, and obtaining the prediction time series and uncertainty thereof.
16. An electronic device comprising a readable storage medium, a central processing unit, and a graphics processor. Wherein the readable storage medium is for storing one or more computer programs, which when executed by the processor implement the steps of the time series regression prediction method with uncertainty estimation of any of claims 1 to 14.
17. A readable storage medium, characterized in that it has stored thereon a computer program which, when executed by a processor, performs the steps of the time-series regression prediction method with uncertainty estimation according to any of claims 1 to 14.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310463764.3A CN116720158A (en) | 2023-04-26 | 2023-04-26 | Time sequence regression prediction method and system with uncertainty estimation |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310463764.3A CN116720158A (en) | 2023-04-26 | 2023-04-26 | Time sequence regression prediction method and system with uncertainty estimation |
Publications (1)
Publication Number | Publication Date |
---|---|
CN116720158A true CN116720158A (en) | 2023-09-08 |
Family
ID=87870356
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310463764.3A Pending CN116720158A (en) | 2023-04-26 | 2023-04-26 | Time sequence regression prediction method and system with uncertainty estimation |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116720158A (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117371299A (en) * | 2023-12-08 | 2024-01-09 | 安徽大学 | Machine learning method for Tokamak new classical circumferential viscous torque |
CN117592571A (en) * | 2023-12-05 | 2024-02-23 | 武汉华康世纪医疗股份有限公司 | Air conditioning unit fault type diagnosis method and system based on big data |
CN117909682A (en) * | 2024-01-09 | 2024-04-19 | 浙江大学 | Local interpretation method of time sequence regression model based on Lime algorithm |
CN118379585A (en) * | 2024-06-24 | 2024-07-23 | 浙江大学 | Method and device for enhancing uncertainty of estimation model based on data |
CN118445734A (en) * | 2024-07-04 | 2024-08-06 | 深圳大学 | Tokamak experimental data intelligent processing method and system |
-
2023
- 2023-04-26 CN CN202310463764.3A patent/CN116720158A/en active Pending
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117592571A (en) * | 2023-12-05 | 2024-02-23 | 武汉华康世纪医疗股份有限公司 | Air conditioning unit fault type diagnosis method and system based on big data |
CN117592571B (en) * | 2023-12-05 | 2024-05-17 | 武汉华康世纪医疗股份有限公司 | Air conditioning unit fault type diagnosis method and system based on big data |
CN117371299A (en) * | 2023-12-08 | 2024-01-09 | 安徽大学 | Machine learning method for Tokamak new classical circumferential viscous torque |
CN117371299B (en) * | 2023-12-08 | 2024-02-27 | 安徽大学 | Machine learning method for Tokamak new classical circumferential viscous torque |
CN117909682A (en) * | 2024-01-09 | 2024-04-19 | 浙江大学 | Local interpretation method of time sequence regression model based on Lime algorithm |
CN118379585A (en) * | 2024-06-24 | 2024-07-23 | 浙江大学 | Method and device for enhancing uncertainty of estimation model based on data |
CN118445734A (en) * | 2024-07-04 | 2024-08-06 | 深圳大学 | Tokamak experimental data intelligent processing method and system |
CN118445734B (en) * | 2024-07-04 | 2024-09-24 | 深圳大学 | Tokamak experimental data intelligent processing method and system |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN116720158A (en) | Time sequence regression prediction method and system with uncertainty estimation | |
CN109885842B (en) | Processing text neural networks | |
CN116662582B (en) | Specific domain business knowledge retrieval method and retrieval device based on natural language | |
KR102109369B1 (en) | Artificial Intelligence System to Predict Changes and Explain Reasons in Time Series | |
CN117076931B (en) | Time sequence data prediction method and system based on conditional diffusion model | |
US11380301B2 (en) | Learning apparatus, speech recognition rank estimating apparatus, methods thereof, and program | |
CN115952407B (en) | Multipath signal identification method considering satellite time sequence and airspace interactivity | |
CN110990555A (en) | End-to-end retrieval type dialogue method and system and computer equipment | |
CN112084301B (en) | Training method and device for text correction model, text correction method and device | |
CN110334186A (en) | Data query method, apparatus, computer equipment and computer readable storage medium | |
CN118132803B (en) | Zero sample video moment retrieval method, system, equipment and medium | |
KR20220111215A (en) | Apparatus and method for predicting drug-target interaction using deep neural network model based on self-attention | |
CN117275609A (en) | Molecular design method based on variation self-encoder and transducer model | |
Hou et al. | Inverse is better! fast and accurate prompt for few-shot slot tagging | |
CN118230347A (en) | Image text control description generation method based on semantic preservation and reconstruction | |
CN117831609A (en) | Protein secondary structure prediction method and device and computer device | |
CN117539667A (en) | Automatic diagnosis method, system, terminal and medium for hydropower system faults | |
CN117198427A (en) | Molecule generation method and device, electronic equipment and storage medium | |
CN114925808B (en) | Anomaly detection method based on incomplete time sequence in cloud network end resource | |
CN115470327A (en) | Medical question-answering method based on knowledge graph and related equipment | |
Park et al. | Convolution-based attention model with positional encoding for streaming speech recognition on embedded devices | |
CN115018627A (en) | Credit risk evaluation method and device, storage medium and electronic equipment | |
AU2022281121C1 (en) | Generating neural network outputs by cross attention of query embeddings over a set of latent embeddings | |
CN116417062B (en) | Enzyme-substrate affinity constant prediction method, storage medium and device | |
CN116957047B (en) | Sampling network updating method, device, equipment and medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |