CN117828308A - Time sequence prediction method based on local segmentation - Google Patents

Time sequence prediction method based on local segmentation Download PDF

Info

Publication number
CN117828308A
CN117828308A CN202410238526.7A CN202410238526A CN117828308A CN 117828308 A CN117828308 A CN 117828308A CN 202410238526 A CN202410238526 A CN 202410238526A CN 117828308 A CN117828308 A CN 117828308A
Authority
CN
China
Prior art keywords
data
training
time series
segmentation
attention
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202410238526.7A
Other languages
Chinese (zh)
Inventor
王涛
杨斌
赵影
贺业凤
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shandong Jerei Digital Technology Co Ltd
Original Assignee
Shandong Jerei Digital Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shandong Jerei Digital Technology Co Ltd filed Critical Shandong Jerei Digital Technology Co Ltd
Priority to CN202410238526.7A priority Critical patent/CN117828308A/en
Publication of CN117828308A publication Critical patent/CN117828308A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Biophysics (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Biomedical Technology (AREA)
  • Health & Medical Sciences (AREA)
  • Probability & Statistics with Applications (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The application discloses a time sequence prediction method based on local segmentation, which belongs to the technical field of artificial intelligence, and comprises the following steps: the original time sequence data is subjected to data division and preprocessing to obtain a second training set; inputting the second training set into the constructed original prediction model comprising a local segmentation module, a transducer attention module, a segmentation flattening module and a full-connection layer which are sequentially connected for training to obtain a target prediction model; and inputting the time series data to be predicted into a target prediction model to obtain a target prediction result. The method can convert the correlation between time series data corresponding to a single time point of attention calculation into the similarity between the series sections, can hold the change of historical data more accurately, and can also keep the local semantic information in the time section, thereby reducing the spatial complexity of the attention calculation and further ensuring that the obtained prediction result is more accurate.

Description

Time sequence prediction method based on local segmentation
Technical Field
The application relates to the technical field of artificial intelligence, in particular to a time sequence prediction method based on local segmentation.
Background
A time series is data arranged in a time sequence reflecting the variation of a single or multiple variables over a period of time, with time dependence and correlation. The time sequence prediction can reveal the change rule and trend of the observed variable, and provide data support for fine management and intelligent decision-making. The time sequence prediction is used as an important application of artificial intelligence technology, and can assist in the development of industries such as production and manufacture, economy and finance, resource monitoring and the like.
Currently, the long and short term memory network (LSTM) and other models based on a Recurrent Neural Network (RNN) have been used in many practical applications to exhibit good performance of multi-step prediction. However, limited by the model structure, recurrent neural networks are often not suitable for parallel training and have gradient vanishing problems, resulting in limited capture sequence length; a Convolutional Neural Network (CNN) based Time Convolutional Network (TCN) solves the problem of parallel training, but the need for memory is enormous due to the reliance on stacked hidden layers to obtain a larger receptive field. Therefore, there is a need for a time series prediction method to solve the above problems.
Disclosure of Invention
In view of this, the present application provides a time series prediction method based on local segmentation, which can convert the correlation between time series data corresponding to a single time point of attention calculation into the similarity between sequence segments, so that the historical data change can be grasped more accurately, and meanwhile, the local semantic information in the time period can be kept, so that the spatial complexity of the attention calculation is reduced, and the obtained prediction result is more accurate.
Specifically, the method comprises the following technical scheme:
the embodiment of the application provides a time sequence prediction method based on local segmentation, which comprises the following steps:
performing data division on the original time sequence data to obtain a first training set, a first verification set and a first test set;
preprocessing the first training set, the first verification set and the first test set to obtain a preprocessed second training set, a preprocessed second verification set and a preprocessed second test set;
an original prediction model is constructed, wherein the original prediction model comprises a local segmentation module, a transducer attention module, a segmentation flattening module and a full connection layer which are sequentially connected;
inputting the second training set into the original prediction model for training to obtain a target prediction model;
and inputting the time series data to be predicted into a target prediction model to obtain a target prediction result.
In some embodiments, preprocessing the first training set, the first verification set, and the first test set to obtain a preprocessed second training set, second verification set, and second test set, including:
extracting feature graphs of the first training set, the first verification set and the first test set;
calculating the data mean and variance of each channel in the first training set, the first verification set and the first test set;
and carrying out normalization processing on the feature map based on the data mean and variance to obtain a second training set, a second verification set and a second test set.
In some embodiments, the inputting the second training set into the original prediction model for training to obtain a target prediction model includes:
the segmentation processing of the overlapping time sequence segments is carried out on the second training set in a local segmentation module, so that segmented data are obtained;
performing position coding on the segmentation data to obtain position coding data comprising position information;
performing self-attention calculation processing on the position coding data in a transducer attention module to obtain decoding output data;
performing flattening processing on the decoded output data in a segmented flattening module to obtain one-dimensional flattening data;
inputting the one-dimensional clapping data into the full-connection layer to obtain an intermediate training result and an intermediate prediction model;
based on the intermediate training result and the average absolute errorMAEAnd Nash correlation coefficientNSECalculating a loss function value;
and stopping training in response to the loss function value meeting a condition to obtain the target prediction model.
In some embodiments, the transducer attention module includes an encoder for calculating self-attention and inputting the result to a decoder for calculating cross-attention, the transducer attention module having a transducer architecture;
the encoder includes a first multi-headed self-attention layer, a first residual connection layer, and a first normalization layer, and the decoder includes a second multi-headed self-attention layer, a multi-headed cross-attention layer, a second residual connection layer, and a second normalization layer.
In some embodiments, the performing, in the local segmentation module, segmentation processing of the second training set in the overlapping time sequence segments to obtain segmented data includes:
time series data in the second training setPerforming overlapping or non-overlapping partial segmentation to obtain segmentation data +.>
Calculating the number of segments according to the following formulaN
Wherein,ia special identification of the individual time series data after the segmentation process is shown,ithe value of (C) is%0,N),PThe length of the segment to be represented is,Nin order to be able to divide the number of segments,Lrepresenting time series data in the second training setIs provided for the total length of (a),Sis the non-overlapping area of steps between two consecutive segments.
In some embodiments, the segmented data is position encoded according to the following formula, resulting in position encoded data comprising position information:
wherein,representing even items in said position-coded data,/->Indicated are odd items in the position-coded data,posrepresenting the position of said position-coded data in said time-series data,/for>Indicated is +.>Is a dimension of (c).
In some embodiments, for a fixed length of pitchkAccording to the following formula, obtainAnd->Is a relative positional relationship of:
in some embodiments, the decoded output data is derived according to the following formula:
wherein,representing said decoded output data, +.>Represented is a normalized exponential function, +.>Representing an initial component of the position-coded data,QKVa query component, a key component and a numeric component, respectively, of the position-coded data, +.>、/>、/>Respectively represent a weight matrix corresponding to the query component, a weight matrix corresponding to the key component and a weight corresponding to the numerical componentMatrix (S)>Denoted by the transposed component of the key component,/->Representing the dimensions of the position-coded data,Linearrepresented as a linear function +.>Representing the position-coded data after a positive normalization process.
In some embodiments, the average absolute error is calculated according to the following equations, respectivelyMAEAnd the Nash correlation coefficientNSE
Wherein,Mthe length of the prediction period in the intermediate training results is shown,representing the predicted value of time series data at the r-th moment in the intermediate training result,/>Representing observations of time series data in said intermediate training results, +.>Represents the average of the observations of time series data in the intermediate training results,rthe predicted time is shown.
The beneficial effects of the technical scheme provided by the embodiment of the application at least comprise:
the embodiment of the application provides a time sequence prediction method based on local segmentation, which is characterized in that a second training set is input into a constructed original prediction model comprising a local segmentation module, a Transformer attention module, a segmentation flattening module and a full-connection layer which are sequentially connected to train to obtain a target prediction model, the constructed original prediction model considers the problems of slow model training speed and prediction speed and low efficiency caused by long time sequence data period and irregular change trend, and the time sequence data is divided into a plurality of sequence-time periods with equal length and overlapping, so that the Transformer attention module operates on the sequence periods corresponding to the time periods instead of the time sequence data corresponding to single time points, thereby enhancing the learning capacity of similar time sequence change processes and reducing the space complexity of attention calculation; and inputting the time sequence data to be predicted into a target prediction model to obtain a target prediction result, so that the obtained target prediction result is more accurate. The method can convert the correlation between time series data corresponding to a single time point of attention calculation into the similarity between the series sections, can hold the change of historical data more accurately, and can also keep the local semantic information in the time section, thereby reducing the spatial complexity of the attention calculation and further ensuring that the obtained prediction result is more accurate.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1 is a flowchart of a method for predicting a time sequence based on local segmentation according to an embodiment of the present application;
fig. 2 (a) is a prediction result of predicting 30 daily traffic in a time sequence prediction method based on local segmentation according to an embodiment of the present application;
fig. 2 (b) is a prediction result of predicting 90-day runoff in a time sequence prediction method based on local segmentation according to an embodiment of the present application;
fig. 2 (c) is a prediction result of predicting 180-day runoff in the local segmentation-based time sequence prediction method according to the embodiment of the present application.
Detailed Description
The following description of the embodiments of the present application will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are some, but not all, of the embodiments of the present application. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the present disclosure, are within the scope of the present disclosure.
The embodiment of the application provides a time sequence prediction method based on local segmentation, which comprises the following steps of.
And step 101, carrying out data division on the original time sequence data to obtain a first training set, a first verification set and a first test set.
In some embodiments, the raw time series data is aligned in a time dimension and then divided into a first training set, a first validation set, and a first test set.
It should be noted that, the first training set is used for training the model, the first verification set is used for adjusting model parameters, and the first test set evaluates the model.
And 102, preprocessing the first training set, the first verification set and the first test set to obtain a preprocessed second training set, a preprocessed second verification set and a preprocessed second test set.
The preprocessing may reduce the effects of distribution shifts between the second training set, the second validation set, and the second test set.
In some embodiments, step 102 may specifically include: (1) Extracting feature graphs of a first training set, a first verification set and a first test set to obtain feature information of data; (2) Calculating the data mean and variance of each channel in the first training set, the first verification set and the first test set; (3) And carrying out normalization processing on the feature map based on the data mean and variance to obtain a second training set, a second verification set and a second test set, calculating the data mean and variance of each channel, and carrying out normalization processing on the feature map by using the data mean and variance, so that the distribution offset effect among the second training set, the second verification set and the second test set can be reduced.
In some embodiments, 100 rounds of epoch training are set, the initial learning rate in the original predictive model is adjusted, and the initial learning rate is attenuated by an Adam algorithm optimization model.
And 103, constructing an original prediction model, wherein the original prediction model comprises a local segmentation module, a transducer attention module, a segmentation flattening module and a full connection layer which are sequentially connected.
The constructed original prediction model considers the problems of slow model training speed, slow prediction speed and low efficiency caused by long time sequence data period and irregular change trend, and the local segmentation module divides the time sequence data into a plurality of sequences which are equal in length and can be overlapped with each other to enable the transducer attention module to operate on the time sequence data corresponding to the sequence segment rather than the single time point corresponding to the time period, thereby enhancing the learning capability of similar time sequence change processes and reducing the space complexity of attention calculation. The transducer attention module is used for obtaining a time period with high correlation in the time series data through attention calculation, the segmentation flattening module is used for dividing the complete time series data into shorter time periods and flattening the time periods into dimensions available for the model, and the full-connection layer is used for converting the dimensions in the model into dimensions required for prediction.
And 104, inputting the second training set into the original prediction model for training to obtain a target prediction model.
And training the original prediction model to obtain a target prediction model, so that the prediction is performed more accurately.
In some embodiments, step 104 specifically includes:
(1) And in the local segmentation module, segmentation processing of the second training set in the overlapping time sequence segments is carried out, so that segmented data are obtained.
The constructed original prediction model considers the problems of low model training speed and prediction speed and low efficiency caused by long time sequence data period and irregular change trend, and the local segmentation module is used for carrying out segmentation processing on the second training set in a time sequence period which can be overlapped, so that the time sequence data is divided into a plurality of time sequence periods with equal length and can be overlapped, the segmented data is obtained, the local semantic information in the time period is reserved, the obtained prediction result is more accurate, and the subsequent calculation efficiency is influenced.
In some embodiments, the segmentation processing of the overlapping time sequence segments is performed on the second training set in the local segmentation module to obtain segmented data, which specifically includes:
time series data in the second training setPerforming overlapping or non-overlapping partial segmentation to obtain segmentation data +.>
Calculating the number of segments according to the following formulaN
Wherein,ia special identification of the individual time series data after the segmentation process is shown,ithe value of (C) is%0,N),PThe length of the segment to be represented is,Nin order to be able to divide the number of segments,Lrepresenting time series data in the second training setIs provided for the total length of (a),Sis the non-overlapping area of steps between two consecutive segments.
Before the dividing process, the method willLast value ofSThe number of repetitions is padded to the end of the time series data in the second training set to ensure that the segmentation is correct. The segmentation processing of the local segmentation module can be used forThe local time period and the corresponding time sequence data thereof are used for carrying out attention calculation, and the number of the input data can be calculated fromLReduced to aboutL/SMeaning that attention is paid to memory usage and computational complexity of the diagramSIs reduced by a multiple of (a). Therefore, when training time and GPU memory are limited, the segmentation process can enable the model to learn a longer history sequence, so that prediction performance is remarkably improved.
(2) And performing position coding on the split data to obtain position coded data comprising position information. The attention calculation cannot acquire the position information of the input sequence, and the position coding can enable the segmented data to acquire the relative position relation so as to orderly perform subsequent calculation.
In some embodiments, the partitioned data is position encoded according to the following formula, resulting in position encoded data comprising position information:
wherein,representing even items in said position-coded data,/->Indicated are odd items in the position-coded data,posrepresenting the position of said position-coded data in said time-series data,/for>Indicated is +.>Is a dimension of (c).
In some embodiments, for a fixed length of pitchkAccording to the following formula, obtainAnd->Is a relative positional relationship of:
(3) And performing self-attention calculation processing on the position coding data in a transducer attention module to obtain decoding output data.
The transducer attention module focuses not only on single time period input, but on the whole sequence, and gives different weights to each time period in the sequence, so that end-to-end global optimization can be realized, and the transducer attention module calculates time sequence data corresponding to the time period rather than single time point, thereby enhancing learning ability of similar time sequence change process and reducing space complexity of attention calculation.
In some embodiments, the transducer attention module includes an encoder for calculating self-attention and inputting the result to the decoder for calculating cross-attention; the encoder includes a first multi-headed self-attention layer, a first residual connection layer, and a first normalization layer, and the decoder includes a second multi-headed self-attention layer, a multi-headed cross-attention layer, a second residual connection layer, and a second normalization layer.
In some embodiments, the number of encoders and decoders may be multiple, each of which may be stacked in series with multiple layers.
The encoder includes a first multi-headed self-attention layer, a first residual connection layer, and a first normalization layer, and the decoder includes a second multi-headed self-attention layer, a multi-headed cross-attention layer, a second residual connection layer, and a second normalization layer.
In some embodiments, the decoded output data is derived according to the following formula:
wherein,representing said decoded output data, +.>Represented is a normalized exponential function, +.>Representing an initial component of the position-coded data,QKVa query component, a key component and a numeric component, respectively, of the position-coded data, +.>、/>、/>Respectively representing a weight matrix corresponding to the query component, a weight matrix corresponding to the key component and a weight matrix corresponding to the numerical component,/a weight matrix corresponding to the key component>Denoted by the transposed component of the key component,/->Representing the dimensions of the position-coded data,Linearrepresented as a linear function +.>Representing the position-coded data after a positive normalization process. Attention is usually focused on the correlation between time series data corresponding to time points, local context change trend of single time point is ignored, the attention calculation can be performed on the local time period and the corresponding runoff amount thereof through the segmentation processing of the local segmentation module, and the input data number can be calculated fromLReduced to aboutL/SMeaning that attention is paid to memory usage and computational complexity of the diagramSIs reduced by a multiple of (a).
(4) And performing flattening processing on the decoded output data in the segmented flattening module to obtain one-dimensional flattening data. The clapping process is the inverse of the previous segmentation process for restoring the data dimension.
(5) And inputting the one-dimensional clapping data into the full-connection layer to obtain an intermediate training result and an intermediate prediction model.
It should be noted that, the full connection layer is that each node is connected to all nodes of the previous layer, so as to integrate the features extracted from the front edge. The parameters of the fully connected layer are also generally the most due to their fully connected nature. Meanwhile, the number of nodes of the full-connection layer determines the output dimension. The intermediate prediction model is not a final target prediction model, and parameter adjustment and evaluation are needed to improve the prediction effect of the model.
(6) Based on the intermediate training result, average absolute errorMAEAnd Nash correlation coefficientNSEThe loss function value is calculated.
In some embodiments, the loss function may be the mean absolute errorMAEAnd Nash correlation coefficientNSE
In some embodiments, the meters are each according to the following formulaCalculating the average absolute errorMAEAnd Nash correlation coefficientNSE
Wherein,Mthe length of the prediction period in the intermediate training results is shown,representing the predicted value of time series data at the r-th moment in the intermediate training result,/>Representing observations of time series data in said intermediate training results, +.>Represents the average of the observations of time series data in the intermediate training results,rthe predicted time is shown. (7) And stopping training in response to the loss function value meeting the condition to obtain a target prediction model.
In some embodiments, the condition that the loss function value satisfies may be that the loss function value decreases five times in succession, andMAEapproaching 0,NSEApproaching 1.
It is noted that the loss function value is continuously reduced for five times to illustrate that the model has good robustness and is continuously improved,MAEapproaching 0,NSEAnd the approach to 1 shows that the target prediction model is trained well, and the target prediction result is more accurate.
In some embodiments, in response to the loss function value not decreasing five times in succession, parameters of each neuron in the intermediate predictive model are adjusted and training is continued.
And 105, inputting the time series data to be predicted into a target prediction model to obtain a target prediction result.
And inputting the time sequence data to be predicted into a target prediction model so as to obtain a target prediction result and obtain a change rule and trend of the time sequence. The target prediction result is one-dimensional time series data, i.e. values of each day in a time range that needs to be predicted.
The target prediction model and different prediction models (LSTM prediction model, TCN prediction model, transducer prediction model, informater prediction model, autoformer prediction model) are utilized to carry out the runoff (in m) on the runoff of the mountain-holding station along the hydrologic station in the Yangtze river basin under different prediction periods (3 days, 7 days, 15 days, 30 days, 90 days, 180 days and 360 days) 3 /s) a prediction of the number of samples,MAEandNSEthe comparison results are shown in table 1 as evaluation indexes of the prediction effect. From the analysis of table 1, it can be found that in short-term prediction, the target prediction model fails to take significant advantage, indicating that the local segmentation operation is a negative boost for short-term prediction. With the lengthening of the prediction period, the target prediction model obtains better performance in medium-and-long-term prediction, especially in 90-day prediction, and the target prediction model is matched with a suboptimal model InformirNSEIn comparison with the addition of 7.1%,MAEthe improvement of the prediction performance is continued to the prediction task of 360 days compared with the reduction of 4.9 percent.
Table 1 mountain-screening site daily Scale prediction comparative experiment results
The single prediction results of the long-term prediction periods in 30 days, 90 days and 180 days are presented in a broken line trend graph, and as shown in fig. 2 (a), 2 (b) and 2 (c), the blue curve represents an observation value curve, namely a true and accurate runoff amount curve provided by a hydrological station, the orange curve represents the prediction result of an LSTM model, the gray curve represents the prediction result of a TCN model, the yellow curve represents the prediction result of a transducer model, the light green curve represents the prediction result of an Informir model, the dark green curve represents the prediction result of an Autoformer model, the brown curve represents the prediction result of a target prediction model provided by the embodiment of the application, and the curve represented by the prediction result of the target prediction model provided by the embodiment of the application penetrates through the observation value curve, so that the judgment on the long-term predicted runoff amount and the runoff amount change trend is accurate can be found.
According to the time sequence prediction method based on the local segmentation, the second training set is input into the built original prediction model comprising the local segmentation module, the Transformer attention module, the segmentation flattening module and the full-connection layer which are sequentially connected, so that a target prediction model is obtained, the built original prediction model considers the problems of slow model training speed and prediction speed and low efficiency caused by long time sequence data period and irregular change trend, the time sequence data is divided into a plurality of sequence-time periods with equal length and overlapping, and the Transformer attention module is enabled to operate on the time sequence data corresponding to the sequence periods instead of the single time point, so that learning capacity of similar time sequence change processes is enhanced, and space complexity of attention calculation is reduced; and inputting the time sequence data to be predicted into a target prediction model to obtain a target prediction result, so that the obtained target prediction result is more accurate. The method can convert the correlation between time series data corresponding to a single time point of attention calculation into the similarity between the series sections, can hold the change of historical data more accurately, and can also keep the local semantic information in the time section, thereby reducing the spatial complexity of the attention calculation and further ensuring that the obtained prediction result is more accurate.
In this application, the terms "first" and "second" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance. The term "plurality" refers to two or more, unless explicitly defined otherwise.
Other embodiments of the present application will be apparent to those skilled in the art from consideration of the specification and practice of the present application disclosed herein. This application is intended to cover any variations, uses, or adaptations of the application following, in general, the principles of the application and including such departures from the present disclosure as come within known or customary practice within the art to which the application pertains. The specification and examples are to be regarded in an illustrative manner only.
It is to be understood that the present application is not limited to the precise arrangements and instrumentalities shown in the drawings, which have been described above, and that various modifications and changes may be effected without departing from the scope thereof.

Claims (9)

1. A method of local segmentation-based time series prediction, the method comprising:
performing data division on the original time sequence data to obtain a first training set, a first verification set and a first test set;
preprocessing the first training set, the first verification set and the first test set to obtain a preprocessed second training set, a preprocessed second verification set and a preprocessed second test set;
an original prediction model is constructed, wherein the original prediction model comprises a local segmentation module, a transducer attention module, a segmentation flattening module and a full connection layer which are sequentially connected;
inputting the second training set into the original prediction model for training to obtain a target prediction model;
and inputting the time series data to be predicted into a target prediction model to obtain a target prediction result.
2. The partial segment based time series prediction method according to claim 1, wherein preprocessing the first training set, the first verification set and the first test set to obtain a preprocessed second training set, second verification set and second test set comprises:
extracting feature graphs of the first training set, the first verification set and the first test set;
calculating the data mean and variance of each channel in the first training set, the first verification set and the first test set;
and carrying out normalization processing on the feature map based on the data mean and variance to obtain a second training set, a second verification set and a second test set.
3. The local segment-based time series prediction method according to claim 1 or 2, wherein the step of inputting the second training set into the original prediction model for training to obtain a target prediction model includes:
the segmentation processing of the overlapping time sequence segments is carried out on the second training set in a local segmentation module, so that segmented data are obtained;
performing position coding on the segmentation data to obtain position coding data comprising position information;
performing self-attention calculation processing on the position coding data in a transducer attention module to obtain decoding output data;
performing flattening processing on the decoded output data in a segmented flattening module to obtain one-dimensional flattening data;
inputting the one-dimensional clapping data into the full-connection layer to obtain an intermediate training result and an intermediate prediction model;
based on the intermediate training result and the average absolute errorMAEAnd Nash correlation coefficientNSECalculating a loss function value;
and stopping training in response to the loss function value meeting a condition to obtain the target prediction model.
4. A partial segment based time series prediction method according to claim 3, wherein the transform attention module comprises an encoder for calculating self-attention and inputting the result to the decoder for calculating cross-attention;
the encoder includes a first multi-headed self-attention layer, a first residual connection layer, and a first normalization layer, and the decoder includes a second multi-headed self-attention layer, a multi-headed cross-attention layer, a second residual connection layer, and a second normalization layer.
5. A local segmentation based time series prediction method according to claim 3, wherein the performing, in the local segmentation module, segmentation processing of the second training set with respect to the overlapping time series segments to obtain the segmented data includes:
time series data in the second training setPerforming overlapping or non-overlapping partial segmentation to obtain segmented data
Calculating the number of segments according to the following formulaN
Wherein,ia special identification of the individual time series data after the segmentation process is shown,ithe value of (C) is%0,N),PThe length of the segment to be represented is,Nin order to be able to divide the number of segments,Lrepresenting time series data in the second training setIs provided for the total length of (a),Sis the non-overlapping area of steps between two consecutive segments.
6. The partial segment based time series prediction method of claim 5, wherein the segmented data is position coded according to the following formula to obtain position coded data including position information:
wherein,representing even items in said position-coded data,/->Indicated are odd items in the position-coded data,posrepresenting the position of the position-coded data in the time-series data,indicated is +.>Is a dimension of (c).
7. The partial segment based time series prediction method of claim 6, wherein for a fixed length pitchkAccording to the following formula, obtainAnd->Is a relative positional relationship of:
8. the partial segment based time series prediction method of claim 6, wherein the decoded output data is derived according to the following formula:
wherein,representing said decoded output data, +.>Represented is a normalized exponential function, +.>Representing an initial component of the position-coded data,QKVa query component, a key component and a numeric component, respectively, of the position-coded data, +.>、/>、/>Respectively representing a weight matrix corresponding to the query component, a weight matrix corresponding to the key component and a weight matrix corresponding to the numerical component,/a weight matrix corresponding to the key component>Denoted by the transposed component of the key component,/->Representing the dimensions of the position-coded data,Linearrepresented as a linear function +.>Representing the position-coded data after a positive normalization process.
9. A partial segment based time series prediction method according to claim 3, characterized in that the mean absolute error is calculated according to the following formula, respectivelyMAEAnd the Nash correlation coefficientNSE
Wherein,Mthe length of the prediction period in the intermediate training results is shown,representing the predicted value of time series data at the r-th moment in the intermediate training result,/>Representing observations of time series data in said intermediate training results, +.>Represents the average of the observations of time series data in the intermediate training results,rthe predicted time is shown.
CN202410238526.7A 2024-03-04 2024-03-04 Time sequence prediction method based on local segmentation Pending CN117828308A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410238526.7A CN117828308A (en) 2024-03-04 2024-03-04 Time sequence prediction method based on local segmentation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410238526.7A CN117828308A (en) 2024-03-04 2024-03-04 Time sequence prediction method based on local segmentation

Publications (1)

Publication Number Publication Date
CN117828308A true CN117828308A (en) 2024-04-05

Family

ID=90522891

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410238526.7A Pending CN117828308A (en) 2024-03-04 2024-03-04 Time sequence prediction method based on local segmentation

Country Status (1)

Country Link
CN (1) CN117828308A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118013866A (en) * 2024-04-09 2024-05-10 西北工业大学 Medium-and-long-term runoff prediction method based on horizontal and vertical attention
CN118296499A (en) * 2024-06-05 2024-07-05 山东电力建设第三工程有限公司 Photo-thermal power station meteorological data prediction method based on self-attention mechanism

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115146700A (en) * 2022-05-21 2022-10-04 西北工业大学 Runoff prediction method based on Transformer sequence-to-sequence model
CN115600656A (en) * 2022-11-03 2023-01-13 杭州电子科技大学(Cn) Multi-element time sequence prediction method based on segmentation strategy and multi-component decomposition algorithm
CN116050652A (en) * 2023-02-22 2023-05-02 重庆邮电大学 Runoff prediction method based on local attention enhancement model
CN116596033A (en) * 2023-05-22 2023-08-15 南通大学 Transformer ozone concentration prediction method based on window attention and generator
CN116975782A (en) * 2023-08-10 2023-10-31 浙江大学 Hierarchical time sequence prediction method and system based on multi-level information fusion
CN117094451A (en) * 2023-10-20 2023-11-21 邯郸欣和电力建设有限公司 Power consumption prediction method, device and terminal
US20240020527A1 (en) * 2022-07-13 2024-01-18 Home Depot Product Authority, Llc Machine learning modeling of time series with divergent scale
CN117494906A (en) * 2023-12-28 2024-02-02 浙江省白马湖实验室有限公司 Natural gas daily load prediction method based on multivariate time series

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115146700A (en) * 2022-05-21 2022-10-04 西北工业大学 Runoff prediction method based on Transformer sequence-to-sequence model
US20240020527A1 (en) * 2022-07-13 2024-01-18 Home Depot Product Authority, Llc Machine learning modeling of time series with divergent scale
CN115600656A (en) * 2022-11-03 2023-01-13 杭州电子科技大学(Cn) Multi-element time sequence prediction method based on segmentation strategy and multi-component decomposition algorithm
CN116050652A (en) * 2023-02-22 2023-05-02 重庆邮电大学 Runoff prediction method based on local attention enhancement model
CN116596033A (en) * 2023-05-22 2023-08-15 南通大学 Transformer ozone concentration prediction method based on window attention and generator
CN116975782A (en) * 2023-08-10 2023-10-31 浙江大学 Hierarchical time sequence prediction method and system based on multi-level information fusion
CN117094451A (en) * 2023-10-20 2023-11-21 邯郸欣和电力建设有限公司 Power consumption prediction method, device and terminal
CN117494906A (en) * 2023-12-28 2024-02-02 浙江省白马湖实验室有限公司 Natural gas daily load prediction method based on multivariate time series

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
杜圣东;李天瑞;杨燕;王浩;谢鹏;洪西进;: "一种基于序列到序列时空注意力学习的交通流预测模型", 计算机研究与发展, no. 08, 6 August 2020 (2020-08-06) *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118013866A (en) * 2024-04-09 2024-05-10 西北工业大学 Medium-and-long-term runoff prediction method based on horizontal and vertical attention
CN118296499A (en) * 2024-06-05 2024-07-05 山东电力建设第三工程有限公司 Photo-thermal power station meteorological data prediction method based on self-attention mechanism

Similar Documents

Publication Publication Date Title
CN117828308A (en) Time sequence prediction method based on local segmentation
Choi et al. Short-term load forecasting based on ResNet and LSTM
CN111080032B (en) Load prediction method based on transducer structure
CN113177633B (en) Depth decoupling time sequence prediction method
CN111161535A (en) Attention mechanism-based graph neural network traffic flow prediction method and system
CN110580543A (en) Power load prediction method and system based on deep belief network
Li et al. A new flood forecasting model based on SVM and boosting learning algorithms
CN109784473A (en) A kind of short-term wind power prediction method based on Dual Clocking feature learning
CN115587454A (en) Traffic flow long-term prediction method and system based on improved Transformer model
CN115688579A (en) Basin multi-point water level prediction early warning method based on generation of countermeasure network
Li et al. Deep spatio-temporal wind power forecasting
CN117094451B (en) Power consumption prediction method, device and terminal
CN114399021A (en) Probability wind speed prediction method and system based on multi-scale information
Kwon et al. Weekly peak load forecasting for 104 weeks using deep learning algorithm
CN115630101A (en) Hydrological parameter intelligent monitoring and water resource big data management system
CN116050652A (en) Runoff prediction method based on local attention enhancement model
CN115982567A (en) Refrigerating system load prediction method based on sequence-to-sequence model
CN114154732A (en) Long-term load prediction method and system
CN116089777A (en) Intelligent new energy settlement method and system based on intelligent information matching
Wenjie et al. A NOVEL MODEL FOR STOCK CLOSING PRICE PREDICTION USING CNN-ATTENTION-GRU-ATTENTION.
CN114186412A (en) Hydropower station water turbine top cover long sequence water level prediction system and method based on self-attention mechanism
Vogt et al. Wind power forecasting based on deep neural networks and transfer learning
Sari et al. Daily rainfall prediction using one dimensional convolutional neural networks
CN116743182A (en) Lossless data compression method
CN117409578A (en) Traffic flow prediction method based on combination of empirical mode decomposition and deep learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination