CN116703607A - Financial time sequence prediction method and system based on diffusion model - Google Patents

Financial time sequence prediction method and system based on diffusion model Download PDF

Info

Publication number
CN116703607A
CN116703607A CN202310710238.2A CN202310710238A CN116703607A CN 116703607 A CN116703607 A CN 116703607A CN 202310710238 A CN202310710238 A CN 202310710238A CN 116703607 A CN116703607 A CN 116703607A
Authority
CN
China
Prior art keywords
model
sequence
prediction
price change
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310710238.2A
Other languages
Chinese (zh)
Inventor
黄琪兴
李雅
褚健
杨根科
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ningbo Institute Of Artificial Intelligence Shanghai Jiaotong University
Original Assignee
Ningbo Institute Of Artificial Intelligence Shanghai Jiaotong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ningbo Institute Of Artificial Intelligence Shanghai Jiaotong University filed Critical Ningbo Institute Of Artificial Intelligence Shanghai Jiaotong University
Priority to CN202310710238.2A priority Critical patent/CN116703607A/en
Publication of CN116703607A publication Critical patent/CN116703607A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/06Asset management; Financial planning or analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2474Sequence data queries, e.g. querying versioned data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/18Complex mathematical operations for evaluating statistical data, e.g. average values, frequency distributions, probability functions, regression analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y04INFORMATION OR COMMUNICATION TECHNOLOGIES HAVING AN IMPACT ON OTHER TECHNOLOGY AREAS
    • Y04SSYSTEMS INTEGRATING TECHNOLOGIES RELATED TO POWER NETWORK OPERATION, COMMUNICATION OR INFORMATION TECHNOLOGIES FOR IMPROVING THE ELECTRICAL POWER GENERATION, TRANSMISSION, DISTRIBUTION, MANAGEMENT OR USAGE, i.e. SMART GRIDS
    • Y04S10/00Systems supporting electrical power generation, transmission or distribution
    • Y04S10/50Systems or methods supporting the power network operation or management, involving a certain degree of interaction with the load-side end user applications

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Business, Economics & Management (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Pure & Applied Mathematics (AREA)
  • Accounting & Taxation (AREA)
  • Computational Mathematics (AREA)
  • Development Economics (AREA)
  • Operations Research (AREA)
  • Databases & Information Systems (AREA)
  • Finance (AREA)
  • Probability & Statistics with Applications (AREA)
  • Evolutionary Biology (AREA)
  • Human Resources & Organizations (AREA)
  • Game Theory and Decision Science (AREA)
  • Marketing (AREA)
  • Technology Law (AREA)
  • General Business, Economics & Management (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Economics (AREA)
  • Computational Linguistics (AREA)
  • Fuzzy Systems (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Strategic Management (AREA)
  • Algebra (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Computing Systems (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a financial time sequence prediction method and a financial time sequence prediction system based on a diffusion model, which relate to the field of time sequence prediction based on deep learning, wherein the method comprises the following steps: step 1, selecting quotation sequence data of a predicted target financial asset, preprocessing the data to obtain a data sample, and dividing the data sample into a training set, a verification set and a test set; step 2, constructing a price change prediction model of the prediction target financial asset, wherein the price change prediction model comprises a diffusion model forward process, a diffusion model backward process and a multi-scale denoising score matching mechanism; step 3, inputting the training set into the price change prediction model for training, and carrying out model verification through the verification set to finally determine parameters of the price change prediction model; and 4, inputting the test set into the price change prediction model with the determined parameters to perform model prediction, and evaluating the prediction performance of the price change prediction model.

Description

Financial time sequence prediction method and system based on diffusion model
Technical Field
The invention relates to the field of time sequence prediction based on deep learning, in particular to a financial time sequence prediction method and system based on a diffusion model.
Background
The quantitative investment is an investment strategy based on mathematical and statistical methods, and investment decisions are made by using computer algorithms and techniques. The goal of quantitative investment is to construct a reproducible investment model by utilizing historical and real-time market data and application of technologies such as machine learning, artificial intelligence and the like, thereby realizing optimization and risk control of investment combinations. In the quantitative investment, investors analyze market trends and industry trends by utilizing a large amount of data, and realize automatic transaction through programming so as to realize the investment targets with high efficiency and low risk. The quantitative investment has the advantages of high controllability, replicability and stability, and reduces the influence of mood factors on investment decisions, thereby improving the investment efficiency and success rate. And a quantized investment strategy constructed based on financial time series prediction occupies an important role in quantized investment.
Since the financial market data has time sequence and periodicity, the rules and trends of the historical market data can be found by performing time sequence analysis on the historical market data, so that the future market price change can be predicted. Conventional timing analysis prediction techniques include autoregressive moving average (ARMA) and generalized autoregressive conditional heteroscedastic model (GARCH) models. With the development of artificial intelligence techniques such as machine learning and deep learning, many researchers began to model market quotas using support vector machines, random forests, gradient-lifted tree models represented by XGBoost and LightGBM, and deep learning models represented by convolutional neural networks and recurrent neural networks, and converters, to predict market quotation changes. However, the time sequence prediction technology provided in the model is often used for solving the problems of over fitting and poor generalization, and most time sequence prediction models only consider the historical market information of a single asset when predicting the price change of the single financial asset, so that the change rule of a market sequence cannot be well mined, and a relatively accurate prediction result cannot be generated.
Accordingly, those skilled in the art have been directed to developing a new financial time series prediction method and system that solves the above-mentioned problems occurring in the prior art.
Disclosure of Invention
In view of the above-mentioned drawbacks of the prior art, the present invention aims to solve the technical problem of how to overcome the over-fitting and poor generalization in the prediction process, and the inaccurate prediction result caused by considering only the historical market information of a single asset when the time-series prediction model predicts the price variation of the single financial asset.
In order to achieve the above objective, the present invention provides a financial time sequence variation prediction method and system based on a probability diffusion model (Diffusion Probabilistic Model), so as to reduce the overfitting degree of market price variation prediction and improve the generalization performance of prediction. The method and the system comprehensively consider market trend information of other related assets when predicting market price change of a single financial asset, adopt a forward process of a diffusion model to strengthen a time sequence characteristic sequence of a plurality of comprehensive financial assets, reduce overfitting degree of the time sequence prediction model, improve prediction generalization performance of the time sequence prediction model, generate a prediction sequence through a backward process of the diffusion model, introduce a multi-scale denoising score matching mechanism to denoise the prediction sequence generated by the backward process of the diffusion model, and obtain uncertainty estimation of a prediction result.
Specifically, the financial time sequence prediction method based on the diffusion model provided by the invention comprises the following steps:
step 1, selecting quotation sequence data of a predicted target financial asset, preprocessing the data to obtain a data sample, and dividing the data sample into a training set, a verification set and a test set;
step 2, constructing a price change prediction model of the prediction target financial asset, wherein the price change prediction model comprises a diffusion model forward process, a diffusion model backward process and a multi-scale denoising score matching mechanism;
step 3, inputting the training set into the price change prediction model for training, and carrying out model verification through the verification set to finally determine parameters of the price change prediction model;
and 4, inputting the test set into the price change prediction model with the determined parameters to perform model prediction, and evaluating the prediction performance of the price change prediction model.
Further, the step 1 includes the following substeps:
step 101, selecting the quotation sequence data; the quotation sequence data includes data of the predicted target financial asset and also includes data of other financial assets associated with the predicted target financial asset;
step 102, selecting a driving price, a receiving price, a highest price, a lowest price, a transaction amount, a warehouse holding amount and a fill index as characteristic dimensions;
step 103, adopting mean variance normalization as a feature normalization method, and independently carrying out normalization processing on each feature dimension of the market sequential data, wherein a calculation formula is as follows:
wherein x represents the original value of a certain feature dimension, x * The numerical value of the characteristic dimension after normalization treatment is represented, mu represents the average value of the characteristic dimension for a period of time before the current time point, sigma represents the standard deviation of the characteristic dimension for a period of time before the current time point;
104, marking the characteristic matrix after i time normalization processing as X i ∈R N×D Wherein N represents the number of the predicted target financial assets and other financial assets associated with the predicted target financial assets, and D represents the number of the feature dimensions selected; taking t different feature matrices X at t time points with equal intervals to form a feature matrix sequence X (0) I.e. (X) 1 ,X 2 ,...,X t ) And the feature matrix is used forSequence X (0) As input samples; the predicted target sequence corresponding to the input sample is a time sequence after the input sample; each element of the predicted target sequence in a time sequence after the input sample is a time-step-by-time price change, i.e. the predicted target sequence Y (0) The method comprises the following steps:
(Y t+1 ,Y t+2 ,...,Y τ )=(P t+1 -P t ,P t+2 -P t+1 ,...,P τ -P τ-1 )
wherein ,Pt Representing the price of the forecast target financial asset at time t, Y t+1 The price change from time t to time t+1 is shown; the feature matrix sequence and the prediction target sequence form the data sample;
step 105, dividing the data samples into the training set, the validation set and the test set.
Further, the step 2 includes the following substeps:
step 2.1, constructing a forward process of the diffusion model in the price change prediction model;
step 2.2, constructing a backward process of the diffusion model in the price change prediction model;
step 2.3, constructing the multi-scale denoising score matching mechanism in the price change prediction model;
wherein the diffusion model forward process in step 2.1 comprises: constructing a Markov chain with gradually increased Gaussian noise; inputting the feature matrix sequence X (0) =(X 1 ,X 2 ,...,X t ) And the corresponding predicted target sequence Y (0) =(Y t+1 ,Y t+2 ,...,Y τ ) Simultaneously diffusing by using different variance plans to obtain the enhanced feature matrix sequence and the predicted target sequence; wherein, beta= (beta) 1 ,…,β T ) Is the diffusion variance plan of the feature matrix sequence, β ' = (β ' ' 1 ,…,β′ T ) Is the diffusion variance of the predicted target sequencePlanning;
the feature matrix sequence X is initially input (0) There may be the following decomposition:
X=<X r ,∈ X >
wherein ,Xr Representing the denoised part of the feature matrix sequence, E X Representing noise in the feature matrix sequence;
the predicted target sequence Y initially input (0) Similar decomposition is also possible:
Y=<Y r ,∈ Y >
wherein ,Yr Representing the denoised portion of the predicted target sequence, e Y Representing noise in the predicted target sequence;
and at the time t after noise is added, the feature matrix sequence is expressed as:
wherein ,representing the noise fraction signal, ">Representing the denoised partial signal;
the time t after noise is added, and the predicted target sequence is expressed as:
wherein ,representing the noise fraction signal, ">Representation ofAnd denoising the partial signal.
Further, the diffusion model backward process in step 2.2 employs an NVAE model.
Further, the feature matrix sequence X 'after the price change prediction model is subjected to diffusion enhancement through the diffusion model forward process' 1 ,X′ 2 ,...,X′ t After input to the encoder, the diffusion model backward procedure in step 2.2 comprises the sub-steps of:
2.2.1, extracting feature representation from the input feature matrix sequence through a bottom-up deterministic network; in the process of extracting the feature representation, the NVAE model uses a residual block to construct a hierarchical multi-scale model to model long-term correlation in the feature matrix sequence;
step 2.2.2, deducing latent variables group by group through a decoder and a top-down network, outputting a sequence
Further, the multi-scale denoising score matching mechanism in step 2.3 causes the generated sequence toIs more closely related to the true distribution Y of the predicted target sequence 1 ,Y 2 ,...,Y T
The objective function of the multi-scale denoising score matching mechanism is as follows:
wherein ,σt ∈{σ 1 ,…,σ T },{σ 1 ,…,σ T -a decremental sequence;
to further removeThe noise in the process uses single-step gradient denoising jump, and the method specifically comprises the following steps:
wherein ,is an uncertainty estimate of the prediction.
Further, the step 3 includes the following substeps:
step 3.1, defining a loss function for training the price change prediction model, wherein the loss function is expressed as:
wherein , and />Controlling the direction of generation of the diffusion model such that the model predicts the generated sequence +.>More closely to the predicted target sequence Y 'after noise addition' (t) And L is DSM The goal of (ζ, t) is then to make the predicted sequence +.>Predicted target sequence Y nearer to before noise addition (t) The method comprises the steps of carrying out a first treatment on the surface of the Psi and lambda are hyper-parameters controlling different loss weights;
step 3.2, inputting the training set into the price change prediction model for training;
and 3.3, testing different super parameters, and selecting a model with the best evaluation index on the verification set from the super parameters, wherein the evaluation index selects a mean square error MSE and a continuous probability rank fraction CRPS:
wherein ,is the predicted value of time T, Y T Is the true tag value at time T;
CRPS=∫[F f (x)-F o (x)] 2 dx
wherein ,Ff (x) Is thatF is equal to the cumulative distribution function of o (x) Is Y T Is a cumulative distribution function of (1); the closer the MSE and CRPS are to 0, the higher the prediction accuracy of the price change prediction model is.
The invention also provides a financial time sequence prediction system based on the diffusion model, which comprises: the system comprises a data preprocessing module, a model construction module, a model training module and a model prediction module;
the data preprocessing module is used for data cleaning, feature selection, normalization processing and data set construction; the data are cleaned and removed, and the data are not in the transaction time; the feature selection comprises the price of opening, price of closing, highest price, lowest price, volume of delivery, warehouse holding capacity and BOLL index as feature dimension; the normalization processing adopts a mean variance normalization method; the data set construction comprises the steps of obtaining data of a predicted target financial asset and data of other financial assets related to the predicted target financial asset, forming a feature matrix sequence, taking the feature matrix sequence as a data sample, and dividing the data sample into a training set, a verification set and a test set;
the model construction module constructs a price change prediction model of the prediction target financial asset, wherein the price change prediction model comprises a diffusion model forward process, a diffusion model backward process and a multi-scale denoising score matching mechanism;
the model training module trains the price change prediction model by using the training set, and the total loss function during training comprises: KL divergence loss, MSE loss, and loss introduced by multi-scale denoising score matching;
the model prediction module predicts price changes of the prediction target financial asset on the test set by using the price change prediction model, and evaluates prediction performance of the price change prediction model.
Further, the model building module further comprises the step 2 and its sub-steps in the diffusion model-based financial time series prediction method of any one of claims 3 to 6.
Further, the model training module further comprises the step 3 and its sub-steps in the diffusion model-based financial time series prediction method of claim 7.
The financial time sequence prediction method and system based on the diffusion model provided by the invention have at least the following technical effects:
1. most of the existing financial asset price change prediction systems only consider single financial asset information to be predicted, but do not consider multiple financial asset information accounts associated with the single financial asset information. The technical scheme provided by the invention fully considers the market variation information of other financial assets related to the prediction target financial asset, so that the prediction effect of the prediction model is better;
2. because the signal-to-noise ratio of the financial time series is extremely low, the problems of over fitting and generalization in the process of training a model can be caused. Therefore, the technical scheme provided by the invention adopts a probability diffusion model to solve the problem of financial time sequence prediction. In order to reduce the over-fitting problem, firstly, adopting a forward process of a diffusion model to carry out time sequence enhancement on an input characteristic sequence, in the forward process, gradually adding Gaussian noise with different degrees into input data according to a plan, then generating a predicted sequence through a backward process of the diffusion model, and finally, introducing a multi-scale denoising score matching mechanism to denoise the predicted sequence generated in the backward process of the diffusion model so as to improve the accuracy of model prediction and obtain uncertainty estimation of a predicted result.
The conception, specific structure, and technical effects of the present invention will be further described with reference to the accompanying drawings to fully understand the objects, features, and effects of the present invention.
Drawings
FIG. 1 is a schematic diagram of a system module according to a preferred embodiment of the present invention;
FIG. 2 is a schematic diagram of a predictive model provided by the embodiment of FIG. 1;
FIG. 3 is a schematic illustration of the forward process of the diffusion model provided by the embodiment of FIG. 1;
FIG. 4 is a schematic diagram of a diffusion model backward process provided by the embodiment of FIG. 1.
Detailed Description
The following description of the preferred embodiments of the present invention refers to the accompanying drawings, which make the technical contents thereof more clear and easy to understand. The present invention may be embodied in many different forms of embodiments and the scope of the present invention is not limited to only the embodiments described herein.
The embodiment of the invention provides a financial time sequence variation prediction method and a financial time sequence variation prediction system based on a probability diffusion model, which comprehensively consider market trend information of other related assets when predicting market price variation of a single financial asset, adopt a forward process of the diffusion model to strengthen a time sequence characteristic sequence of a plurality of comprehensive financial assets, reduce overfitting degree of the time sequence prediction model, improve prediction generalization performance of the time sequence prediction model, generate a prediction sequence through a backward process of the diffusion model, introduce a multi-scale denoising score matching mechanism to denoise the prediction sequence generated by the backward process of the diffusion model, and obtain uncertainty estimation of a prediction result.
The embodiment of the invention provides a financial time sequence variation prediction method based on a probability diffusion model. And selecting a forecast target financial asset and a financial asset associated with the forecast target financial asset, and selecting indexes such as a driving price, a receiving price, a highest price, a lowest price, a volume, a holding amount, a fill and the like as characteristics. And normalizing the characteristics of each time point in a mean value and variance normalization mode, wherein the mean value and the variance are calculated by a market sequence of a period of time before the time point, and then sequentially determining the starting time and the ending time of a training set, a verification set and a test set according to time sequence to complete the construction of a data set. And then, completing the construction of the model through a model construction module, inputting the data set output by the data preprocessing module into the model, and training the model through a model training module. Specifically, by combining the KL divergence loss, the MSE loss and the multi-scale denoising score matching loss, the model can better predict and generate a sequence which is closer to an original predicted target sequence, reduce the overfitting degree of prediction and enhance the generalization performance of prediction.
Specifically, the financial time sequence prediction method based on the diffusion model provided by the invention comprises the following steps:
step 1, selecting quotation sequence data of a predicted target financial asset, preprocessing the data to obtain a data sample, and dividing the data sample into a training set, a verification set and a test set;
step 2, constructing a price change prediction model of a prediction target financial asset, wherein the price change prediction model comprises a diffusion model forward process, a diffusion model backward process and a multi-scale denoising score matching mechanism (shown in figure 2);
step 3, inputting the training set into a price change prediction model for training, and carrying out model verification through a verification set to finally determine parameters of the price change prediction model;
and 4, inputting the test set into a price change prediction model with determined parameters to perform model prediction, and evaluating the prediction performance of the price change prediction model.
In particular, step 1 comprises the following sub-steps:
step 101, selecting quotation sequence data; the quotation sequence data comprises data of forecast target financial assets and data of other financial assets related to the forecast target financial assets, such as forecast soda futures price change, and also consider futures varieties related to industry upstream and downstream such as glass;
step 102, selecting a driving price, a receiving price, a highest price, a lowest price, a transaction amount, a warehouse holding amount and a fill index as characteristic dimensions;
step 103, adopting mean variance normalization as a characteristic normalization method, and independently carrying out normalization processing on each characteristic dimension of the market sequential data, wherein the calculation formula is as follows:
where x represents the original value of a feature dimension, x * The method comprises the steps that a numerical value of a feature dimension after normalization processing is represented, mu represents a mean value of the feature dimension for a period of time before a current time point, and sigma represents a standard deviation of the feature dimension for a period of time before the current time point;
104, marking the characteristic matrix after i time normalization processing as X i ∈R N×D Where N represents the number of predicted target financial assets and other financial assets associated with the predicted target financial assets and D represents the number of feature dimensions selected; taking t different feature matrices X at t time points with equal intervals to form a feature matrix sequence X (0) I.e. (X) 1 ,X 2 ,...,X t ) And sequence X of characteristic matrix (0) As input samples; the predicted target sequence corresponding to the input sample is a time sequence after the input sample; predicting each element of the target sequence in a time sequence after inputting samples as a time-step price change, i.e. predicting the target sequence Y (0) The method comprises the following steps:
(Y t+1 ,Y t+2 ,...,Y τ )=(P t+1 -P t ,P t+2 -P t+1 ,...,P τ -P τ-1 )
wherein ,Pt Indicating the prediction purpose of t momentPrice of marked financial asset, Y t+1 The price change from time t to time t+1 is shown; the characteristic matrix sequence and the predicted target sequence form a data sample;
step 105, dividing the data sample into a training set, a validation set and a test set.
In particular, step 2 comprises the following sub-steps:
step 2.1, constructing a diffusion model forward process in a price change prediction model;
step 2.2, constructing a diffusion model backward process in the price change prediction model;
step 2.3, constructing a multi-scale denoising score matching mechanism in a price change prediction model;
the forward process of the diffusion model in the step 2.1 comprises the following steps: constructing a Markov chain with gradually increased Gaussian noise; input feature matrix sequence X (0) =(X 1 ,X 2 ,...,X t ) And corresponding predicted target sequence Y (0) =(Y t+1 ,Y t+2 ,...,Y τ ) Simultaneously diffusing by using different variance plans to obtain an enhanced feature matrix sequence and a predicted target sequence; wherein, beta= (beta) 1 ,…,β T ) Is the diffusion variance plan of the feature matrix sequence, β ' = (β ' ' 1 ,…,β′ T ) Is a diffusion variance plan of the predicted target sequence (as shown in fig. 3);
feature matrix sequence X of initial input (0) There may be the following decomposition:
X=<X r ,∈ X >
wherein ,Xr Representing denoised portions of the feature matrix sequence, e X Representing noise in the feature matrix sequence;
initially entered predicted target sequence Y (0) Similar decomposition is also possible:
Y=<Y r ,∈ Y >
wherein ,Yr Representing denoised portions of the predicted target sequence, e Y Representing noise in the predicted target sequence;
at time t after noise is added, the feature matrix sequence is expressed as:
wherein ,representing the noise fraction signal, ">Representing the denoised partial signal;
at time t after noise addition, the predicted target sequence is expressed as:
wherein ,representing the noise fraction signal, ">Representing the denoised partial signal.
In particular, the diffusion model backward process in step 2.2 uses a NVAE (Nouveau VAE) model. The NVAE model is a deep hierarchical VAE model in which latent variables in the structure of the hierarchical VAE model are divided into disjoint groups in order to increase the expressive power of the approximate posterior and prior distributions (as shown in fig. 4).
Characteristic matrix sequence X 'after diffusion enhancement is carried out on price change prediction model through diffusion model forward process' 1 ,X′ 2 ,...,X′ t After input to the encoder, the diffusion model backward procedure in step 2.2 comprises the sub-steps of:
2.2.1, extracting feature representation from an input feature matrix sequence through a bottom-up deterministic network;in extracting the feature representation, the NVAE model uses the residual block to build a hierarchical multi-scale model to model long-term correlations in the feature matrix sequence. The encoder will first start with a set of latent variables z from the top 1 Initially, sampling from the hierarchy on a group-by-group basis while progressively increasing the latent variable space dimension, this multi-scale approach enables NVAE to capture global long-term dependencies at the top of the hierarchy and local dependencies of the timing representation at the bottom.
Step 2.2.2, deducing latent variables group by group through a decoder and a top-down network, outputting a sequenceSince KL divergence between the approximate posterior distribution and the prior distribution in the hierarchical VAE structure is unbounded, training optimization is very difficult, and thus the NVAE model also improves the minimization of KL terms by approximating residual parameterization of posterior parameters, and stabilizes VAE training using a spectral regularization method.
In particular, the sequence distribution generated by the diffusion model backward processTend to approach the target predicted sequence distribution Y 'after noise addition' 1 ,Y′ 2 ,...,Y′ T Therefore, to reduce the uncertainty of the prediction result without sacrificing prediction accuracy, further denoising is required. Using a multi-scale denoising score matching method to generate sequences +.>Is closer to the true target predicted sequence distribution Y 1 ,Y 2 ,...,Y T I.e. the future market trend sequence distribution to be predicted.
The objective function of the multi-scale denoising score matching mechanism is:
wherein ,σt ∈{σ 1 ,…,σ T },{σ 1 ,…,σ T -a decremental sequence;
to further removeThe noise in the process uses single-step gradient denoising jump, and the method specifically comprises the following steps:
wherein ,is an uncertainty estimate of the prediction.
In particular, step 3 comprises the following sub-steps:
step 3.1, defining a loss function of the training price change prediction model, wherein the loss function is expressed as:
wherein , and />Controlling the direction of generation of the diffusion model such that the model predicts the generated sequence +.>More closely to the predicted target sequence Y 'after noise addition' (t) And L is DSM The goal of (ζ, t) is then to make the predicted sequence +.>Predicted target sequence Y nearer to before noise addition (t) ;ψAnd lambda is a hyper-parameter controlling different loss weights.
Step 3.2, inputting the training set into a price change prediction model for training, wherein the training process is as follows:
1. repeating;
2、Y (0) ~q(Y (0) ),δ X ~N(0,I d ),δ Y ~N(0,I d );
3. t was selected from 1, & gt, T;
4、
5. generating latent variables z, z-p using NVAE model φ (Z|X (t) );
6. From the slaveSampling to obtain->Calculating KL divergence +.>
7. Calculating a multi-scale denoising score matching loss:
8. calculating an overall loss function:
9、θ,φ←arg min(L);
10. until convergence.
And 3.3, testing different super parameters, and selecting a model with the best evaluation index on the verification set from the super parameters, wherein the evaluation index selects a mean square error MSE and a continuous probability ranking score CRPS:
wherein ,is the predicted value of time T, Y T Is the true tag value at time T;
CRPS=∫[F f (x)-F o (x)] 2 dx
wherein ,Ff (x) Is thatF is equal to the cumulative distribution function of o (x) Is Y T Is a cumulative distribution function of (1); the closer the MSE and CRPS are to 0, the closer the predicted and true distributions are, and the higher the prediction accuracy of the model.
The model prediction of step 4 includes the steps of:
step 4.1, generating a prediction result according to the following steps;
1. input: x-q (X);
2. sampling: z-p φ (Z|X);
3. Generating:
4. final output:
and 4.2, predicting performance evaluation, and calculating MSE and CRPS on the test set.
The invention also provides a financial time sequence prediction system based on the diffusion model, which comprises the following steps: a data preprocessing module, a model construction module, a model training module, and a model prediction module (as shown in fig. 1).
The data preprocessing module is used for data cleaning, feature selection, normalization processing and data set construction; data are cleaned and removed, and data which are not in the transaction time are removed; the feature selection comprises the opening price, the closing price, the highest price, the lowest price, the transaction amount, the warehouse holding amount and the BOLL index as feature dimensions; normalization involves, for each feature at a time point, normalizing the standard deviation of the mean calculated with the feature sequence for a period of time prior to that time point.
The dataset construction includes obtaining data of the predicted target financial asset and also data of other financial assets associated with the predicted target financial asset, such as predicted price trends of soda futures, while taking into account price trends of glass futures associated with industry upstream and downstream. Splicing a plurality of asset characteristics at the same time to form a two-dimensional matrix of a plurality of assets, forming a characteristic matrix sequence by taking the two-dimensional matrix within a period of time, taking the characteristic matrix sequence as a data sample, and dividing the data sample into a training set, a verification set and a test set;
the model construction module constructs a price change prediction model for predicting the target financial asset. The input of the price change prediction model is the historical price and characteristic sequence of a plurality of financial assets, and the input is the future price change sequence of the plurality of financial assets. Firstly, a two-dimensional matrix is formed by the market sequence of the target asset to be predicted and other related multi-asset and technical indexes such as BOLL, and a time sequence is formed by different two-dimensional matrices taken at t time points with continuous equal intervals as the input of a prediction model. And then, the data is enhanced through the forward process of the diffusion model, so that the generating generalization capability of the model is improved. And then inputting the diffused input sequence into a backward process of a diffusion model to obtain a generated sequence. In order to make the generated sequence more similar to the real target predicted sequence, a multi-scale denoising score matching mechanism is used, so that the probability distribution of the generated sequence is close to that of the real target predicted sequence, namely the probability distribution of a future price sequence.
Specifically, the price change prediction model comprises a diffusion model forward process, a diffusion model backward process and a multi-scale denoising score matching mechanism.
The forward process of the diffusion model is input into a time sequence feature matrix sequence and a predicted target sequence, and the forward process of the diffusion model is output into the time sequence feature sequence and the predicted target sequence after time sequence diffusion enhancement, wherein the purpose of the time sequence enhancement is to make up the limitation of time sequence data so as to reduce the overfitting degree of the prediction model. The forward process of the diffusion model is a markov chain, and each time step adds different degrees of gaussian noise on the basis of the output sequence of the previous step. And the degree of gaussian noise applied by the feature sequence and the predicted target sequence is controlled by different parameters.
The backward process of the diffusion model adopts an NVAE model, and inputs the NVAE model as a feature sequence subjected to diffusion enhancement and outputs the NVAE model as a generated prediction sequence. The NVAE model is a VAE model, adopts an encoder-decoder structure, inputs data into an encoder to generate latent variables during training, and generates data through a decoder, wherein the training target of the VAE ensures that the generated data distribution is close to the original data distribution and changes at the same time. And NVAE is improved on the basis of the traditional VAE, the minimum KL term is improved by approximating residual parameterization of posterior parameters, and a spectrum regularization method is used for stabilizing VAE training, so that the training stability and the generation effect of the VAE are improved.
The input of the multi-scale denoising score matching mechanism is a predicted sequence generated by a backward process of an original predicted target sequence and a diffusion model, and the output is a denoised predicted sequence. The purpose of this mechanism is to reduce uncertainty in generating the target sequence without sacrificing prediction accuracy, because the generated sequence distribution tends to approach the sequence of the predicted target sequence after the timing enhancement rather than the original predicted target sequence without the timing enhancement.
The model training module trains a price change prediction model by using a training set, and the total loss function during model training is divided into three parts: one part is the KL divergence loss between the sequence distribution generated by model prediction and the predicted target sequence distribution after noise addition; a part is MSE loss between the sequence generated by model prediction and the predicted target sequence after noise addition; yet another part is the loss introduced by the multi-scale denoising score matching. The first two parts aim to make the sequence generated by the model prediction closer to the predicted target sequence after adding noise, and the last two parts aim to make the sequence generated by the model prediction closer to the predicted target sequence before adding noise.
The model prediction module is used for predicting the price change of the financial asset on the test set by utilizing the model trained by the training module, and the prediction performance of the model is evaluated by using the related evaluation index.
The foregoing describes in detail preferred embodiments of the present invention. It should be understood that numerous modifications and variations can be made in accordance with the concepts of the invention without requiring creative effort by one of ordinary skill in the art. Therefore, all technical solutions which can be obtained by logic analysis, reasoning or limited experiments based on the prior art by the person skilled in the art according to the inventive concept shall be within the scope of protection defined by the claims.

Claims (10)

1. A method for predicting financial time series based on a diffusion model, the method comprising the steps of:
step 1, selecting quotation sequence data of a predicted target financial asset, preprocessing the data to obtain a data sample, and dividing the data sample into a training set, a verification set and a test set;
step 2, constructing a price change prediction model of the prediction target financial asset, wherein the price change prediction model comprises a diffusion model forward process, a diffusion model backward process and a multi-scale denoising score matching mechanism;
step 3, inputting the training set into the price change prediction model for training, and carrying out model verification through the verification set to finally determine parameters of the price change prediction model;
and 4, inputting the test set into the price change prediction model with the determined parameters to perform model prediction, and evaluating the prediction performance of the price change prediction model.
2. The diffusion model-based financial time series prediction method according to claim 1, wherein the step 1 comprises the sub-steps of:
step 101, selecting the quotation sequence data; the quotation sequence data includes data of the predicted target financial asset and also includes data of other financial assets associated with the predicted target financial asset;
step 102, selecting a driving price, a receiving price, a highest price, a lowest price, a transaction amount, a warehouse holding amount and a fill index as characteristic dimensions;
step 103, adopting mean variance normalization as a feature normalization method, and independently carrying out normalization processing on each feature dimension of the market sequential data, wherein a calculation formula is as follows:
wherein x represents the original value of a certain feature dimension, x * The numerical value of the characteristic dimension after normalization treatment is represented, mu represents the average value of the characteristic dimension for a period of time before the current time point, sigma represents the standard deviation of the characteristic dimension for a period of time before the current time point;
104, marking the characteristic matrix after i time normalization processing as X i ∈R N×D Wherein N represents the number of the predicted target financial assets and other financial assets associated with the predicted target financial assets, and D represents the number of the feature dimensions selected; taking t different feature matrices X at t time points with equal intervals to form a feature matrix sequence X (0) I.e. (X) 1 ,X 2 ,...,X t ) And the characteristic matrix sequence X (0) As input samples; the predicted target sequence corresponding to the input sample is a time sequence after the input sample; each element of the predicted target sequence in a time sequence after the input sample is a time-step-by-time price change, i.e. the predicted target sequence Y (0) The method comprises the following steps:
(Y t+1 ,Y t+2 ,...,Y τ )=(P t+1 -P t ,P t+2 -P t+1 ,...,P τ -P τ-1 )
wherein ,Pt Representing the price of the forecast target financial asset at time t, Y t+1 The price change from time t to time t+1 is shown; the feature matrix sequence and the prediction target sequence form the data sample;
step 105, dividing the data samples into the training set, the validation set and the test set.
3. The diffusion model-based financial time series prediction method according to claim 2, wherein the step 2 comprises the sub-steps of:
step 2.1, constructing a forward process of the diffusion model in the price change prediction model;
step 2.2, constructing a backward process of the diffusion model in the price change prediction model;
step 2.3, constructing the multi-scale denoising score matching mechanism in the price change prediction model;
wherein the diffusion model forward process in step 2.1 comprises: constructing a Markov chain with gradually increased Gaussian noise; inputting the feature matrix sequence X (0) =(X 1 ,X 2 ,...,X t ) And the corresponding predicted target sequence Y (0) =(Y t+1 ,Y t+2 ,...,Y τ ) Simultaneously diffusing by using different variance plans to obtain the enhanced feature matrix sequence and the predicted target sequence; wherein, beta= (beta) 1 ,…,β T ) Is the diffusion variance plan of the feature matrix sequence, β ' = (β ' ' 1 ,…,β′ T ) Is a diffusion variance plan for the predicted target sequence;
the feature matrix sequence X is initially input (0) There may be the following decomposition:
X=<X r ,∈ X >
wherein ,Xr Representation houseThe denoised part E in the feature matrix sequence X Representing noise in the feature matrix sequence;
the predicted target sequence Y initially input (0) Similar decomposition is also possible:
Y=<Y r ,∈ Y >
wherein ,Yr Representing the denoised portion of the predicted target sequence, e Y Representing noise in the predicted target sequence;
and at the time t after noise is added, the feature matrix sequence is expressed as:
wherein ,representing the noise fraction signal, ">Representing the denoised partial signal;
the time t after noise is added, and the predicted target sequence is expressed as:
wherein ,representing the noise fraction signal, ">Representing the denoised partial signal.
4. The diffusion model-based financial time series prediction method according to claim 3, wherein the diffusion model backward process in step 2.2 employs an NVAE model.
5. The diffusion model based financial time series prediction method according to claim 4 wherein said feature matrix sequence X 'after diffusion enhancement of said diffusion model forward process in said price change prediction model' 1 ,X′ 2 ,...,X′ t After input to the encoder, the diffusion model backward procedure in step 2.2 comprises the sub-steps of:
2.2.1, extracting feature representation from the input feature matrix sequence through a bottom-up deterministic network; in the process of extracting the feature representation, the NVAE model uses a residual block to construct a hierarchical multi-scale model to model long-term correlation in the feature matrix sequence;
step 2.2.2, deducing latent variables group by group through a decoder and a top-down network, outputting a sequence
6. The diffusion model based financial time series prediction method of claim 5 wherein the multi-scale denoising score matching mechanism of step 2.3 causes the sequence to be generatedIs more closely related to the true distribution Y of the predicted target sequence 1 ,Y 2 ,...,Y T
The objective function of the multi-scale denoising score matching mechanism is as follows:
wherein ,σt ∈{σ 1 ,…,σ T },{σ 1 ,…,σ T -a decremental sequence;
to further removeThe noise in the process uses single-step gradient denoising jump, and the method specifically comprises the following steps:
wherein ,is an uncertainty estimate of the prediction.
7. The diffusion model based financial time series prediction method according to claim 6, wherein said step 3 comprises the sub-steps of:
step 3.1, defining a loss function for training the price change prediction model, wherein the loss function is expressed as:
wherein , and />Controlling the direction of generation of the diffusion model such that the model predicts the generated sequence +.>More closely to the predicted target sequence Y 'after noise addition' (t) And L is DSM The goal of (ζ, t) is then to make the predicted sequence +.>Predicted target sequence Y nearer to before noise addition (t) The method comprises the steps of carrying out a first treatment on the surface of the Psi and lambda are hyper-parameters controlling different loss weights;
step 3.2, inputting the training set into the price change prediction model for training;
and 3.3, testing different super parameters, and selecting a model with the best evaluation index on the verification set from the super parameters, wherein the evaluation index selects a mean square error MSE and a continuous probability rank fraction CRPS:
wherein ,is the predicted value of time T, Y T Is the true tag value at time T;
CRPS=∫[F f (x)-F o (x)] 2 dx
wherein ,Ff (x) Is thatF is equal to the cumulative distribution function of o (x) Is Y T Is a cumulative distribution function of (1); the closer the MSE and CRPS are to 0, the higher the prediction accuracy of the price change prediction model is.
8. A financial asset price change prediction system based on a probabilistic diffusion model, the system comprising: the system comprises a data preprocessing module, a model construction module, a model training module and a model prediction module;
the data preprocessing module is used for data cleaning, feature selection, normalization processing and data set construction; the data are cleaned and removed, and the data are not in the transaction time; the feature selection comprises the price of opening, price of closing, highest price, lowest price, volume of delivery, warehouse holding capacity and BOLL index as feature dimension; the normalization processing adopts a mean variance normalization method; the data set construction comprises the steps of obtaining data of a predicted target financial asset and data of other financial assets related to the predicted target financial asset, forming a feature matrix sequence, taking the feature matrix sequence as a data sample, and dividing the data sample into a training set, a verification set and a test set;
the model construction module constructs a price change prediction model of the prediction target financial asset, wherein the price change prediction model comprises a diffusion model forward process, a diffusion model backward process and a multi-scale denoising score matching mechanism;
the model training module trains the price change prediction model by using the training set, and the total loss function during training comprises: KL divergence loss, MSE loss, and loss introduced by multi-scale denoising score matching;
the model prediction module predicts price changes of the prediction target financial asset on the test set by using the price change prediction model, and evaluates prediction performance of the price change prediction model.
9. The financial asset price change prediction system based on probabilistic diffusion model of claim 8, wherein said model building module further comprises said step 2 and its sub-steps in the diffusion model based financial time series prediction method of any one of claims 3 to 6.
10. The probabilistic diffusion model-based financial asset price change prediction system of claim 8, wherein the model training module further comprises the step 3 and sub-steps thereof in the diffusion model-based financial time series prediction method of claim 7.
CN202310710238.2A 2023-06-15 2023-06-15 Financial time sequence prediction method and system based on diffusion model Pending CN116703607A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310710238.2A CN116703607A (en) 2023-06-15 2023-06-15 Financial time sequence prediction method and system based on diffusion model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310710238.2A CN116703607A (en) 2023-06-15 2023-06-15 Financial time sequence prediction method and system based on diffusion model

Publications (1)

Publication Number Publication Date
CN116703607A true CN116703607A (en) 2023-09-05

Family

ID=87840761

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310710238.2A Pending CN116703607A (en) 2023-06-15 2023-06-15 Financial time sequence prediction method and system based on diffusion model

Country Status (1)

Country Link
CN (1) CN116703607A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117312777A (en) * 2023-11-28 2023-12-29 北京航空航天大学 Industrial equipment time sequence generation method and device based on diffusion model

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117312777A (en) * 2023-11-28 2023-12-29 北京航空航天大学 Industrial equipment time sequence generation method and device based on diffusion model
CN117312777B (en) * 2023-11-28 2024-02-20 北京航空航天大学 Industrial equipment time sequence generation method and device based on diffusion model

Similar Documents

Publication Publication Date Title
Rahman et al. Predicting prices of stock market using gated recurrent units (GRUs) neural networks
Das et al. An optimized feature reduction based currency forecasting model exploring the online sequential extreme learning machine and krill herd strategies
CN110956309A (en) Flow activity prediction method based on CRF and LSTM
CN116703607A (en) Financial time sequence prediction method and system based on diffusion model
CN111598329A (en) Time sequence data prediction method based on automatic parameter adjustment recurrent neural network
Weytjens et al. Learning uncertainty with artificial neural networks for improved remaining time prediction of business processes
CN117094451B (en) Power consumption prediction method, device and terminal
Majidi et al. Algorithmic trading using continuous action space deep reinforcement learning
JPH06337852A (en) Time series prediction method by neural network
CN111861751A (en) Stock quantitative trading method based on deep reinforcement learning, storage medium and equipment
He et al. GA-based optimization of generative adversarial networks on stock price prediction
Аkanova et al. Impact of the compilation method on determining the accuracy of the error loss in neural network learning
Pongsena et al. Deep Learning for Financial Time-Series Data Analytics: An Image Processing Based Approach
He Topological optimisation of artificial neural networks for financial asset forecasting
Supriyanto Comparison of Grid Search and Evolutionary Parameter Optimization with Neural Networks on JCI Stock Price Movements during the Covid 19
Dengerud Global Models for Time Series Forecasting With Applications to Zero-shot Forecasting
Cederberg et al. Forecasting the Nasdaq-100 index using GRU and ARIMA
CN110991637B (en) Social network data extraction method and system of company image enhancement system
Magnani A Deep Learning stacking ensemble algorithm for Stock Market classification and risk management
Cahyadi et al. Bitcoin Price Prediction Model Development Using Convolutional Neural Network (CNN) and Long Short-Term Memory (LSTM)
Rostami et al. Time Series Forecasting of House Prices: An evaluation of a Support Vector Machine and a Recurrent Neural Network with LSTM cells
CHAPARRO PORTILLO A DEEP LEARNING STATE-BASED MARKET MICROSTRUCTURE APPROACH FOR THE PRICE MOVEMENT PREDICTION TASK
Shah et al. Stock Market Prediction Using Deep Learning
Durmus A Primer to the 42 Most commonly used Machine Learning Algorithms (With Code Samples)
TANKSALE A Data Driven Approach to Option Pricing

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination