CN110633728A - Financial signal mining method and system based on Monte Carlo search algorithm - Google Patents

Financial signal mining method and system based on Monte Carlo search algorithm Download PDF

Info

Publication number
CN110633728A
CN110633728A CN201910708355.9A CN201910708355A CN110633728A CN 110633728 A CN110633728 A CN 110633728A CN 201910708355 A CN201910708355 A CN 201910708355A CN 110633728 A CN110633728 A CN 110633728A
Authority
CN
China
Prior art keywords
tree model
stock
adjusted
bayesian
probability
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910708355.9A
Other languages
Chinese (zh)
Inventor
金滢
郭健
郭家栋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Deep Asset Management Co Ltd
Peng Cheng Laboratory
Original Assignee
Hangzhou Deep Asset Management Co Ltd
Peng Cheng Laboratory
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Deep Asset Management Co Ltd, Peng Cheng Laboratory filed Critical Hangzhou Deep Asset Management Co Ltd
Priority to CN201910708355.9A priority Critical patent/CN110633728A/en
Publication of CN110633728A publication Critical patent/CN110633728A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/04Trading; Exchange, e.g. stocks, commodities, derivatives or currency exchange
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/06Asset management; Financial planning or analysis

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Accounting & Taxation (AREA)
  • Finance (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Development Economics (AREA)
  • Marketing (AREA)
  • Strategic Management (AREA)
  • Technology Law (AREA)
  • Economics (AREA)
  • General Business, Economics & Management (AREA)
  • Data Mining & Analysis (AREA)
  • Game Theory and Decision Science (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Operations Research (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Human Resources & Organizations (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Financial Or Insurance-Related Operations Such As Payment And Settlement (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention provides a financial signal mining method and system based on a Monte Carlo search algorithm.A Bayes tree model of a corresponding mathematical expression is sequentially extracted by constructing each Markov chain in a tree structure, and the Bayes tree model is adjusted by utilizing a proposal function to obtain an adjusted Bayes tree model and parameters; and substituting each group of stock data in the stock data set into a mathematical expression shown by the adjusted Bayes tree model respectively, and obtaining a data expression for stock price prediction according to the adjusted Bayes tree model. In the embodiment, a mathematical expression representing stock information is converted into an equivalent tree structure, a Bayesian tree model is established for the tree structure, a symbolic regression system is established by using the Bayesian tree model, the symbolic model is sampled by using a Markov chain Monte Carlo algorithm, an optimal model is selected from the sampled data, and a fitting result is used for predicting future stock prices, so that accurate estimation of the future stock prices is realized.

Description

Financial signal mining method and system based on Monte Carlo search algorithm
Technical Field
The invention relates to the technical field, in particular to a financial signal mining method and system based on a Monte Carlo search algorithm.
Background
The markov chain monte carlo method (MCMC method) is a method for realizing monte carlo simulation calculation integration by a computer based on bayes theory. The problem of high-dimensional complex integral of Bayesian posterior distribution is solved, and the application field of Bayesian inference is developed. At present, a Bayesian survival analysis method based on an MCMC method is widely applied in various subjects, and the method introduces a Markov process into Monte Carlo simulation, overcomes the defect that Monte Carlo integration can only be used for static simulation, and realizes dynamic simulation of sampling distribution changing along with the progress of simulation. The Bayesian actual modeling calculation problem is well solved in the application of engineering structure reliability research, and the effectiveness and operability of the model are improved.
In the financial market, because the characteristics of the stock market are obvious, the requirement of the stock market investment industry on investors is high, common investors cannot accurately predict the stock price, and the blind investment only causes the capital loss. The price of an asset may be considered a combination of signals, such as a price signal, a transaction amount signal, and the like. Most of the existing stock price prediction models adopt a fixed signal combination form, such as linear regression and the like. However, in the actual market, the combination of signals is often unknown, so that a technology for better discriminating a plurality of signals corresponding to the asset price and predicting the stock price needs to be developed.
Therefore, the prior art is subject to further improvement.
Disclosure of Invention
In view of the defects in the prior art, the invention provides a financial signal mining method and system based on a Monte Carlo search algorithm, and overcomes the defect that the prior art does not contain the related technology for accurately predicting the stock price.
The embodiment discloses a financial signal mining method based on a Monte Carlo search algorithm, wherein the method comprises the following steps:
acquiring a historical stock data set; each group of stock data contained in the stock data set contains stock characteristic information and stock price information;
generating an initial Bayesian tree model for each Markov chain in the tree structure according to a preset tree structure and prior distribution of parameters, and adjusting the initial Bayesian tree model according to a preset proposal function to obtain an adjusted Bayesian tree model and parameters;
respectively substituting each group of stock data in the stock data set into the adjusted Bayes tree model, and calculating the acceptance probability of each Markov chain adjusted Bayes tree model;
judging whether the number of the Bayesian tree models with the adjusted Markov chains and the acceptance probability larger than the preset probability exceeds the preset acceptance number or not;
if the user does not exceed the preset value, obtaining a prediction data expression according to the adjusted Bayesian tree model, and predicting the stock price according to the prediction data expression.
Optionally, the step of generating an initial bayesian tree model for each markov chain in the tree structure according to a preset tree structure and a priori distribution of the parameters includes:
generating prior distribution of the tree structure by using a computer random number and preset probability distribution;
and generating an initial Bayesian tree model according to the prior distribution by using a random sampling method.
Optionally, the step of adjusting the initial bayesian tree model according to a preset proposed function to obtain an adjusted bayesian tree model and parameters includes:
adjusting parameters of at least one node in the initial Bayesian tree model to obtain an adjusted Bayesian tree model;
or/and adjusting at least one leaf node in the initial Bayesian tree model into a child node, and growing the leaf node according to the prior distribution;
or/and adjusting at least one child node in the initial Bayesian tree model to be a leaf node, and setting the probability value adjusted to be the leaf node as the probability mean value of each node;
or/and adjusting the operators represented by the nodes in the initial Bayesian tree model;
or/and adjusting the input stock characteristic information corresponding to at least one leaf node in the initial Bayesian tree model.
Optionally, the step of substituting each group of stock data in the stock data set into the adjusted bayesian tree model and calculating the acceptance probability of each markov chain adjusted bayesian tree model includes:
and calculating the acceptance probability of the Bayes tree model after the adjustment of each Markov chain by utilizing the Monte Carlo of the reversible jump Markov chain.
Optionally, after the step of determining whether the number of the adjusted bayesian tree model acceptance probabilities of the respective markov chains exceeds the preset acceptance number, the method further includes:
and saving the log joint likelihood value and the estimation error corresponding to each adjusted Bayes tree model.
Optionally, if the predicted data expression exceeds the preset threshold, the step of obtaining the predicted data expression according to the adjusted bayesian tree model includes:
judging whether the Markov chain corresponding to each adjusted Bayes tree model is converged or not according to the logarithm joint likelihood value and the estimation error;
if the Markov chain is converged, selecting a plurality of samples arranged at the end in each Markov chain as posterior samples;
and fitting the predicted data expression according to the selected posterior sample to obtain the predicted data expression.
Optionally, the step of predicting the stock price according to the prediction data expression includes:
acquiring a stock data set to be predicted;
inputting the stock characteristic information contained in the stock data set to be predicted into the prediction data expression to obtain the predicted stock price corresponding to the stock data set to be predicted.
On the basis of the above embodiment, the present invention also discloses a financial signal mining system based on the monte carlo search algorithm, wherein the system comprises:
the historical data collection module is used for acquiring a historical stock data set; each group of stock data contained in the stock data set contains stock characteristic information and stock price information;
the model establishing module is used for generating an initial Bayesian tree model for each Markov chain in the tree structure according to a preset tree structure and the prior distribution of the parameters, and adjusting the initial Bayesian tree model according to a preset proposing function to obtain an adjusted Bayesian tree model and parameters;
the acceptance probability calculation module is used for substituting each group of stock data in the stock data set into the adjusted Bayes tree model respectively and calculating the acceptance probability of each Markov chain adjusted Bayes tree model;
the receiving quantity calculating module is used for judging whether the quantity of the Bayesian tree models with the adjusted Markov chains, the receiving probability of which is greater than the preset probability, exceeds the preset receiving quantity or not;
and the input prediction module is used for obtaining a prediction data expression according to the adjusted Bayesian tree model and predicting the stock price according to the prediction data expression when the number of the adjusted Bayesian tree model acceptance probability of each Markov chain which is larger than the preset probability exceeds the preset acceptance number.
The embodiment also discloses a computer device, which comprises a memory and a processor, wherein the memory stores a computer program, and the processor implements the steps of the method when executing the computer program.
The present embodiment also discloses a computer-readable storage medium having a computer program stored thereon, wherein the computer program realizes the steps of the method when being executed by a processor.
Compared with the prior art, the embodiment of the invention has the following advantages:
according to the method and the related equipment provided by the embodiment of the invention, the mathematical expression is expressed as a variable mathematical expression through the data tag, the mathematical expression is converted into an equivalent tree structure, a Bayesian tree model is established for the tree structure, a symbolic regression system is established by using the Bayesian tree model, the symbolic model is sampled by using a Markov chain Monte Carlo algorithm, an optimal model is selected from the sampled data, and the future stock price is predicted by using the fitting result, so that the accurate estimation of the future stock price is realized, and the reference information is provided for investors.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments described in the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
FIG. 1 is a flow chart illustrating the steps of a method for mining financial signals based on a Monte Carlo search algorithm according to an embodiment of the present invention;
FIG. 2 is a schematic structural diagram of the numerical structure of the financial signal mining method in the implementation of the present invention;
fig. 3 is a schematic structural diagram of an apparatus according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention clearer and clearer, the present invention is further described in detail below with reference to the accompanying drawings and examples. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
In the prior art, symbolic regression refers to a system for fitting a certain algebraic expression (composed of basic mathematical operations such as +, -, +,/and the like) to given data and output, can build a symbolic model for a nonlinear system without depending on model knowledge, and is currently applied to the fields of engineering relational expression fitting, signal detection and the like. The genetic algorithm is a classical method for constructing a symbolic regression system, but the initialization of the genetic algorithm has a large influence on the algorithm effect, and the genetic algorithm is relatively fixed and difficult to introduce a priori knowledge (for example, a certain operator and a control part structure are preferentially adopted) on a mathematical expression. Based on the above problems in the prior art, the present embodiment discloses a new symbolic regression system to predict the stock price more accurately.
The embodiment discloses a financial signal mining method based on a Monte Carlo search algorithm, and as shown in FIG. 1, the method comprises the following steps:
step S1, acquiring a historical stock data set; each group of stock data contained in the stock data set contains stock characteristic information and stock price information.
Historical stock information is collected, and the historical stock information is constructed into a historical stock data set, wherein the stock data set comprises a plurality of groups of stock data, and the stock data corresponds to stock characteristic information and stock price information. Wherein the stock characteristic information comprises: yesterday stock price, previous-day stock price, yesterday transaction amount and the like, wherein the stock price information is information representing stock prices, such as: the profitability of the stock. Each set of stock data contains stock characteristic information and stock price information for a stock.
And step S2, generating an initial Bayesian tree model for each Markov chain in the tree structure according to the preset tree structure and the prior distribution of the parameters, and adjusting the initial Bayesian tree model according to a preset proposal function to obtain an adjusted Bayesian tree model and parameters.
The tree structure is a nonlinear data structure, each element contained in the nonlinear data structure is called a node, the node is divided into a father node, a child node and a leaf node, each father node is connected with a plurality of child nodes, each child node is connected with a leaf node, each non-leaf node represents an algorithm, the data of the child node is used for arithmetic operation of the node and then is used as the data of the node, and the arithmetic operation is mathematical operation, for example: + represents addition, and exp () represents logarithmic transformation by converting the elements of each nodeElements represent each operational factor of the mathematical expression, and the computational operators in the mathematical expression are represented by the connection operators between the nodes, so that the mathematical expression is converted into a data structure which is easy to read by a computer, and the computation of the mathematical expression is accelerated, wherein a binary operator node has two sub-nodes, and a binary operator is a computational operator which needs two operational factors to perform operation, for example: +,. x, etc., the unary operator has a child node, and the unary operator is a calculation operator that can be operated by only one operation factor, such as: exponent exp (), trigonometric function cos (), etc., each leaf node (i.e., a node without children) represents a stock feature, such as: the input data is
Figure 33641DEST_PATH_IMAGE001
D characteristics (features) are provided, one of the characteristics is selected by a certain leaf node, the input data is stock characteristic information, and the characteristics are that elements corresponding to each leaf node correspond to stock characteristics such as yesterday stock price, previous-day stock price and yesterday trading volume one by one.
For example, expressions
Figure 618206DEST_PATH_IMAGE002
Can be represented by a tree structure as in FIG. 2, in which the cos labeled root node, the child nodes of which are + labeled nodes, the child nodes of + labeled nodes are two leaf nodes, and the superscripts 1 and 2 represent inputs
Figure 254723DEST_PATH_IMAGE003
And
Figure 266542DEST_PATH_IMAGE004
the prior distribution is information known about an unknown parameter, such as real data of a certain amount needing prediction which is known in advance, and because a certain tree structure is designed in advance in the step, the prior distribution of the parameter can be obtained according to the designed tree structure at the same time.
The step of generating an initial bayesian tree model for each markov chain in the tree structure according to a preset tree structure and a prior distribution of the parameters comprises:
generating prior distribution of the tree structure by using a computer random number and preset probability distribution;
and generating an initial Bayesian tree model according to the prior distribution by using a random sampling method.
For example: the tree structure is denoted by T and the feature inputs represented by leaf nodes by M, whose joint prior distribution f(s) = f (T, M) is as follows: starting from the root node (the final operation), for each depth isNode η (the depth of a node is the number of nodes that the node passes through to the root node), using a computer random number, with probability
Figure 494709DEST_PATH_IMAGE006
Generating a child node (wherein the parameters
Figure 720154DEST_PATH_IMAGE007
And
Figure 919185DEST_PATH_IMAGE008
pre-assigned) to represent a unary operator by a mean-average probability, and to represent a unary operator by a probability
Figure 230081DEST_PATH_IMAGE009
Generating two child nodes (wherein the parameters
Figure 422028DEST_PATH_IMAGE010
And
Figure 501979DEST_PATH_IMAGE008
pre-assigned), represent some binary operator by the average probability, and represent some remaining probability by the remaining probability
Figure 871912DEST_PATH_IMAGE011
Become leaf nodes and determine which particular feature is input with a uniform probability. The average probability is the average value of the probabilities of all the parameters.
The basic operator design and parameter spatial distribution in the tree structure and the parameters comprise:
using the basic binary operator: + and x, and a unary operator: exp () stands for exponential transform, inv (x) =1/x, and linear transform lt (x) = ax + b, where a, b are linear parameters subject to an inverse gamma distribution, randomly sampled from the prior distribution. All linear parameters are represented by a length-invariant vector theta, a-priori distribution of a
Figure 670104DEST_PATH_IMAGE012
Wherein
Figure 400162DEST_PATH_IMAGE013
Obeying an inverse gamma distribution, the prior distribution of b being
Figure 334620DEST_PATH_IMAGE012
Figure 863735DEST_PATH_IMAGE014
Obeying an inverse gamma distribution.
Wherein, the setting of output noise distribution in tree structure and parameter includes:
the hypothesis tree structure and the linear parameters are corresponded to
Figure 149223DEST_PATH_IMAGE015
As a function of the input
Figure 682973DEST_PATH_IMAGE016
Then the corresponding tag output is
Figure 488249DEST_PATH_IMAGE017
In which the noise (error)
Figure 449252DEST_PATH_IMAGE018
Compliance
Figure 956456DEST_PATH_IMAGE019
The distribution of the water content is carried out,
Figure 28317DEST_PATH_IMAGE020
obeying an inverse gamma distribution.
The proposed function (proposal)
Figure 953679DEST_PATH_IMAGE021
Representing existing structuresStarting from the point of view, a new structure is proposed
Figure 80084DEST_PATH_IMAGE023
Probability distribution of not includingFor a tree structure designed according to the prior art
Figure 735505DEST_PATH_IMAGE022
Starting from the point of view, the structure of the device is changed to obtain a new structure after the change
Figure 772731DEST_PATH_IMAGE023
The method for generating the Bayesian tree model of the new structure by using the proposed function respectively comprises the following steps:
1. keeping the tree structure of the initial Bayesian tree model unchanged, and adjusting parameters, wherein the parameters comprise: stock characteristics corresponding to the nodes or operation operators among the nodes;
2. adjusting the tree structure of the initial Bayesian tree model, for example: growing, namely continuously growing the leaf nodes according to prior distribution; pruning, selecting a certain non-leaf node, and setting the non-leaf node as a leaf node.
After the tree structure is preset, each Markov chain in the tree structure is distributed according to prior
Figure 520107DEST_PATH_IMAGE025
And generating a Bayes tree model and parameters of the Bayes tree model by using a random sampling method.
Specifically, the step of adjusting the initial bayesian tree model according to a preset proposed function to obtain an adjusted bayesian tree model and parameters includes:
adjusting parameters of at least one node in the Bayesian tree model to obtain an adjusted Bayesian tree model;
or/and adjusting at least one leaf node in the Bayesian tree model into a child node, and growing the leaf node according to the prior distribution;
or/and adjusting at least one child node in the Bayes tree model to be a leaf node, and setting the probability value adjusted to be the leaf node as the probability mean value of each node;
or/and adjusting operators represented by nodes in the Bayesian tree model;
or/and adjusting the input stock characteristic information corresponding to at least one leaf node in the Bayesian tree model.
Since the above modifications can be implemented individually or in combination, the specific implementation can be designed according to the situation of the tree structure.
Proposing a functionRepresenting existing structures
Figure 568145DEST_PATH_IMAGE027
Starting from the point of view, a new structure is proposedProbability distribution of not including
Figure 10945DEST_PATH_IMAGE029
Sampling of (3).Five variations are included:
1) the structure is unchanged, and the parameter information is adjusted;
2) growing: selecting a certain leaf node according to the equipartition probability, and continuing to grow according to the prior distribution;
3) pruning: selecting a certain non-leaf node according to the equipartition probability, setting the non-leaf node as a leaf node, and selecting a certain characteristic according to the equipartition probability for inputting;
4) and (3) modifying an operator: and selecting a certain non-leaf node according to the average probability, and resetting the operator according to the average probability. If the unary operator is changed into diacid, growing another child node (right child node) according to prior distribution; if the binary operator is changed into the unary operator, directly cutting off the right subnode branch;
5) modifying the characteristics: and selecting a certain leaf node according to the equipartition probability, and resetting characteristic input according to the equipartition probability.
The above five transformations are according to probability
Figure 982760DEST_PATH_IMAGE026
A new structure is generated and the new structure is generated,
Figure 361789DEST_PATH_IMAGE026
can be pre-specified in the system and can be self-adjusting to the desired search direction, for example, pruning at a greater probability when the size of the tree is larger.
And step S3, substituting each group of stock data in the stock data set into the adjusted Bayes tree model respectively, and calculating the receiving probability of each Markov chain adjusted Bayes tree model.
After obtaining the bayesian tree models corresponding to the markov chains adjusted in the step S2, inputting the historical stock data sets obtained in the step S1 into the bayesian tree models, calculating the acceptance probability of each adjusted bayesian tree model, and determining whether the acceptance probability of the adjusted bayesian tree model exceeds the preset probability according to the output result of each bayesian tree model
Figure 818178DEST_PATH_IMAGE030
. And if the Bayes tree model exceeds the preset threshold value, the adjusted Bayes tree model is accepted, and if the Bayes tree model does not exceed the preset threshold value, the Bayes tree model is not accepted, and the structure before adjustment is maintained.
The step of calculating the acceptance probability of each Markov chain adjusted Bayes tree model comprises the following steps:
and calculating the acceptance probability of the Bayes tree model after the adjustment of each Markov chain by utilizing the Monte Carlo of the reversible jump Markov chain. The reversible jump Markov chain Monte Carlo calculates the selected probability of each variable by simulating a Markov chain that obeys a steady state distribution.
The method comprises the steps of realizing linear parameter sampling by a reversible jump Markov chain algorithm (RJMCMC), and calculating the acceptance probability of a Bayesian tree model after each Markov chain is adjusted, wherein the steps comprise:
parameters when the number of linear operators lt () varies
Figure 324377DEST_PATH_IMAGE031
The method comprises the steps of adopting a reversible jump Markov chain algorithm to complete sampling of a variable dimension parameter space, specifically, according to the dimension, firstly sampling auxiliary parameters, then carrying out equivalent transformation to obtain new parameters, and then carrying out corresponding probability
Figure 182612DEST_PATH_IMAGE032
And accepting the new structure and parameters, and otherwise, maintaining the original parameters.
Under the condition that the number of the linear operator lt () is not changed, the Metropolis-Hastings algorithm step is directly used for carrying out the processSampling new values and corresponding probabilities
Figure 207385DEST_PATH_IMAGE032
Accepting new parameters and structures, otherwise, maintaining the original structures and parameters.
And step S4, judging whether the number of the Bayesian tree model with the adjusted Markov chain receiving probability larger than the preset probability exceeds the preset receiving number.
And if the acceptance probability of each adjusted Bayes tree model corresponding to the Markov chains in the number structure sampled randomly exceeds the preset probability, judging that the Bayes tree model is designed to meet the requirement if the acceptance probability exceeds the preset number.
The number of the acceptance probabilities larger than the preset probability is the number which needs to be met by the adjusted Bayesian tree model acceptance probability corresponding to each Markov chain. For example: and if the number of the Bayesian tree models corresponding to the 5 Markov chains is 10, the acceptance number of the acceptance probability of the Bayesian tree model corresponding to each Markov chain is 10.
And step S5, if the stock price exceeds the preset value, obtaining a prediction data expression according to the adjusted Bayes tree model, and predicting the stock price according to the prediction data expression.
If the acceptance number of the acceptance probability of the bayesian tree model corresponding to each markov chain calculated in the step S4 exceeds the preset acceptance number, fitting a prediction data expression according to the adjusted bayesian tree model, and predicting the stock price according to the prediction data expression.
Specifically, after the step of determining whether the number of the adjusted bayesian tree model with the markov chain having the acceptance probability greater than the preset probability exceeds the preset acceptance number, the method further includes:
and saving the log joint likelihood value and the estimation error corresponding to each adjusted Bayes tree model.
Judging whether the Markov chain corresponding to each adjusted Bayes tree model is converged or not according to the logarithm joint likelihood value and the estimation error;
if the Markov chain is converged, selecting a plurality of samples arranged at the end in each Markov chain as posterior samples;
and fitting the predicted data expression according to the selected posterior sample to obtain the predicted data expression.
And diagnosing whether the K Markov chains are converged by utilizing a Gelman-Rubin method, calculating a Gelman-Rubin diagnosis value by using a logarithm joint likelihood value sequence of the K Markov chains, judging whether the Gelman-Rubin diagnosis value is close enough to 1, and considering convergence if the Gelman-Rubin diagnosis value is close enough to 1. If convergence, the last sample of each chain is selected as a stable a posteriori sample estimate, such as: the final 1/3 sample, i.e., the bayesian tree model arranged 1/3 after each markov chain, is selected, or the expression with the smallest fitting error is selected as the best expression. The sample at post-1/3 can be considered to be a sample stabilized by sampling, and can represent an approximate posterior distribution.
Equivalently converting the Bayesian tree model serving as the posterior sample into a mathematical expression, and obtaining a predicted mathematical expression corresponding to the Bayesian tree model by using the stock characteristic information represented by each node in the Bayesian tree model and the mathematical operator of the representative operator in each node, wherein the stock characteristic information represented by each node is each operand in the predicted mathematical expression, namely input data, the operator represented by each node is an operation mode required among the input data, and each input data is calculated in the corresponding operation mode to obtain an output result of the predicted mathematical expression. Therefore, after the Bayesian tree model is equivalently converted into the prediction mathematical expression, the stock price prediction of the corresponding input stock characteristic data can be carried out by utilizing the prediction expression.
When a prediction mathematical expression is obtained, the step of predicting the stock price according to the prediction data expression comprises the following steps:
acquiring a stock data set to be predicted;
inputting the stock characteristic information contained in the stock data set to be predicted into the prediction data expression to obtain the predicted stock price corresponding to the stock data set to be predicted.
The stock data set to be predicted is a stock information set of which stock price information is not shown at present, and a prediction mathematical expression is input into the stock information set, and the prediction mathematical expression outputs a prediction stock price corresponding to the prediction stock data set.
According to the method disclosed by the embodiment, a data set with a plurality of variables is input, the system automatically utilizes Markov Monte Carlo sampling, a fitted expression is output, and a data label is expressed as a mathematical expression of the variables. In this system, a mathematical expression may be converted into an equivalent tree structure, and a bayesian tree model may be built for the tree structure (including leaf nodes, intermediate nodes) to facilitate the introduction of a priori knowledge about the operator and expression structure. A Markov chain Monte Carlo method is adopted to conduct posterior distribution sampling on a tree structure, acceptance rate is improved through an effective proposed distribution function (proposal), fitting efficiency and accuracy of function expressions are improved, after an expression used for prediction is obtained, new data are substituted into a mathematical expression, and estimation of future stock price is obtained through calculation.
The steps of the above method in practical application will be further described by the embodiment of the present invention.
1) Inputting dataAnd corresponding tag value
Figure 964437DEST_PATH_IMAGE034
Wherein each piece of data
Figure 950848DEST_PATH_IMAGE035
The d characteristics represent that the data of a certain stock on a certain day has the characteristics of closing price, trading volume and the like, and the label value is the fitting object of the d characteristics, such as the profitability of the certain stock on a certain day.
2) According to a preset acceptance number N, a Markov chain number K, a tree structure and a prior distribution of parametersProposing a functionAs the setting of the system.
For each Markov chain, according to a prior distributionGenerating structure and parameters by random sampling method, and circularly using proposed function
Figure 219969DEST_PATH_IMAGE026
And linear sampling method to obtain new structure and new parameter and probabilityAccepting new structure and new parameters, wherein probabilityCalculated by a reversible jump MCMC method; if not, the signal is maintained.
3) If the new structure and the new parameters are accepted, recording the log-joint likelihood value and the estimation error
Figure 211693DEST_PATH_IMAGE038
So that the convergence effect is diagnosed afterwards until the acceptance number N is reached.
4) And diagnosing whether the K Markov chains are converged by using a Gelman-Rubin method, calculating a Gelman-Rubin diagnosis value by using the logarithm joint likelihood value sequence of the K Markov chains, judging whether the Gelman-Rubin diagnosis value is close to 1 enough, and considering convergence if the Gelman-Rubin diagnosis value is close to 1 enough. If it converges, the last 1/3 sample of each chain is selected as the stable a posteriori sample estimate, or the expression with the smallest fitting error is selected as the best expression. The general post 1/3 sample is a sample stabilized sample that may represent an approximate posterior distribution.
5) Obtaining an expressionThen, the new data (characteristics) is substituted into the expression to obtain the prediction of the future stock price so as to guide the transaction.
The invention discloses a prior distribution function defined according to search preference
Figure 430633DEST_PATH_IMAGE036
And searching for proposal functions
Figure 400863DEST_PATH_IMAGE026
The method can realize flexible adjustment of the tree scale, specific operators and the structure preference, so that the fitting result of the adjusted Bayes tree model and the mathematical expression to be predicted is more accurate, and a more accurate stock price prediction result can be obtained to guide the transaction better.
A prediction example of the present embodiment is given below:
simulating input data: there are 200 pieces of data, each piece of data x has two characteristics
Figure 942702DEST_PATH_IMAGE040
And
Figure 441817DEST_PATH_IMAGE041
in the ranges [0,6 ] respectively]Wherein the generation is independent according to uniform distribution;
simulation output data: for 200 pieces of simulation input data, according to the prediction mathematical expression:
Figure 819840DEST_PATH_IMAGE042
calculating to obtain;
setting a model: receiving the number N =50, and the number K =5 of Markov chains;
and (3) outputting a model: the expression finally fitted is
Figure 797023DEST_PATH_IMAGE043
The mean square error of the fit was 0.02 (for comparison, the mean of the absolute values of the input data was 63).
On the basis of the above embodiment, the present invention also discloses a financial signal mining system based on the monte carlo search algorithm, as shown in fig. 3, including:
a historical data collection module 301, configured to obtain a historical stock data set; each group of stock data contained in the stock data set contains stock characteristic information and stock price information; the function of which is as described in step S1.
A model establishing module 302, configured to generate an initial bayesian tree model for each markov chain in the tree structure according to a preset tree structure and prior distribution of parameters, and adjust the initial bayesian tree model according to a preset proposed function to obtain an adjusted bayesian tree model and parameters; the function of which is as described in step S2.
An acceptance probability calculation module 303, configured to substitute each group of stock data in the stock data set into the adjusted bayesian tree model, and calculate an acceptance probability of each markov chain adjusted bayesian tree model; the function of which is as described in step S3.
A receiving number calculating module 304, configured to determine whether the number of the adjusted bayesian tree models with the markov chains whose receiving probability is greater than the preset probability exceeds the preset receiving number; the function of which is as described in step S4.
The input prediction module 305 is configured to, when it is determined that the number of the adjusted bayesian tree model acceptance probabilities of the respective markov chains exceeds the preset acceptance number, obtain a prediction data expression according to the adjusted bayesian tree model, and perform stock price prediction according to the prediction data expression, where the function of the input prediction module is as described in step S5.
Yet another embodiment of the present invention is a computer device comprising a memory and a processor, the memory storing a computer program, wherein the processor implements the steps of the method when executing the computer program.
Yet another embodiment of the invention is a computer-readable storage medium having a computer program stored thereon, wherein the computer program realizes the steps of the method when executed by a processor.
The invention provides a financial signal mining method and system based on a Monte Carlo search algorithm, wherein a Bayes tree model corresponding to each Markov chain in a tree structure is constructed, and the Bayes tree model is adjusted by using a proposal function to obtain an adjusted Bayes tree model and parameters; and substituting each group of stock data in the stock data set into the adjusted Bayes tree model respectively, and obtaining a data expression for stock price prediction according to the adjusted Bayes tree model. The method comprises the steps of expressing a data label of stock information into a mathematical expression, converting the mathematical expression into an equivalent tree structure, establishing a Bayesian tree model for the tree structure, establishing a symbolic regression system by using the Bayesian tree model, sampling the symbolic model by using a Markov chain Monte Carlo algorithm, selecting an optimal model from a sampled book, and predicting future stock prices by using a fitting result, so that accurate estimation of the future stock prices is realized.
It should be understood that equivalents and modifications of the technical solution and inventive concept thereof may occur to those skilled in the art, and all such modifications and alterations should fall within the scope of the appended claims.

Claims (10)

1. A financial signal mining method based on a Monte Carlo search algorithm is characterized by comprising the following steps:
acquiring a historical stock data set; each group of stock data contained in the stock data set contains stock characteristic information and stock price information;
generating an initial Bayesian tree model for each Markov chain in the tree structure according to a preset tree structure and prior distribution of parameters, and adjusting the initial Bayesian tree model according to a preset proposal function to obtain an adjusted Bayesian tree model and parameters;
respectively substituting each group of stock data in the stock data set into the adjusted Bayes tree model, and calculating the acceptance probability of each Markov chain adjusted Bayes tree model;
judging whether the number of the Bayesian tree models with the adjusted Markov chains and the acceptance probability larger than the preset probability exceeds the preset acceptance number or not;
if the user does not exceed the preset value, obtaining a prediction data expression according to the adjusted Bayesian tree model, and predicting the stock price according to the prediction data expression.
2. The method of mining a financial signal based on a monte carlo search algorithm according to claim 1, wherein the step of generating an initial bayesian tree model for each markov chain in the tree structure based on a predetermined tree structure and a prior distribution of parameters comprises:
generating prior distribution of the tree structure by using a computer random number and preset probability distribution;
and generating an initial Bayesian tree model according to the prior distribution by using a random sampling method.
3. The method for mining financial signals based on the monte carlo search algorithm according to any one of claims 1-2, wherein the step of adjusting the initial bayesian tree model according to a preset proposal function to obtain an adjusted bayesian tree model and parameters comprises:
adjusting parameters of at least one node in the initial Bayesian tree model to obtain an adjusted Bayesian tree model;
or/and adjusting at least one leaf node in the initial Bayesian tree model into a child node, and growing the leaf node according to the prior distribution;
or/and adjusting at least one child node in the initial Bayesian tree model to be a leaf node, and setting the probability value adjusted to be the leaf node as the probability mean value of each node;
or/and adjusting the operators represented by the nodes in the initial Bayesian tree model;
or/and adjusting the input stock characteristic information corresponding to at least one leaf node in the initial Bayesian tree model.
4. The method for mining financial signals based on the Monte Carlo search algorithm according to any one of claims 1-2, wherein the step of substituting each set of stock data in the stock data set into the adjusted Bayesian tree model and calculating the acceptance probability of each Markov chain adjusted Bayesian tree model comprises:
and calculating the acceptance probability of the Bayes tree model after the adjustment of each Markov chain by utilizing the Monte Carlo of the reversible jump Markov chain.
5. The method for mining financial signals based on the Monte Carlo search algorithm according to any one of claims 1-2, wherein the step of determining whether the number of the acceptance probabilities of the respective Markov chain adjusted Bayesian tree models greater than the predetermined probability exceeds the predetermined acceptance number further comprises:
and saving the log joint likelihood value and the estimation error corresponding to each adjusted Bayes tree model.
6. The method of claim 5, wherein the step of obtaining the predicted data expression from the adjusted Bayesian tree model if the predicted data expression exceeds the threshold comprises:
judging whether the Markov chain corresponding to each adjusted Bayes tree model is converged or not according to the logarithm joint likelihood value and the estimation error;
if the Markov chain is converged, selecting a plurality of samples arranged at the end in each Markov chain as posterior samples;
and fitting the predicted data expression according to the selected posterior sample to obtain the predicted data expression.
7. The method of Monte Carlo search algorithm-based financial signal mining of claim 5, wherein said step of making a stock price prediction from said prediction data expression comprises:
acquiring a stock data set to be predicted;
inputting the stock characteristic information contained in the stock data set to be predicted into the prediction data expression to obtain the predicted stock price corresponding to the stock data set to be predicted.
8. A system for mining financial signals based on a monte carlo search algorithm, comprising:
the historical data collection module is used for acquiring a historical stock data set; each group of stock data contained in the stock data set contains stock characteristic information and stock price information;
the model establishing module is used for generating an initial Bayesian tree model for each Markov chain in the tree structure according to a preset tree structure and the prior distribution of the parameters, and adjusting the initial Bayesian tree model according to a preset proposing function to obtain an adjusted Bayesian tree model and parameters;
the acceptance probability calculation module is used for substituting each group of stock data in the stock data set into the adjusted Bayes tree model respectively and calculating the acceptance probability of each Markov chain adjusted Bayes tree model;
the receiving quantity calculating module is used for judging whether the quantity of the Bayesian tree models with the adjusted Markov chains, the receiving probability of which is greater than the preset probability, exceeds the preset receiving quantity or not;
and the input prediction module is used for obtaining a prediction data expression according to the adjusted Bayesian tree model and predicting the stock price according to the prediction data expression when the number of the adjusted Bayesian tree model acceptance probability of each Markov chain which is larger than the preset probability exceeds the preset acceptance number.
9. A computer device comprising a memory and a processor, the memory storing a computer program, wherein the processor implements the steps of the method of any one of claims 1 to 7 when executing the computer program.
10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method of any one of claims 1 to 7.
CN201910708355.9A 2019-08-01 2019-08-01 Financial signal mining method and system based on Monte Carlo search algorithm Pending CN110633728A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910708355.9A CN110633728A (en) 2019-08-01 2019-08-01 Financial signal mining method and system based on Monte Carlo search algorithm

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910708355.9A CN110633728A (en) 2019-08-01 2019-08-01 Financial signal mining method and system based on Monte Carlo search algorithm

Publications (1)

Publication Number Publication Date
CN110633728A true CN110633728A (en) 2019-12-31

Family

ID=68969151

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910708355.9A Pending CN110633728A (en) 2019-08-01 2019-08-01 Financial signal mining method and system based on Monte Carlo search algorithm

Country Status (1)

Country Link
CN (1) CN110633728A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117216720A (en) * 2023-11-07 2023-12-12 天津市普迅电力信息技术有限公司 Multi-system data fusion method for distributed photovoltaic active power

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117216720A (en) * 2023-11-07 2023-12-12 天津市普迅电力信息技术有限公司 Multi-system data fusion method for distributed photovoltaic active power
CN117216720B (en) * 2023-11-07 2024-02-23 天津市普迅电力信息技术有限公司 Multi-system data fusion method for distributed photovoltaic active power

Similar Documents

Publication Publication Date Title
Elmaz et al. CNN-LSTM architecture for predictive indoor temperature modeling
Jin et al. Bayesian symbolic regression
Ma et al. A hybrid attention-based deep learning approach for wind power prediction
Chung et al. Empirical evaluation of gated recurrent neural networks on sequence modeling
Onken et al. Discretize-optimize vs. optimize-discretize for time-series regression and continuous normalizing flows
CN111860982A (en) Wind power plant short-term wind power prediction method based on VMD-FCM-GRU
US11650968B2 (en) Systems and methods for predictive early stopping in neural network training
Oudelha et al. HMM parameters estimation using hybrid Baum-Welch genetic algorithm
CN111433689B (en) Generation of control systems for target systems
KR20220059120A (en) System for modeling automatically of machine learning with hyper-parameter optimization and method thereof
Klusowski Sparse learning with CART
CN112215412A (en) Dissolved oxygen prediction method and device
CN115034430A (en) Carbon emission prediction method, device, terminal and storage medium
CN112163671A (en) New energy scene generation method and system
Larios-Cárdenas et al. A hybrid inference system for improved curvature estimation in the level-set method using machine learning
CN110633728A (en) Financial signal mining method and system based on Monte Carlo search algorithm
Nikolaev et al. A regime-switching recurrent neural network model applied to wind time series
CN111161238A (en) Image quality evaluation method and device, electronic device, and storage medium
Chang Latent variable modeling for generative concept representations and deep generative models
CN113780394B (en) Training method, device and equipment for strong classifier model
Heiner et al. Bayesian nonparametric density autoregression with lag selection
CN115879536A (en) Learning cognition analysis model robustness optimization method based on causal effect
JP6648828B2 (en) Information processing system, information processing method, and program
Dmitrieva Forecasting of a hydropower plant energy production
Bales et al. Selecting the metric in hamiltonian monte carlo

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20191231