CN115482101A - Stock prediction method based on historical data screening and momentum overflow effect - Google Patents

Stock prediction method based on historical data screening and momentum overflow effect Download PDF

Info

Publication number
CN115482101A
CN115482101A CN202210943495.6A CN202210943495A CN115482101A CN 115482101 A CN115482101 A CN 115482101A CN 202210943495 A CN202210943495 A CN 202210943495A CN 115482101 A CN115482101 A CN 115482101A
Authority
CN
China
Prior art keywords
stock
network
ith
stocks
historical
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210943495.6A
Other languages
Chinese (zh)
Inventor
高忠科
苏静钰
郭嘉仪
田源
薄地阔
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Chunda Asset Management Co ltd
Tianjin University
Original Assignee
Shanghai Chunda Asset Management Co ltd
Tianjin University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Chunda Asset Management Co ltd, Tianjin University filed Critical Shanghai Chunda Asset Management Co ltd
Priority to CN202210943495.6A priority Critical patent/CN115482101A/en
Publication of CN115482101A publication Critical patent/CN115482101A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/04Trading; Exchange, e.g. stocks, commodities, derivatives or currency exchange
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Economics (AREA)
  • Strategic Management (AREA)
  • General Physics & Mathematics (AREA)
  • Finance (AREA)
  • Development Economics (AREA)
  • Marketing (AREA)
  • Human Resources & Organizations (AREA)
  • Accounting & Taxation (AREA)
  • General Business, Economics & Management (AREA)
  • Biomedical Technology (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Game Theory and Decision Science (AREA)
  • Technology Law (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Tourism & Hospitality (AREA)
  • Financial Or Insurance-Related Operations Such As Payment And Settlement (AREA)

Abstract

A stock forecasting method based on historical data screening and momentum overflow effect comprises the following steps: collecting historical trading data and text information of the target stock and stocks related to the target stock; data preprocessing, namely converting historical transaction data and text information into two-dimensional feature matrixes respectively; constructing a relation network matrix among N stocks; constructing a fusion model of a time convolution network with a channel domain attention mechanism and a graph convolution network; and training the time convolution network with the channel domain attention mechanism by combining with a fusion model of the graph convolution network, and finally selecting the super-parameter combination with the minimum cross entropy loss value as the optimal super-parameter combination according to a training result. The invention collects the text information on the basis of collecting the historical trading data of the target stock and the related stocks, and realizes the fusion of multi-source information; the historical data is screened by a deep learning method, the price rise and fall of the stocks are predicted, and the precision of predicting the price rise and fall of the stocks is improved.

Description

Stock prediction method based on historical data screening and momentum overflow effect
Technical Field
The invention relates to a stock forecasting method. In particular to a stock forecasting method based on historical data screening and momentum overflow effect.
Background
The stock market is one of important components of the financial market, and the action of investing through buying and selling stocks becomes a common phenomenon in daily life of people, so that the method for predicting the change trend of the stock price by applying a scientific method is very important, a basis can be provided for investors to make investment plans, and risks are avoided to a certain extent. In recent years, many ideas and implementation methods have been proposed by scholars in the financial and computer fields for stock forecasting.
In the aspect of feature selection, the forecasting of the stocks all the time depends on the analysis of historical data of the stocks, such as opening prices, highest prices, lowest prices, volume of interest, market profitability and the like. However, the stock price sequence can be influenced by aspects such as trading behavior, market information and the like, and has the characteristics of nonlinearity, volatility and chaos. Nowadays, investment behaviors of stockholders are extremely easy to be influenced by public opinions, and whether related news reports or opinions found by people of a plurality of investors can influence transaction strategies of partial investors. A great deal of text information on channels such as an investor forum, a financial and economic news platform and the like not only reflects the stock price change situation in the Chinese and foreign languages, but also reveals the direct opinion of investors. The mining, fusion and analysis of the massive information characteristics become one of the keys for analyzing the stock price change.
In the selection of the prediction model, the evolution from a traditional time series model to a machine learning model and then to a deep learning model is undergone. Many practices prove that the deep learning model achieves better effects on stock forecasting problems due to strong adaptability, excellent portability and data analysis capability, and the recurrent neural network and the variant thereof become more widely applied methods. However, the recurrent neural network adopts a sequential processing sequence and a weight sharing mechanism, so that the network occupies a large amount of memory under the condition of large data volume, and each transaction day for prediction is given the same attention. It is therefore a better choice to find a network that processes data in parallel, adding a mechanism of attention.
The momentum spillover effect indicates that the revenue of a company can be predicted from the past revenue of the company because stock prices are generally slow to react to the news related to its associated company. Therefore, analysis of inter-enterprise relationships is an essential link in prediction, and changes of a certain stock may affect other stocks in the industry to different degrees.
Disclosure of Invention
The invention aims to solve the technical problem of providing a stock prediction method based on historical data screening and momentum overflow effect, which can improve the prediction accuracy, for overcoming the defects of the prior art.
The technical scheme adopted by the invention is as follows: a stock forecasting method based on historical data screening and momentum overflow effect is characterized by comprising the following steps:
1) Collecting historical trading data and text information of the target stock and stocks related to the target stock;
2) Data preprocessing, namely converting historical transaction data and text information into two-dimensional feature matrixes respectively;
3) Constructing a relationship network matrix among N stocks, comprising the following steps:
(3.1) constructing a space-time dynamic hierarchical complex network by using each trading day in each historical trading data to obtain a stock relation network under the span of T trading days;
(3.2) calculating degree correlation between layers in the N-layer stock relation network;
(3.3) calculating the clustering coefficient correlation between layers in the N layers of stock relation networks;
(3.4) defining the average value of degree correlation and clustering coefficient correlation among N stocks as the stock correlation R i,j Is represented as
Figure BDA0003786739360000021
Normalizing the stock relativity to obtain normalized stock relativity
Figure BDA0003786739360000022
Thereby obtaining N stock relation network matrixes
Figure BDA0003786739360000023
4) Constructing a fusion model of a time convolution network with a channel domain attention mechanism and a graph convolution network;
5) Continuously changing the hyper-parameters of the fusion model of the time convolution network combined graph convolution network with the channel domain attention mechanism to train the fusion model for many times, namely calculating the cross entropy loss value of the predicted value and the true value of the target stock closing price, continuously updating the parameters of the fusion model of the time convolution network combined graph convolution network with the channel domain attention mechanism, and finally selecting the hyper-parameter combination with the minimum cross entropy loss value as the optimal hyper-parameter combination according to the training result.
The stock forecasting method based on historical data screening and momentum overflow effect collects text information on the basis of collecting the historical trading data of the target stock and the stock related to the target stock, thereby adding the consideration to the emotion of an investor and realizing the fusion of multi-source information; based on the principle of momentum overflow effect, a stock relation network is added by using a space-time hierarchical complex network method, the influence of information of related stocks of the target stock on the target stock is integrated, and the overall development rule of the industry is more conformed; the historical data is screened by a deep learning method, and the stock price fluctuation is predicted, so that different attention degrees are given to different trading days, and the precision of the stock price fluctuation prediction is improved.
Drawings
FIG. 1 is a flow chart of a stock forecasting method based on historical data screening and momentum overflow effect according to the present invention;
FIG. 2 is a diagram of a converged model architecture for a time convolution network with a channel domain attention mechanism in combination with a graph convolution network;
FIG. 3 is a schematic diagram of a single stock network constructed by using a space-time dynamic hierarchical complex network when a finite traversal view distance d = 1;
fig. 4 is a schematic diagram of a single stock network node relationship when the limited traversal line-of-sight d = 1.
Detailed Description
The stock forecasting method based on historical data screening and momentum overflow effect of the invention is described in detail with reference to the embodiments and the accompanying drawings.
The invention discloses a stock forecasting method based on historical data screening and momentum overflow effect, which is creative in that fusion of multi-source data is realized by combining a time-space convolution network with channel domain attention and a fusion model of a graph convolution network, and a stock relation network is constructed according to the momentum overflow effect principle, so that the forecasting of the target stock closing price rise and fall is realized. Therefore, the invention collects the historical trading data of the target stock and the stocks related to the target stock and obtains the text information in the financial news and the investor forum. And a channel domain attention mechanism and a space-time convolution network are adopted to carry out historical data screening and data parallel processing on a fusion characteristic vector obtained after the historical transaction data characteristics and the text information characteristics are fused, so that more attention is given to important transaction days while the training cost is reduced.
On the aspect of analyzing the relationship among enterprises, the invention provides a method for constructing a relationship network among a plurality of stocks by utilizing a space-time dynamic hierarchical complex network, and further fusing a deep fusion feature vector and a stock relationship network by using a graph convolution network. And finally, constructing a model output layer by utilizing a feedforward neural network to predict the development trend of the price rise and fall of the target stock.
The stock forecasting method based on historical data screening and momentum overflow effect comprises the steps of collecting historical trading data and text information of a target stock and N-1 stocks related to the target stock, preprocessing the historical trading data and the text information, and meanwhile, obtaining a stock relation network by constructing a space-time dynamic hierarchical complex network. By training the time convolution network with the channel domain attention mechanism and the fusion model of the graph convolution network, the comprehensive characteristics of the historical trading data characteristics and the text information characteristics are subjected to data screening, and further characteristic extraction is performed by combining the stock relation network, and finally the rise and fall prediction of the target stock closing price is realized. As shown in fig. 1, the method comprises the following steps:
1) Collecting historical trading data and text information of the target stock and stocks related to the target stock;
the closing price of the target stock on the t-th trading day is higher than that of the previous day, and is classified as rising, and the closing price of the target stock on the t-th trading day is lower than that of the previous day, and is classified as falling. Historical trading data and text information for N stocks, including the target stock, is collected.
The historical transaction data comprises five indexes which are respectively as follows: opening price, closing price, transaction amount, highest price and lowest price; the text information comprises financial news for selecting the target stock and the stock related to the target stock, and comments issued by investors in the eastern wealth network forum.
2) Data preprocessing, namely converting historical transaction data and text information into two-dimensional feature matrixes respectively; the method comprises the following steps:
performing data cleaning on the collected historical transaction data (the data cleaning is the last procedure for finding and correcting recognizable errors in the data file and comprises checking data consistency, processing invalid values, missing values and the like); performing text quantization on the collected text information, and obtaining daily emotional tendency vectors of each stock through an emotional dictionary to form daily text information characteristics of each stock; setting the ith stock at the tThe historical transaction data of each transaction day is characterized by
Figure BDA0003786739360000031
Wherein L' is a characteristic dimension of the historical transaction data; similarly, the text message characteristic of the ith stock on the tth trading day is set as
Figure BDA0003786739360000032
Where L is the feature dimension of the text information feature.
3) Constructing a relationship network matrix among N stocks, comprising the following steps:
(3.1) constructing a space-time dynamic hierarchical complex network by using each trading day in each historical trading data to obtain a stock relation network under the span of T trading days; the method comprises the following steps:
regarding each trading day as a node of a single stock network, two nodes
Figure BDA0003786739360000033
And
Figure BDA0003786739360000034
between the vertical bars, a horizontal connecting line is constructed, wherein, the nodes
Figure BDA0003786739360000035
Indicating that the ith stock is at the t m The closing price of each trading day is
Figure BDA0003786739360000036
Node point
Figure BDA0003786739360000037
Denoted as the ith stock at the t n The closing price of each trading day is
Figure BDA0003786739360000038
The height of the connecting line being the minimum of the square bars, i.e.
Figure BDA0003786739360000039
t m ,t n E [ T-T, T) represents the T-th transaction day ranging from the T-T transaction day to the T-th transaction day m Date of transaction and tth n Setting the limited crossing visual range as d on each trading day, and setting the two nodes if
Figure BDA00037867393600000310
And
Figure BDA00037867393600000311
the horizontal connecting line between the two nodes is intersected with d intermediate nodes or less, then
Figure BDA00037867393600000312
And
Figure BDA00037867393600000313
a connecting edge exists between the two, otherwise, the connecting edge does not exist; fig. 3 shows a process of constructing a single stock network by using a spatio-temporal dynamic hierarchical complex network when a limited traversal line-of-sight d =1 is set, in which all network nodes with connecting edges are summarized, and a black solid line represents that no intermediate node exists in two histogram horizontal connecting lines; the black dashed line indicates that there is an intermediate node between the two histogram horizontal lines, and fig. 4 shows the relationship corresponding to the network node in fig. 3. Respectively constructing a layer of single stock network for each stock in the N stocks to obtain N layers of stock relation networks
Figure BDA00037867393600000314
Where N is the total number of the target stock and the stocks associated with the target stock.
(3.2) calculating degree correlation between layers in the N-layer stock relation network; the method comprises the following steps:
nodes according to ith stock network
Figure BDA0003786739360000041
Value of (A)
Figure BDA0003786739360000042
And j (th) nodes of stock network
Figure BDA0003786739360000043
Value of (A)
Figure BDA0003786739360000044
Calculating the value sequence of the ith stock network
Figure BDA0003786739360000045
And j-th stock network value sequence
Figure BDA0003786739360000046
The mutual information between the stock and the stock is used for representing the degree correlation of the ith stock and the jth stock in the T trading days
Figure BDA0003786739360000047
Namely degree correlation between layers in the N-layer stock relation network, the calculation formula is as follows:
Figure BDA0003786739360000048
wherein, p (k) i ) Is the degree distribution of the ith stock, p (k) j ) Is the degree distribution of the jth stock, p (k) i ,k j ) Is the joint degree distribution of the ith stock and the jth stock.
(3.3) calculating the clustering coefficient correlation between layers in the N layers of stock relation networks; the method comprises the following steps:
node according to ith layer single stock network
Figure BDA0003786739360000049
Cluster coefficient of (2)
Figure BDA00037867393600000410
Node of single stock network of j-th layer
Figure BDA00037867393600000411
Cluster coefficient of (2)
Figure BDA00037867393600000412
Calculating the clustering coefficient sequence of the ith layer of single stock network
Figure BDA00037867393600000413
And the clustering coefficient sequence of the j-th layer single stock network
Figure BDA00037867393600000414
The mutual information between the stock and the stock is used for expressing the clustering coefficient correlation of the ith stock and the jth stock in the T trading days
Figure BDA00037867393600000415
The calculation formula is as follows:
Figure BDA00037867393600000416
wherein, p (z) i ) Is the clustering coefficient distribution of the ith stock, p (z) j ) Is the clustering coefficient distribution of the jth stock, p (z) i ,z j ) Is the joint clustering coefficient distribution of the ith stock and the jth stock.
(3.4) defining the average value of degree correlation and clustering coefficient correlation among N stocks as the stock correlation R i,j Is shown as
Figure BDA00037867393600000417
Normalizing the stock relativity to obtain normalized stock relativity
Figure BDA00037867393600000418
Thereby obtaining N stock relation network matrixes
Figure BDA00037867393600000419
4) Constructing a fusion model of a time convolution network with a channel domain attention mechanism and a graph convolution network; as shown in fig. 2, includes:
(4.1) Using K-dimensional bilinear tensor product terms
Figure BDA00037867393600000420
The calculation method of (1) fuses the historical transaction data characteristics and the text information characteristics of each stock, and calculates the kth item in the K-dimensional bilinear tensor product through tensor slicing, wherein the calculation formula is as follows:
Figure BDA00037867393600000421
wherein the content of the first and second substances,
Figure BDA00037867393600000422
is a third order tensor, Γ [1:K] =[Γ 1 ,...,Γ k ,...,Γ K ]L' is the dimension of table historical transaction data characteristics, L is the dimension of text information characteristics, K is the dimension of bilinear vector product items,
Figure BDA00037867393600000423
the characteristic of the ith dimension in the historical trading data characteristic representing the ith stock,
Figure BDA00037867393600000424
the ith dimension of the text information features representing the ith stock is characterized by the historical trading data
Figure BDA00037867393600000425
And text information features
Figure BDA00037867393600000426
By a weight matrix
Figure BDA00037867393600000427
Performing series and linear transformation to obtain the fusion characteristic vector of the ith stock
Figure BDA00037867393600000428
Figure BDA0003786739360000051
Wherein, | | represents a concatenation,
Figure BDA0003786739360000052
representing the offset, tanh is the activation function;
(4.2) processing the fusion feature vector of each stock by using a channel attention mechanism, wherein the channel attention gives corresponding weight according to the influence degree of the fusion feature vectors of different trading days on the current trading day to complete the screening of historical data, and the specific processing formula is as follows:
Figure BDA0003786739360000053
wherein, the first and the second end of the pipe are connected with each other,
Figure BDA0003786739360000054
is a channel attention feature vector of the ith stock in T trading days, f is a feature dimension of each trading day output by the channel attention mechanism,
Figure BDA0003786739360000055
representing the fusion characteristic vector set of the ith stock in T trading days; CA represents the channel attention mechanism;
(4.3) respectively carrying out high-order feature extraction on the channel attention feature vector of each stock through a time convolution network model, wherein the extraction formula is as follows:
Figure BDA0003786739360000056
wherein the content of the first and second substances,
Figure BDA0003786739360000057
the depth fusion characteristic vector of the ith stock output by the time convolution network model on the tth trading day is represented, and D is
Figure BDA0003786739360000058
TCN represents a time convolutional network model;
(4.4) deep fusion feature vector of N stocks and N layers of stock relation network
Figure BDA0003786739360000059
The characteristic extraction is carried out by using a graph convolution network with a gating mechanism, and the introduction of the gating mechanism can abandon small-amplitude price change and target stocks s which do not influence the stock price fluctuation a And with the target stock s a Relationship feature vector between related stocks
Figure BDA00037867393600000510
Is represented as follows:
Figure BDA00037867393600000511
Figure BDA00037867393600000512
wherein the content of the first and second substances,
Figure BDA00037867393600000513
representing the target stock s normalized on the tth trading day a And with the target stock s a Related stocks s b Normalized stock correlation therebetween, and
Figure BDA00037867393600000514
the larger the value is, the stronger the correlation between the two stocks is, and the weaker the correlation is;
Figure BDA00037867393600000515
is a target stock s a A weight matrix shared with stocks related to the target stock, wherein D' is a dimension of the outputted relational feature vector; sigma represents a sigmoid function;
Figure BDA00037867393600000516
represents a series connection; p (-) represents a gating mechanism,
Figure BDA00037867393600000517
is a weight matrix in the gating mechanism;
Figure BDA00037867393600000518
is an offset;
(4.5) sending the obtained relation characteristic vector and the obtained depth fusion characteristic vector into an output layer, wherein the output layer is a two-layer feedforward neural network with a softmax function, and the output result is a prediction result of future fluctuation of the target stock:
Figure BDA00037867393600000519
wherein the content of the first and second substances,
Figure BDA00037867393600000520
is the target stock s a And with the target stock s a The relationship feature vector between the related stocks,
Figure BDA00037867393600000521
is a target stock s a The depth-fused feature vector of (a),
Figure BDA00037867393600000522
is a weight matrix obtained by neural network training, P represents the number of classifications,
Figure BDA00037867393600000523
which represents the amount of offset, is,
Figure BDA00037867393600000524
is to the target stock sa The result of the future rise and fall prediction is also the final output of the fusion model of the time convolution network with the channel domain attention mechanism and the graph convolution network.
5) And continuously changing the hyper-parameters of the fusion model of the time convolution network with the channel domain attention mechanism combined with the graph convolution network to train the fusion model for multiple times, namely calculating the cross entropy loss value of the predicted value and the true value of the target stock closing price, continuously updating the parameters of the fusion model of the time convolution network with the channel domain attention mechanism combined with the graph convolution network, and finally selecting the hyper-parameter combination with the minimum cross entropy loss value as the optimal hyper-parameter combination according to the training result.
The above description of the present invention and the embodiments is not limited thereto, and the description of the embodiments is only one of the implementation manners of the present invention, and any structure or embodiment similar to the technical solution without inventive design is within the protection scope of the present invention without departing from the inventive spirit of the present invention.

Claims (7)

1. A stock forecasting method based on historical data screening and momentum overflow effect is characterized by comprising the following steps:
1) Collecting historical trading data and text information of the target stock and stocks related to the target stock;
2) Data preprocessing, namely converting historical transaction data and text information into two-dimensional feature matrixes respectively;
3) Constructing a relationship network matrix among N stocks, comprising the following steps:
(3.1) constructing a space-time dynamic hierarchical complex network by using each trading day in each historical trading data to obtain a stock relation network under the span of T trading days;
(3.2) calculating degree correlation between layers in the N-layer stock relation network;
(3.3) calculating the clustering coefficient correlation between layers in the N layers of stock relation networks;
(3.4) defining the average value of degree correlation and clustering coefficient correlation among N stocks as the stock correlation R i,j Is shown as
Figure FDA0003786739350000011
Normalizing the stock relativity to obtain normalized stock phaseSex of concern
Figure FDA0003786739350000012
Thereby obtaining N stock relation network matrixes
Figure FDA0003786739350000013
4) Constructing a fusion model of a time convolution network with a channel domain attention mechanism and a graph convolution network;
5) Continuously changing the hyper-parameters of the fusion model of the time convolution network combined graph convolution network with the channel domain attention mechanism to train the fusion model for many times, namely calculating the cross entropy loss value of the predicted value and the true value of the target stock closing price, continuously updating the parameters of the fusion model of the time convolution network combined graph convolution network with the channel domain attention mechanism, and finally selecting the hyper-parameter combination with the minimum cross entropy loss value as the optimal hyper-parameter combination according to the training result.
2. The method as claimed in claim 1, wherein the historical trading data of step 1) includes five indexes, which are: opening price, closing price, transaction amount, highest price and lowest price; the text information comprises financial news for selecting the target stock and the stocks related to the target stock, and comments issued by investors in the oriental wealth network forum.
3. The stock forecasting method based on historical data screening and momentum overflow effect as claimed in claim 1, wherein the step 2) comprises:
performing data cleaning on the collected historical transaction data; performing text quantization on the collected text information, and obtaining daily emotional tendency vectors of each stock through an emotional dictionary to form daily text information characteristics of each stock; setting the historical trading data characteristics of the ith stock on the tth trading day as
Figure FDA0003786739350000014
Wherein L' is a characteristic dimension of the historical transaction data; similarly, the text message characteristic of the ith stock on the tth trading day is set as
Figure FDA0003786739350000015
Where L is the feature dimension of the text information feature.
4. The stock forecasting method based on historical data screening and momentum overflow effect as claimed in claim 1, wherein the step 3) the (3.1) th step comprises:
regarding each trading day as a node of a single stock network, two nodes
Figure FDA0003786739350000016
And
Figure FDA0003786739350000017
between the vertical bars of (1) a horizontal connecting line is constructed, wherein, the nodes
Figure FDA0003786739350000018
Indicating that the ith stock is at the t m The closing price of each trading day is
Figure FDA0003786739350000019
Node point
Figure FDA00037867393500000110
Denoted as the ith stock at the t n The closing price of each trading day is
Figure FDA0003786739350000021
The height of the connecting line being the minimum of the square bars, i.e.
Figure FDA0003786739350000022
Indicating the tth transaction day ranging from tth-tth transaction day to tth transaction day m Date of transaction and tth n Setting the limited crossing visual range as d on each trading day, and setting the two nodes if
Figure FDA0003786739350000023
And
Figure FDA0003786739350000024
the horizontal connecting line between the intermediate nodes is intersected with d intermediate nodes which are less than or equal to
Figure FDA0003786739350000025
And
Figure FDA0003786739350000026
a connecting edge exists between the two, otherwise, the connecting edge does not exist; respectively constructing a layer of single stock network for each stock in the N stocks to obtain N layers of stock relation networks
Figure FDA0003786739350000027
Where N is the total number of the target stock and the stocks associated with the target stock.
5. The stock forecasting method based on historical data screening and momentum overflow effect as claimed in claim 1, wherein the step 3) and the step (3.2) comprise:
nodes according to ith stock network
Figure FDA0003786739350000028
Value of (A)
Figure FDA0003786739350000029
And j (th) nodes of stock network
Figure FDA00037867393500000210
Value of (A)
Figure FDA00037867393500000211
Calculating the value sequence of the ith stock network
Figure FDA00037867393500000212
And the value sequence of the j' th stock network
Figure FDA00037867393500000213
Mutual information between them, which is used to represent the degree correlation between the ith stock and the jth stock in T trading days
Figure FDA00037867393500000214
Namely degree correlation between layers in the N-layer stock relation network, the calculation formula is as follows:
Figure FDA00037867393500000215
wherein, p (k) i ) Is the degree distribution of the ith stock, p (k) j ) Is the degree distribution of the jth stock, p (k) i ,k j ) Is the joint degree distribution of the ith stock and the jth stock.
6. The stock forecasting method based on historical data screening and momentum overflow effect as claimed in claim 1, wherein the step 3) and the (3.3) step comprise:
nodes according to ith layer single stock network
Figure FDA00037867393500000216
Cluster coefficient of (2)
Figure FDA00037867393500000217
Node of single stock network of j layer
Figure FDA00037867393500000218
Cluster coefficient of (2)
Figure FDA00037867393500000219
Calculating the clustering coefficient sequence of the ith layer of single stock network
Figure FDA00037867393500000220
And the clustering coefficient sequence of the j-th layer single stock network
Figure FDA00037867393500000221
The mutual information between the stock and the stock is used for expressing the clustering coefficient correlation of the ith stock and the jth stock in the T trading days
Figure FDA00037867393500000222
The calculation formula is as follows:
Figure FDA00037867393500000223
wherein, p (z) i ) Is the clustering coefficient distribution of the ith stock, p (z) j ) Is the clustering coefficient distribution of the j-th stock, p (z) i ,z j ) Is the joint clustering coefficient distribution of the ith stock and the jth stock.
7. The stock forecasting method based on historical data screening and momentum overflow effect as claimed in claim 1, wherein the step 4) comprises:
(4.1) Using K-dimensional bilinear tensor product terms
Figure FDA00037867393500000224
The calculation method of (1) fuses the historical transaction data characteristics and the text information characteristics of each stock, and calculates the kth item in the K-dimensional bilinear tensor product through tensor slicing, wherein the calculation formula is as follows:
Figure FDA00037867393500000225
wherein the content of the first and second substances,
Figure FDA0003786739350000031
is a third order tensor, gamma [1:K] =[Γ 1 ,...,Γ k ,...,Γ K ]L' is the dimension of the table historical transaction data characteristic, L is the dimension of the text information characteristic, K is the dimension of the bilinear vector product item,
Figure FDA0003786739350000032
the characteristic of the ith dimension in the historical trading data characteristic representing the ith stock,
Figure FDA0003786739350000033
the ith dimension of the text information feature representing the ith stock is used for characterizing the historical transaction data
Figure FDA0003786739350000034
And text information features
Figure FDA0003786739350000035
By a weight matrix
Figure FDA0003786739350000036
Performing series and linear transformation to obtain the fusion characteristic vector of the ith stock
Figure FDA0003786739350000037
Figure FDA0003786739350000038
Wherein, | | represents a concatenation,
Figure FDA0003786739350000039
representing the offset, tanh is the activation function;
(4.2) processing the fusion feature vector of each stock by using a channel attention mechanism to complete the screening of historical data, wherein the specific processing formula is as follows:
Figure FDA00037867393500000310
wherein the content of the first and second substances,
Figure FDA00037867393500000311
is a channel attention feature vector of the ith stock in T trading days, f is a feature dimension of each trading day output by the channel attention mechanism,
Figure FDA00037867393500000312
representing the fusion characteristic vector set of the ith stock in T trading days; CA represents the channel attention mechanism;
(4.3) respectively carrying out high-order feature extraction on the channel attention feature vector of each stock through a time convolution network model, wherein the extraction formula is as follows:
Figure FDA00037867393500000313
wherein, the first and the second end of the pipe are connected with each other,
Figure FDA00037867393500000314
the depth fusion characteristic vector of the ith stock output by the time convolution network model on the tth trading day is represented, and D is
Figure FDA00037867393500000315
TCN represents a time convolutional network model;
(4.4) deep fusion feature vector of N stocks and N layers of stock relation network
Figure FDA00037867393500000316
Feature extraction using graph convolution networks with gating mechanisms, target stocks s a And with target stocks s a Related strandRelationship feature vector between tickets
Figure FDA00037867393500000317
Is represented as follows:
Figure FDA00037867393500000318
Figure FDA00037867393500000319
wherein the content of the first and second substances,
Figure FDA00037867393500000320
representing the target stock s normalized on the tth trading day a And with the target stock s a Related stocks s b Normalized stock correlation therebetween, and
Figure FDA00037867393500000321
the larger the value is, the stronger the correlation between the two stocks is, and the weaker the correlation is;
Figure FDA00037867393500000322
is the target stock s a A weight matrix shared with stocks related to the target stock, wherein D' is a dimension of the outputted relational feature vector; sigma represents a sigmoid function;
Figure FDA00037867393500000323
represents a series connection; p (-) represents a gating mechanism,
Figure FDA00037867393500000324
is a weight matrix in the gating mechanism;
Figure FDA00037867393500000325
is an offset;
(4.5) sending the obtained relation characteristic vector and the obtained depth fusion characteristic vector into an output layer, wherein the output layer is a two-layer feedforward neural network with a softmax function, and the output result is a prediction result of future fluctuation of the target stock:
Figure FDA0003786739350000041
wherein the content of the first and second substances,
Figure FDA0003786739350000042
is the target stock s a And with the target stock s a The relationship feature vector between the related stocks,
Figure FDA0003786739350000043
is the target stock s a The depth-fused feature vector of (a),
Figure FDA0003786739350000044
is a weight matrix obtained by neural network training, P represents the number of classifications,
Figure FDA0003786739350000045
which represents the amount of offset, is,
Figure FDA0003786739350000046
is to the target stock s a The result of the future rise and fall prediction is also the final output of the fusion model of the time convolution network with the channel domain attention mechanism and the graph convolution network.
CN202210943495.6A 2022-08-08 2022-08-08 Stock prediction method based on historical data screening and momentum overflow effect Pending CN115482101A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210943495.6A CN115482101A (en) 2022-08-08 2022-08-08 Stock prediction method based on historical data screening and momentum overflow effect

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210943495.6A CN115482101A (en) 2022-08-08 2022-08-08 Stock prediction method based on historical data screening and momentum overflow effect

Publications (1)

Publication Number Publication Date
CN115482101A true CN115482101A (en) 2022-12-16

Family

ID=84421733

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210943495.6A Pending CN115482101A (en) 2022-08-08 2022-08-08 Stock prediction method based on historical data screening and momentum overflow effect

Country Status (1)

Country Link
CN (1) CN115482101A (en)

Similar Documents

Publication Publication Date Title
CN109829631B (en) Enterprise risk early warning analysis method and system based on memory network
Mishev et al. Forecasting corporate revenue by using deep-learning methodologies
Wang et al. MG-Conv: A spatiotemporal multi-graph convolutional neural network for stock market index trend prediction
Baboshkin et al. Multi-source model of heterogeneous data analysis for oil price forecasting
Li et al. Digital transformation and corporate performance: evidence from China
CN115482101A (en) Stock prediction method based on historical data screening and momentum overflow effect
Chen Visual recognition and prediction analysis of China’s real estate index and stock trend based on CNN-LSTM algorithm optimized by neural networks
AlHakeem et al. Iraqi Stock Market Prediction Using Artificial Neural Network and Long Short-Term Memory
Ranjan et al. Investor community sentiment analysis for predicting stock price trends
Sun et al. Short-term stock price forecasting based on an svd-lstm model
Jeon et al. Building industry network based on business text: corporate disclosures and news
Cakar et al. Neurotic Fuzzy-Data-Envelopment Analysis to Forecast Efficiency of Bank Branches
Li et al. A comparison between linear regression, lasso regression, decision tree, XGBoost, and RNN for asset price strategies
Ozkaya et al. Science, Technology and Innovation Policy Indicators and Comparisons of Countries through a Hybrid Model of Data Mining and Operation Research Methods. Sustainability 2021, 13, 694
CN109299442A (en) Chinese chapter primary-slave relation recognition methods and system
Rahman et al. A Hybrid Deep Neural Network Model to Forecast Day-Ahead Electricity Prices in the USA Energy Market
Zhang Housing price prediction using machine learning algorithm
Shang et al. Cracking the Achilles’ heel of energy performance contracting projects: the credit risk identification method for clients
Chen et al. A Stock Index Prediction Method and Trading Strategy Based on the Combination of Lasso-Grid Search-Random Forest
Su et al. Research and Comparison of Random Forests and Neural Networks in Shanghai and Shenzhen Financial 20 Index Prediction
Zhou et al. Statistical Forecasting Model of Financial Data based on Artificial Neural Network Algorithm
Ruhal et al. A Comparative Study Of Statistical Methods And Machine Learning Approaches For Stock Price Prediction
Zhang et al. A hybrid approach combining data envelopment analysis and recurrent neural network for predicting the efficiency of research institutions
Liu et al. Research on the Influence of Securities Research Newspapers on the Company's Stock Trend and Investment Strategies
CN115907975A (en) Stock investment portfolio recommendation method based on complex network multi-source information fusion

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination