CN110659767A - Stock trend prediction method based on LSTM-CNN deep learning model - Google Patents
Stock trend prediction method based on LSTM-CNN deep learning model Download PDFInfo
- Publication number
- CN110659767A CN110659767A CN201910723665.8A CN201910723665A CN110659767A CN 110659767 A CN110659767 A CN 110659767A CN 201910723665 A CN201910723665 A CN 201910723665A CN 110659767 A CN110659767 A CN 110659767A
- Authority
- CN
- China
- Prior art keywords
- trend
- lstm
- turning point
- price
- prediction
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/04—Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2458—Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
- G06F16/2465—Query processing support for facilitating data mining operations in structured databases
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q40/00—Finance; Insurance; Tax strategies; Processing of corporate or income taxes
- G06Q40/04—Trading; Exchange, e.g. stocks, commodities, derivatives or currency exchange
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Business, Economics & Management (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Strategic Management (AREA)
- Economics (AREA)
- Mathematical Physics (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Software Systems (AREA)
- Accounting & Taxation (AREA)
- Development Economics (AREA)
- Marketing (AREA)
- General Business, Economics & Management (AREA)
- Human Resources & Organizations (AREA)
- Finance (AREA)
- Databases & Information Systems (AREA)
- Entrepreneurship & Innovation (AREA)
- Operations Research (AREA)
- Quality & Reliability (AREA)
- Tourism & Hospitality (AREA)
- Game Theory and Decision Science (AREA)
- Probability & Statistics with Applications (AREA)
- Fuzzy Systems (AREA)
- Technology Law (AREA)
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
- Financial Or Insurance-Related Operations Such As Payment And Settlement (AREA)
Abstract
A stock trend prediction method based on an LSTM-CNN deep learning model comprises the following steps: step 1: preprocessing data; step 2: predicting the stock price trend by using an LSTM-CNN prediction model; and step 3: and evaluating the standard, wherein the prediction target is to predict the trend position and the bottom turning point, and the accuracy of the two aspects is respectively considered. The invention provides an LSTM-CNN neural network model and a weighting loss function, optimizes a normalization algorithm and a target function trend position, and accurately predicts the stock fluctuation trend.
Description
Technical Field
The invention relates to the research of financial time series, in particular to a method for predicting stock trend by combining two neural networks of LSTM and CNN.
Background
With the development of global economy, the development of the financial market in China faces greater opportunities and challenges. The financial market is rich in contents, products mainly comprise bonds, futures, foreign exchange, stocks and the like, and the fluctuation of the price is a key concern.
Stock, an important part of the financial market, is the wind vane of the financial market. The fluctuation of stock market is closely related to the interests of investors, and becomes the focus of attention of many investors and financial institutions.
Through carrying out quantitative analysis on historical stock price fluctuation, transaction data and the like, reference is provided by using technologies such as data mining, artificial intelligence and the like, and the method has important significance for investors. In recent years, with the technological breakthrough in the field of artificial intelligence, the artificial intelligence technology has once again attracted public attention.
Hocherive S and Schmidhuber J propose that the LSTM neural network model is suitable for feature analysis in the field of data with time-series characteristics. In the early CNN stage, Fukushima K proposes the prototype of a convolutional neural network, which is biased to identify static characteristics, the characteristics are also suitable for financial data, the relationship is strong in adjacent time, and then local characteristics are extracted.
Disclosure of Invention
Different from the analysis and prediction of stock data by using a single neural network at present, the stock price fluctuation is analyzed based on an LSTM-CNN network model, time sequence and static characteristics are extracted from the stock data through the neural network, and the turning point of stock trend and future tendency are predicted.
The technical scheme adopted by the invention for solving the technical problems is as follows:
a stock trend prediction method based on an LSTM-CNN deep learning model comprises the following steps:
step 1: data pre-processing
Stock data of a stock market are selected for analysis and verification, and an integral data set is divided into a training set and a testing set;
step 1.1: feature selection and normalization
The characteristics are transaction data and technical indexes, and the transaction data needs to be normalized; the technical index is obtained by calculating transaction data, and a normalization process is included in the calculation process;
firstly, normalizing original transaction data, wherein in stock data, in the presence of a continuously rising or continuously falling quotation, high or low innovation is a common phenomenon in financial data, so that a normalization method is respectively designed for price and volume of transaction;
price normalization: the stock data for each time window is quantified. Calculating relative fall and rise by taking the stock sequence first value as a reference price, and independently normalizing data in each time window as shown in a formula (1);
for each training sample, at t0,t1,…,ti,tmThe closing price corresponding to the time is P0,P1,…,Pi,PmWherein μ is the average closing price and m is the sliding window size;
and (3) normalization of volume of bargaining: the calculation is carried out on the whole time sequence of a stock, and the formula is shown as (2):
at TiAt the moment, the volume of traffic is ViThe eigenvalue obtained based on the volume normalization is VxiIf the constant term m is generally equal to the size of the time window, the first m time points of one stock are only used for data processing and are not marked into a sample;
the normalization of closing price and volume is completed, the technical index values of the calculation sequence are MACD, RSI and KDJ-J respectively, and the technical index values are all used as input data;
step 1.2: calibrating trending locations
Researching the price trend of the stocks in the middle and long term, defining turning points, quantifying the trend according to the turning points to obtain trend positions, and calibrating the data labels according to the trend positions;
the trend turning points include a top turning point at the peak position and a bottom turning point at the valley position, when the price is at the bottom turning point, it means that the price is about to rise, and vice versa. In an actual market situation, prices are always in fluctuation, fluctuation with different amplitudes is generated, a plurality of turning points are generated, and the turning points of model identification are quantified, and the specific definition is as follows:
stock price is listed at T1,T2,T3The closing prices corresponding to the time are respectively P1,P2,P3Setting a parameter called trend threshold delta, when condition (4) is satisfied, and T2The time price is T1To T3The lowest point in the period, T is considered2The moment is the bottom turning point of a section of trend;
similar to the bottom turning point, when the condition (5) is satisfied, and T2The time price is T1To T3The highest point in the period, then T is considered2The moment is the top turning point of a section of trend, and the trend is shown in the right diagram of fig. 1. It should be noted that the calculation processes of the bottom turning point and the top turning point are both based on low price, otherwise, under the same threshold value delta, the fluctuation absolute values required for the establishment of the bottom turning point and the top turning point are different, and deviation is generated;
after a certain bottom turning point is determined, the subsequent sequence is defined as an ascending trend until the next top turning point is met, and similarly, after a certain top turning point is determined, the subsequent sequence is defined as a descending trend until the next bottom turning point is met;
after determining the turning points at the bottom and the top of the time sequence, mapping the current stock price to the relative position in the trend, and using the trend position as the direct prediction target of an LSTM-CNN model to realize the prediction of the future price trend, wherein the calibration method of the trend position comprises the following steps:
1.2.1, according to the definition of the turning points, obtaining all bottom turning points and top turning points of a group of sequences, wherein the trend position at the bottom turning point is marked as 0, and the trend position at the top turning point is marked as 100;
1.2.2, the trend position value between the bottom and top turning points is calculated according to equation (6), where PLAnd PHA set of adjacent bottom and top inflection points;
the last section of trend of the time sequence cannot determine the turning point at the top or the bottom of the last section of price, so that the part which cannot be determined is not marked and does not contain training data;
step 2: the stock price trend is predicted by using an LSTM-CNN prediction model, and the method comprises the following steps:
step 2.1: constructing a network model, comprising the following steps:
the LSTM-CNN model comprises three parts, wherein one part is an LSTM time sequence characteristic learning layer; the other part is a CNN static characteristic learning layer which is synchronously carried out with the LSTM part; and the last part is a fully-connected output layer, and the fully-connected neural network is constructed by serially combining the outputs of the first two models.
The first part is composed of three LSTM layers, as shown in part (a) of fig. 3. The input matrix size of the LSTM is. Where k is the number of training features and m is the size of the sliding time window. As shown, the L1 layer includes M LSTM units corresponding to time nodes, and the input of each LSTM unit is composed of k features, which include transaction data and technical indicators. Each cell in the L1 layer is trained to output as an input to each cell in the L2 layer, and similarly, the output of the L2 layer is input to the L3 layer. The training mode of the L2 and L3 layers is the same as that of the L1 layer. Finally, the output of the LSTM cell corresponding to the last time node of the L3 layer is taken as the result of part a;
the second part consists of three convolutional layers and one LSTM layer, as shown in part (b) of fig. 3. In this section, the input matrix is the same as in section (a); first, the local related information between the price and the technical index is extracted through three convolutional layers. Three convolutional layers are followed by an LSTM layer. Analyzing the time relation in the extracted local dependency information through an LSTM layer;
the last part consists of three fully connected layers. And respectively extracting time and static information from the LSTM and the CNN, and combining the time and the static information into a new characteristic sequence. This signature sequence is the input to the fully connected layers. And finally outputting the final prediction result of the model, namely the trend position. In addition, applying a random deactivation (dropout) method between fully connected layers can disconnect some nodes between two connected layers, with the end result not fully dependent on a particular node. The random inactivation method significantly reduces the over-fitting problem;
step 2.2: activating a function
Using the ELU as an activation function in the LSTM network, the formula is (7):
φ(x)=max(0,x) (7)
the network structure used for training is shown in fig. 4, a network model needs to be configured in the training process, an activation function and input and output dimensions need to be set in an LSTM unit, and the size of the activation function and a convolution kernel need to be set in a CNN unit;
step 2.3: constructing a model training strategy, designing a weighting loss function for the training of the model, and comprising the following steps:
the value range of the trend position is a limited interval [0,100], the application values of different positions corresponding to the interval are different, and the invention performs weighting measurement on the interval to provide a loss function sensitive to the cost of a turning region;
firstly, the target value y and the predicted value are comparedScaling and translating to make the target value in the interval [ -1,1 [ -1 [ ]]In the method, formula (8) is a conversion method, c is a small real number and is used for ensuring that a logarithm and a denominator calculation object are not zero, and arctaph formula (9) is used for a targetValue z and predicted valueMapping is carried out, and weight assignment, Y and Y are realized through the mapping processRespectively a real value and a predicted value after the transformation weighting;
after the target value and the predicted value are subjected to mapping weighting, MSE is applied to measure errors of the target value and the predicted value, and WMSE (10) which is a loss function of the prediction model is obtained. Through weighting, the training speed can be accelerated, and meanwhile, the complexity of the model is reduced to a certain extent;
and step 3: error evaluation criterion
The prediction target is a prediction trend position and a bottom turning point, and the accuracy rates of the two aspects are respectively considered;
the accuracy rate of the trend position is evaluated by using the Mean Absolute Error (MAE), the accuracy rate of the turning point is evaluated by designing two indexes of a bottom precision rate (LP) and a top recall rate (HR);
precision (Precision) and Recall (Recall) are common evaluation indexes for the two-class problem, and in a classification algorithm, after a prediction result is compared with a real target according to the classification algorithm, the following four marks represent the prediction state of the problem;
TP: when the true value is the positive class, the predicted value is the positive class;
TN: when the true value is a positive class, the predicted value is a negative class;
FP: when the true value is a negative class, the predicted value is a positive class;
FN: and when the true value is the negative class, the predicted value is the negative class.
Wherein T is True, F is False, P is Positive, and N is Negative. And calculating an accuracy (11) and a recall (12) based on the values; according to the formula, the precision ratio represents the condition that the actual value is true when the predicted value is true, and the recall ratio represents the condition that the predicted value is true when the true value is true;
for the bottom turning point, a strong buy signal is provided when the turning point occurs in each prediction, so that it is desirable that each prediction is as accurate as possible, otherwise investment loss is caused, and therefore attention is paid to precision ratio;
for the top turning point, when the turning point appears in each prediction, a selling signal is provided, and the top turning point appears in each prediction, the prediction is accurate as much as possible, so that the selling opportunity is avoided being missed, otherwise, the investment loss is caused, and therefore the recall ratio needs to be paid attention.
The invention has the following beneficial effects: the invention provides an LSTM-CNN neural network model and a weighting loss function, optimizes a normalization algorithm and a target function trend position, and accurately predicts the stock fluctuation trend.
Drawings
FIG. 1 is a turning point defining diagram of the present invention.
FIG. 2 is an exemplary diagram of a trend position of the present invention.
Fig. 3 is a schematic diagram of the LSTM-CNN network structure of the present invention.
FIG. 4 is a graph of an arctan h map of the present invention.
FIG. 5 is a graph of the target weighting effect of the present invention.
FIG. 6 is a prediction chart of the Bao Steel share trend position LSTM-CNN of the present invention.
Detailed Description
The invention is further described below with reference to the accompanying drawings.
Referring to fig. 1 to 6, a stock trend prediction method based on an LSTM-CNN deep learning model includes the following steps:
step 1: data pre-processing
Partial stock data in the stock market A is adopted for analysis and verification, and the whole data set is divided into a training set and a testing set. The Shanghai depth 300 index is selected as a training set, and the Shanghai syndrome 50 is selected as a testing set. New stocks, next-to-new stocks, long-term stop stocks, ST stocks and stocks with large amplitude abnormal fluctuation are not included. 218 stocks were selected to be included in the training set and 37 stocks were selected to be included in the testing set. And selecting stock data in 12 months from 2004 to 2018, and acquiring trading data such as opening price, closing price, maximum price, volume of bargaining and the like of each week.
The main characteristics are transaction data and technical indexes. The five groups of training characteristics of the data are Cx, Vx, MACD and JDK-J, RSI, which respectively represent closing price, volume of. Each stock in the training set has about 500 time sequence points, the step length of a sliding window is 1, each training time point corresponds to a trend position, and the size of a target matrix is the same as the number of samples for the target of the sample.
Step 1.1: feature selection and normalization
The method is mainly characterized by transaction data and technical indexes, and the transaction data needs to be normalized. The technical index is obtained by calculating the transaction data, and a normalization process is included in the calculation process, so that the technical index does not need to be processed again.
The raw transaction data is first normalized. In stock data, where there is a market that is continuously rising or continuously falling, innovation high or new low is a common phenomenon in financial data. Therefore, a normalization method is designed for the price and the volume respectively.
Price normalization: the stock data for each time window is quantified. And (3) calculating the relative fall and rise by taking the stock sequence first value as a reference price, and independently normalizing the data in each time window as shown in the formula (1).
For each training sample, at t0,t1,…,ti,tmThe closing price corresponding to the time is P0,P1,…,Pi,PmWherein μ is the average closing price, sliding window m is 100;
and (3) normalization of volume of bargaining: the calculation is carried out on the whole time sequence of a stock, and the formula is shown as (2):
at TiAt the moment, the volume of traffic is ViThe eigenvalue obtained based on the volume normalization is VxiThe constant term m is 100, which is generally equal to the size of the time window, and the first m of one stock is 100 time points, which are only used for data processing and are not drawn into a sample;
closing price, volume, MACD, RSI and KDJ-J are used as input data.
Step 1.2: calibrating trending locations
The price trend of the stocks in the middle and long term is researched, and turning points are provided. And quantifying the trend according to the turning points to obtain a trend position, and calibrating the data label according to the trend position.
The trend turning points include a top turning point at the peak position and a bottom turning point at the valley position, when the price is at the bottom turning point, it means that the price is about to rise, and vice versa. In an actual market situation, prices are always in fluctuation, fluctuation with different amplitudes is generated, a plurality of turning points are generated, and the turning points of model identification are quantified, and the specific definition is as follows:
stock price is listed at T1,T2,T3Time of dayThe corresponding closing prices are respectively P1,P2,P3When the condition (4) is satisfied, a parameter called a trend threshold δ is set to 60%, and T is set2The time price is T1To T3The lowest point in the period, T is considered2The moment is the bottom turning point of a section of trend, and the trend is shown in the left graph of fig. 1.
Similar to the bottom turning point, when the condition (5) is satisfied, and T2The time price is T1To T3The highest point in the period, then T is considered2The moment is the top turning point of a section of trend, and the trend is shown in the right diagram of fig. 1. It should be noted that the bottom turning point and the top turning point should be calculated based on the low price, otherwise, under the same threshold value δ, the absolute values of the fluctuation required for the establishment of the bottom turning point and the top turning point are different, and a deviation is generated.
When a bottom turning point is determined, the subsequent sequence is defined as an ascending trend until the next top turning point is encountered. Similarly, when a top inflection point is identified, the subsequent sequence is defined as a trend of falling until the next bottom inflection point is encountered.
After determining the bottom and top turning points of the time series, the current stock price is mapped to a relative position in the trend. And the trend position is used as a direct prediction target of the LSTM-CNN model to realize the prediction of the future price trend. The calibration method of the trend position comprises the following steps:
1.2.1, according to the definition of the turning points, all the bottom turning points and the top turning points of a group of sequences can be obtained, the trend position of the bottom turning point is marked as 0, and the trend position of the top turning point is marked as 100.
1.2.2, the trend position value between the bottom and top turning points proceeds according to equation (6)Calculation of where PLAnd PHA set of adjacent bottom and top inflection points.
In the last trend of the time series, the top or bottom turning point of the last price can not be determined according to the definition, so that the part which can not be determined is not marked and is not included in the training data. The marking effect of a stock sequence is shown in fig. 2.
Step 2: predicting the stock price trend by using an LSTM-CNN prediction model, and the steps are as follows;
step 2.1: constructing a network model, comprising the following steps:
the LSTM-CNN model comprises three parts, wherein one part is an LSTM time sequence characteristic learning layer; the other part is a CNN static characteristic learning layer which is synchronously carried out with the LSTM part; and the last part is a fully-connected output layer, and the fully-connected neural network is constructed by serially combining the outputs of the first two models.
The first part is composed of three LSTM layers, as shown in part (a) of fig. 3. The input matrix size of the LSTM is. Where k is 5, and m is the size of the sliding time window. As shown, the L1 layer includes LSTM units corresponding to M-100 time nodes, and the input of each LSTM unit is composed of k-5 features, which include transaction data and technical indicators. Each cell in the L1 layer is trained to output as an input to each cell in the L2 layer, and similarly, the output of the L2 layer is input to the L3 layer. The training mode of the L2 and L3 layers is the same as that of the L1 layer. Finally, the output of the LSTM cell corresponding to the last time node of the L3 layer is taken as the result of section a.
The second part consists of three convolutional layers and one LSTM layer, as shown in part (b) of fig. 3. In this section, the input matrix is the same as in section (a). First, the local related information between the price and the technical index is extracted through three convolutional layers. Three convolutional layers are followed by an LSTM layer. The temporal relationship in the extracted local dependency information is analyzed by the LSTM layer.
The last part consists of three fully connected layers. And respectively extracting time and static information from the LSTM and the CNN, and combining the time and the static information into a new characteristic sequence. This signature sequence is the input to the fully connected layers. And finally outputting the final prediction result of the model, namely the trend position. In addition, applying a random deactivation (dropout) method between fully connected layers can disconnect some nodes between two connected layers, with the end result not fully dependent on a particular node. The random inactivation method significantly reduces the overfitting problem.
Step 2.2: activating a function
Using the ELU as an activation function in the LSTM network, the formula is (7):
φ(x)=max(0,x) (7)
the network structure used for training of the invention is shown in fig. 4, the network model needs to be configured in the training process, the activation function and the input and output dimensionality need to be set in the LSTM unit, and the activation function and the convolution kernel size need to be set in the CNN unit. In the experiment of the invention, the LSTM-CNN network model structural parameters are trained by using the configuration shown in Table 1.
TABLE 1
Step 2.3: constructing a model training strategy, designing a weighting loss function for the training of the model, and comprising the following steps:
the value range of the trend position is a limited interval [0,100], the application values of different positions corresponding to the interval are different, and the invention performs weighting measurement on the interval to provide a cost-sensitive loss function of a turning region.
Firstly, the target value y and the predicted value are comparedScaling and translating to make the target value in the interval [ -1,1 [ -1 [ ]]In the inner, formula (8) is a conversion method, and c is a smaller real number, which is used to ensure that the logarithm and denominator calculation object is not zero, so as to make the logarithm and denominator calculation object not zeroUsing arctan formula (9) to calculate the target value z and the predicted valueMapping is carried out, and weight assignment, Y and Y are realized through the mapping processThe real value and the predicted value after the transformation and weighting are respectively.
The arctan mapping graph is shown in fig. 4. As can be seen from the figure, the mapping weighting method enables the change of the two end values to be gradually increased, the change amplitude of the middle area is relatively small, and the whole is smooth. After weighting, the weighted error in the two end regions is increased compared with the original error, and the closer to the two ends, the larger the error increment, thereby realizing weight distribution in different regions, and the actual weighting effect is shown in fig. 5.
After the target value and the predicted value are subjected to mapping weighting, MSE is applied to measure errors of the target value and the predicted value, and WMSE (10) which is a loss function of the prediction model is obtained. By weighting, the training speed can be accelerated, and the complexity of the model can be reduced to a certain extent.
And step 3: evaluation criteria
The main prediction target of the invention is to predict the trend position and the bottom turning point, and the accuracy of the two aspects is respectively considered.
The accuracy of the trend position is evaluated by using the Mean Absolute Error (MAE), the accuracy of the turning point is evaluated by designing two indexes of bottom precision (LP) and top recall (HR).
Precision (Precision) and Recall (Recall) are common evaluation indexes for two-class problems, and in a classification algorithm, after a prediction result is compared with a real target, the prediction state can be represented by the following four marks.
TP: when the true value is the positive class, the predicted value is the positive class;
TN: when the true value is a positive class, the predicted value is a negative class;
FP: when the true value is a negative class, the predicted value is a positive class;
FN: and when the true value is the negative class, the predicted value is the negative class.
Wherein T is True, F is False, P is Positive, and N is Negative. And calculates the precision ratio (11) and the recall ratio (12) based on the values. As can be seen from the formula, the precision ratio indicates that the actual value is true when the predicted value is true, and the recall ratio indicates that the predicted value is true when the true value is true.
For the bottom milestone, the investor is provided with a strong buy signal each time the milestone is predicted to occur, so we want to be as accurate as possible each time the milestone is predicted, otherwise it will cause investment loss, and so care needs to be taken for the precision.
For the top turning point, when the turning point is predicted to appear each time, a selling signal is provided for the investor, if the investor holds stocks, the top turning point is expected to be predicted as accurately as possible each time so as to avoid missing a selling chance, otherwise, the investor is lost, and therefore, the recall ratio needs to be concerned.
In the classification problem, there are definite positive and negative classes, and the prediction is correct and wrong, but the experiment is a regression algorithm, a specific trend position value is obtained, and no definite difference is obtained, so that in order to measure the accuracy of the prediction of the trend turning point by using a similar method, the following method is used for measuring the result.
And judging the bottom and the top according to the trend position, judging the turning point according to a certain area, and taking the area with the trend position value smaller than 5 as the bottom area and more than 95 as the top area. When the predicted value is less than 5, it is determined to be near the bottom turning point, and the absolute error from the true value is measured as the bottom Precision (LP). Similarly, when the true value is greater than 95, the top turning point is deemed to have been approached and its absolute error from the predicted value is measured as the top recall (HR).
In the classification problem, the precision ratio and the recall ratio are usually contradictory, and in extreme cases, one is very high and the other is very low, so that LP and HR only evaluate the prediction effect in the turning point or turning region of the prediction model, and for the overall measurement, the present invention can use the above-mentioned Mean Absolute Error (MAE) to reflect the fitting effect of the model on the overall trend due to the regression calculation. According to the definition of the trend position, the trend position is essentially hundred divisions of the trend, the maximum error between single points is 100, and the minimum error is 0, so that the prediction effect can be more intuitively reflected by using the average absolute error.
Prediction results of this example: when the time window M is 100, the trend threshold δ is 60%, and the error functions LP, HR, and MAE are 8.07%, 11.86%, and 12.72%, respectively. In the case of the bao steel stock, the time window M is 100, and δ is 60% as the predicted effect graph shown in fig. 6.
Claims (4)
1. A stock trend prediction method based on an LSTM-CNN deep learning model is characterized by comprising the following steps:
step 1: data pre-processing
Stock data of a stock market are selected for analysis and verification, and an integral data set is divided into a training set and a testing set;
step 1.1: feature selection and normalization
The characteristics are transaction data and technical indexes, and the transaction data needs to be normalized; the technical index is obtained by calculating transaction data, and a normalization process is included in the calculation process;
step 1.2: calibrating trending locations
Researching the price trend of the stocks in the middle and long term, proposing turning points, quantifying the trend according to the turning points to obtain trend positions, and calibrating the data labels according to the trend positions;
step 2: the stock price trend is predicted by using an LSTM-CNN prediction model, and the method comprises the following steps:
step 2.1: constructing a network model, comprising the following steps:
the LSTM-CNN model comprises three parts, wherein one part is an LSTM time sequence characteristic learning layer; the other part is a CNN static characteristic learning layer which is synchronously carried out with the LSTM part; the last part is a fully-connected output layer, and the outputs of the first two models are connected in series and combined to construct a fully-connected neural network;
step 2.2: activating a function
Using the ELU as the activation function, the formula is (7):
φ(x)=max(0,x) (7)
the network model needs to be configured in the training process, an activation function and input and output dimensions need to be set in an LSTM unit, and the size of the activation function and a convolution kernel need to be set in a CNN unit;
step 2.3: constructing a model training strategy, designing a weighting loss function for the training of the model, and comprising the following steps:
the value range of the trend position is a limited interval [0,100], the application values of different positions corresponding to the interval are different, and the invention performs weighting measurement on the interval to provide a loss function sensitive to the cost of a turning region;
firstly, the target value y and the predicted value are comparedScaling and translating to make the target value in the interval [ -1,1 [ -1 [ ]]In the method, formula (8) is a conversion method, c is a small real number and is used for ensuring that a logarithm and a denominator calculation object are not zero, and arctaph formula (9) is used for carrying out the conversion on a target value z and a predicted valueMapping is carried out, and weight assignment, Y and Y are realized through the mapping processRespectively a real value and a predicted value after the transformation weighting;
after the target value and the predicted value are subjected to mapping weighting, MSE is applied to measure errors of the target value and the predicted value, WMSE (10) which is a loss function of the prediction model is obtained, the training speed can be accelerated through weighting, and meanwhile, the complexity of the model is reduced to a certain extent;
and step 3: evaluation criteria
The prediction target is a prediction trend position and a bottom turning point, and the accuracy rates of the two aspects are respectively considered;
the accuracy rate of the trend position is evaluated by using the average absolute error MAE, the accuracy rate of the turning point is evaluated by designing two indexes of a bottom precision ratio LP and a top recall ratio HR;
the precision ratio and the recall ratio are common evaluation indexes of the two classification problems, and in a classification algorithm, after a prediction result is compared with a real target according to the classification algorithm, the following four marks represent the prediction state of the classification algorithm;
TP: when the true value is the positive class, the predicted value is the positive class;
TN: when the true value is a positive class, the predicted value is a negative class;
FP: when the true value is a negative class, the predicted value is a positive class;
FN: when the true value is a negative class, the predicted value is a negative class;
wherein T is True, F is False, P is Positive, N is Negative, and calculate precision (11) and recall (12) according to each value; according to the formula, the precision ratio represents the condition that the actual value is true when the predicted value is true, and the recall ratio represents the condition that the predicted value is true when the true value is true;
for the bottom turning point, a strong buy signal is provided when the turning point occurs in each prediction, so that it is desirable that each prediction is as accurate as possible, otherwise investment loss is caused, and therefore attention is paid to precision ratio;
for the top turning point, when the turning point appears in each prediction, a selling signal is provided, and the top turning point appears in each prediction, the prediction is accurate as much as possible, so that the selling opportunity is avoided being missed, otherwise, the investment loss is caused, and therefore the recall ratio needs to be paid attention.
2. The method for predicting stock trend based on LSTM-CNN deep learning model as claimed in claim 1, wherein in step 1.1, the original transaction data is first normalized, in the stock data, in the market data, there is a continuously rising or descending market, the innovation height or new height is a common phenomenon in the financial data, therefore, the normalization method is designed for the price and the volume of;
price normalization: quantifying the stock data of each time window, calculating relative fall and rise by taking the stock sequence first value as a reference price, and independently normalizing the data in each time window as shown in a formula (1);
for each training sample, at T0,T1,T2…Ti…TnThe closing price corresponding to the time is P0,P1,P2…Pi…PnAt TiThe characteristic value obtained by time normalization is CxiA constant term C;
and (3) normalization of volume of bargaining: the calculation is carried out on the whole time sequence of a stock, and the formula is shown as (2):
at TiAt the moment, the volume of traffic is ViThe eigenvalue obtained based on the volume normalization is VxiIf the constant term n is generally equal to the size of the time window, the first n time points of one stock are only used for data processing and are not marked into a sample;
and (4) completing the normalization of closing price and volume, and respectively calculating the technical index values of the sequence as MACD, RSI and KDJ-J, wherein the technical index values are all used as input data.
3. The LSTM-CNN deep learning model-based stock trend prediction method of claim 1 or 2, wherein in step 1.2, the trend turning points comprise a top turning point at a peak position and a bottom turning point at a valley position, when the price is at the bottom turning point, it means that the price is about to rise, and vice versa; in an actual market situation, prices are always in fluctuation, fluctuation with different amplitudes is generated, a plurality of turning points are generated, and the turning points of model identification are quantified, and the specific definition is as follows:
stock price is listed at T1,T2,T3The closing prices corresponding to the time are respectively P1,P2,P3Setting a parameter called trend threshold delta, when condition (4) is satisfied, and T2The time price is T1To T3The lowest point in the period, T is considered2The moment is the bottom turning point of a section of trend;
similar to the bottom turning point, when the condition (5) is satisfied, and T2The time price is T1To T3The highest point in the period, then T is considered2The moment is a top turning point of a section of trend, and it is worth noting that the calculation processes of the bottom turning point and the top turning point are both based on low price, otherwise, under the same threshold value delta, the bottom turning point and the top turning point are different in the fluctuation absolute value required for establishing, and deviation is generated;
after a certain bottom turning point is determined, the subsequent sequence is defined as an ascending trend until the next top turning point is met, and similarly, after a certain top turning point is determined, the subsequent sequence is defined as a descending trend until the next bottom turning point is met;
after determining the turning points at the bottom and the top of the time sequence, mapping the current stock price to the relative position in the trend, and using the trend position as the direct prediction target of an LSTM-CNN model to realize the prediction of the future price trend, wherein the calibration method of the trend position comprises the following steps:
1.2.1, according to the definition of the turning points, obtaining all bottom turning points and top turning points of a group of sequences, wherein the trend position at the bottom turning point is marked as 0, and the trend position at the top turning point is marked as 100;
1.2.2, the trend position value between the bottom and top turning points is calculated according to equation (6), where PLAnd PHA set of adjacent bottom and top inflection points;
the last trend of the time series cannot determine the top or bottom turning point of the last price, so that the part which cannot be determined is not marked and is not included in the training data.
4. The LSTM-CNN deep learning model based stock trend prediction method of claim 1 or 2, wherein in step 2.1, the first part is composed of three LSTM layers, the input matrix size of LSTM is, where k is the number of training features, M is the size of sliding time window, L1 layer contains LSTM units corresponding to M time nodes, the input of each LSTM unit is composed of k features, including transaction data and technical indicators, each unit in L1 layer is trained to output as the input of each unit in L2 layer, similarly, the output of L2 layer is used as the input of L3 layer, the training modes of L2 and L3 layers are the same as that of L1 layer, and finally, the output of LSTM unit corresponding to the last time node in L3 layer is used as the result of part a;
the second part consists of three convolutional layers and one LSTM layer, in this part, the input matrix is the same as in the first part; extracting local related information between price and technical indexes through three convolutional layers, connecting the three convolutional layers with an LSTM layer, and analyzing a time relation in the extracted local dependency information through the LSTM layer;
the last part consists of three completely connected layers, time and static information are respectively extracted by the LSTM and the CNN and are combined into a new characteristic sequence, the characteristic sequence is input by the completely connected layers, and finally, a final prediction result of the model, namely a trend position, is output; in addition, some nodes between two connecting layers can be disconnected by applying a random inactivation method between all connecting layers, the final result is not completely dependent on specific nodes, and the random inactivation method obviously reduces the over-fitting problem.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910723665.8A CN110659767A (en) | 2019-08-07 | 2019-08-07 | Stock trend prediction method based on LSTM-CNN deep learning model |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910723665.8A CN110659767A (en) | 2019-08-07 | 2019-08-07 | Stock trend prediction method based on LSTM-CNN deep learning model |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110659767A true CN110659767A (en) | 2020-01-07 |
Family
ID=69036431
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910723665.8A Pending CN110659767A (en) | 2019-08-07 | 2019-08-07 | Stock trend prediction method based on LSTM-CNN deep learning model |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110659767A (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112308306A (en) * | 2020-10-27 | 2021-02-02 | 贵州工程应用技术学院 | Multi-mode input coal and gas outburst risk prediction method |
CN113792258A (en) * | 2021-09-18 | 2021-12-14 | 广东电网有限责任公司广州供电局 | Method for determining contribution rate of power grid enterprise informatization investment |
CN113807951A (en) * | 2021-09-23 | 2021-12-17 | 中国建设银行股份有限公司 | Transaction data trend prediction method and system based on deep learning |
CN117725522A (en) * | 2023-12-18 | 2024-03-19 | 易方达基金管理有限公司 | New stock release trend prediction method and system |
-
2019
- 2019-08-07 CN CN201910723665.8A patent/CN110659767A/en active Pending
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112308306A (en) * | 2020-10-27 | 2021-02-02 | 贵州工程应用技术学院 | Multi-mode input coal and gas outburst risk prediction method |
CN113792258A (en) * | 2021-09-18 | 2021-12-14 | 广东电网有限责任公司广州供电局 | Method for determining contribution rate of power grid enterprise informatization investment |
CN113807951A (en) * | 2021-09-23 | 2021-12-17 | 中国建设银行股份有限公司 | Transaction data trend prediction method and system based on deep learning |
CN113807951B (en) * | 2021-09-23 | 2024-10-15 | 中国建设银行股份有限公司 | Transaction data trend prediction method and system based on deep learning |
CN117725522A (en) * | 2023-12-18 | 2024-03-19 | 易方达基金管理有限公司 | New stock release trend prediction method and system |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110659767A (en) | Stock trend prediction method based on LSTM-CNN deep learning model | |
Dejaeger et al. | Data mining techniques for software effort estimation: a comparative study | |
Neves et al. | Improving bankruptcy prediction with hidden layer learning vector quantization | |
CN111563706A (en) | Multivariable logistics freight volume prediction method based on LSTM network | |
CN111626785A (en) | CNN-LSTM network fund price prediction method based on attention combination | |
Liu et al. | An adversarial bidirectional serial–parallel LSTM-based QTD framework for product quality prediction | |
CN116663568B (en) | Critical task identification system and method based on priority | |
CN114239397A (en) | Soft measurement modeling method based on dynamic feature extraction and local weighted deep learning | |
Wimmer et al. | Leveraging vision-language models for granular market change prediction | |
Abed Salman et al. | Creating a cutting-edge neurocomputing model with high precision | |
CN111815458A (en) | Dynamic investment portfolio configuration method based on fine-grained quantitative marking and integration method | |
Zhang | Research on Stock Price Prediction Based on PCA-LSTM Model | |
US20220269991A1 (en) | Evaluating reliability of artificial intelligence | |
Pongsena et al. | Development of a model for predicting the direction of daily price changes in the forex market using long short-term memory | |
CN115796000A (en) | Short-term air temperature forecast set correction method based on stacked machine learning algorithm | |
Wang et al. | Stock Price Volatility Prediction in Financial Big Data on XGBoost and ARIMA Models | |
KR20220147968A (en) | A stock price prediction system based on real-time macro index prediction | |
Kocaoğlu et al. | Sector-Based Stock Price Prediction with Machine Learning Models | |
EP3739517A1 (en) | Image processing | |
CN106485363A (en) | The one B shareB in a few days quantization of upward price trend and Forecasting Methodology | |
TW202213239A (en) | A predicting the trend of stock prices and trade advising system based on neural networks of analyzing multiple technical analysis indicators | |
Zhao et al. | Analysing trends in trading patterns in financial markets using deep learning algorithms | |
Jackson et al. | Machine learning for classification of economic recessions | |
CN114386196B (en) | Method for evaluating mechanical property prediction accuracy of plate strip | |
Nayak et al. | Forecasting Foreign Currency Exchange Price using Long Short-Term Memory with K-Nearest Neighbour Method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20200107 |
|
RJ01 | Rejection of invention patent application after publication |