CN110659767A

CN110659767A - Stock trend prediction method based on LSTM-CNN deep learning model

Info

Publication number: CN110659767A
Application number: CN201910723665.8A
Authority: CN
Inventors: 张端; 施佳琴; 赵傲
Original assignee: Zhejiang University of Technology ZJUT
Current assignee: Zhejiang University of Technology ZJUT
Priority date: 2019-08-07
Filing date: 2019-08-07
Publication date: 2020-01-07

Abstract

A stock trend prediction method based on an LSTM-CNN deep learning model comprises the following steps: step 1: preprocessing data; step 2: predicting the stock price trend by using an LSTM-CNN prediction model; and step 3: and evaluating the standard, wherein the prediction target is to predict the trend position and the bottom turning point, and the accuracy of the two aspects is respectively considered. The invention provides an LSTM-CNN neural network model and a weighting loss function, optimizes a normalization algorithm and a target function trend position, and accurately predicts the stock fluctuation trend.

Description

A stock trend prediction method based on LSTM-CNN deep learning model

技术领域technical field

本发明涉及金融时间序列的研究，具体是应用LSTM和CNN两种神经网络相结合的方法，对股票趋势进行预测。The invention relates to the research of financial time series, in particular to predicting the stock trend by applying the method of combining two neural networks, LSTM and CNN.

背景技术Background technique

随着全球经济的发展，中国的金融市场发展面临着更大的机遇和挑战。金融市场内容丰富，产品主要包括债券、期货、外汇、股票等，其价格的波动是重点关注的对象。With the development of the global economy, China's financial market development faces greater opportunities and challenges. The financial market is rich in content, and its products mainly include bonds, futures, foreign exchange, stocks, etc. The price fluctuations are the focus of attention.

股票作为金融市场中重要的一部分，是金融市场的风向标。股票市场的波动与投资者的利益息息相关，成为众多投资者、金融机构的关注焦点。As an important part of the financial market, stocks are the weather vane of the financial market. The volatility of the stock market is closely related to the interests of investors, and has become the focus of attention of many investors and financial institutions.

通过对股票历史价格波动、交易数据等进行定量分析，利用数据挖掘、人工智能等技术提供参考，对于投资者具有重要意义。近年来，随着人工智能领域取得的技术性突破，人工智能技术再一次被引起公众的重视。Quantitative analysis of historical stock price fluctuations, transaction data, etc., and the use of data mining, artificial intelligence and other technologies to provide reference is of great significance to investors. In recent years, with the technological breakthroughs in the field of artificial intelligence, artificial intelligence technology has once again attracted the attention of the public.

Hocherative S和Schmidhuber J提出LSTM神经网络模型适用于具有时序特性的数据领域的特征分析。CNN早期是由Fukushima K提出了卷积神经网络的雏形，偏向于识别静态性特征，此特征同样适用于金融数据，相邻时间内联系较强，再将局部特征进行提取。Hocherative S and Schmidhuber J proposed that the LSTM neural network model is suitable for feature analysis in the data domain with time series characteristics. In the early days of CNN, the prototype of the convolutional neural network was proposed by Fukushima K, which was inclined to identify static features. This feature is also suitable for financial data. The connection is strong in adjacent time, and then local features are extracted.

发明内容SUMMARY OF THE INVENTION

与目前应用单一的神经网络对股票数据进行分析、预测不同，基于LSTM-CNN网络模型对股票价格波动进行分析，通过神经网络从股票数据提取时序和静态特征，并对股票趋势的转折点以及未来走势进行预测。Different from the current application of a single neural network to analyze and predict stock data, the stock price fluctuation is analyzed based on the LSTM-CNN network model, and the time series and static features are extracted from the stock data through the neural network. Make predictions.

本发明解决其技术问题所采用的技术方案是：The technical scheme adopted by the present invention to solve its technical problems is:

一种基于LSTM-CNN深度学习模型的股票趋势预测方法，包括以下步骤：A stock trend prediction method based on LSTM-CNN deep learning model, including the following steps:

步骤1：数据预处理Step 1: Data Preprocessing

选择股票市场的股票数据进行分析与验证，整体数据集分为训练集和测试集；Select the stock data of the stock market for analysis and verification, the overall data set is divided into training set and test set;

步骤1.1：特征选择和归一化Step 1.1: Feature Selection and Normalization

所述特征为交易数据和技术指标，交易数据需要进行归一化处理；技术指标是由交易数据计算获得，在计算过程中已经包含归一化过程；The features are transaction data and technical indicators, and the transaction data needs to be normalized; the technical indicators are calculated from the transaction data, and the normalization process is already included in the calculation process;

首先对原始交易数据进行归一化，在股票数据中，存在持续上升或持续下降的行情中，创新高或新低是金融数据中的常见现象，因此，对价格与成交量分别设计归一化方法；First, normalize the original transaction data. In the stock data, when the market continues to rise or fall, new highs or new lows are common phenomena in financial data. Therefore, a normalization method is designed for price and trading volume respectively. ;

价格归一化：对于每个时间窗口的股票数据进行量化。以股票序列首位值为基准价格，计算相对跌涨幅，公式如(1)所示对每个时间窗口内的数据进行单独归一化；Price normalization: Quantify stock data for each time window. Take the first position of the stock sequence as the benchmark price to calculate the relative price increase. The formula is as shown in (1) to separately normalize the data in each time window;

对于每个训练样本，在t₀,t₁,…,t_i,t_m时刻对应的收盘价为P₀,P₁,…,P_i,P_m.其中μ是平均收盘价，m为滑动窗大小；For each training sample, the corresponding closing prices at times t ₀ , t ₁ ,…, t _i , t _m are P ₀ , P ₁ ,…, P _i , P _m . Where μ is the average closing price, and m is the sliding window size;

成交量的归一化：在一只股票整体时间序列上进行计算，公式如(2)所示：Normalization of trading volume: Calculated on the overall time series of a stock, the formula is shown in (2):

在T_i时刻，成交量为V_i，基于成交量归一化得到的特征值为Vx_i，常数项m通常与时间窗口大小相等，则一只股票的前m个时间点只用作数据处理，不划入样本；At time T _i , the trading volume is V _i , the eigenvalue obtained based on the normalization of trading volume is Vx _i , the constant term m is usually equal to the time window size, then the first m time points of a stock are only used for data processing , not included in the sample;

完成对收盘价与成交量的归一化，计算序列的技术指标值分别为MACD、RSI和KDJ-J，将其都作为输入数据；After completing the normalization of closing price and trading volume, the technical indicator values of the calculation sequence are MACD, RSI and KDJ-J respectively, which are all used as input data;

步骤1.2：标定趋势位置Step 1.2: Calibrate the trend position

研究股票中长期的价格趋势，定义了转折点，根据转折点对趋势定量化，得到趋势位置，并依此对数据标签进行标定；Study the medium and long-term price trends of stocks, define the turning points, quantify the trend according to the turning points, get the trend position, and calibrate the data labels accordingly;

趋势转折点包括波峰位置的顶部转折点与波谷位置的底部转折点，当价格处于底部转折点时，意味着价格即将上升，反之亦然。在实际行情中，价格一直处于波动之中，产生不同振幅的波动，从而产生多个转折点，对模型识别的转折点进行量化，具体的定义如下：Trend turning points include the top turning point at the peak position and the bottom turning point at the trough position. When the price is at the bottom turning point, it means that the price is about to rise, and vice versa. In the actual market, the price has been fluctuating, resulting in fluctuations of different amplitudes, resulting in multiple turning points. The turning points identified by the model are quantified. The specific definitions are as follows:

股票价格序列在T₁,T₂,T₃时刻所对应的收盘价分别为P₁,P₂,P₃，设定一参数称为趋势阈值δ，当满足条件(4)时，且T₂时刻价格为T₁至T₃期间内的最低点，则认为T₂时刻为一段趋势的底部转折点；The closing prices corresponding to the stock price sequence at T ₁ , T ₂ , and T ₃ are P ₁ , P ₂ , and P ₃ respectively, and a parameter is set called the trend threshold δ. When condition (4) is satisfied, and T ₂ If the moment price is the lowest point in the period from T ₁ to T ₃ , the moment T ₂ is considered to be the bottom turning point of a trend;

与底部转折点类似，当满足条件(5)时，且T₂时刻价格为T₁至T₃期间内的最高点，则认为T₂时刻为一段趋势的顶部转折点，其走势如图1右图所示。值得注意的是，底部转折点与顶部转折点计算过程应均以低价位为基准，否则在同一阈值δ下，底部转折点与顶部转折点确立所需要的波动绝对值不同，会产生偏差；Similar to the bottom turning point, when the condition ( ₅ ) is satisfied, and the price at time T2 is the highest point in the period from T1 to T3, then the time at T2 is considered as the top turning point _of _a trend, and its trend is shown in the right picture of Figure ₁ . Show. It is worth noting that the calculation process of the bottom turning point and the top turning point should be based on the low price, otherwise under the same threshold δ, the absolute value of the fluctuation required for the establishment of the bottom turning point and the top turning point will be different, which will cause deviations;

当某一底部转折点确定后，其后的序列便定义为上升趋势，直至遇到下一个顶部转折点，同理，当某一顶部转折点确定后，其后的序列便定义为下跌趋势，直至遇到下一个底部转折点；When a bottom turning point is determined, the subsequent sequence is defined as an upward trend until it encounters the next top turning point. Similarly, when a top turning point is determined, the subsequent sequence is defined as a downward trend until it encounters the next top turning point. next bottom turning point;

确定时间序列的底部和顶部转折点确定后，将当前的股票价格映射到趋势中相对位置，将趋势位置作为LSTM-CNN模型的直接预测目标，实现对未来价格趋势的预测，趋势位置的标定方法如下：After determining the bottom and top turning points of the time series, map the current stock price to the relative position in the trend, and use the trend position as the direct prediction target of the LSTM-CNN model to predict the future price trend. The calibration method of the trend position is as follows :

1.2.1，根据转折点定义，可以获得一组序列所有的底部转折点与顶部转折点，底部转折点处趋势位置标为0，顶部转折点处趋势位置标为100；1.2.1, according to the definition of turning point, all bottom turning points and top turning points of a set of sequences can be obtained, the trend position at the bottom turning point is marked as 0, and the trend position at the top turning point is marked as 100;

1.2.2，位于底部转折点与顶部转折点之间的趋势位置值按照公式(6)进行计算，其中，P_L与P_H为一组相邻的底部与顶部转折点；1.2.2, the trend position value between the bottom turning point and the top turning point is calculated according to formula (6), where _PL and _PH are a set of adjacent bottom and top turning points;

时间序列的最后一段趋势，无法确定最后一段价格的顶部或底部转折点，故对无法确定部分不进行标记，且不纳入训练数据；For the last trend of the time series, the top or bottom turning point of the last price cannot be determined, so the undetermined part is not marked and is not included in the training data;

步骤2：使用LSTM-CNN预测模型对股票价格趋势进行预测，步骤如下：Step 2: Use the LSTM-CNN prediction model to predict the stock price trend, the steps are as follows:

步骤2.1：网络模型构建，步骤如下：Step 2.1: Network model construction, the steps are as follows:

LSTM-CNN模型包括三部分，一部分为LSTM时序特征学习层；另一部分为CNN静态特征学习层，与LSTM部分同步进行；最后一部分为全连接输出层，通过将前两种模型的输出串联合并，再构建全连接神经网络。The LSTM-CNN model consists of three parts, one part is the LSTM time series feature learning layer; the other part is the CNN static feature learning layer, which is synchronized with the LSTM part; the last part is the fully connected output layer, by combining the outputs of the first two models in series, Then build a fully connected neural network.

第一部分是由三个LSTM层组成的，如图3中的(a)部分所示。LSTM的输入矩阵大小为。其中k为训练特征个数，m为滑动时间窗的大小。如图所示L1层中包含M个时间节点对应的LSTM单元，每个LSTM单元的输入由k个特征组成，其中包含交易数据和技术指标。L1层中每个单元经过训练分别输出作为L2层每个单元的输入，同样的，L2层的输出作为L3层的输入。L2、L3层的训练方式与L1层相同。最后，将L3层最后一个时间节点对应的LSTM单元的输出作为a部分的结果；The first part is composed of three LSTM layers, as shown in part (a) in Figure 3. The input matrix size of LSTM is . where k is the number of training features, and m is the size of the sliding time window. As shown in the figure, the L1 layer contains LSTM units corresponding to M time nodes, and the input of each LSTM unit consists of k features, including transaction data and technical indicators. Each unit in the L1 layer is trained and output as the input of each unit in the L2 layer. Similarly, the output of the L2 layer is used as the input of the L3 layer. The L2 and L3 layers are trained in the same way as the L1 layer. Finally, the output of the LSTM unit corresponding to the last time node of the L3 layer is used as the result of part a;

第二部分由三个卷积层和一个LSTM层组成，如图3中的(b)部分所示。在本部分中，输入矩阵与(a)部分中的输入矩阵相同；首先通过三个卷积层提取价格与技术指标之间的局部相关信息。三个卷积层后接一个LSTM层。通过LSTM层分析了所提取的局部依赖信息中的时间关系；The second part consists of three convolutional layers and one LSTM layer, as shown in part (b) in Figure 3. In this section, the input matrix is the same as in section (a); the local correlation information between price and technical indicators is first extracted through three convolutional layers. Three convolutional layers followed by an LSTM layer. The temporal relationship in the extracted local dependency information is analyzed through the LSTM layer;

最后一部分由三个完全连接的层组成。LSTM和CNN分别提取时间和静态信息，合并为新的特征序列。此特征序列是全连接的层的输入。最后输出模型的最终预测结果，即趋势位置。另外，在全连接层之间应用了随机失活(dropout)方法可以断开两个连接层之间的某些节点，最终结果不完全依赖于特定节点。随机失活方法明显减少过度拟合问题；The last part consists of three fully connected layers. LSTM and CNN extract temporal and static information, respectively, and merge into a new feature sequence. This feature sequence is the input to the fully connected layer. Finally, the final prediction result of the model is output, which is the trend position. In addition, applying a random dropout method between fully connected layers can disconnect some nodes between two connected layers, and the final result is not completely dependent on a specific node. The random deactivation method significantly reduces the overfitting problem;

步骤2.2：激活函数Step 2.2: Activation function

在LSTM网络中使用ELU作为激活函数，公式如(7)：Using ELU as the activation function in the LSTM network, the formula is as (7):

φ(x)＝max(0,x) (7)φ(x)=max(0,x) (7)

训练使用的网络结构如图4所示，训练过程需要对网络模型进行配置，在LSTM单元中，需要设定激活函数、输入输出维度，在CNN单元中，需要设定激活函数、卷积核大小；The network structure used for training is shown in Figure 4. The training process needs to configure the network model. In the LSTM unit, the activation function and input and output dimensions need to be set. In the CNN unit, the activation function and the size of the convolution kernel need to be set. ;

步骤2.3：模型训练策略构建，为模型的训练设计了加权损失函数，步骤如下：Step 2.3: Build a model training strategy, and design a weighted loss function for model training. The steps are as follows:

趋势位置的取值范围为有限区间[0,100]，在此区间内所对应的不同位置的应用价值不同，本发明对其进行加权测量，提出一种转折区域代价敏感的损失函数；The value range of the trend position is a limited interval [0, 100], and the application values of different positions corresponding to this interval are different. The present invention performs weighted measurement on it, and proposes a cost-sensitive loss function in the turning area;

首先对目标值y与预测值

进行等比例缩放与平移，使目标值处于区间[-1,1]内，公式(8)为转换方法，c为一较小实数，用来保证对数与分母计算对象不为零，使用arctanh公式(9)，对目标值z与预测值

进行映射，通过映射过程实现权值分配，Y与分别为变换加权后的真实值与预测值；First, compare the target value y with the predicted value

Perform equal scaling and translation to make the target value in the interval [-1,1], formula (8) is the conversion method, c is a small real number, used to ensure that the logarithm and denominator calculation objects are not zero, use arctanh Formula (9), for the target value z and the predicted value

Mapping is carried out, and weight distribution is achieved through the mapping process, and Y and are the true value and the predicted value after the transformation and weighting, respectively;

目标值与预测值经过映射加权后，再应用MSE对其测量误差，得到WMSE(10)即为预测模型的损失函数。通过加权，可以使训练速度加快，同时在一定程度上降低模型复杂度；After the target value and the predicted value are weighted by mapping, MSE is applied to measure the error, and WMSE(10) is obtained, which is the loss function of the prediction model. Through weighting, the training speed can be accelerated, and the model complexity can be reduced to a certain extent;

步骤3：误差评估标准Step 3: Error Evaluation Criteria

预测目标是预测趋势位置和底部转折点，分别考虑这两个方面的准确率；The prediction goal is to predict the trend position and the bottom turning point, and consider the accuracy of these two aspects respectively;

趋势位置准确率使用平均绝对误差(MAE)进行评估，转折点准确率，设计了底部查准率(LP)与顶部查全率(HR)两种指标进行评估；The trend position accuracy is evaluated using the mean absolute error (MAE), and the turning point accuracy is designed with two indicators: bottom precision (LP) and top recall (HR);

查准率(Precision)与查全率(Recall)是二分类问题常用的评价指标，在分类算法中，根据其预测结果与真实目标比较后，以下四种标记表示其预测状态；Precision and recall are commonly used evaluation indicators for binary classification problems. In the classification algorithm, after comparing the predicted results with the real target, the following four markers indicate the predicted state;

TP：真实值为正类时预测值为正类；TP: When the true value is a positive class, the predicted value is a positive class;

TN：真实值为正类时预测值为负类；TN: When the true value is a positive class, the predicted value is a negative class;

FP：真实值为负类时预测值为正类；FP: When the true value is negative, the predicted value is positive;

FN：真实值为负类时预测值为负类。FN: The predicted value is the negative class when the true value is negative.

其中T为True，F为False，P为Positive，N为Negative。并依据其各值计算出查准率(11)与查全率(12)；由公式可知，查准率表示当预测值为真时，实际值也为真的情况，查全率表示当真实值为真时，预测值也为真的情况；Where T is True, F is False, P is Positive, and N is Negative. And calculate the precision (11) and recall (12) according to their values; from the formula, the precision indicates that when the predicted value is true, the actual value is also true, and the recall indicates that when the true value is true. When the value is true, the predicted value is also true;

对于底部转折点，当每次预测出现转折点时，提供强烈的买入信号，所以我们希望每次预测为转折点时，尽可能地准确，否则会造成投资损失，因此需要关注的为查准率；For the bottom turning point, every time a turning point is predicted, a strong buy signal is provided, so we hope that each prediction is a turning point, as accurate as possible, otherwise it will cause investment losses, so we need to pay attention to the precision rate;

对于顶部转折点，当每次预测出现转折点时，提供卖出信号，在每一次顶部出现时尽量预测准确，以避免错过卖出机会，否则会造成投资者损失，因此需要关注的为查全率。For the top turning point, every time a turning point is predicted, a sell signal is provided, and the forecast is as accurate as possible when each top appears, so as to avoid missing the selling opportunity, otherwise it will cause losses to investors, so the recall rate needs to be paid attention to.

本发明的有益效果主要表现在：本发明提出了将LSTM-CNN神经网络模型和加权损失函数，优化了归一化算法和目标函数趋势位置，准确的预测了股票的波动趋势。The beneficial effects of the present invention are mainly manifested in: the present invention proposes the LSTM-CNN neural network model and the weighted loss function, optimizes the normalization algorithm and the trend position of the objective function, and accurately predicts the fluctuation trend of the stock.

附图说明Description of drawings

图1为本发明的转折点定义图。FIG. 1 is a definition diagram of a turning point of the present invention.

图2为本发明的趋势位置示例图。FIG. 2 is a diagram showing an example of a trend position of the present invention.

图3为本发明的LSTM-CNN网络结构示意图。FIG. 3 is a schematic diagram of the structure of the LSTM-CNN network of the present invention.

图4为本发明的arctanh映射曲线图。FIG. 4 is an arctanh mapping graph of the present invention.

图5为本发明的目标加权效果图。FIG. 5 is a target weighting effect diagram of the present invention.

图6为本发明的宝钢股份趋势位置LSTM-CNN预测图。FIG. 6 is the LSTM-CNN prediction diagram of the trend position of Baosteel Co., Ltd. according to the present invention.

具体实施方式Detailed ways

下面结合附图对本发明作进一步描述。The present invention will be further described below in conjunction with the accompanying drawings.

参照图1～图6，一种基于LSTM-CNN深度学习模型的股票趋势预测方法，所述方法包括以下步骤：1 to 6, a stock trend prediction method based on the LSTM-CNN deep learning model, the method includes the following steps:

步骤1：数据预处理Step 1: Data Preprocessing

采用了A股市场内部分股票数据进行分析与验证，整体数据集分为训练集和测试集。选取沪深300指数为训练集，上证50为测试集。其中不包括新股、次新股、长期停牌股、ST股和大幅度异常波动的股票。选定218只股票纳入训练集，37只股票纳入测试集。选取2004年至2018年12月份期间的股票数据，获取其每周的开盘价、收盘价、最高价、成交量等交易数据。Part of the stock data in the A-share market is used for analysis and verification. The overall data set is divided into training set and test set. The CSI 300 index was selected as the training set, and the SSE 50 index was selected as the test set. It does not include new shares, new shares, long-term suspended stocks, ST shares and stocks with large and abnormal fluctuations. 218 stocks were selected to be included in the training set, and 37 stocks were included in the test set. Select the stock data from 2004 to December 2018 to obtain its weekly opening price, closing price, highest price, trading volume and other transaction data.

主要特征为交易数据和技术指标。数据的五组训练特征为Cx、Vx、MACD、JDK-J、RSI，分别表示收盘价、成交量，MACD指标、JDK-J指标、RSI指标。训练集中每只股票大约有500个时间序列点，以滑动窗口步长为1，每个训练的时间点对应趋势位置，为此样本的目标，目标矩阵大小与样本数个数相同。The main features are transaction data and technical indicators. The five sets of training features of the data are Cx, Vx, MACD, JDK-J, and RSI, which represent closing price, trading volume, MACD indicator, JDK-J indicator, and RSI indicator respectively. There are about 500 time series points for each stock in the training set, and the sliding window step size is 1. Each training time point corresponds to the trend position. For the target of this sample, the size of the target matrix is the same as the number of samples.

主要特征为交易数据和技术指标，交易数据需要进行归一化处理。技术指标是由交易数据计算获得，在计算过程中已经包含归一化过程，故不需要再次处理。The main features are transaction data and technical indicators, and the transaction data needs to be normalized. Technical indicators are calculated from transaction data, and the normalization process is already included in the calculation process, so it does not need to be processed again.

首先对原始交易数据进行归一化。在股票数据中，存在持续上升或持续下降的行情中，创新高或新低是金融数据中的常见现象。因此，对价格与成交量分别设计归一化方法。First normalize the raw transaction data. In stock data, where there is a continuous rise or a continuous decline, new highs or new lows are a common phenomenon in financial data. Therefore, a normalization method is designed for price and volume respectively.

价格归一化：对于每个时间窗口的股票数据进行量化。以股票序列首位值为基准价格，计算相对跌涨幅，公式如(1)所示对每个时间窗口内的数据进行单独归一化。Price normalization: Quantify stock data for each time window. Taking the first position of the stock sequence as the benchmark price, calculate the relative price increase. The formula is shown in (1) to separately normalize the data in each time window.

对于每个训练样本，在t₀,t₁,…,t_i,t_m时刻对应的收盘价为P₀,P₁,…,P_i,P_m.其中μ是平均收盘价，滑动窗口m＝100；For each training sample, the corresponding closing prices at time t ₀ , t ₁ ,..., t _i , t _m are P ₀ , P ₁ ,..., P _i , P _m . Where μ is the average closing price, sliding window m =100;

在T_i时刻，成交量为V_i，基于成交量归一化得到的特征值为Vx_i，常数项m＝100通常与时间窗口大小相等，则一只股票的前m＝100个时间点只用作数据处理，不划入样本；At time T _i , the trading volume is V _i , the characteristic value obtained based on the normalization of trading volume is Vx _i , the constant term m=100 is usually equal to the time window size, then the first m=100 time points of a stock are only Used for data processing, not included in the sample;

将收盘价、成交量、MACD、RSI和KDJ-J作为输入数据。Take the closing price, volume, MACD, RSI and KDJ-J as input data.

步骤1.2：标定趋势位置Step 1.2: Calibrate the trend position

研究股票中长期的价格趋势，提出了转折点。根据转折点对趋势定量化，得到趋势位置，并依此对数据标签进行标定。Studying the mid- to long-term price trends of stocks, turning points are suggested. The trend is quantified according to the turning point, and the trend position is obtained, and the data label is calibrated accordingly.

股票价格序列在T₁,T₂,T₃时刻所对应的收盘价分别为P₁,P₂,P₃，设定一参数称为趋势阈值δ＝60％，当满足条件(4)时，且T₂时刻价格为T₁至T₃期间内的最低点，则认为T₂时刻为一段趋势的底部转折点，其走势如图1左图所示。The closing prices corresponding to the stock price sequence at T ₁ , T ₂ , and T ₃ are respectively P ₁ , P ₂ , and P ₃ , and a parameter is set called the trend threshold δ=60%. When the condition (4) is satisfied, And if the price at time T2 is the lowest point in the period from T1 to T3, it is considered that time at T2 is the bottom _turning point _of _a trend, and its trend is shown in the left figure of Figure ₁ .

与底部转折点类似，当满足条件(5)时，且T₂时刻价格为T₁至T₃期间内的最高点，则认为T₂时刻为一段趋势的顶部转折点，其走势如图1右图所示。值得注意的是，底部转折点与顶部转折点计算过程应均以低价位为基准，否则在同一阈值δ下，底部转折点与顶部转折点确立所需要的波动绝对值不同，会产生偏差。Similar to the bottom turning point, when the condition ( ₅ ) is satisfied, and the price at time T2 is the highest point in the period from T1 to T3, then the time at T2 is considered as the top turning point _of _a trend, and its trend is shown in the right picture of Figure ₁ . Show. It is worth noting that the calculation process of the bottom turning point and the top turning point should be based on the low price. Otherwise, under the same threshold δ, the absolute value of the fluctuation required for the establishment of the bottom turning point and the top turning point will be different, which will cause deviations.

当某一底部转折点确定后，其后的序列便定义为上升趋势，直至遇到下一个顶部转折点。同理，当某一顶部转折点确定后，其后的序列便定义为下跌趋势，直至遇到下一个底部转折点。When a bottom turning point is identified, the subsequent series is defined as an uptrend until the next top turning point is encountered. Similarly, when a top turning point is determined, the subsequent sequence is defined as a downtrend until the next bottom turning point is encountered.

确定时间序列的底部和顶部转折点确定后，将当前的股票价格映射到趋势中相对位置。将趋势位置作为LSTM-CNN模型的直接预测目标，实现对未来价格趋势的预测。趋势位置的标定方法如下：After determining the bottom and top turning points of the time series, map the current stock price to the relative position in the trend. The trend position is used as the direct prediction target of the LSTM-CNN model to realize the prediction of future price trends. The calibration method of the trend position is as follows:

1.2.1，根据转折点定义，可以获得一组序列所有的底部转折点与顶部转折点，底部转折点处趋势位置标为0，顶部转折点处趋势位置标为100。1.2.1. According to the definition of turning point, all bottom turning points and top turning points of a set of sequences can be obtained. The trend position at the bottom turning point is marked as 0, and the trend position at the top turning point is marked as 100.

1.2.2，位于底部转折点与顶部转折点之间的趋势位置值按照公式(6)进行计算，其中，P_L与P_H为一组相邻的底部与顶部转折点。1.2.2. The trend position value between the bottom turning point and the top turning point is calculated according to formula (6), where _PL and _PH are a set of adjacent bottom and top turning points.

在时间序列的最后一段趋势当中，根据定义无法确定最后一段价格的顶部或底部转折点，故对无法确定部分不进行标记，且不纳入训练数据。一段股票序列的标记效果如图2所示。In the last trend of the time series, the top or bottom turning point of the last price cannot be determined by definition, so the undetermined part is not marked and is not included in the training data. The marking effect of a stock sequence is shown in Figure 2.

步骤2：使用LSTM-CNN预测模型对股票价格趋势进行预测，步骤如下；Step 2: Use the LSTM-CNN prediction model to predict the stock price trend, the steps are as follows;

第一部分是由三个LSTM层组成的，如图3中的(a)部分所示。LSTM的输入矩阵大小为。其中k＝5为训练特征个数，m为滑动时间窗的大小。如图所示L1层中包含M＝100个时间节点对应的LSTM单元，每个LSTM单元的输入由k＝5个特征组成，其中包含交易数据和技术指标。L1层中每个单元经过训练分别输出作为L2层每个单元的输入，同样的，L2层的输出作为L3层的输入。L2、L3层的训练方式与L1层相同。最后，将L3层最后一个时间节点对应的LSTM单元的输出作为a部分的结果。The first part is composed of three LSTM layers, as shown in part (a) in Figure 3. The input matrix size of LSTM is . Where k=5 is the number of training features, and m is the size of the sliding time window. As shown in the figure, the L1 layer contains LSTM units corresponding to M=100 time nodes, and the input of each LSTM unit consists of k=5 features, including transaction data and technical indicators. Each unit in the L1 layer is trained and output as the input of each unit in the L2 layer. Similarly, the output of the L2 layer is used as the input of the L3 layer. The L2 and L3 layers are trained in the same way as the L1 layer. Finally, the output of the LSTM unit corresponding to the last time node of the L3 layer is used as the result of part a.

第二部分由三个卷积层和一个LSTM层组成，如图3中的(b)部分所示。在本部分中，输入矩阵与(a)部分中的输入矩阵相同。首先通过三个卷积层提取价格与技术指标之间的局部相关信息。三个卷积层后接一个LSTM层。通过LSTM层分析了所提取的局部依赖信息中的时间关系。The second part consists of three convolutional layers and one LSTM layer, as shown in part (b) in Figure 3. In this section, the input matrix is the same as in section (a). The local correlation information between price and technical indicators is first extracted through three convolutional layers. Three convolutional layers followed by an LSTM layer. The temporal relationships in the extracted local dependency information are analyzed through LSTM layers.

最后一部分由三个完全连接的层组成。LSTM和CNN分别提取时间和静态信息，合并为新的特征序列。此特征序列是完全连接的层的输入。最后输出模型的最终预测结果，即趋势位置。另外，在全连接层之间应用了随机失活(dropout)方法可以断开两个连接层之间的某些节点，最终结果不完全依赖于特定节点。随机失活方法明显减少过度拟合问题。The last part consists of three fully connected layers. LSTM and CNN extract temporal and static information, respectively, and merge into a new feature sequence. This feature sequence is the input to the fully connected layer. Finally, the final prediction result of the model is output, which is the trend position. In addition, applying a random dropout method between fully connected layers can disconnect some nodes between two connected layers, and the final result is not completely dependent on a specific node. The random deactivation method significantly reduces the overfitting problem.

步骤2.2：激活函数Step 2.2: Activation function

φ(x)＝max(0,x) (7)φ(x)=max(0,x) (7)

本发明训练使用的网络结构如图4所示，训练过程需要对网络模型进行配置，在LSTM单元中，需要设定激活函数、输入输出维度，在CNN单元中，需要设定激活函数、卷积核大小。在本发明实验中，LSTM-CNN网络模型结构参数使用表1所示配置进行训练。The network structure used in the training of the present invention is shown in Figure 4. The training process needs to configure the network model. In the LSTM unit, the activation function and input and output dimensions need to be set. In the CNN unit, the activation function and convolution need to be set. nuclear size. In the experiments of the present invention, the structural parameters of the LSTM-CNN network model are trained using the configurations shown in Table 1.

表1Table 1

趋势位置的取值范围为有限区间[0,100]，在此区间内所对应的不同位置的应用价值不同，本发明对其进行加权测量，提出一种转折区域代价敏感的损失函数。The value range of the trend position is a finite interval [0, 100], and different positions corresponding to this interval have different application values. The present invention performs weighted measurement on it, and proposes a cost-sensitive loss function in the turning area.

首先对目标值y与预测值

进行映射，通过映射过程实现权值分配，Y与分别为变换加权后的真实值与预测值。First, compare the target value y with the predicted value

Mapping is carried out, and weight distribution is achieved through the mapping process, and Y and are the true value and the predicted value after the transformation and weighting, respectively.

arctanh映射曲线图如图4所示。由图可看出，该映射加权方法使两端值变化逐步增大，中部区域变化幅度相对较小，且整体平滑。经过加权后，在两端区域加权误差相较于原始误差会增加，越靠近两端，误差增量越大，从而实现不同区域的权值分配，其加权实际效果体现在实际目标上效果如图5所示。The arctanh mapping curve is shown in Figure 4. It can be seen from the figure that the mapping weighting method gradually increases the changes of the two end values, and the change range in the middle area is relatively small and the whole is smooth. After weighting, the weighted error at the two ends will increase compared to the original error. The closer to the two ends, the larger the error increment, so as to realize the distribution of weights in different regions. The actual effect of weighting is reflected in the actual target. The effect is shown in the figure 5 shown.

目标值与预测值经过映射加权后，再应用MSE对其测量误差，得到WMSE(10)即为预测模型的损失函数。通过加权，可以使训练速度加快，同时在一定程度上降低模型复杂度。After the target value and the predicted value are weighted by mapping, MSE is applied to measure the error, and WMSE(10) is obtained, which is the loss function of the prediction model. By weighting, the training speed can be accelerated while reducing the model complexity to a certain extent.

步骤3：评估标准Step 3: Evaluation Criteria

本发明主要的预测目标是预测趋势位置和底部转折点，分别考虑这两个方面的准确率。The main prediction objective of the present invention is to predict the trend position and the bottom turning point, and the accuracy rates of these two aspects are considered respectively.

趋势位置准确率使用平均绝对误差(MAE)进行评估，转折点准确率，设计了底部查准率(LP)与顶部查全率(HR)两种指标进行评估。The trend position accuracy is evaluated using the mean absolute error (MAE), and the turning point accuracy is designed with two indicators: bottom precision (LP) and top recall (HR).

查准率(Precision)与查全率(Recall)是二分类问题常用的评价指标，在分类算法中，根据其预测结果与真实目标比较后，可以用以下四种标记表示其预测状态。Precision and recall are commonly used evaluation indicators for binary classification problems. In the classification algorithm, after comparing the predicted results with the real target, the following four markers can be used to indicate the predicted state.

其中T为True，F为False，P为Positive，N为Negative。并依据其各值计算出查准率(11)与查全率(12)。由公式可知，查准率表示当预测值为真时，实际值也为真的情况，查全率表示当真实值为真时，预测值也为真的情况。Where T is True, F is False, P is Positive, and N is Negative. And according to its values, the precision (11) and recall (12) are calculated. It can be seen from the formula that the precision indicates that when the predicted value is true, the actual value is also true, and the recall indicates that when the actual value is true, the predicted value is also true.

对于底部转折点，当每次预测出现转折点时，可以为投资者提供强烈的买入信号，所以我们希望每次预测为转折点时，尽可能地准确，否则会造成投资损失，因此需要关注的为查准率。For the turning point at the bottom, when a turning point is predicted each time, it can provide investors with a strong buy signal, so we hope that every time the prediction is a turning point, it will be as accurate as possible, otherwise it will cause investment losses, so you need to pay attention to checking accuracy.

对于顶部转折点，当每次预测出现转折点时，会为投资者提供卖出信号，若投资者持有股票，则期望在每一次顶部出现时尽量预测准确，以避免错过卖出机会，否则会造成投资者损失，因此需要关注的为查全率。For the top turning point, every time a turning point is predicted, it will provide investors with a sell signal. If investors hold stocks, they should try to predict as accurately as possible when each top appears, so as to avoid missing selling opportunities, otherwise it will cause Investors lose, so the recall rate needs to be concerned.

在分类问题中，有明确的正类与负类，预测正确与错误，而本实验为回归算法，得到的是具体的趋势位置值，无明确对错只分，因此，为了使用类似方法对趋势转折点预测的准确度进行衡量，使用以下方法对结果进行衡量。In the classification problem, there are clear positive and negative classes, and the prediction is correct and wrong. This experiment is a regression algorithm, and the specific trend position value is obtained. There is no clear right or wrong, only the classification. Therefore, in order to use a similar method to analyze the trend The accuracy of the turning point forecast is measured, and the results are measured using the following methods.

根据趋势位置判定底部与顶部，转折点的判断依据一定区域进行判定，以趋势位置值小于5时为底部区域，大于95时为顶部区域。当预测值小于5时，认定为接近底部转折点，测量其与真实值的绝对误差，为底部查准率(Low Precision,LP)。类似地，当真实值大于95时，认定已经接近顶部转折点，测量其与预测值的绝对误差，为顶部查全率(HighRecall,HR)。The bottom and top are determined according to the trend position, and the turning point is determined based on a certain area. When the trend position value is less than 5, it is the bottom area, and when it is greater than 95, it is the top area. When the predicted value is less than 5, it is considered to be close to the bottom turning point, and the absolute error between it and the true value is measured, which is the bottom precision (Low Precision, LP). Similarly, when the true value is greater than 95, it is considered that it is close to the top turning point, and the absolute error between it and the predicted value is measured, which is the top recall (High Recall, HR).

在分类问题中，查准率与查全率通常是互相矛盾的，在极端情况下，会出现一者很高而一者很低的情况，因此，LP与HR只对预测模型的转折点或转折区域内的预测效果进行了评测，对于整体的衡量，本发明由于是回归计算，可使用前文所述的平均绝对误差(MAE)，来反映模型对整体趋势的拟合效果。根据趋势位置的定义可得，趋势位置本质上为趋势的百分化，其单点之间最大误差为100，最小误差为0，因此，使用平均绝对误差可以更直观地体现预测效果。In classification problems, precision and recall are usually contradictory. In extreme cases, one is very high and the other is very low. Therefore, LP and HR only predict the turning point or turning point of the model. The prediction effect in the region is evaluated. For the overall measurement, the present invention can use the mean absolute error (MAE) described above to reflect the fitting effect of the model to the overall trend because it is a regression calculation. According to the definition of trend position, the trend position is essentially a percentage of the trend. The maximum error between single points is 100, and the minimum error is 0. Therefore, the use of the average absolute error can reflect the prediction effect more intuitively.

本实施例的预测结果：当时间窗口M＝100，趋势阈值δ＝60％，误差函数LP、HR、MAE分别为8.07％，11.86％，12.72％。以宝钢股份为例，时间窗口M＝100，δ＝60％预测效果图如图6所示。The prediction result of this embodiment: when the time window M=100, the trend threshold δ=60%, the error functions LP, HR, and MAE are 8.07%, 11.86%, and 12.72%, respectively. Taking Baosteel Co., Ltd. as an example, the time window M=100, δ=60% prediction effect is shown in Figure 6.

Claims

1. A stock trend prediction method based on an LSTM-CNN deep learning model is characterized by comprising the following steps:

step 1: data pre-processing

Stock data of a stock market are selected for analysis and verification, and an integral data set is divided into a training set and a testing set;

step 1.1: feature selection and normalization

The characteristics are transaction data and technical indexes, and the transaction data needs to be normalized; the technical index is obtained by calculating transaction data, and a normalization process is included in the calculation process;

step 1.2: calibrating trending locations

Researching the price trend of the stocks in the middle and long term, proposing turning points, quantifying the trend according to the turning points to obtain trend positions, and calibrating the data labels according to the trend positions;

step 2: the stock price trend is predicted by using an LSTM-CNN prediction model, and the method comprises the following steps:

step 2.1: constructing a network model, comprising the following steps:

the LSTM-CNN model comprises three parts, wherein one part is an LSTM time sequence characteristic learning layer; the other part is a CNN static characteristic learning layer which is synchronously carried out with the LSTM part; the last part is a fully-connected output layer, and the outputs of the first two models are connected in series and combined to construct a fully-connected neural network;

step 2.2: activating a function

Using the ELU as the activation function, the formula is (7):

φ(x)＝max(0,x) (7)

the network model needs to be configured in the training process, an activation function and input and output dimensions need to be set in an LSTM unit, and the size of the activation function and a convolution kernel need to be set in a CNN unit;

step 2.3: constructing a model training strategy, designing a weighting loss function for the training of the model, and comprising the following steps:

the value range of the trend position is a limited interval [0,100], the application values of different positions corresponding to the interval are different, and the invention performs weighting measurement on the interval to provide a loss function sensitive to the cost of a turning region;

firstly, the target value y and the predicted value are compared

Scaling and translating to make the target value in the interval [ -1,1 [ -1 [ ]]In the method, formula (8) is a conversion method, c is a small real number and is used for ensuring that a logarithm and a denominator calculation object are not zero, and arctaph formula (9) is used for carrying out the conversion on a target value z and a predicted value

Mapping is carried out, and weight assignment, Y and Y are realized through the mapping process

Respectively a real value and a predicted value after the transformation weighting;

after the target value and the predicted value are subjected to mapping weighting, MSE is applied to measure errors of the target value and the predicted value, WMSE (10) which is a loss function of the prediction model is obtained, the training speed can be accelerated through weighting, and meanwhile, the complexity of the model is reduced to a certain extent;

and step 3: evaluation criteria

The prediction target is a prediction trend position and a bottom turning point, and the accuracy rates of the two aspects are respectively considered;

the accuracy rate of the trend position is evaluated by using the average absolute error MAE, the accuracy rate of the turning point is evaluated by designing two indexes of a bottom precision ratio LP and a top recall ratio HR;

the precision ratio and the recall ratio are common evaluation indexes of the two classification problems, and in a classification algorithm, after a prediction result is compared with a real target according to the classification algorithm, the following four marks represent the prediction state of the classification algorithm;

TP: when the true value is the positive class, the predicted value is the positive class;

TN: when the true value is a positive class, the predicted value is a negative class;

FP: when the true value is a negative class, the predicted value is a positive class;

FN: when the true value is a negative class, the predicted value is a negative class;

wherein T is True, F is False, P is Positive, N is Negative, and calculate precision (11) and recall (12) according to each value; according to the formula, the precision ratio represents the condition that the actual value is true when the predicted value is true, and the recall ratio represents the condition that the predicted value is true when the true value is true;

for the bottom turning point, a strong buy signal is provided when the turning point occurs in each prediction, so that it is desirable that each prediction is as accurate as possible, otherwise investment loss is caused, and therefore attention is paid to precision ratio;

for the top turning point, when the turning point appears in each prediction, a selling signal is provided, and the top turning point appears in each prediction, the prediction is accurate as much as possible, so that the selling opportunity is avoided being missed, otherwise, the investment loss is caused, and therefore the recall ratio needs to be paid attention.

2. The method for predicting stock trend based on LSTM-CNN deep learning model as claimed in claim 1, wherein in step 1.1, the original transaction data is first normalized, in the stock data, in the market data, there is a continuously rising or descending market, the innovation height or new height is a common phenomenon in the financial data, therefore, the normalization method is designed for the price and the volume of;

price normalization: quantifying the stock data of each time window, calculating relative fall and rise by taking the stock sequence first value as a reference price, and independently normalizing the data in each time window as shown in a formula (1);

for each training sample, at T₀,T₁,T₂…T_i…T_nThe closing price corresponding to the time is P₀,P₁,P₂…P_i…P_nAt T_iThe characteristic value obtained by time normalization is Cx_iA constant term C;

and (3) normalization of volume of bargaining: the calculation is carried out on the whole time sequence of a stock, and the formula is shown as (2):

at T_iAt the moment, the volume of traffic is V_iThe eigenvalue obtained based on the volume normalization is Vx_iIf the constant term n is generally equal to the size of the time window, the first n time points of one stock are only used for data processing and are not marked into a sample;

and (4) completing the normalization of closing price and volume, and respectively calculating the technical index values of the sequence as MACD, RSI and KDJ-J, wherein the technical index values are all used as input data.

3. The LSTM-CNN deep learning model-based stock trend prediction method of claim 1 or 2, wherein in step 1.2, the trend turning points comprise a top turning point at a peak position and a bottom turning point at a valley position, when the price is at the bottom turning point, it means that the price is about to rise, and vice versa; in an actual market situation, prices are always in fluctuation, fluctuation with different amplitudes is generated, a plurality of turning points are generated, and the turning points of model identification are quantified, and the specific definition is as follows:

stock price is listed at T₁,T₂,T₃The closing prices corresponding to the time are respectively P₁,P₂,P₃Setting a parameter called trend threshold delta, when condition (4) is satisfied, and T₂The time price is T₁To T₃The lowest point in the period, T is considered₂The moment is the bottom turning point of a section of trend;

similar to the bottom turning point, when the condition (5) is satisfied, and T₂The time price is T₁To T₃The highest point in the period, then T is considered₂The moment is a top turning point of a section of trend, and it is worth noting that the calculation processes of the bottom turning point and the top turning point are both based on low price, otherwise, under the same threshold value delta, the bottom turning point and the top turning point are different in the fluctuation absolute value required for establishing, and deviation is generated;

after a certain bottom turning point is determined, the subsequent sequence is defined as an ascending trend until the next top turning point is met, and similarly, after a certain top turning point is determined, the subsequent sequence is defined as a descending trend until the next bottom turning point is met;

after determining the turning points at the bottom and the top of the time sequence, mapping the current stock price to the relative position in the trend, and using the trend position as the direct prediction target of an LSTM-CNN model to realize the prediction of the future price trend, wherein the calibration method of the trend position comprises the following steps:

1.2.1, according to the definition of the turning points, obtaining all bottom turning points and top turning points of a group of sequences, wherein the trend position at the bottom turning point is marked as 0, and the trend position at the top turning point is marked as 100;

1.2.2, the trend position value between the bottom and top turning points is calculated according to equation (6), where P_LAnd P_HA set of adjacent bottom and top inflection points;

the last trend of the time series cannot determine the top or bottom turning point of the last price, so that the part which cannot be determined is not marked and is not included in the training data.

4. The LSTM-CNN deep learning model based stock trend prediction method of claim 1 or 2, wherein in step 2.1, the first part is composed of three LSTM layers, the input matrix size of LSTM is, where k is the number of training features, M is the size of sliding time window, L1 layer contains LSTM units corresponding to M time nodes, the input of each LSTM unit is composed of k features, including transaction data and technical indicators, each unit in L1 layer is trained to output as the input of each unit in L2 layer, similarly, the output of L2 layer is used as the input of L3 layer, the training modes of L2 and L3 layers are the same as that of L1 layer, and finally, the output of LSTM unit corresponding to the last time node in L3 layer is used as the result of part a;

the second part consists of three convolutional layers and one LSTM layer, in this part, the input matrix is the same as in the first part; extracting local related information between price and technical indexes through three convolutional layers, connecting the three convolutional layers with an LSTM layer, and analyzing a time relation in the extracted local dependency information through the LSTM layer;

the last part consists of three completely connected layers, time and static information are respectively extracted by the LSTM and the CNN and are combined into a new characteristic sequence, the characteristic sequence is input by the completely connected layers, and finally, a final prediction result of the model, namely a trend position, is output; in addition, some nodes between two connecting layers can be disconnected by applying a random inactivation method between all connecting layers, the final result is not completely dependent on specific nodes, and the random inactivation method obviously reduces the over-fitting problem.