CN111445339A

CN111445339A - Price limiting single prediction analysis method and system for bilinear attention convolution neural network

Info

Publication number: CN111445339A
Application number: CN202010312349.4A
Authority: CN
Inventors: 张莉; 吕雪瑞; 屈蕴茜; 章晓芳; 王邦军; 周伟达
Original assignee: Suzhou University
Current assignee: Suzhou University
Priority date: 2020-04-20
Filing date: 2020-04-20
Publication date: 2020-07-24

Abstract

The invention discloses a bilinear attention convolutional neural network price limit single prediction analysis method and a bilinear attention mechanism on the basis of a convolutional neural network, the attention of the network to time information is enhanced, the number of model parameters is reduced through a pooling layer and a Dropout layer to prevent the occurrence of an overfitting phenomenon, the calculated amount is reduced, in the back propagation of a network training process, an Adam algorithm is used for calculating and updating weight, the traditional gradient descent process is optimized, parameters of an updated model are optimized through a loss function L, and therefore the bilinear attention convolutional neural network is optimized.

Description

Price limiting single prediction analysis method and system for bilinear attention convolution neural network

Technical Field

The invention relates to the field of price limit order analysis, in particular to a bilinear attention convolution neural network price limit order prediction analysis method and system.

Background

The financial market reflects not only the macroscopic economy of a country, but also the future economic trends of various companies or goods, and is a complex and variable large system influenced by the market and market participants, and predicting the future fluctuation of the market is a very challenging task.

The research of deep learning is explosively increased, and the application is also vigorously developed. Deep learning has been widely applied in the fields of voice, image and video, but as many practitioners recognize the field of wind control that the method of deep learning cannot be used, the method is less applied in the financial field because the method is a black box model and has poor interpretability. Convolutional Neural Networks (CNN) have been widely applied to the fields of image motion recognition, text emotion classification, and the like as a representative algorithm for deep learning, but are also rarely applied to the financial field. Judging the future trend of the financial market is one of the most difficult tasks for investors, and finding a neural network model suitable for the stock market is also a difficult problem. The stock market comprises transaction data, financial data, macroscopic data and the like, the transaction data comprises stroke-by-stroke data, time-of-use data, price limit order data and market price order data, and the analysis of the price limit order data is also an important problem.

The limiting Order includes purchase orders and sales orders of multiple investors within a time, can reflect expectations and divergences of investors to the bid, the interaction situation of market and investors, and the trend of short-term population in the future, and does not need to carry out complex text analysis, and does not need a deep model with high complexity, and is suitable for analyzing the financial market.

Therefore, a solution is needed to optimize the convolutional neural network model and enhance the time concern of the model, and on the basis of the optimization, the prediction analysis is performed on the price limit order.

Disclosure of Invention

The invention aims to provide a bilinear attention convolution neural network price limiting single prediction analysis method and system.

In order to solve the technical problem, the invention provides a bilinear attention convolution neural network price limiting single prediction analysis method, which comprises the following steps:

step 1: acquiring a data set of a stock price limit list, and dividing the data set into a training set and a testing set;

step 2: adding a bilinear attention mechanism based on the convolutional neural network, and establishing a bilinear attention convolutional neural network model;

step 3, inputting the training set into a bilinear attention convolution neural network model to train the model, calculating and reversely updating the weight and the offset value of the whole neural network model by adopting a loss function L and an Adam algorithm, and finishing training to obtain the trained bilinear attention convolution neural network model;

and 4, step 4: and inputting the test set into the trained bilinear attention convolution neural network model to obtain corresponding output data, namely a prediction result.

Further, the training set and the test set in step 1 are both preprocessed.

Further, the preprocessing method of the data set is a z-score normalization method for normalizing data, and the specific steps are as follows:

step 1-1: acquiring a data set

And label therefor

Where T is the total number of samples, x_tIs a D-dimensional sample corresponding to the time point t, D is a characteristic number, and

each time point t comprises the first n grades of quotations, and the ith grade of quotation comprises the pre-purchase price

And pre-buy volume

And a pre-sale price

And pre-sell volume

y_tRepresenting the trend of the stock price corresponding to the time point t at the time point t + m, is a matrix with the size of 3 × 1, each value in the matrix represents rising, falling and invariant respectively, and m is the future time period predicted by the model;

step 1-2: normalizing the original data to ensure that the mean value of the processed data is 0 and the standard deviation is 1, wherein the normalization formula is as follows:

wherein

Is the mean, σ, of the d-th dimension feature in the raw data_dStandard deviation of the d-dimension characteristic of the original data;

step 1-3: the normalized data set needs to be processed as an image-like data set in order to use data from a past period of time to predict the price trends of stocks. Class image sample x'_tIs composed of normalized samples from time point t-delta t +1 to time point t, the size of class image sample is delta t × D, delta t is time span and is labeled y_t；

Step 1-4: after the data set I goes through steps 1-1, 1-2 and 1-3, the data set I is divided into a training set U and a testing set X.

Further, the bilinear attention convolutional neural network model established in the step 2 comprises an 8-layer convolutional network and a bilinear attention mechanism.

Further, the 8-layer convolutional network is:

the layer 1 is a two-dimensional convolutional layer, the convolutional kernel size is 4 × 4n, the number of channels is 16, the layer 2 is a Reshape layer, the layer 3 is a one-dimensional convolutional layer, the convolutional kernel size is 3, the number of channels is 16, the layer 4 is a one-dimensional maximum pooling layer, the layer 5 is a one-dimensional convolutional layer, the convolutional kernel size is 3, the number of channels is 32, the layer 6 is a one-dimensional convolutional layer, the convolutional kernel size is 3, the number of channels is 32, the layer 7 is a one-dimensional maximum pooling layer, the layer 8 is a fully-connected layer, and the number of neurons is 3.

Further, the bilinear attention mechanism includes a bilinear projection layer, a Dropout layer, and a bilinear attention layer 3 layer, which are respectively:

bilinear projection layer, on which the data set U is transformed into U₁The related formula is:

U₁＝φ(W₁UW₂+B)

wherein, the d column in the training set contains the variation value of the d characteristic in the time T, and the T row shows the time TBetween points D values of different characteristics, W₁∈R^Δt'×Δt，W₂∈R^D×D'，B∈R^Δt'×D'The phi function is a Relu activation function, so that the problems of gradient explosion and gradient disappearance can be avoided, the output of the layer is finally obtained, and the information of two dimensions of the input data of the layer is respectively stored in W₁And W₂In, B is a bias matrix;

the Dropout layer reduces the number of parameters by randomly deleting part of neurons, reduces the calculation amount and prevents an overfitting phenomenon;

bilinear attention layer, on which the data set U₁Respectively through W₁' obtaining F, by W₂' get E, get A by Softmax function, get W₃B, phi to obtain U₂The related formula is:

F＝U₁W₁'

E＝W₂'F

U₂＝φ(W₃A+B)

wherein, W₁'∈R^D'×D”，W₂'∈R^Δt'×Δt'，W₃∈R^Δt”×Δt'，B∈R^Δt”×D”，a_ijAnd e_ijThe elements with the positions (i, j) in the A and the E are respectively represented, and the phi function is a Relu activation function, so that the problems of gradient explosion and gradient disappearance can be avoided, and the output of the layer is finally obtained. In this process, pass W₁' obtaining a feature space R^D'，W₂The diagonal elements of' are all fixed to be 1/delta t, and the weight of each piece of time information is not changed, so that the time information corresponding to the d-th feature is kept unchanged, and the attention to the time information is strengthened.

Further, the loss function L in step 3 is:

wherein a is_cIs an indicator variable, the prediction class is 1 if the prediction class is the same as the actual class, otherwise, the prediction class is 0, p_cIs the predicted probability that the sample belongs to class c.

Further, the process of training the bilinear attention convolution neural network model by using the training set in the step 3 is as follows:

step 3-1, inputting the training set U into the network model for forward transmission, and in the reverse process of the training process, calculating and reversely updating the weight and the bias value of the whole neural network model by adopting a loss function L and an Adam algorithm;

step 3-2: and (3) repeating the step (3-1) until a preset training end condition is reached, stopping training, determining the weight value and the bias value of each neuron in the network model at the moment, and obtaining the trained bilinear attention convolution neural network model.

Further, the activation functions used by all convolution layers in the bilinear attention convolution neural network model in step 3 are Re L U functions, and the Softmax function is used by the full-connected layer.

The invention also provides a bilinear attention convolution neural network price limit sheet prediction analysis system, which comprises a data preprocessing module, a network training module and a stock price limit sheet trend prediction module:

in the data preprocessing module, preprocessing a data set to generate a training set and a test set, wherein the training set is input into the network training module, and the test set is input into the stock limit order trend prediction module;

in the network training module, a bilinear attention convolutional neural network model is constructed, the network model is optimized by using a pooling layer, a Dropout layer, a loss function L and an Adam algorithm, a training set is input into the network model to train the network model, and the trained bilinear attention convolutional neural network model is obtained after training is finished;

in the stock price limit single trend prediction module, a test set is input into a trained bilinear attention convolution neural network model to obtain corresponding output data, namely a prediction result.

By the scheme, the bilinear attention convolutional neural network price limiting list prediction analysis method has the advantages that a bilinear attention mechanism is added on the basis of a convolutional neural network, attention of the network to time information is enhanced, the number of model parameters is reduced through a pooling layer and a Dropout layer, the occurrence of an overfitting phenomenon is prevented, the calculated amount is reduced, in the backward propagation of the network training process, the Adam algorithm is used for calculating and updating the weight, the traditional gradient descent process is optimized, parameters of the updated model are optimized through a loss function L, and therefore the bilinear attention convolutional neural network is optimized.

The foregoing description is only an overview of the technical solutions of the present invention, and in order to make the technical solutions of the present invention more clearly understood and to implement them in accordance with the contents of the description, the following detailed description is given with reference to the preferred embodiments of the present invention and the accompanying drawings.

Drawings

Fig. 1 is a schematic structural diagram of a bilinear attention convolution neural network model in the present invention.

FIG. 2 is a flow chart of a method for bilinear attention layer in the present invention.

Fig. 3 is a schematic structural diagram of a bilinear attention convolution neural network price limit order analysis system.

Detailed Description

The following detailed description of embodiments of the present invention is provided in connection with the accompanying drawings and examples. The following examples are intended to illustrate the invention but are not intended to limit the scope of the invention.

This embodiment uses the open data set FI-2010 data set of the stock quotation (data source: Ntakaris A, Magris M, Kanniainen J, et al. benchmark dataset for mid-price for estimating the limit of the book data with the machine learning methods [ J ]. Journal of estimating, 2018,37(8):852 866). The data set comprises five Finnish stocks of the Nordic group of Nardack, a total of 45 ten thousand samples, labels of each limit order are divided into 5 types according to predicted time points, the data set is divided into two parts, one part comprises a bidding period, the other part does not comprise the bidding period, and each sample has 144 characteristics. In this embodiment, 1 of the tags is selected for prediction and does not include a bidding period, and the first 40 characteristics are selected, where the 40 characteristics are the pre-bid price and the pre-bid volume, and the pre-ask price and the pre-ask volume, respectively, of the first ten grades in the limited price list.

Example (b): a price limiting single prediction analysis method for a bilinear attention convolution neural network comprises the following steps:

step 1: preprocessing a data set of the stock price limit list by adopting a z-score standardization method, and dividing the data set into a training set and a testing set:

step 1-1: acquiring a data set

And labels therefor

Where T is the total number of samples, x_tIs a 40-dimensional sample corresponding to the time point t, and 40 is a feature number. Training centralization

Each time point t comprises the first 10 th market, the ith market comprises the pre-purchase price

And pre-buy volume

And a pre-sale price

And pre-sell volume

y_tRepresenting the trend of the stock price at time point t + m corresponding to time point t, is a matrix with the size of 3 × 1Each value in (a) represents a rise, a fall and a constant, respectively, and m is the future time period predicted by the model

wherein

Step 1-4: after the data set I is processed through steps 1-1, 1-2 and 1-3, 70% of the preprocessed data set is selected as a training set U and 30% is selected as a test set X in the example in order to achieve better effect and better generalization ability.

Step 2: and adding a bilinear attention mechanism based on the convolutional neural network, and establishing a bilinear attention convolutional neural network model which comprises an 8-layer convolutional network and the bilinear attention mechanism.

As shown in fig. 1, the 8-layer convolutional network includes:

the layer 1 is a two-dimensional convolution layer, the size of a convolution kernel is 4 × 4n, the number of channels is 16, and the two-dimensional convolution layer is used for learning local features of input data;

the layer 2 is a Reshape layer and is used for changing the dimensionality of data and converting input two-dimensional data into one-dimensional data;

the 3 rd layer is a one-dimensional convolution layer, the size of a convolution kernel is 3, the number of channels is 16, and the 3 rd layer is used for learning the local characteristics of the one-dimensional data output by the upper layer;

the 4 th layer is a one-dimensional maximum pooling layer, the number of parameters output by the upper layer is reduced through a filter of the maximum pooling layer, and an over-fitting phenomenon is prevented;

the 5th layer is a one-dimensional convolution layer, the size of a convolution kernel is 3, the number of channels is 32, and the convolution kernel is used for learning local features output by the pooling layer;

the 6 th layer is a one-dimensional convolution layer, the size of a convolution kernel is 3, the number of channels is 32, the depth of the layer is increased, and the fitting capacity of the model is improved;

the 7 th layer is a one-dimensional maximum pooling layer and is used for reducing the number of parameters output by the previous layer again and optimizing the model;

the 8 th layer is a full connection layer, the number of the neurons is 3, and the neurons are used for realizing classification of prediction results and are divided into 3 types of rising, falling and unchanging.

All the convolution layers use the Re L U function as the activation function, and the full connection layer uses the Softmax function to calculate the output value of the neuron, and the output value is transmitted to the next layer for calculation.

The bilinear attention mechanism comprises a bilinear projection layer, a Dropout layer and a bilinear attention layer 3, which are respectively:

bilinear projection layer, on which the data set U is transformed into U₁The formula is as follows:

U₁＝φ(W₁UW₂+B)

wherein the D column in the training set contains the variation value of the D characteristic in time T, the T row represents the values of D different characteristics at T time point, W₁∈R^Δt'×Δt，W₂∈R^D×D'，B∈R^Δt'×D'The phi function is a Relu activation function, so that the problems of gradient explosion and gradient disappearance can be avoided, the output of the layer is finally obtained, and the information of two dimensions of the input data of the layer is respectively stored in W₁And W₂In (1), B is a bias matrix.

as shown in FIG. 2, a bilinear attention layer, at which the data set U is located₁Respectively go toOver W₁' obtaining F, by W₂' get E, get A by Softmax function, get W₃B, phi to obtain U₂The involved algorithms are:

F＝U₁W₁'

E＝W₂'F

U₂＝φ(W₃A+B)

And step 3: use training set

Training is carried out on the bilinear attention-enhancing convolutional neural network shown in fig. 1, and the results of the two parts are combined into a network model output value through the 8-layer convolutional neural network and the bilinear attention mechanism in sequence:

step 3-1, inputting the training set U into the network model for forward transfer, and performing reverse update optimization to update all parameters of the whole neural network model by adopting a loss function L, wherein the loss function L is as follows:

wherein a is_cIs an indicator variable, the prediction class is 1 if it is the same as the actual class, otherwiseIs 0, p_cIs the predicted probability that the sample belongs to class c; in the reverse updating of the training process, calculating and reversely updating the weight and the bias value of the whole neural network model through an Adam algorithm;

step 3-2: and repeating the step 3-1 until a training end condition is reached, training 200 rounds in the embodiment, stopping training, determining a weight value and a bias value of each neuron in the network model at the moment, and obtaining the trained bilinear attention convolution neural network model.

And 4, step 4: and inputting the test set X into the trained bilinear attention convolution neural network model to obtain a predicted value of the stock at the time point t.

In order to better illustrate the technical effects of the present invention, the examples were conducted to test the present invention. The predicted value of the stock at the time point t represents the trend of the stock in the future t to (t + m), the trend is divided into rising, falling and unchanging, and each kind of results of the predicted value is evaluated, wherein the rising is 1 and the rising is not 0; the dip is 1, not 0; not to 1, but to 0. The predicted values are subjected to mean processing after evaluation and are compared with the true values, and the predicted values are represented by four indexes of Precision (Precision), Recall (Recall), F1 Score (F1-Score) and accuracy (accuracy), and the calculation method is as follows:

wherein, TP, FP, FN and TN are four types of evaluation parameters: TP represents that the predicted value is the same as the actual value, and the predicted value is 1; FP represents that the predicted value is different from the actual value, and the predicted value is 1; TN represents that the predicted value is the same as the actual value, and the predicted value is 0; FN indicates that the predicted value is different from the actual value, and the predicted value is 0. Table 1 is a table of four types of parameters used for evaluation.

	True value (1)	True value (0)
			Predicted value (1)	TP	FP
Predicted value (0)	FN	TN

TABLE 1 four-class parameter Table for evaluation

In this embodiment, 3 methods are set for comparison analysis, wherein the SVM, M L P (SVM and M L P are shown in the literature "Tsantekisis A, Passalis N, Tefas A, et al. use missing to detect temporal change indexes in capillary references [ C ]// 201725 th European Signal processing Conference (SIPEUCO). IEEE,2017:2511 2515") and CNN (see the literature "Tsantekisis A, Passalis N, Tefas A, et al. Forxecuting storage test library, the inventor number of using volumetric work [ C ]//2017IEEE 19th coherence business networks (C.19 th/2017 th business networks), the results are shown as accurate learning indexes, and the results are shown in the form of a model of shallow learning index (Cscan) and a deep learning index (Ref) of the SVM 1, and the results are shown as accurate learning index (F-1). The model, the results are shown as accurate learning index, and the results are shown as accurate learning index (Ref. C).

Table 2 is a table comparing the results of the test according to the method of the invention and the 3 comparative methods.

Method of producing a composite material	Precision ratio (%)	Recall (%)	F1 fraction (%)	Accuracy (%)
					SVM	39.62	44.92	35.88	-
MLP	47.81	60.78	48.27	-
					CNN	67.67	56.33	60.97	76.53
The invention	74.67	63.33	68.34	80.51

TABLE 2 comparison of the test results of the method of the invention and SVM, M L P, CNN

As shown in Table 2, the accuracy rate, the recall rate, the F1 score and the accuracy rate of the method are all superior to other comparison methods under the same condition, the method strengthens the attention of the network to time information by optimizing a convolution network layer and increasing a bilinear attention mechanism on the basis of the optimization, prevents the occurrence of an overfitting phenomenon by reducing the number of model parameters through a pooling layer and a Dropout layer, reduces the calculated amount, calculates and updates the weight by using an Adam algorithm in the back propagation of a network training process, optimizes the traditional gradient descent process, optimizes the parameters of an updated model through a loss function L, and optimizes a bilinear attention convolution neural network.

As shown in fig. 3, the bilinear attention convolution neural network price limit order analysis system provided in the embodiment of the present invention includes a data preprocessing module, a network training module, and a stock price limit order trend prediction module:

the data preprocessing module is used for preprocessing the data set to generate a training set and a test set, the training set is input into the network training module, and the test set is input into the stock limit order trend prediction module;

in the stock price limit single trend prediction module, the test set is input into a trained bilinear attention convolution neural network model to obtain corresponding output data, namely a prediction result.

The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, it should be noted that, for those skilled in the art, many modifications and variations can be made without departing from the technical principle of the present invention, and these modifications and variations should also be regarded as the protection scope of the present invention.

Claims

1. A price limiting single prediction analysis method for a bilinear attention convolution neural network is characterized by comprising the following steps:

step 3, inputting the training set into a bilinear attention convolution neural network model to train the model, calculating and reversely optimizing and updating the weight and the offset value of the whole neural network model by adopting a loss function L and an Adam algorithm, and finishing training to obtain the trained bilinear attention convolution neural network model;

2. The bilinear attention convolution neural network pricing unit prediction analysis method of claim 1, characterized in that: in the step 1, the training set and the test set are preprocessed, and data is normalized.

3. The bilinear attention convolution neural network pricing unit prediction analysis method of claim 2, characterized in that: the preprocessing method of the data set is a z-score standardization method, and comprises the following specific steps:

step 1-1: obtainFetch datasets

And label therefor

each time point t comprises the first n-th stage market, the ith stage market comprises the pre-purchase price P_i ^bidAnd pre-buy volume V_i ^bidAnd a pre-sale price P_i ^askAnd a pre-sale volume V_i ^ask，y_tRepresenting the trend of the stock price corresponding to the time point t at the time point t + m, is a matrix with the size of 3 × 1, each value in the matrix represents rising, falling and invariant respectively, and m is the future time period predicted by the model;

wherein

4. The bilinear attention convolution neural network pricing unit prediction analysis method of claim 1, characterized in that: the bilinear attention convolutional neural network model established in the step 2 comprises an 8-layer convolutional network and a bilinear attention mechanism.

5. The bilinear attention convolution neural network pricing unit prediction analysis method of claim 4, characterized in that: the 8-layer convolutional network is as follows:

6. The bilinear attention convolution neural network pricing unit prediction analysis method of claim 4, characterized in that: the bilinear attention mechanism comprises a bilinear projection layer, a Dropout layer and a bilinear attention layer 3, wherein the bilinear attention mechanism comprises the following layers:

U₁＝φ(W₁UW₂+B)

wherein the D column in the training set contains the variation value of the D characteristic in time T, the T row represents the values of D different characteristics at T time point, W₁∈R^Δt'×Δt，W₂∈R^D×D'，B∈R^Δt'×D'The phi function is a Relu activation function, so that the problems of gradient explosion and gradient disappearance can be avoided, the output of the layer is finally obtained, and the information of two dimensions of the input data of the layer is respectively stored in W₁And W₂In, B is a bias matrix;

F＝U₁W₁'

E＝W₂'F

U₂＝φ(W₃A+B)

7. The bilinear attention convolution neural network price limiting univocal prediction analysis method of claim 1, characterized in that the loss function L in the step 3 is:

wherein a is_cIs an indicator variable, the prediction class is 1 if the prediction class is the same as the actual class, otherwise, the prediction class is 0, p_cIs the predicted probability that a sample belongs to class c。

8. The bilinear attention convolution neural network pricing unit prediction analysis method of claim 1, characterized in that: the process of training the bilinear attention convolution neural network model by using the training set in the step 3 is as follows:

step 3-1, inputting the training set U into the network model for forward transmission, and in the reverse process of the training process, calculating and reversely updating and optimizing the weight and the bias value of the whole neural network model by adopting a loss function L and an Adam algorithm;

9. The method for finite price single prediction analysis of bilinear attention convolution neural network as claimed in claim 1, wherein the activation functions used by all convolution layers in the bilinear attention convolution neural network model in step 3 are Re L U functions, and the full connection layer uses Softmax function.

10. A bilinear attention convolution neural network price limit single prediction analysis system comprises a data preprocessing module, a network training module and a stock price limit single trend prediction module, and is characterized in that: