CN112307760A

CN112307760A - Deep learning-based financial report emotion analysis method and device and terminal

Info

Publication number: CN112307760A
Application number: CN202011271062.8A
Authority: CN
Inventors: 邓蔚; 熊香权; 李惠莲; 姜志豪; 汪劲松; 杨记军; 王鸣晖
Original assignee: Chengdu Zhiyuan Technology Co ltd
Current assignee: Chengdu Zhiyuan Technology Co ltd
Priority date: 2020-11-13
Filing date: 2020-11-13
Publication date: 2021-02-02

Abstract

The invention discloses a financial report emotion analysis method, device and terminal based on deep learning, wherein the analysis method comprises the following steps: acquiring text information discussed and analyzed by a management layer in a financial report; preprocessing the text information to obtain a text vocabulary; vectorizing and representing the text vocabulary; taking the vectorization expression of the text vocabulary as the input of a Word2Vec model, carrying out Word vector training to obtain a text Word vector, and extracting the entity characteristics and the part of speech characteristics of the text vocabulary; constructing a BilSTM model, and adding an attention mechanism into the model; and inputting the text word vector, the entity characteristics and the part-of-speech characteristics into a BilSTM model, and calculating the emotion analysis result of the text information. According to the method, an attention mechanism is introduced on the basis of the BilSTM model, the importance degree of each word semantic in the target words is distinguished, the text is quickly and simply subjected to feature extraction, and the target words are subjected to feature representation, so that the performance of the conventional model is greatly improved.

Description

Deep learning-based financial report emotion analysis method and device and terminal

Technical Field

The invention belongs to the technical field of natural language processing, and particularly relates to a financial report emotion analysis method, device and terminal based on deep learning.

Background

In recent years, with the continuous and vigorous development of various industries, the investment heat of the market is only increased but not reduced, and investors can not deal with financial reports of an enterprise no matter on the basis of theory or past experience when making enterprise investment decisions. The financial report is an enterprise operation analysis report which is regularly published by an enterprise according to market specifications, and the report content can be divided into two parts in form: financial information and non-financial information.

The financial information is mainly in a data table presentation form, has the characteristics of objectivity, strong normativity, readability and the like for investors, only reflects the current operating condition of a company, and has no foresight property. Therefore, in order to predict the future performance of the enterprise, investors begin to mine information such as the future development trend and growth of the enterprise from non-financial information, wherein the part with the highest information content and value content is management layer discussion and analysis. The part is that the management layer expresses the operation condition, investment stock situation, external environment influence and future development trend of a company by characters, and compared with objective financial data, text information is more subjective and emotional tendency.

At present, the analysis of 'management layer discussion and analysis' is introduced in domestic research later, so that the detailed text exploration of the MD & A of Chinese content is not enough, and the analysis of financial reports by investors is expected to predict the future growth, profitability and risk level of a company. Therefore, the research on the relation between the MD & A text and the future performance of the enterprise can help investors to better grasp the future operation condition of the enterprise and make better judgment in decision making.

At present, the mainstream methods applied to the analysis of the financial report text are still dictionary methods and traditional machine learning methods, and particularly, the methods for predicting the future performance of enterprises such as decision trees, random forests and XGboost are mostly applied.

Disclosure of Invention

The invention aims to overcome the defects of the prior art and provides a financial report emotion analysis method, device and terminal based on deep learning.

The purpose of the invention is realized by the following technical scheme: the financial report emotion analysis method based on deep learning comprises the following steps:

acquiring text information discussed and analyzed by a management layer in a financial report;

preprocessing the text information to obtain a text vocabulary;

vectorizing and representing the text vocabulary;

taking the vectorization expression of the text vocabulary as the input of a Word2Vec model, carrying out Word vector training to obtain a text Word vector, and extracting the entity characteristics and the part of speech characteristics of the text vocabulary;

constructing a BilSTM model, and adding an attention mechanism into the BilSTM model;

and inputting the text word vector, the entity characteristics and the part-of-speech characteristics into a BilSTM model, and calculating the emotion analysis result of the text information.

Preferably, the preprocessing the text information to obtain a text vocabulary includes:

sentence dividing is carried out on the text information;

removing interference symbols in the text information after the sentence division;

and performing word segmentation on the text information without the interference symbols to obtain text words.

Preferably, the text vocabulary is vectorized and expressed by using an One-Hot coding mode.

Preferably, the Word2Vec model is a CBOW model or a Skip-Gram model.

Preferably, the word vector training comprises:

connecting all One-Hot vectors on an input layer of the CBOW model, and multiplying all the One-Hot vectors by a shared input weight matrix respectively;

adding all One-Hot vectors multiplied by the input weight matrix to average to serve as hidden layer vectors, and transmitting the hidden layer vectors to a hidden layer;

in the transmission process of the hidden layer vector from the hidden layer to the output layer, multiplying the hidden layer vector by the output weight matrix to obtain an output vector;

calculating the output vector through a Softmax activation function layer to obtain probability distribution, wherein a word indicated by the maximum probability value is a predicted target word vector;

calculating the error between the predicted target word vector and the actual One-Hot vector of the text vocabulary;

defining a cross entropy loss function, and updating an input weight matrix and an output weight matrix by using a gradient descent algorithm;

repeating the steps until the error between the predicted target word vector and the actual One-Hot vector of the text vocabulary meets the threshold value;

after training is finished, multiplying each word of the input layer by the initial weight matrix to obtain a text word vector.

Preferably, an attention mechanism is added to the BilSTM model, and comprises the following steps:

setting a model data input placeholder;

defining an L2 regularization loss;

setting a word embedding layer: initializing a word embedding matrix by using a pre-trained word vector, and converting words in input data into word vectors by using the word embedding matrix;

defining a model structure of two layers of BilSTM: defining a forward LSTM structure and a reverse LSTM structure, and splicing output results of the forward LSTM structure and the reverse LSTM structure by adopting a dynamic input mode, or splicing the output results of the forward LSTM structure and the reverse LSTM structure by adopting a mode of taking the full length of a sequence;

acquiring the number of LSTM neurons in the last layer, initializing a weight vector, performing nonlinear conversion on the spliced output of the BiLSTM, performing dimension conversion and normalization processing on the weight, performing weighted summation on the output of the nonlinear conversion by using the normalized weight, performing dimension conversion for output, and performing regularization by using Dropout to obtain the output of attention;

loss calculation with full connectivity layer: calculating the weight L2 loss and the offset L2 loss, and calculating the binary cross entropy loss, the total loss being the sum of the mean loss of the binary cross entropy and the L2 regular loss of the model with coefficients.

Preferably, an emotion analysis result formula of the text information is calculated, a hidden layer state is obtained under a BilSTM model, the influence of each input position on the current position is calculated by a dot product method, the output result is led into a softmax function layer to obtain attention weight distribution, and finally an emotion analysis result of the text information is obtained by using a corresponding vector obtained by weighting summation.

Deep learning based financial report sentiment analysis device, comprising:

the text collection module is used for acquiring text information discussed and analyzed by a management layer in the financial report;

the data processing module is used for preprocessing the text information to obtain text vocabularies and vectorizing and expressing the text vocabularies;

the Word2Vec model is used for performing Word vector training by taking the vectorization expression of the text vocabulary as input to obtain a text Word vector and extracting the entity characteristics and the part of speech characteristics of the text vocabulary;

and adding a BilSTM model of an attention mechanism, and calculating the emotion analysis result of the text information by taking the text word vector, the entity characteristic and the part-of-speech characteristic as input.

The deep learning based financial reporting emotion analysis terminal comprises a processor and a memory, wherein the processor is coupled with the memory and executes instructions stored in the memory to realize the deep learning based financial reporting emotion analysis method when in work.

The invention has the beneficial effects that:

(1) the invention adopts the BilSTM model, compared with the RNN model and the single LSTM model, the BilSTM model reserves more context information, correspondingly reserves more gradients, and relieves the problem of gradient disappearance;

(2) according to the method, an attention mechanism is introduced on the basis of a BilSTM model, the importance degree of each word semantic in a target word is distinguished, the text is quickly and simply subjected to feature extraction, and the target word is subjected to feature representation, so that the performance of the existing model is greatly improved;

(3) because the specialty and the format uniformity of the financial report text are very strong, the text length of a single sample is long, the difficulty of manually extracting features is high, and the extraction effect is poor; the invention adopts a deep learning method, the advantage of the deep learning method is just in the characteristic self-learning capability of the model, the method is very suitable for the characteristic of high-dimensional big data, and the problems of high difficulty and poor extraction effect of manually extracting the characteristics are solved;

(4) the BilSTM model can correlate the context semantics, can well bind and integrate the relation between word segmentation vectors in the Chinese sentence processing process, and improves the accuracy of prediction;

(5) according to the invention, based on the abstract features extracted by the deep learning technology, the relation and meaning between words are simulated by the Word2Vec technology, so that the accuracy of emotion classification prediction is improved.

Drawings

FIG. 1 is a flow chart of a deep learning based method for sentiment analysis of financial reports;

FIG. 2 is a schematic structural diagram of a CBOW model used for word vector training according to the present invention;

FIG. 3 is a schematic structural diagram of an LSTM model;

FIG. 4 is a schematic structural diagram of a BilSTM model incorporating an attention mechanism in accordance with the present invention;

FIG. 5 is a schematic structural diagram of a financial report emotion analysis device based on deep learning.

Detailed Description

The technical solutions of the present invention will be described clearly and completely with reference to the following embodiments, and it should be understood that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be obtained by a person skilled in the art without inventive effort based on the embodiments of the present invention, are within the scope of the present invention.

Referring to fig. 1 to 5, the invention provides a financial report emotion analysis method, device and terminal based on deep learning:

example one

As shown in FIG. 1, the financial report emotion analysis method based on deep learning comprises the following steps:

and S1, acquiring text information discussed and analyzed by the management layer in the financial report.

And S2, preprocessing the text information to obtain text vocabularies.

Specifically, the preprocessing the text information to obtain a text vocabulary includes:

and S21, sentence splitting is carried out on the text information.

And S22, removing interference symbols in the text information after sentence division.

And S23, performing word segmentation on the text information with the interference symbols removed to obtain text words.

And S3, vectorizing and representing the text vocabulary. In this embodiment, the text vocabulary is vectorized and represented by an One-Hot encoding method, and the vectorized representation result of the text vocabulary is recorded as an One-Hot vector.

And S4, performing Word vector training by taking the vectorized representation of the text vocabulary as the input of the Word2Vec model, and extracting the entity characteristics and the part of speech characteristics of the text vocabulary.

The Word2Vec model is a CBOW model or a Skip-Gram model.

Specifically, as shown in fig. 2, the vectorization representation of the text vocabulary is used as an input of the Word2Vec model for Word vector training, which includes:

s41, connecting all One-Hot vectors on an input layer of the CBOW model, and multiplying all the One-Hot vectors by a shared input weight matrix respectively;

s42, adding all the One-Hot vectors multiplied by the input weight matrix to average to obtain hidden vectors, and transmitting the hidden vectors to a hidden layer;

s43, in the process of transferring the hidden layer vector from the hidden layer to the output layer, multiplying the hidden layer vector by the output weight matrix to obtain an output vector;

s44, calculating the output vector through a Softmax activation function layer to obtain probability distribution, wherein a word indicated by the maximum probability value is a predicted target word vector;

s45, calculating the error between the predicted target word vector and the actual One-Hot vector of the text vocabulary;

s46, defining a cross entropy loss function, and updating an input weight matrix and an output weight matrix by using a gradient descent algorithm;

s47, repeating S41-S46 until the error between the predicted target word vector and the actual One-Hot vector of the text vocabulary meets a threshold value;

and S48, multiplying each word of the input layer by the initial weight matrix after training to obtain a text word vector.

S5, constructing a BiLSTM model (Bi-directional Long Short-Term Memory network, BiLSTM for Short), and adding an attention mechanism into the BiLSTM model.

As shown in fig. 3, each black line carries a vector representing the input from the output of one node to the next, circles within the structure represent operations such as vector multiplication or addition, and boxes represent neural network learning layers. In the LSTM architecture, the key to exerting long-term memory is the transfer of information between horizontal lines through all cells. Meanwhile, under the unit line, the gate structure selects information needing to be transmitted or discarded through a Sigmoid nerve layer and a point-by-point multiplication operation, and the range of output vectors of the Sigmoid layer is between (0 and 1) and is used as a proportion for measuring the passing of the unit information. Bi-LSTM (BiLSTM model) means that two LSTM layers in different directions are combined together, and the specific meaning is as follows: the forward LSTM layer recognizes the text from left to right and trains to obtain a vector, the backward LSTM layer recognizes the text from right to left to obtain a vector, and then the forward and backward hidden vectors are spliced to obtain a new vector as output. Some contents in the text jointly determine emotional characteristics through the front text and the rear text, and the prediction effect of the text can be improved by adding a backward LSTM layer.

The attention mechanism focuses on key input information which has a significant influence on the output of the BilSTM model by calculating probability distribution, so that the learning of the BilSTM model on text features is optimized. The attention mechanism is added between a model training layer and an output layer of the BilSTM model, and different weights are given to each input state to determine the final output result of each unit state.

As shown in fig. 4, the input information is first encoded uniformly into fixed-length features by the encoder, and then the features are decoded and output by the decoder. This process forces the input information to be encoded to the same fixed length, so that the model performance is affected. After the attention mechanism is introduced, the attention mechanism selectively pays attention to input information by a method of controlling weight, only the attention content is selectively output when the information is output, and although the model calculation amount is increased in efficiency, the model convergence time is prolonged, the effect of model training is greatly improved, and the purpose of text analysis is stronger.

Specifically, an attention mechanism is added into the BilSTM model, and the attention mechanism comprises the following steps:

s51, setting a model data input placeholder;

s52, defining L2 regularization loss;

s53, setting a word embedding layer: initializing a word embedding matrix by using a pre-trained word vector, and converting words in input data into word vectors by using the word embedding matrix;

s54, defining a model structure of two layers of BiLSTM: defining a forward LSTM structure and a reverse LSTM structure, and splicing output results of the forward LSTM structure and the reverse LSTM structure by adopting a dynamic input mode or a mode of taking the full length of a sequence;

s55, acquiring the number of the last layer of LSTM neurons, initializing weight vectors, performing nonlinear conversion on the spliced output of the Bi-LSTM, performing dimension conversion and normalization processing on the weights, performing weighted summation on the output of the nonlinear conversion by using the normalized weights, performing dimension conversion for output, and performing regularization by using Dropout to obtain the output of Attention;

and S56, performing loss calculation by using the full-connection layer, wherein the loss calculation comprises weight L2 loss and offset L2 loss, and the binary cross entropy loss is calculated, and the total loss is the sum of the average loss of the binary cross entropy and the L2 regular loss of the model with coefficients.

And S6, inputting the word vector training result, the entity characteristics and the part of speech characteristics into a BilSTM model, and calculating the emotion analysis result of the text information.

In the step S6, a hidden layer state is obtained under a BilSTM model, the influence of each input position on the current position is calculated by a dot product method, the output result is led into a softmax function layer to obtain attention weight distribution, and finally the emotion analysis result of the text information is obtained by using a corresponding vector obtained by weighting and summing.

In order to verify the effect of the embodiment, a classic long text classification model, namely a TextRCNN model, is set up to train sample data, results of a Logistic regression model, the TextRCNN and a Bi-LSTM + Attention model are compared with model prediction results of other researches, and the prediction effect of the Bi-LSTM model is remarkably better than that of a traditional machine learning method. The accuracy of the Bi-LSTM model is 91.70%, the classification effect is better than the LSTM alone, since more text features are extracted using the two-layer LSTM model. After an attention mechanism is introduced into the Bi-LSTM model, the model precision is 98.30%, the classification effect is better than that of both TextRCNN and Bi-LSTM models, and the precision is 15.23% higher than that of the traditional machine learning method.

Example two

As shown in FIG. 5, the financial report emotion analysis device based on deep learning comprises a text collection module, a data processing module, a Word2Vec model and a BilSTM model with an attention adding mechanism.

The text collection module is used for acquiring text information discussed and analyzed by a management layer in the financial report.

The data processing module is used for preprocessing the text information to obtain text vocabularies and vectorizing and expressing the text vocabularies. The step of preprocessing the text information to obtain a text vocabulary comprises the following steps: sentence dividing is carried out on the text information; removing interference symbols in the text information after the sentence division; and performing word segmentation on the text information without the interference symbols to obtain text words.

The Word2Vec model is used for performing Word vector training by taking the vectorized representation of the text vocabulary as input to obtain a text Word vector, and is used for extracting the entity characteristics and the part of speech characteristics of the text vocabulary.

The BilSTM model with the attention mechanism is used for calculating the emotion analysis result of the text information by taking the text word vector, the entity characteristic and the part-of-speech characteristic as input.

EXAMPLE III

The deep learning based financial reporting emotion analysis terminal comprises a processor and a memory, wherein the processor is coupled with the memory and is used for executing instructions stored in the memory to realize the deep learning based financial reporting emotion analysis method in the embodiment one.

The processor may also be referred to as a CPU (central processing unit), among others. The processor may be an integrated circuit chip having signal processing capabilities. The processor may also be a general purpose processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic, discrete hardware components. A general purpose processor may be a microprocessor or the processor may be any conventional processor, but is not limited thereto.

The foregoing is illustrative of the preferred embodiments of this invention, and it is to be understood that the invention is not limited to the precise form disclosed herein and that various other combinations, modifications, and environments may be resorted to, falling within the scope of the concept as disclosed herein, either as described above or as apparent to those skilled in the relevant art. And that modifications and variations may be effected by those skilled in the art without departing from the spirit and scope of the invention as defined by the appended claims.

Claims

1. The financial report emotion analysis method based on deep learning is characterized by comprising the following steps:

preprocessing the text information to obtain a text vocabulary;

vectorizing and representing the text vocabulary;

2. The deep learning based financial reporting emotion analysis method of claim 1, wherein preprocessing the textual information to obtain a textual vocabulary comprises:

sentence dividing is carried out on the text information;

3. The deep learning-based financial report emotion analysis method of claim 1, wherein the text vocabulary is vectorized and represented by using One-Hot coding.

4. The deep learning-based financial report sentiment analysis method of claim 3 wherein the Word2Vec model is a CBOW model or a Skip-Gram model.

5. The deep learning based financial reporting emotion analysis method of claim 4, wherein word vector training comprises:

6. The deep learning based financial reporting emotion analysis method of claim 1, wherein an attention mechanism is added to the BilSTM model, comprising:

setting a model data input placeholder;

defining an L2 regularization loss;

7. The deep learning-based financial report emotion analysis method as claimed in claim 1, wherein an emotion analysis result expression of the text information is calculated, a hidden layer state is obtained under a BilSTM model, the influence of each input position on the current position is calculated by a dot multiplication method, an output result is led into a softmax function layer to obtain an attention weight distribution, and an emotion analysis result of the text information is finally obtained by using a corresponding vector obtained by weighting and summing.

8. Financial report emotion analysis device based on deep learning, characterized by including:

9. Deep learning based financial reporting sentiment analysis terminal comprising a processor and a memory, the processor being coupled to the memory, the processor being operable to execute instructions stored in the memory to implement the deep learning based financial reporting sentiment analysis method of any one of claims 1 to 6.