CN113052395A - Method for predicting financial data by neural network fusing network characteristics - Google Patents

Method for predicting financial data by neural network fusing network characteristics Download PDF

Info

Publication number
CN113052395A
CN113052395A CN202110405540.8A CN202110405540A CN113052395A CN 113052395 A CN113052395 A CN 113052395A CN 202110405540 A CN202110405540 A CN 202110405540A CN 113052395 A CN113052395 A CN 113052395A
Authority
CN
China
Prior art keywords
data
network
node
nodes
enterprise
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110405540.8A
Other languages
Chinese (zh)
Inventor
黄泽宇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shandong Ziping Information Technology Service Co ltd
Original Assignee
Shandong Ziping Information Technology Service Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shandong Ziping Information Technology Service Co ltd filed Critical Shandong Ziping Information Technology Service Co ltd
Priority to CN202110405540.8A priority Critical patent/CN113052395A/en
Publication of CN113052395A publication Critical patent/CN113052395A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9536Search customisation based on social or collaborative filtering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/04Trading; Exchange, e.g. stocks, commodities, derivatives or currency exchange
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
    • G06Q50/01Social networking

Abstract

The invention discloses a method for predicting financial data by a neural network fusing network characteristics, which comprises the following steps: s1: collecting data; s2: preprocessing a data set; s3: constructing an enterprise relation complex network by utilizing enterprise stock price time sequence data; s4: carrying out community division on the network by using a Louvain algorithm to obtain the community property of the node; s5: calculating by using the importance of the nodes in the enterprise relevance network according to the enterprise relevance network by using a PageRank algorithm to obtain the PageRank value of the nodes; s6: performing secondary calculation on the static data according to corresponding indexes to construct dependent variable data; s7: and (4) carrying out prediction evaluation on the income capacity of the enterprise by using a neural network algorithm. The method has higher accuracy of the final prediction result, enhances the objectivity, scientificity and accuracy of the evaluation work, solves the problem that quantitative analysis is lacked in income prediction during enterprise value evaluation work, reduces the influence of subjective factors of analysts during income prediction, and improves the objectivity and interpretability of the evaluation work.

Description

Method for predicting financial data by neural network fusing network characteristics
Technical Field
The invention relates to a complex network and financial data mining technology, in particular to a method for predicting financial data by a neural network fusing network characteristics.
Background
In traditional business investment, the analysts judge the future income of the enterprise on the basis of qualitative analysis, the prediction difference of different analysts is large, and the evaluation result is greatly influenced by the professional level of the analysts and the accuracy is difficult to control. With the advent of the information-oriented era, the internet financial industry experiences a change of covering the land, and the information technology which takes the internet, cloud computing, big data, artificial intelligence and a block chain as the core is rapidly developed, is being applied to various fields of the economic society on a large scale and becomes an important driving force for promoting the transformation and upgrading of various industries. Various data in the society are processed through an algorithm, and reference information can be effectively provided for decision makers and decision making is assisted. The asset assessment industry is used as an important participant of market economy, when massive data is faced, a income prediction model is built by utilizing the massive data to assist an analyst to make a decision, and the situation of providing a referable valuation prediction becomes an emerging solution in the industry.
The judgment of the future income of the enterprise by the traditional analysts is usually based on qualitative analysis, the prediction difference of different analysts is large, and the evaluation result is greatly influenced by the professional level of the analysts and the influence is difficult to control.
The Chinese patent application CN201910237356.X provides a quantitative calculation method for investment value of listed enterprises, which comprises the steps of firstly obtaining financial data of a target enterprise to obtain a classification result, and then calculating the classification result according to a preset algorithm formed by the weight of influence factors of the investment value of the listed enterprises to obtain a quantitative calculation result for the investment value.
The chinese patent application CN201810742756.1 proposes an income prediction method, which determines a feature value corresponding to a relevant feature of a user performing an operation on a target object by obtaining relevant information of each user performing the operation on the target object in a prediction period in a group to be predicted, and predicts the income of the target object under each channel according to a group feature value corresponding to the relevant feature of each channel and a pre-trained channel income prediction model.
The Chinese patent application CN202010081874.X provides a method for establishing respective local evaluation models based on longitudinal federal learning to determine the investment value of an enterprise to be evaluated, so that the enterprise can be comprehensively and reliably evaluated.
Although these patent applications can evaluate the value of the enterprise to some extent, they have several disadvantages. First, the enterprise to be forecasted is not evaluated from a whole market perspective, and the forecasting model is relatively one-sided. Second, the assessment model does not take into account the inter-enterprise relationships, such as strong cohesion between enterprises. Third, the parameters of these assessment models are only calculated twice on common data and do not take into account the hierarchy of the enterprise.
Disclosure of Invention
The invention aims to overcome the defects of the prior art and provides a method for predicting financial data by a neural network fusing network characteristics.
In order to achieve the purpose, the invention adopts the following technical scheme:
a method for predicting financial data by a neural network fusing network characteristics comprises the following steps:
s1: data acquisition, including dynamic data and static data;
s2: data set preprocessing, including processing missing values and outliers of data;
s3: constructing an enterprise relation complex network by utilizing enterprise stock price time sequence data;
s4: carrying out community division on the network by using a Louvain algorithm (namely a community discovery algorithm of a large-scale network mined by a social network) to obtain the community properties of the nodes;
s5: calculating the importance of the nodes in the enterprise relevance network by using a PageRank (namely, a webpage ranking, also called a webpage level) algorithm according to the enterprise relevance network to obtain the PageRank value of the nodes;
s6: according to the corresponding index, carrying out secondary calculation on the static data to construct dependent variable data, and specifically comprising the following steps:
s6.1: increasing community properties of enterprises, and performing variable dimension expansion by adopting One-Hot coding (One-Hot coding);
s6.2: increasing the PageRank value of an enterprise, and performing variable dimension expansion;
s7: and (4) carrying out prediction evaluation on the income capacity of the enterprise by using a neural network algorithm.
In step S1, the dynamic data is the market value of all listed enterprises in the stock market in the setting stage, and the static data is the financial data and the overall market data of the enterprises (i.e. the macroscopic factors in the financial indexes of the listed enterprises, and the specific data is shown in the macroscopic factor part in table 1).
In step S2, filling the null values in the sample data with median, and performing the max-min method normalization processing on the data, where the calculation formula (1) is as follows:
Figure BDA0003022135780000021
wherein z isijRepresenting the jth row of data in the ith characteristic index of the sample data set, i and j are integers greater than or equal to 1, and ziminAnd zimaxRespectively representing the minimum value and the maximum value of the ith characteristic index of the sample data set.
The step S3 includes the following sub-steps:
s3.1, in the step, the time series of the stock prices is taken as the dynamic characteristics of the listed enterprises, the correlation of the characteristics between the listed enterprises is calculated, Pearson correlation coefficients are adopted, the calculation formula (2) is as follows, and v is assumed to bei(t) is the closing price of i stock at time t, at which time the value gain of i stock is delta vi(t) is:
Figure BDA0003022135780000031
wherein t represents time, delta t is the time period for obtaining the income, i and j are integers which are more than or equal to 1 respectively, and the Pearson correlation coefficient p between any two i and j stocksijBy two variables vi、vjThe covariance and the ratio of the product of the two standard deviations are calculated, and the specific formula (3) is as follows:
Figure BDA0003022135780000032
s3.2, the connection relationship E ═ E { E } between the two listed businesses is obtained in step S3.111,e12,…,eijIn which p isijRepresenting the Pearson correlation coefficient between i, j stocks, eijFor the incidence relation between the enterprises appearing on the market, E is a basic connection coefficient, and a table relation network G is obtained as (V, E, W), wherein V represents each network node entity of the enterprises appearing on the market, E represents the connection edge relation between two enterprises appearing on the market, W represents the weight relation between the enterprises appearing on the market, and the weight relation W between the enterprises i and jijIs calculated as in general formula (4)
wij=eij*pij (4)
Under the calculation formula, strong weight connection edges exist when the correlation is strong, and weak weight connection edges exist when the correlation is weak.
The step S4 specifically includes the following substeps:
s4.1, initializing nodes and communities, and regarding each node in the network as an independent community, wherein the number of the communities is the same as that of the nodes;
s4.2, m represents a node, is a positive integer, sequentially tries to allocate the node m to the community where each neighbor node is located, calculates the modularity change delta Q before and after allocation, and keeps the community attribute of the neighbor node with the maximum delta Q, if max delta Q is larger than 0, allocates the node m to the community where the neighbor node with the maximum delta Q is located, otherwise, keeps unchanged;
s4.3, repeating the S4.2 until the node community attribute is unchanged;
s4.4, regarding all nodes in the same community as a new node, converting the weight of edges between the nodes in the community into the weight of a ring of the new node, and converting the edge weight between the community intervals into the weight of edges between the new nodes;
and S4.5, repeating S4.1 until the modularity Delta Q of the whole network is not changed any more.
The step S5 specifically includes the following substeps:
s5.1, the PageRank value of the initialization node is PangRank (d)i)1M is a node of the network;
s5.2, traversing the nodes in the network according to the PageRank value of the target node and the weights of the neighbor nodes, updating the PageRank value of the nodes, and calculating a formula (5) as follows
Figure BDA0003022135780000041
Wherein PangRank (d)i)k+1Representing node d at iteration k +1iPageRank value of (d), M (d)i) Represents node diSet of neighbor nodes of dvRepresentative node diThe neighbor node of (2);
s5.3, calculating the PageRank value updating quantity delta of all nodes in the network, wherein a calculation formula (6) is as follows
Figure BDA0003022135780000042
Where D represents the set of all nodes, N represents the number of nodes, PangRank (D)i)k+1Representing node d at iteration k +1iPageRank value of (c);
and S5.4, stopping iteration when the delta is less than or equal to the epsilon, wherein the epsilon is a constant, and otherwise, repeating the step S4.2.
In the step S6, when an index system is constructed, index fusion is performed by using the index evaluation principle in the multivariate statistical theory and using the network characteristic indexes extracted in the steps S4 and S5, and the method specifically includes the following substeps:
and S6.1, increasing community properties of listed enterprises, adopting One-Hot coding to convert the community attributes of the listed enterprise nodes into multi-bit binary data, and performing variable dimension expansion.
And S6.2, increasing the PageRank value of the enterprise on the market, and performing variable dimension expansion.
In step S7, the neural network model is divided into forward propagation and backward propagation, output data of a previous layer in the forward propagation process is used as input data of a next layer, and then the input data needs to be weighted, summed, added with a deviation, and substituted into an activation function for calculation, where a specific formula (7) is as follows:
Figure BDA0003022135780000043
wherein, f is an activation function, the activation function in the neural network is a Sigmoid activation function, wpqWeight value, x, representing the p-layer q-layer of the neural network layerpRepresenting the input at the p-layer of the neural network, bpRepresenting the offset of the p-layer of the neural network.
A complex network is a method of representing various types of actual relationships in terms of abstract nodes and connecting edges. As an important tool for researching various disciplines, the topological characteristics of a specific network in an actual problem can be obtained through a data structure such as a network diagram, and then the corresponding problem is solved by utilizing the characteristics. A concrete network can be abstracted as a graph G ═ (V, E, W) consisting of a set of nodes V and a set of connected edges E, where V contains all the nodes,
Figure BDA0003022135780000044
is a collection of edges. v. ofiE.v denotes a node in the network, eijE represents the node viAnd node vjThe connecting edge between the two. w is aijE W represents the weight coefficient of the edge for measuring vjAnd viThe degree of tightness of the connection.
Neural networks, which were first proposed by psychologists and neurobiologists to provide a relatively simple approach to solving complex problems, have received increasing attention in recent years. Neural network models are various and different levels of description and simulation of biological nervous systems are performed from different perspectives. Representative network models are BP networks, RBF networks, Hopfield networks, ad hoc feature mapping networks, and the like.
The invention utilizes the stock price time sequence data of the listed enterprises to construct an enterprise relationship complex network, divides the whole enterprise network into communities according to the clustering idea, quantifies the importance degree of a single enterprise in the whole network, and quantifies the incidence relationship of the enterprises in the whole market through the idea of community division. Characteristics of the enterprises in the complex network are quantified through community division and important node sequencing, and an analyst can conveniently and visually know the status of the target enterprises in the complex network. And (4) carrying out prediction and evaluation on the income capacity of the enterprise by using a neural network algorithm in combination with the complex network index and the financial index of the enterprise on the market.
The invention selects stock market value and enterprise financial basic information data of listed enterprises, adopts the neural network prediction model after adding the network characteristic indexes, has higher accuracy of the final prediction result, enhances the objectivity, scientificity and accuracy of the evaluation work, solves the problem that the income prediction is lack of quantitative analysis during the enterprise value evaluation work, reduces the influence of subjective factors of analysts during the income prediction, improves the objectivity and interpretability of the evaluation work, and meets the requirement of practical use. In the experiment of real data, the model effect is obviously improved after the network characteristic parameters are fused, the mean square error MSE of the business income growth rate (%) of the model output value is reduced from 0.226 to 0.184, the reduction proportion of the mean square error is up to 18%, and as can be seen from fig. 7, the deviation degree of the real value and the predicted value is realized. The fact shows that the method provided by the patent has good prediction capability on financial data.
Drawings
FIG. 1 is a flow chart of the modeling steps of the present invention;
FIG. 2 is a diagram illustrating a complex enterprise relationship network constructed using time-series data of stock prices for a listed enterprise;
FIG. 3 is a flow chart of Louvain algorithm community discovery;
FIG. 4 is a flowchart of the PageRank algorithm;
FIG. 5 is a schematic diagram of a neural network algorithm;
FIG. 6 is a graph of iterative Loss drop of neural network algorithm fit;
FIG. 7 is a graph of the fit before and after adding a network feature.
Detailed Description
The invention is further illustrated with reference to the following figures and examples.
The structure, proportion, size and the like shown in the drawings are only used for matching with the content disclosed in the specification, so that the person skilled in the art can understand and read the description, and the description is not used for limiting the limit condition of the implementation of the invention, so the method has no technical essence, and any structural modification, proportion relation change or size adjustment still falls within the scope covered by the technical content disclosed by the invention without affecting the effect and the achievable purpose of the invention. In addition, the terms "upper", "lower", "left", "right", "middle" and "one" used in the present specification are for clarity of description, and are not intended to limit the scope of the present invention, and the relative relationship between the terms and the terms is not to be construed as a scope of the present invention.
Referring to fig. 1 to 7, the present invention takes the enterprise listed in stock a of china as an example to perform predictive modeling analysis on the income capacities of different enterprises. A stock marketing company with 2016 and 2018 as a test sample (income data is not published in 2020, financial data of enterprises in 2019 cannot be used), and companies with more than 20% of data loss, major change of main business and ST in an index system are removed.
As shown in fig. 1, the neural network prediction marketing enterprise income method with network features fused comprises the following steps:
s1: the collection of the data set is mainly divided into two parts of data, namely dynamic data and static data. The dynamic data is the market value of different listed enterprises, and the static data is the financial data and the overall market data of the listed enterprises.
Dynamic data and static data of a listed enterprise need to be collected, and the following substeps are included.
S1.1, stock market data of the listed enterprises are collected, wherein the data used by the model are from a Taian database in China, and 2016 is adopted for stock week closing prices of the listed enterprises with all A stocks in 2018.
S1.2, collecting market data and financial data of enterprises to be listed, wherein the financial indexes of the enterprises to be listed comprise the financial data of the enterprises to be listed, such as total domestic production value, Chinese international balance of income and expenditure, fixed asset investment completion amount and the like, account receivable turnover rate, account receivable turnover days and the like, as shown in Table 1.
Table 1: financial index of enterprise on market
Figure BDA0003022135780000061
Figure BDA0003022135780000071
Figure BDA0003022135780000081
S2: and (4) preprocessing the data set, wherein missing values and abnormal values of the data are processed. And filling null values in the sample data by adopting a median. In addition, because the measurement units and the value ranges of all indexes in the index system are different, in order to avoid the great influence of the value range difference on the result of the prediction model, the data is subjected to the standardization processing of the Min-Max scaling method. The calculation formula (1) is as follows:
Figure BDA0003022135780000082
wherein z isijRepresenting the jth row of data in the ith characteristic index of the sample data set, i and j are integers greater than or equal to 1, and ziminAnd zimaxRespectively representing the minimum value and the maximum value of the ith characteristic index of the sample data set.
S3: the method for constructing the enterprise relation complex network by using the listed enterprise stock price time sequence data comprises the following substeps:
s3.1, in the step, the time sequence of the stock price is taken as the dynamic characteristic of the listed enterprises, the correlation of the characteristics between the listed enterprises is calculated, the Pearson correlation coefficient is adopted, the calculation formula is as follows, and the assumption that v isi(t) is the closing price of i stock at time t, at which time the value gain of i stock is delta vi(t) is:
Figure BDA0003022135780000091
wherein t represents time, delta t is the time period for obtaining the income, i and j are integers which are more than or equal to 1 respectively, and the Pearson correlation coefficient between any two i and j stocks passes through two variables vi、vjThe covariance and the ratio of the product of the two standard deviations are calculated, and the specific formula (3) is as follows:
Figure BDA0003022135780000092
s3.2, obtaining the connection relation E ═ E between every two enterprises on the market through S3.111,e12,…,eijIn which p isijRepresenting the Pearson correlation coefficient between i, j stocks, eijFor the existence of the association relationship between the listed enterprises, E is the basic connection coefficient, as shown in fig. 2, a table relationship network G ═ V, E, W is obtained, where V represents each listed enterprise network node entity, E represents the connection relationship between two listed enterprises, W represents the weight relationship between two listed enterprises, and the weight relationship W between i and j enterprisesijIs calculated as in general formula (4)
wij=eij*pij (4)
Under the calculation formula, strong weight connection edges exist when the correlation is strong, and weak weight connection edges exist when the correlation is weak.
S4: the community property of the node is obtained by performing community division on the network by using a Louvain algorithm, and the flow is shown in FIG. 3 and specifically comprises the following substeps:
and S4.1, initializing the nodes and communities, and regarding each node in the network as an independent community, wherein the number of the communities is the same as that of the nodes.
S4.2, m represents a node, the node m is a positive integer, the node m is sequentially tried to be distributed to communities where each neighbor node is located, modularity change delta Q before and after distribution is calculated, the community attribute of the neighbor node with the maximum delta Q is reserved, if max delta Q is larger than 0, the node m is distributed to the community where the neighbor node with the maximum delta Q is located, and if not, the node m is kept unchanged.
And S4.3, repeating the S4.2 until the node community attribute is unchanged.
And S4.4, regarding all nodes in the same community as a new node, converting the weight of the edge between the nodes in the community into the weight of the ring of the new node, and converting the edge weight between the community intervals into the weight of the edge between the new nodes.
And S4.5, repeating S4.1 until the modularity Delta Q of the whole network is not changed any more.
S5: according to the listed enterprise relevance network, calculating by using the importance of the nodes in the listed enterprise relevance network through a PageRank algorithm to obtain the PageRank value of the nodes, wherein the flow is shown in FIG. 4 and specifically comprises the following substeps:
s5.1, the PageRank value of the initialization node is PangRank (d)m)1M is a node of the network;
s5.2, traversing the nodes in the network according to the PageRank value of the target node and the weights of the neighbor nodes, updating the PageRank value of the nodes, and calculating a formula (5) as follows
Figure BDA0003022135780000101
Wherein PangRank (d)i)k+1Representing node d at iteration k +1iPageRank value of (d), M (d)i) Represents node diSet of neighbor nodes of dvRepresentative node diThe neighbor node of (2);
s5.3, calculating the PageRank value updating quantity delta of all nodes in the network, wherein a calculation formula (6) is as follows
Figure BDA0003022135780000102
Where D represents the set of all nodes, N represents the number of nodes, PangRank (D)i)k+1Representing node d at iteration k +1iThe PageRank value of; and S5.4, stopping iteration when the delta is less than or equal to the epsilon, wherein the epsilon is a constant, and otherwise, repeating the step S4.2.
S6: and performing secondary calculation on the static data according to the corresponding index, and constructing dependent variable data. When an index system is constructed, the method uses the index evaluation principle in the multivariate statistical theory for reference, and the specific calculation mode is shown in table 1. On the other hand, the index fusion is performed by using the network characteristic indexes extracted in S4 and S5, and the method specifically comprises the following substeps:
and S6.1, increasing community properties of listed enterprises, adopting One-Hot coding to convert the community attributes of the listed enterprise nodes into multi-bit binary data, and performing variable dimension expansion.
And S6.2, increasing the PageRank value of the enterprise on the market, and performing variable dimension expansion.
S7: the enterprise revenue capacity is forecasted and evaluated by using a neural network algorithm, and as shown in fig. 5, the neural network model is divided into forward propagation and backward propagation. The output data of the previous layer in the forward propagation process is used as the input data of the next layer, then the input data needs to be weighted and summed to add deviation, and is substituted into the activation function for calculation, and the specific formula (7) is as follows:
Figure BDA0003022135780000103
wherein, f is an activation function, the activation function in the neural network is a Sigmoid activation function, wpqRepresenting a neural network layer pWeight value of layer q, xpRepresenting the input at the p-layer of the neural network, bpRepresenting the offset of the p-layer of the neural network.
As described above, the embodiment of the neural network prediction marketing enterprise income method fusing network characteristics in the field of financial data mining is introduced, the method selects the stock market value and the enterprise financial basic information data of marketing enterprises in 2016 + 2018, adopts the neural network prediction model after adding the network characteristic indexes, has high accuracy of the final prediction result, enhances the objectivity, the scientificity and the accuracy of the evaluation work, solves the problem that the income prediction is lack of quantitative analysis during the enterprise value evaluation work, reduces the influence of subjective factors of an analyst during the income prediction, improves the objectivity and the interpretability of the evaluation work, and meets the requirement of actual use. In the experiment of real data, the model effect is obviously improved after the network characteristic parameters are fused, the mean square error MSE of the business income growth rate (%) of the model output value is reduced from 0.226 to 0.184, and as can be seen from FIG. 7, the deviation degree of the real value and the predicted value is realized.
Although the embodiments of the present invention have been described with reference to the accompanying drawings, it is not intended to limit the scope of the present invention, and it should be understood by those skilled in the art that various modifications and variations can be made without inventive efforts by those skilled in the art based on the technical solution of the present invention.

Claims (7)

1. A method for predicting financial data by a neural network fusing network characteristics is characterized by comprising the following steps:
s1: data acquisition, including dynamic data and static data;
s2: data set preprocessing, including processing missing values and outliers of data;
s3: constructing an enterprise relation complex network by utilizing enterprise stock price time sequence data;
s4: carrying out community division on the network by using a Louvain algorithm to obtain the community property of the node;
s5: calculating by using the importance of the nodes in the enterprise relevance network according to the enterprise relevance network by using a PageRank algorithm to obtain the PageRank value of the nodes;
s6: according to corresponding indexes, performing secondary calculation on static data, constructing dependent variable data, and when an index system is constructed, taking reference to an index evaluation principle in a multivariate statistical theory, and simultaneously performing index fusion by using the network characteristic indexes extracted in the step S4 and the step S5, specifically comprising the following substeps:
and S6.1, increasing community properties of listed enterprises, adopting One-Hot coding to convert the community attributes of the listed enterprise nodes into multi-bit binary data, and performing variable dimension expansion.
And S6.2, increasing the PageRank value of the enterprise on the market, and performing variable dimension expansion.
S7: and (4) carrying out prediction evaluation on the income capacity of the enterprise by using a neural network algorithm.
2. The method for predicting financial data through a neural network with converged network characteristics as claimed in claim 1, wherein the dynamic data is a market value of all listed enterprises in the stock market in the setting stage in the step S1, and the static data is financial data of the enterprises and a macroscopic factor in the overall market data, namely financial indexes of the listed enterprises.
3. The method for predicting financial data according to claim 1, wherein in step S2, the null values in the sample data are filled with median, and the data are normalized by the maximum-minimum method, and formula (1) is calculated as follows:
Figure FDA0003022135770000011
wherein z isijRepresenting the jth row of data in the ith characteristic index of the sample data set, i and j are integers greater than or equal to 1, and ziminAnd zimaxRespectively representing the minimum value and the maximum value of the ith characteristic index of the sample data set.
4. The method for neural network prediction of financial data with converged network characteristics as claimed in claim 1, wherein the step S3 comprises the sub-steps of:
s3.1, taking the time sequence of the stock prices of the listed enterprises as the dynamic characteristics of the listed enterprises, calculating the correlation of the characteristics between the listed enterprises, adopting Pearson correlation coefficient, and calculating the formula (2) as follows, and assuming that v isi(t) is the closing price of i stock at time t, at which time the value gain of i stock is delta vi(t) is:
Figure FDA0003022135770000021
wherein t represents time, Δ t represents time period for obtaining the profit, i and j are integers greater than or equal to 1 respectively, and Pearson correlation coefficient p between any two i and j stocksijBy two variables vi、vjThe covariance and the ratio of the product of the two standard deviations are calculated, and the specific formula (3) is as follows:
Figure FDA0003022135770000022
s3.2, the connection relationship E ═ E { E } between the two listed businesses is obtained in step S3.111,e12,…,eijIn which p isijRepresenting the Pearson correlation coefficient between i, j stocks, eijFor the incidence relation between the enterprises appearing on the market, E is a basic connection coefficient, and a table relation network G is obtained as (V, E, W), wherein V represents each network node entity of the enterprises appearing on the market, E represents the connection edge relation between two enterprises appearing on the market, W represents the weight relation between the enterprises appearing on the market, and the weight relation W between the enterprises i and jijIs calculated as in general formula (4)
wij=eij*pij (4)
Under the calculation formula, strong weight connection edges exist when the correlation is strong, and weak weight connection edges exist when the correlation is weak.
5. The method for neural network based financial data fusion with network characteristics as claimed in claim 1, wherein said step S4 comprises the following steps:
s4.1, initializing nodes and communities, and regarding each node in the network as an independent community, wherein the number of the communities is the same as that of the nodes;
s4.2, m represents a node, is a positive integer, sequentially tries to allocate the node m to the community where each neighbor node is located, calculates the modularity change delta Q before and after allocation, and keeps the community attribute of the neighbor node with the maximum delta Q, if max delta Q is larger than 0, allocates the node m to the community where the neighbor node with the maximum delta Q is located, otherwise, keeps unchanged;
s4.3, repeating the S4.2 until the node community attribute is unchanged;
s4.4, regarding all nodes in the same community as a new node, converting the weight of edges between the nodes in the community into the weight of a ring of the new node, and converting the edge weight between the community intervals into the weight of edges between the new nodes;
and S4.5, repeating S4.1 until the modularity Delta Q of the whole network is not changed any more.
6. The method for neural network based financial data fusion with network characteristics as claimed in claim 1, wherein said step S5 comprises the following steps:
s5.1, the PageRank value of the initialization node is PangRank (d)m)1M is a node of the network as above);
s5.2, traversing the nodes in the network according to the PageRank value of the target node and the weights of the neighbor nodes, updating the PageRank value of the nodes, and calculating a formula (5) as follows
Figure FDA0003022135770000031
Wherein PangRank (d)i)k+1Representing iterationsNode d at time k +1iPageRank value of (d), M (d)i) Represents node diSet of neighbor nodes of dvRepresentative node diThe neighbor node of (2);
s5.3, calculating the PageRank value updating quantity delta of all nodes in the network, wherein a calculation formula (6) is as follows
Figure FDA0003022135770000032
Where D represents the set of all nodes, N represents the number of nodes, PangRank (D)i)k+1Representing node d at iteration k +1iThe PageRank value of; and S5.4, stopping iteration when the delta is less than or equal to the epsilon, wherein the epsilon is a constant, and otherwise, repeating the step S4.2.
7. The method for predicting financial data by using neural network with converged network characteristics as claimed in claim 1, wherein in the step S7, the neural network model is divided into forward propagation and backward propagation, output data of a previous layer in the forward propagation process is used as input data of a next layer, and then the input data is weighted and summed to add a deviation, and substituted into an activation function for calculation, and the specific formula (7) is as follows:
Figure FDA0003022135770000033
wherein, f is an activation function, the activation function in the neural network is a Sigmoid activation function, wpqWeight value, x, representing the p-layer q-layer of the neural network layerpRepresenting the input at the p-layer of the neural network, bpRepresenting the offset of the p-layer of the neural network.
CN202110405540.8A 2021-04-15 2021-04-15 Method for predicting financial data by neural network fusing network characteristics Pending CN113052395A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110405540.8A CN113052395A (en) 2021-04-15 2021-04-15 Method for predicting financial data by neural network fusing network characteristics

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110405540.8A CN113052395A (en) 2021-04-15 2021-04-15 Method for predicting financial data by neural network fusing network characteristics

Publications (1)

Publication Number Publication Date
CN113052395A true CN113052395A (en) 2021-06-29

Family

ID=76520380

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110405540.8A Pending CN113052395A (en) 2021-04-15 2021-04-15 Method for predicting financial data by neural network fusing network characteristics

Country Status (1)

Country Link
CN (1) CN113052395A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113837879A (en) * 2021-09-14 2021-12-24 上证所信息网络有限公司 Abnormal detection method for index quotation
KR20230027903A (en) * 2021-08-20 2023-02-28 유한책임회사 블루바이저시스템즈 Apparatus for predicting fluctuation of stock price based on learning model

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20230027903A (en) * 2021-08-20 2023-02-28 유한책임회사 블루바이저시스템즈 Apparatus for predicting fluctuation of stock price based on learning model
KR102614106B1 (en) 2021-08-20 2023-12-14 유한책임회사 블루바이저시스템즈 Apparatus for predicting fluctuation of stock price based on learning model
CN113837879A (en) * 2021-09-14 2021-12-24 上证所信息网络有限公司 Abnormal detection method for index quotation
CN113837879B (en) * 2021-09-14 2023-12-19 上证所信息网络有限公司 Abnormality detection method for index quotation

Similar Documents

Publication Publication Date Title
Roszkowska Multi-criteria decision making models by applying the TOPSIS method to crisp and interval data
CN110956273A (en) Credit scoring method and system integrating multiple machine learning models
CN113052395A (en) Method for predicting financial data by neural network fusing network characteristics
CN112016755A (en) Construction method of universal design cost standardization technology module of power transmission and transformation project construction drawing
CN113011518B (en) Aluminum electrolysis cell condition health degree classification method based on combined weighted naive Bayes
Utkin et al. The DS/AHP method under partial information about criteria and alternatives by several levels of criteria
Mitra A white paper on scenario generation for stochastic programming
Lin Fuzzy-AI model
CN103942604A (en) Prediction method and system based on forest discrimination model
Xu et al. Copula-based high dimensional cross-market dependence modeling
CN112308305B (en) Multi-model synthesis-based electricity sales amount prediction method
Aleksandar et al. Multiple linear regression model for predicting bidding price
Li Prediction of house price index based on machine learning methods
CN114358474A (en) Typical multi-energy user model building method
Cakar et al. Neurotic Fuzzy-Data-Envelopment Analysis to Forecast Efficiency of Bank Branches
CN111353523A (en) Method for classifying railway customers
Reig-Mullor et al. Novel distance measure in fuzzy TOPSIS to improve ranking process: An application to the Spanish grocery industry
Maknickiene et al. Investigation of Prediction Capabilities using RNN Ensembles.
Kurniasari et al. PERFORMANCE OF THE ACCURACY OF FORECASTING THE CONSUMER PRICE INDEX USING THE GARCH AND ANN METHODS
Damaliana et al. Implementation of Quantile Regression Neural Network Model for Forecasting Electricity Demand in East Java
FITRIANI et al. APPLICATION OF HIERARCHICAL TIME SERIES MODEL WITH TRANSFER FUNCTION
Qiu Risk Reduction Strategy of Complex System Based on Intelligent Science
CN115482101A (en) Stock prediction method based on historical data screening and momentum overflow effect
Djennas et al. Agent-Based Modeling in Supply Chain Management: a Genetic Algorithm and Fuzzy Logic Approach
Najafi et al. Predicting Stock Market Dividends Using Ranking and Data Mining Techniques

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination