CN111798061A

CN111798061A - Financial data prediction system based on block chain and cloud computing

Info

Publication number: CN111798061A
Application number: CN202010652720.1A
Authority: CN
Inventors: 刘星
Original assignee: Yangpu Minoan Electronic Technology Co ltd
Current assignee: Shenzhen rongyida Technology Development Co.,Ltd.
Priority date: 2020-07-08
Filing date: 2020-07-08
Publication date: 2020-10-20
Anticipated expiration: 2040-07-08
Also published as: CN111798061B

Abstract

The financial data prediction system comprises a first data acquisition module, a second data acquisition module, a financial data preprocessing module, a block chain storage module, a financial data prediction module and a visualization module, wherein the first data acquisition module is used for acquiring historical financial time sequence data, the second data acquisition module is used for acquiring real-time financial time sequence data, the financial data preprocessing module is used for processing the financial data, and the block chain storage module is used for storing the processed historical financial time sequence data. The method determines the abnormal degree of the financial data to be detected by measuring the distribution relation of the financial data to be detected and the neighborhood financial data thereof on the whole by using the distribution detection coefficient, and determines the abnormal degree of the financial data to be detected by measuring the local relation between the financial data to be detected and the subclass in the neighborhood detection set by using the neighborhood detection coefficient, thereby effectively improving the detection precision of the noise data.

Description

Financial data prediction system based on block chain and cloud computing

Technical Field

The invention relates to the field of financial data processing, in particular to a financial data prediction system based on a block chain and cloud computing.

Background

In recent years, with the rapid development of economy, the trend of change of the financial market is more and more complex, understanding the mode of financial activities and predicting the development and change of the financial activities are important points of academic circles and industry research, the future trend of financial data is predicted, the development and change of the financial market at a macroscopic level are facilitated to be known, and a basis is provided for investors and companies to make trading decisions and plans at a microscopic level.

Disclosure of Invention

In view of the above problems, the present invention aims to provide a financial data prediction system based on a block chain and cloud computing.

The invention points improved based on the two major algorithms of the invention do not have singleness and belong to the same technical framework, so that the applicant applies for related contents of 2 technologies on the same day, the invention and the applicant's application on the same day, namely a financial data prediction system based on a block chain and big data share partial contents, the application mainly requests to protect the algorithm improvement and the whole of a preprocessing module, and the other application mainly requests to protect the algorithm improvement of a neural network of the prediction module.

The purpose of the invention is realized by the following technical scheme:

a financial data prediction system based on a block chain and cloud computing comprises a first data acquisition module, a second data acquisition module, a financial data preprocessing module, a block chain storage module, a financial data prediction module and a visualization module, wherein the first data acquisition module is used for collecting historical financial time sequence data and inputting the collected historical financial time sequence data to the financial data preprocessing module for processing, the processed historical financial time sequence data is transmitted to the block chain storage module for storage, the second data acquisition module is used for collecting real-time financial time sequence data and inputting the collected real-time financial time sequence data to the financial data preprocessing module for processing, the processed real-time financial time sequence data is input to the financial data prediction module, and the financial data prediction module predicts the future trend of the current financial data according to the input real-time sequence financial data, the visualization module is used for displaying the prediction result of the financial data prediction module.

Preferably, the financial data preprocessing module is configured to normalize the input financial time series data and remove noise data in the normalized financial time series data, where R denotes a normalized financial time series data set, and R ═ { R(s) ·, s ═ 1, 2,..,. m }, where R(s) denotes the s-th financial data in the financial time series data set R, and m denotes a total number of financial data in the financial time series data set R; let c(s) denote a neighborhood data set corresponding to financial data R(s), given a positive integer n, and 0 < n < m, n financial data closest to the financial data R(s) in the financial time series data set R is selected and added to the set c(s), and c(s) { R (e, s), e ═ 1, 2.., n }, where R (e, s) denotes the e-th financial data in the set c(s); screening and classifying the financial data in the set C(s), C_u(s) represents the u-th sub-category obtained by screening and classifying the financial data in the set C(s), and the financial data in the set C(s) is selected and added into the sub-category C in the following way_uIn(s), specifically:

given a distance threshold D, assuming that r (p, s) represents the p-th financial data in the set C(s) without any subclass added, and r (q, s) represents the q-th financial data in the set C(s) without any subclass added, then

Randomly selecting one financial data which is not added into any subclass from the set C(s) to be added into the subclass C_u(s) continuing to screen and classify the financial data in the set C(s) without any subclass, wherein r (v, s) represents the v-th financial data in the set C(s) without any subclass, and r (v, s) represents the v-th financial data in the set C(s) without any subclass_u(d, s) represents subclass C_u(s) when the financial data r (v, s) satisfies:

then financial data r (v, s) is added to subclass C_u(s) and continuing to screen and sort financial data in the set C(s) that is not added with any subclass, when no new financial data is added into subclass C_uIf the data is in(s), the data is stopped from being selected from the set C(s) and added to the subclass C_u(s) wherein η (r (v, s), r)_u(d, s)) represents a value function, when | r |_uWhen (D, s) -r (v, s) | is less than or equal to D, then eta (r (v, s), r_u(d, s)) -0, when | r |_u(d，s)-r(_v，s) | > D, then η (r (v, s), r_u(d，s))＝1；

Let m_u(s) represents subclass C_uFinancial data in(s) when m_uWhen(s) is greater than or equal to 2, subclass C is retained_u(s) when m_uWhen(s) is 1, subclass C will be_uThe financial data in(s) is deleted in the set C(s), and the method is adopted again to select the financial data in the set C(s) to be added into the subclass C_u(s) stopping screening and classifying the financial data in the set C(s) until all the financial data in the set C(s) are added into the subclasses;

let C'(s) denote the set of financial data in the set C(s) that has not been deleted after being filtered and classified, C_θ(s) represents the theta sub-class resulting from screening and classifying the financial data in the set C(s), and C_θ(s) is a subclass closest to the financial data r(s) among the subclasses obtained by screening and classifying the financial data in the set C(s), and

where m(s) represents the number of sub-categories into which the financial data in set C(s) is divided, r_u(g, s) denotes subclass C_uThe g-th financial data in(s); the threshold H (θ) is determined for a given neighborhood, and

wherein r is_θ(ψ, s) represents the subclass C_θThe ψ th financial data in(s); let C "(s) denote the neighborhood detection set of the financial data r(s), let r ' (a, s) denote the a-th financial data in the set C '(s), and when the financial data r ' (a, s) satisfies: when | < r(s) | ≦ H (θ), adding financial data r' (a, s) to the set C "(s), where C"(s) { r "(a, s), a ═ 1, 2,. and m"(s) }, where r "(a, s) represents the a-th financial data in the set C"(s), m "(s) represents the total number of financial data in the set C"(s), defining an abnormal value y(s) corresponding to the financial data r(s), and the expression of y(s) is:

y(s)＝y₁(s)*y₂(s)

in the formula, y₁(s) represents the distribution detection coefficient of the financial data r(s) in the set C'(s), y₂(s) represents the neighborhood detection coefficients of financial data r(s) in set C'(s), ρ(s) represents the distribution coefficients of financial data r(s) in set C "(s), and

wherein, R (max) represents the maximum financial data value in the financial time sequence data set R, R (min) represents the minimum financial data value in the financial time sequence data set R, f₁(ρ(s), H (ρ)) represents a first judgment function corresponding to the financial data r(s), where f (ρ(s), H (ρ)) > H (ρ) is 1, and when ρ(s) ≦ H (ρ), f (ρ(s), H (ρ)) is 0, and C "(a, s) represents the financial data r" (a, s)A neighborhood detection set, ρ (a, s) represents the distribution coefficient of the financial data r "(a, s) in the set C" (a, s),

where r '(b, a, s) represents the b-th financial data in the set C' (a, s), m '(a, s) represents the total number of financial data in the set C' (a, s), and f₁(ρ (a, s), H (ρ)) represents a first judgment function corresponding to the financial data r ″ (a, s), and f (a, s) > H (ρ) is the first judgment function₁(ρ (a, s), H (ρ)) -1, and when ρ (a, s) ≦ H (ρ), then f (p)) is set to 1₁(ρ (a, s), H (ρ)) -0, where H (ρ) is a given distribution detection threshold,

represents a second judgment function when

When it is, then

When in use

When it is, then

Represents subclass C_θThe Lth financial data in(s), D (C)_θ(s)) represents subclass C_θ(s) a corresponding class detection threshold, and

wherein r is_θ(K, s) denotes subclass C_θ(s) the kth financial data;

when the abnormal value y(s) corresponding to the financial data R(s) is greater than 1, the financial data R(s) is judged to be noise data, the financial data R(s) is deleted in the financial time sequence data set R, when the abnormal value y(s) corresponding to the financial data R(s) is less than or equal to 1, the financial data R(s) is judged to be normal data, and the financial data R(s) is reserved in the financial time sequence data set R.

Preferably, the financial data prediction module predicts the future trend of the financial time series data by using a BP neural network, the financial data prediction module calls historical financial time series data stored in the block chain storage module to train the financial data future trend prediction on the BP neural network, the processed real-time financial time series data is used as the output value of the trained BP neural network, and the output value of the trained BP neural network is the predicted future trend of the financial data.

Preferably, in the training process of the BP neural network, the initial weight and the threshold of the BP neural network are optimized by adopting a particle swarm algorithm.

Preferably, the particles in the set of particles are updated using the following formula:

V_i(t+1)＝ω_i(t)V_i(t)+c₁rand()(P_i(t)-X_i(t))+c₂rand()(G(t)-X_i(t))

X_i(t+1)＝X_i(t)+V_i(t+1)

in the formula, X_i(t +1) and V_i(t +1) denotes the position and step size of the particle i at the (t +1) th iteration, X, respectively_i(t) and V_i(t) denotes the position and step size of the particle i at the t-th iteration, c₁And c₂Respectively, a learning factor, rand () a random number between (0, 1) randomly generated, P_i(t) represents the historical optimal solution, ω, for particle i at the t-th iteration_i(t) represents an inertial weight factor for particle i at the tth iteration, g (t) represents a global reference solution for the particle population at the tth iteration, g (t) is determined in the following manner:

let P (t) denote the historical optimal solution set of particles in the particle swarm at the t-th iteration, and P (t) { P }_i(t), i ═ 1, 2.., N }, where N represents the number of particles in the population, the historical optimal solutions in set p (t) are screened, and when the same historical optimal solution exists in set p (t), the historical optimal solution is screenedWhen only one of the same historical optimal solutions is reserved, deleting the other historical optimal solutions in the same historical optimal solutions, representing the filtered set P (t) as P ' (t), and setting P ' (t) ═ P (j, t), j ═ 1, 2,. and N ' (t) }, wherein P (j, t) represents the jth historical optimal solution in the set P ' (t), N ' (t) represents the number of historical optimal solutions in the set P ' (t), and the neighborhood detection distance corresponding to the historical optimal solution in the set P ' (t) is defined as d (t), and then the expression of d (t) is as follows:

in the formula (d)₀Representing a given initial neighborhood detection distance, T_maxRepresenting a given maximum number of iterations;

detecting the historical optimal solution in the set P' (t), wherein O (j, t) is used for representing a neighborhood detection range corresponding to the historical optimal solution P (j, t), and the O (j, t) is a circular area which takes the historical optimal solution P (j, t) as a center and d (t) as a radius, and defines a global reference value corresponding to the historical optimal solution P (j, t) as

Then

The expression of (a) is:

in the formula (f)₃(h (g (j, t))) represents a third value function corresponding to the historical optimal solution P (j, t), h (P (j, t)) represents a fitness function value corresponding to the historical optimal solution P (j, t),

represents the mean value of fitness function values corresponding to the historical optimal solution in the set P' (t)

When f is greater₃(h (P (j, t)))) 1, when

When f is greater₃(h (P (j, t))) + ∞, h (min, t) and h (max, t) respectively represent the minimum and maximum fitness function values corresponding to the particles in the particle swarm at the time of the tth iteration, (O (j, t)) represents the historical optimal number of solutions in the set P' (t) existing in the neighborhood detection range O (j, t), and α (t) and β (t) are weight coefficients, and

and selecting the historical optimal solution with the minimum global reference value in the set P' (t) as the global reference solution G (t).

Preferably, the inertial weight factor ω of the particle i at the t-th iteration_iThe expression of (t) is:

where ω (start) represents the initial inertia weight factor value, and ω (start) is 0.9, and ω (end) represents the evolution of the particle population to the maximum number of iterations T_maxThe value of the inertial weight factor of time, and ω (end) is 0.4, h (X)_i(t)) represents the position X of the particle i at the t-th iteration_i(t) the corresponding fitness function value, and h (G (t)) represents the fitness function value corresponding to the global reference solution G (t) of the particle swarm in the t iteration.

The beneficial effects created by the invention are as follows: the financial data prediction system based on the block chain and cloud computing is provided, and the future trend of financial data is predicted by adopting a BP neural network, so that the development and change of a financial market can be known in time; the method comprises the steps of setting a financial data preprocessing module for removing noise data in financial time series data, selecting a certain amount of financial data closest to the financial data to be detected to construct a neighborhood data set corresponding to the financial data to be detected, screening and classifying the financial data in the neighborhood data set, removing isolated data in the neighborhood detection set, obtaining subclasses with similar data, selecting the subclasses closest to the financial data to be detected to construct a neighborhood detection set of the financial data to be detected, effectively avoiding influence on accuracy of noise detection on the financial data to be detected due to the noise data in the neighborhood detection set, ensuring similarity between the financial data to be detected and the neighborhood financial data in the neighborhood detection set, defining abnormal values corresponding to the financial data to be detected, and determining that the financial data to be detected is compared with the neighborhood financial data to be detected through distribution detection coefficients and neighborhood detection coefficients Fusing the abnormal degree of the data, wherein the distribution detection coefficient measures the distribution situation between the financial data and the neighborhood financial data thereof through the distribution coefficient, when the distribution coefficient of the financial data is smaller than a given distribution detection threshold, the distribution detection coefficient indicates that the financial data is closer to the distribution distance of the neighborhood financial data thereof, therefore, the financial data to be detected is directly judged to be normal data, when the distribution coefficient of the financial data to be detected is larger than the given distribution detection threshold, the distribution distance between the financial data to be detected and the neighborhood financial data thereof is farther, considering the situation that the density distribution of the financial data is larger, whether the financial data to be detected is in an area with larger density distribution is determined by calculating the distribution coefficient of the neighborhood financial data of the financial data to be detected, when the distribution coefficient of the neighborhood financial data with more density distribution is larger than the given distribution detection threshold, the financial data to be detected is indicated to be in an area with large density distribution, namely the financial data to be detected is judged to be normal data, when the financial data to be detected has fewer distribution coefficients of neighborhood financial data smaller than a given distribution detection threshold value, the abnormal condition of the financial data to be detected is further determined through the neighborhood detection coefficients, the neighborhood detection coefficients are further judged by measuring the abnormal condition of the sub-class to be detected closest to the financial data to be detected, when the minimum distance between the financial data to be detected and the fusion data in the sub-class closest to the financial data to be detected is larger than the class detection threshold value corresponding to the sub-class, the financial data to be detected is indicated to be outside the sub-class closest to the financial data to be detected, namely the financial data to be detected is judged to be noise data; in summary, the financial data preprocessing module determines the abnormal degree of the financial data to be detected by measuring the distribution relation between the financial data to be detected and the neighborhood financial data thereof on the whole by using the distribution detection coefficients, and the adopted distribution detection coefficients can effectively detect the abnormal of the financial data in different density areas; determining the abnormal degree of the financial data to be detected by measuring the local relation between the data to be detected and the subclasses in the neighborhood detection set by utilizing the neighborhood detection coefficient, thereby effectively improving the detection precision of the noise data; the initial weight and the threshold of the BP neural network are optimized by adopting a particle swarm algorithm, the prediction precision of the BP neural network can be effectively improved, in a traditional updating mode of a particle swarm, the global optimal solution directly influences the updating of the position of the next generation of particles, namely, the selection of the global optimal solution has important influence on the optimization result of the particle swarm, compared with a mode that particles in the traditional particle swarm directly learn to the global optimal solution in the updating process, the preferred embodiment introduces a global reference solution to replace the traditional global optimal solution, the global reference solution is selected from historical optimal solutions of the particles in the particle swarm during the current iteration, the global reference value corresponding to the historical optimal solution is defined, the global reference value of the historical optimal solution is calculated in the neighborhood detection range corresponding to the historical optimal solution, and the detection distance corresponding to the historical optimal solution is determined by the iteration number of the particle swarm and the number of different historical optimal solutions in the particle swarm, with the increase of the iteration times, the neighborhood detection distance is reduced, so that the local search and the global search of the particle swarm algorithm can be effectively balanced, and in addition, when the number of different historical optimal solutions in the particle swarm is small, the neighborhood detection distance of the historical optimal solution is increased, so that the solution space can be more comprehensively covered by the detection of the historical optimal solution in the particle swarm, and the global search capability of the particle swarm is increased; when calculating the global reference value of the historical optimal solution, comprehensively considering the fitness function value level of the historical optimal solution and the number of the historical optimal solutions in the neighborhood detection range of the historical optimal solution, selecting the historical optimal solution with the minimum global reference value in the set as the global reference solution, namely selecting the historical optimal solution with smaller fitness function value in the particle swarm and less other historical optimal solutions in the neighborhood detection range as the global reference solution, selecting the historical optimal solution with smaller fitness function value as the global reference solution, namely ensuring that the particles in the particle swarm advance towards the target solution, selecting the historical optimal solution with less other historical optimal solutions in the neighborhood detection range as the global reference solution, namely increasing the diversity of the particle swarm solution and avoiding the defect that the particle swarm algorithm is trapped in local optimization, when more other historical optimal solutions are included in the neighborhood detection range of the historical optimal solution, if the historical optimal solution is taken as a local reference solution, the particle swarm is easy to fall into local optimal, in addition, considering the condition that the historical optimal solution of more particles is around the global optimal solution at the later iteration stage of the particle swarm algorithm, the preferred embodiment introduces a weight coefficient into the global reference value of the historical optimal solution, the weight coefficient enables the global reference solution to be selected to pay more attention to the number of other historical optimal solutions contained in the neighborhood detection range of the historical optimal solution in the early iteration stage of the particle swarm, namely, the diversity of the particle swarm solution is more noticed, the particle swarm is prevented from falling into the local optimum, the global reference solution is selected to more notice the fitness function value level of the historical optimum solution in the later iteration stage of the particle swarm according to the weight coefficient, the particle swarm optimization is accelerated to move forward towards the direction of a target solution, so that the convergence speed of the particle swarm optimization is improved; the inertia weight factors of the particles in the particle swarm are set to be adjusted along with the difference value of the fitness function value between the particles and the global reference solution in a self-adaptive mode, when the difference value between the fitness function value of the particles and the fitness function value of the global reference solution is large, the value of the inertia weight factors of the particles is large at the moment, namely the particles advance towards the global reference solution by adopting a large step length, so that the particles pay more attention to global optimization, when the difference value between the fitness function value of the particles and the fitness function value of the global reference solution is small, the value of the inertia weight factors of the particles is small at the moment, namely the particles advance towards the global reference solution by adopting a small step length, so that the particles pay more attention to local optimization, namely the inertia weight factors adopted by the preferred embodiment can effectively balance global optimization and local optimization of the particles.

Drawings

The invention is further described with the aid of the accompanying drawings, in which, however, the embodiments do not constitute any limitation to the invention, and for a person skilled in the art, without inventive effort, further drawings may be derived from the following figures.

FIG. 1 is a schematic diagram of the present invention.

Detailed Description

The invention is further described with reference to the following examples.

Referring to fig. 1, the financial data prediction system based on a block chain and cloud computing in this embodiment includes a first data acquisition module, a second data acquisition module, a financial data preprocessing module, a block chain storage module, a financial data prediction module, and a visualization module, where the first data acquisition module is configured to collect historical financial timing data and input the collected historical financial timing data to the financial data preprocessing module for processing, the processed historical financial timing data is transmitted to the block chain storage module for storage, the second data acquisition module is configured to collect real-time financial timing data and input the collected real-time financial timing data to the financial data preprocessing module for processing, the processed real-time financial timing data is input to the financial data prediction module, and the financial data prediction module predicts future behavior of current financial data according to the input real-time financial timing data, the visualization module is used for displaying the prediction result of the financial data prediction module.

The preferred embodiment provides a financial data prediction system based on a block chain and cloud computing, and the future trend of financial data is predicted by adopting a BP neural network, so that the development and change of a financial market can be known in time.

Preferably, the financial data preprocessing module is configured to perform normalization on the input financial time series data, remove noise data in the normalized financial time series data, and set R to represent the normalized fundFusing the time series data set, and R ═ R(s), s ═ 1, 2,. and m }, wherein R(s) represents the s-th financial data in the financial time series data set R, and m represents the total number of financial data in the financial time series data set R; let c(s) denote a neighborhood data set corresponding to financial data R(s), given a positive integer n, and 0 < n < m, n financial data closest to the financial data R(s) in the financial time series data set R is selected and added to the set c(s), and c(s) { R (e, s), e ═ 1, 2.., n }, where R (e, s) denotes the e-th financial data in the set c(s); screening and classifying the financial data in the set C(s), C_u(s) represents the u-th sub-category obtained by screening and classifying the financial data in the set C(s), and the financial data in the set C(s) is selected and added into the sub-category C in the following way_uIn(s), specifically:

then financial data r (v, s) is added to subclass C_u(s) and continuing to screen and sort financial data in the set C(s) that is not added with any subclass, when no new financial data is added into subclass C_u(s) stopping the selection of melting data from the set C(s)To subclass C_u(s) wherein η (r (v, s), r)_u(d, s)) represents a value function, when | r |_uWhen (D, s) -r (v, s) | is less than or equal to D, then eta (r (v, s), r_u(d, s)) -0, when | r |_uWhen (D, s) -r (v, s) | > D, η (r (v, s), r_u(d，s))＝1；

wherein r is_θ(ψ, s) represents the subclass C_θThe ψ th financial data in(s); let C "(s) denote the neighborhood detection set of financial data r(s), let r '(A, s) denote the A-th financial data in set C'(s), when finance is detectedThe data r' (a, s) satisfy: when | < r(s) | ≦ H (θ), adding financial data r' (a, s) to the set C "(s), where C"(s) { r "(a, s), a ═ 1, 2,. and m"(s) }, where r "(a, s) represents the a-th financial data in the set C"(s), m "(s) represents the total number of financial data in the set C"(s), defining an abnormal value y(s) corresponding to the financial data r(s), and the expression of y(s) is:

y(s)＝y₁(s)*y₂(s)

wherein, R (max) represents the maximum financial data value in the financial time sequence data set R, R (min) represents the minimum financial data value in the financial time sequence data set R, f₁(ρ(s), H (ρ)) represents a first judgment function corresponding to the financial data r(s), where f (ρ(s), H (ρ)) > H (ρ) is 1, and when ρ(s) ≦ H (ρ), f (ρ(s), H (ρ)) is 0, C "(a, s) is assumed to represent a neighborhood detection set of the financial data r" (a, s), ρ (a, s) represents a distribution coefficient of the financial data r "(a, s) in the set C" (a, s),

represents a second judgment function when

When it is, then

When in use

When it is, then

wherein r is_θ(K, s) denotes subclass C_θ(s) the kth financial data;

The preferred embodiment is used for removing noise data in financial time series data, selecting a certain amount of financial data closest to the financial data to be detected to construct a neighborhood data set corresponding to the financial data to be detected, screening and classifying the financial data in the neighborhood data set, removing isolated data in the neighborhood detection set, obtaining subclasses with similar data, selecting the subclasses closest to the financial data to be detected to construct a neighborhood detection set of the financial data to be detected, effectively avoiding the influence on the accuracy of noise detection on the financial data to be detected due to the noise data in the neighborhood detection set, ensuring the similarity between the financial data to be detected and the neighborhood financial data in the neighborhood detection set, defining abnormal values corresponding to the financial data to be detected, and determining the abnormal values of the financial data to be detected compared with the neighborhood financial data to be detected through distribution detection coefficients and neighborhood detection coefficients The distribution detection coefficient measures the distribution situation between the financial data and the neighborhood financial data through the distribution coefficient, when the distribution coefficient of the financial data is smaller than a given distribution detection threshold, the distribution detection coefficient indicates that the financial data is closer to the distribution distance of the neighborhood financial data, therefore, the financial data to be detected is directly judged to be normal data, when the distribution coefficient of the financial data to be detected is larger than the given distribution detection threshold, the distribution distance between the financial data to be detected and the neighborhood financial data is farther, considering the situation that the density distribution of the financial data is larger, whether the financial data to be detected is in an area with larger density distribution is determined by calculating the distribution coefficient of the neighborhood financial data of the financial data to be detected, when the distribution coefficient of the neighborhood financial data with more existence of the financial data to be detected is larger than the given distribution detection threshold, the financial data to be detected is indicated to be in an area with large density distribution, namely the financial data to be detected is judged to be normal data, when the financial data to be detected has fewer distribution coefficients of neighborhood financial data smaller than a given distribution detection threshold value, the abnormal condition of the financial data to be detected is further determined through the neighborhood detection coefficients, the neighborhood detection coefficients are further judged by measuring the abnormal condition of the sub-class to be detected closest to the financial data to be detected, when the minimum distance between the financial data to be detected and the fusion data in the sub-class closest to the financial data to be detected is larger than the class detection threshold value corresponding to the sub-class, the financial data to be detected is indicated to be outside the sub-class closest to the financial data to be detected, namely the financial data to be detected is judged to be noise data; in summary, the preferred embodiment determines the abnormal degree of the financial data to be detected by measuring the distribution relation between the financial data to be detected and the neighborhood financial data thereof, and the adopted distribution detection coefficients can effectively detect the abnormal of the financial data in different density areas; the abnormal degree of the financial data to be detected is determined by measuring the local relation between the data to be detected and the subclasses in the neighborhood detection set by utilizing the neighborhood detection coefficients, so that the detection precision of the noise data is effectively improved.

Preferably, the financial data prediction module predicts the future trend of the financial time series data by using a BP neural network, the financial data prediction module calls historical financial time series data stored in the block chain storage module to train the BP neural network for future trend prediction of the financial data, the processed real-time financial time series data is used as an input value of the trained BP neural network, and an output value of the trained BP neural network is the predicted future trend of the financial data.

Preferably, in the training process of the BP neural network, the particle swarm algorithm is adopted to optimize the initial weight and the threshold of the BP neural network, and a fitness function of the particle swarm algorithm is defined as:

wherein M is the number of training samples, Y_pFor the output value of the p-th sample,

the smaller the fitness function value of the particles in the particle swarm is, the better the optimization result of the particles is.

V_i(t+1)＝ω_i(t)V_i(t)+c₁rand()(P_i(t)-X_i(t))+c₂rand()(G(t)-X_i(t))

X_i(t+1)＝X_i(t)+V_i(t+1)

let P (t) denote the historical optimal solution set of particles in the particle swarm at the t-th iteration, and P (t) { P }_i(t), i ═ 1, 2., N }, where N denotes the number of particles in the particle swarm, the historical optimal solutions in the set P (t) are screened, when the same historical optimal solutions exist in the set P (t), only one of the same historical optimal solutions is retained, the other historical optimal solutions in the same historical optimal solutions are deleted, the screened set P (t) is denoted as P ' (t), P ' (t) { P (j, t), j ═ 1, 2., N ' (t) }, where P (j, t) denotes the jth historical optimal solution in the set P ' (t), N ' (t) denotes the number of historical optimal solutions in the set P ' (t), and the neighborhood detection distance corresponding to the historical optimal solution in the set P ' (t) is defined as d (t), then d (t) is expressed as:

detecting the historical optimal solution in the set P' (t), wherein O (j, t) represents a neighborhood detection range corresponding to the historical optimal solution P (j, t), and the O (j, t) is a circle with the historical optimal solution P (j, t) as a center and d (t) as a radiusThe region defines the global reference value corresponding to the historical optimal solution P (j, t) as

Then

The expression of (a) is:

When f is greater₃(h (P (j, t)))) 1, when

The preferred embodiment is used for determining a global reference solution in the particle swarm updating process, in the traditional updating mode of the particle swarm, the global optimal solution directly influences the updating of the next generation particle position, namely the selection of the global optimal solution has an important influence on the optimization result of the particle swarm, compared with the mode that the particles in the traditional particle swarm directly learn the global optimal solution in the updating process, the preferred embodiment introduces the global reference solution to replace the traditional global optimal solution, the global reference solution is selected from the historical optimal solutions of the particles in the particle swarm during the current iteration, the global reference value corresponding to the historical optimal solution is defined, the global reference value of the historical optimal solution is calculated in the neighborhood detection range corresponding to the historical optimal solution, and the neighborhood detection distance corresponding to the historical optimal solution is determined by the iteration number of the particle swarm and the number of different historical optimal solutions in the particle swarm, with the increase of the iteration times, the neighborhood detection distance is reduced, so that the local search and the global search of the particle swarm algorithm can be effectively balanced, and in addition, when the number of different historical optimal solutions in the particle swarm is small, the neighborhood detection distance of the historical optimal solution is increased, so that the solution space can be more comprehensively covered by the detection of the historical optimal solution in the particle swarm, and the global search capability of the particle swarm is increased; when calculating the global reference value of the historical optimal solution, comprehensively considering the fitness function value level of the historical optimal solution and the number of the historical optimal solutions in the neighborhood detection range of the historical optimal solution, selecting the historical optimal solution with the minimum global reference value in the set as the global reference solution, namely selecting the historical optimal solution with smaller fitness function value in the particle swarm and less other historical optimal solutions in the neighborhood detection range as the global reference solution, selecting the historical optimal solution with smaller fitness function value as the global reference solution, namely ensuring that the particles in the particle swarm advance towards the target solution, selecting the historical optimal solution with less other historical optimal solutions in the neighborhood detection range as the global reference solution, namely increasing the diversity of the particle swarm solution and avoiding the defect that the particle swarm algorithm is trapped in local optimization, when more other historical optimal solutions are included in the neighborhood detection range of the historical optimal solution, if the historical optimal solution is taken as a local reference solution, the particle swarm is easy to fall into local optimal, in addition, considering the condition that the historical optimal solution of more particles is around the global optimal solution at the later iteration stage of the particle swarm algorithm, the preferred embodiment introduces a weight coefficient into the global reference value of the historical optimal solution, the weight coefficient enables the global reference solution to be selected to pay more attention to the number of other historical optimal solutions contained in the neighborhood detection range of the historical optimal solution in the early iteration stage of the particle swarm, namely, the diversity of the particle swarm solution is more noticed, the particle swarm is prevented from falling into the local optimum, the global reference solution is selected to more notice the fitness function value level of the historical optimum solution in the later iteration stage of the particle swarm according to the weight coefficient, namely, the particle swarm algorithm is accelerated to move forward towards the direction of the target solution, and the convergence speed of the particle swarm algorithm is improved.

The preferred embodiment is for determining inertial weight factors for particles in a population of particles, the inertial weight factors being adaptively adjusted with a difference in fitness function values between the particles and a global reference solution, when the difference between the fitness function value of the particle and the fitness function value of the global reference solution is large, the value of the inertial weight factor of the particle is large at this time, i.e. the particles advance towards the global reference solution with larger steps, making the particles more focused on global optimization, when the difference between the fitness function value of the particle and the fitness function value of the global reference solution is small, the value of the inertial weight factor of the particle is small at this time, i.e. the particles advance towards the global reference solution with smaller steps, making the particles more focused on local optimization, that is, the inertial weight factor adopted in the preferred embodiment can effectively balance global optimization and local optimization of particles.

Finally, it should be noted that the above embodiments are only used for illustrating the technical solutions of the present invention, and not for limiting the protection scope of the present invention, although the present invention is described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications or equivalent substitutions can be made on the technical solutions of the present invention without departing from the spirit and scope of the technical solutions of the present invention.

Claims

1. A financial data prediction system based on a block chain and cloud computing is characterized by comprising a first data acquisition module, a second data acquisition module, a financial data preprocessing module, a block chain storage module, a financial data prediction module and a visualization module, wherein the first data acquisition module is used for acquiring historical financial time sequence data and inputting the acquired historical financial time sequence data into the financial data preprocessing module for processing, the processed historical financial time sequence data is transmitted to the block chain storage module for storage, the second data acquisition module is used for acquiring real-time financial time sequence data and inputting the acquired real-time financial time sequence data into the financial data preprocessing module for processing, the processed real-time financial time sequence data is input into the financial data prediction module, and the financial data prediction module predicts the future trend of the current financial data according to the input real-time financial time sequence data, the visualization module is used for displaying the prediction result of the financial data prediction module.

2. The system according to claim 1, wherein the financial data preprocessing module is configured to normalize the input financial time series data and remove noise data from the normalized financial time series data, where R denotes a normalized financial time series data set, and R ═ R(s), s ═ 1, 2.. and m }, where R(s) denotes the s-th financial data in the financial time series data set R, and m denotes the total number of financial data in the financial time series data set R; let C(s) denote the neighborhood corresponding to the financial data r(s)A domain data set, wherein n is given as a positive integer n, and 0 < n < m, n financial data closest to financial data R(s) in a financial time sequence data set R are selected to be added into a set c(s), and c(s) { R (e, s), e ═ 1, 2,. said., n }, wherein R (e, s) represents the e-th financial data in the set c(s); screening and classifying the financial data in the set C(s), C_u(s) represents the u-th sub-category obtained by screening and classifying the financial data in the set C(s), and the financial data in the set C(s) is selected and added into the sub-category C in the following way_uIn(s), specifically:

then financial data r (v, s) is added to subclass C_u(s) and continuing to screen and sort financial data in the set C(s) that is not added with any subclass, when no new financial data is added into subclass C_uIf the data is in(s), the data is stopped from being selected from the set C(s) and added to the subclass C_u(s) wherein η (r (v, s), r)_u(d, s)) represents a value function, when | r |_uWhen (D, s) -r (v, s) | is less than or equal to D, then eta (r (v, s), r_u(d, s)) -0, when | r |_uWhen (D, s) -r (v, s) | > D, η (r (v, s), r_u(d，s))＝1；

wherein r is_θ(ψ, s) represents the subclass C_θThe ψ th financial data in(s); let C "(s) denote the neighborhood detection set of the constructed financial data r(s), let r ' (a, s) denote the a-th financial data in the set C '(s), and when the financial data r ' (a, s) satisfy: if | < r(s) | ≦ H (θ), the financial data r' (a, s) is added to the set C ″(s), where C ″(s) = { r ″ (a, s) ≦ H (θ)) 1, 2., m "(s) }, where r" (a, s) represents the a-th financial data in the set C "(s), m"(s) represents the total number of financial data in the set C "(s), the abnormal value corresponding to the financial data r(s) is defined as y(s), and the expression of y(s) is:

y(s)＝y₁(s)*y₂(s)

where r '(b, a, s) represents the b-th financial data in the set C' (a, s), m '(a, s) represents the total number of financial data in the set C' (a, s), and f₁(ρ (a, s), H (ρ)) represents a first judgment corresponding to the financial data r ″ (a, s)Function, when ρ (a, s) > H (ρ), then f₁(ρ (a, s), H (ρ)) -1, and when ρ (a, s) ≦ H (ρ), then f (p)) is set to 1₁(ρ (a, s), H (ρ)) -0, where H (ρ) is a given distribution detection threshold,

represents a second judgment function when

When it is, then

When in use

When it is, then

r_θ(L, s) denotes subclass C_θThe Lth financial data in(s), D (C)_θ(s)) represents subclass C_θ(s) a corresponding class detection threshold, and

wherein r is_θ(K, s) denotes subclass C_θ(s) the kth financial data;

3. The financial data prediction system based on the block chain and cloud computing as claimed in claim 2, wherein the financial data prediction module predicts the future trend of the financial time series data by using a BP neural network, the financial data prediction module calls historical financial time series data stored in the block chain storage module to train the BP neural network for future trend prediction of the financial data, the processed real-time financial time series data is used as an input value of the trained BP neural network, and an output value of the trained BP neural network is the predicted future trend of the financial data.

4. The system of claim 3, wherein in the training process of the BP neural network, the initial weight and the threshold of the BP neural network are optimized by a particle swarm algorithm.