CN112561179A - Stock tendency prediction method and device, computer equipment and storage medium - Google Patents

Stock tendency prediction method and device, computer equipment and storage medium Download PDF

Info

Publication number
CN112561179A
CN112561179A CN202011518772.6A CN202011518772A CN112561179A CN 112561179 A CN112561179 A CN 112561179A CN 202011518772 A CN202011518772 A CN 202011518772A CN 112561179 A CN112561179 A CN 112561179A
Authority
CN
China
Prior art keywords
stock
learning machine
trained
extreme learning
historical
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011518772.6A
Other languages
Chinese (zh)
Inventor
陈素冬
王熙照
陈思宏
沈浩靖
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen University
Original Assignee
Shenzhen University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen University filed Critical Shenzhen University
Priority to CN202011518772.6A priority Critical patent/CN112561179A/en
Publication of CN112561179A publication Critical patent/CN112561179A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/04Trading; Exchange, e.g. stocks, commodities, derivatives or currency exchange

Abstract

The embodiment of the invention discloses a stock tendency prediction method, a stock tendency prediction device, computer equipment and a storage medium. The method comprises the following steps: acquiring at least two historical stock data sets, and dividing each historical stock data set into a training set and a testing set respectively, wherein the at least two historical stock data sets comprise a target historical stock data set of stocks to be predicted; establishing a seemingly unrelated regression model according to each training set and each extreme learning machine model to be trained corresponding to each historical stock data set; determining hidden layer output weights of all extreme learning machine models to be trained according to the seemingly unrelated regression models to obtain all trained extreme learning machine models; and inputting a test set obtained by dividing the target historical stock data set into a corresponding trained extreme learning machine model so as to predict the tendency of the stock to be predicted. By considering the correlation among different data sets, the classification performance of the prediction model is improved, and the accuracy of stock prediction is improved.

Description

Stock tendency prediction method and device, computer equipment and storage medium
Technical Field
The embodiment of the invention relates to the technical field of financial analysis, in particular to a stock tendency prediction method, a stock tendency prediction device, computer equipment and a storage medium.
Background
Since the financial securities market has been trading electronically, a large amount of trading data, including stock market data, corporate financial information, and trade records, has been accumulated. Particularly, in recent years, the financial stock market in China is rapidly developed, the trading scale is rapidly enlarged, the trading data is increased in a geometric series, and how to utilize the huge data information is extremely important, wherein one is to utilize the data to predict the future stock price.
Because the stock market is a complex dynamic system and has a certain degree of uncertainty, the amount of information and calculation amount required to be processed during prediction is large, and the existing stock prediction method basically trains a model through a data set to predict, so that an ideal prediction effect is generally not achieved. The relevance among various factors in the stock market is complicated, the primary and secondary relations are variable, and the quantitative relation is difficult to extract, so that the quantitative analysis of the stock market by applying a conventional prediction method is very difficult, and the prediction result is often not accurate enough.
Disclosure of Invention
The embodiment of the invention provides a stock tendency prediction method, a stock tendency prediction device, computer equipment and a storage medium, which are used for improving the accuracy of stock tendency prediction so as to help a user make a decision better.
In a first aspect, an embodiment of the present invention provides a stock tendency prediction method, where the method includes:
acquiring at least two historical stock data sets, and dividing each historical stock data set into a training set and a testing set respectively, wherein the at least two historical stock data sets comprise a target historical stock data set of stocks to be predicted;
establishing a seemingly unrelated regression model according to the training sets and the extreme learning machine models to be trained corresponding to the historical stock data sets;
determining hidden layer output weights of the extreme learning machine models to be trained according to the seemingly unrelated regression model to obtain the trained extreme learning machine models;
and inputting a test set obtained by dividing the target historical stock data set into a corresponding trained extreme learning machine model so as to predict the tendency of the stock to be predicted.
In a second aspect, an embodiment of the present invention further provides a stock tendency prediction apparatus, where the apparatus includes:
the stock data acquisition module is used for acquiring at least two historical stock data sets and dividing each historical stock data set into a training set and a testing set respectively, wherein the at least two historical stock data sets comprise a target historical stock data set of stocks to be predicted;
the regression model establishing module is used for establishing a seemingly unrelated regression model according to each training set and each extreme learning machine model to be trained corresponding to each historical stock data set;
the model training module is used for determining hidden layer output weights of all extreme learning machine models to be trained according to the seemingly unrelated regression model so as to obtain all trained extreme learning machine models;
and the trend prediction module is used for inputting a test set obtained by dividing the target historical stock data set into a corresponding trained extreme learning machine model so as to predict the trend of the stock to be predicted.
In a third aspect, an embodiment of the present invention further provides a computer device, where the computer device includes:
one or more processors;
a memory for storing one or more programs;
when executed by the one or more processors, cause the one or more processors to implement a stock movement prediction method as provided by any of the embodiments of the invention.
In a fourth aspect, the embodiment of the present invention further provides a computer-readable storage medium, on which a computer program is stored, and when the computer program is executed by a processor, the computer program implements the stock tendency prediction method provided by any embodiment of the present invention.
The embodiment of the invention provides a stock tendency prediction method, which comprises the steps of firstly obtaining at least two historical stock data sets, dividing each historical stock data set into a training set and a testing set respectively, wherein one historical stock data set is a target historical stock data set of a stock to be predicted, then training an extreme learning machine model for each stock, establishing a seemingly unrelated regression model according to each training set and the extreme learning machine model of each stock, determining hidden layer output weight values of each extreme learning machine model according to the seemingly unrelated regression model to obtain each trained extreme learning machine model, and finally inputting the target historical stock data set into the corresponding trained extreme learning machine model so as to output a prediction result to predict the tendency of the stock to be predicted. According to the stock tendency prediction method provided by the embodiment of the invention, the hidden layer output weight of the extreme learning machine model is estimated by using the seemingly unrelated regression model, and the correlation among different data sets is considered, so that the estimation of the hidden layer output weight is more accurate, the classification performance of the extreme learning machine model is improved, the accuracy of stock tendency prediction is improved, and a user can be effectively helped to make a better decision.
Drawings
FIG. 1 is a flowchart illustrating a stock trend prediction method according to an embodiment of the present invention;
fig. 2 is a schematic structural diagram of a stock trend prediction apparatus according to a second embodiment of the present invention;
fig. 3 is a schematic structural diagram of a computer device according to a third embodiment of the present invention.
Detailed Description
The present invention will be described in further detail with reference to the accompanying drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the invention and are not limiting of the invention. It should be further noted that, for the convenience of description, only some of the structures related to the present invention are shown in the drawings, not all of the structures.
Before discussing exemplary embodiments in more detail, it should be noted that some exemplary embodiments are described as processes or methods depicted as flowcharts. Although a flowchart may describe the steps as a sequential process, many of the steps can be performed in parallel, concurrently or simultaneously. In addition, the order of the steps may be rearranged. The process may be terminated when its operations are completed, but may have additional steps not included in the figure. The processes may correspond to methods, functions, procedures, subroutines, and the like.
Example one
Fig. 1 is a flowchart of a stock trend prediction method according to an embodiment of the present invention. The embodiment is applicable to the case of predicting the future tendency according to the historical data of the stock, and the method can be executed by the stock tendency prediction device provided by the embodiment of the invention, the device can be realized by hardware and/or software, and can be generally integrated in computer equipment. As shown in fig. 1, the method specifically comprises the following steps:
s11, at least two historical stock data sets are obtained, each historical stock data set is divided into a training set and a testing set, and the at least two historical stock data sets comprise a target historical stock data set of stocks to be predicted.
Each historical stock data set may be a set of related historical data of stocks of one company, in order to utilize the correlation between different data sets, it is necessary to obtain the historical stock data sets of at least two companies, and one of the historical stock data sets is a target historical stock data set of stocks to be predicted. After the required historical stock data sets are obtained, the historical stock data sets can be respectively divided into training sets and testing sets, so that the prediction model can be trained according to the latest training set for each prediction, the trained prediction model can be more suitable for the current stock trend, and the future stock trend can be predicted according to the latest testing set, so that the prediction result is more accurate. Alternatively, the first seventy percent of the data of each historical stock data set can be used as a training set and the remaining thirty percent of the data can be used as a test set according to the time sequence.
Optionally, the historical stock data set includes dependent variable data and corresponding independent variable data, and the dependent variable data includes a closing price; correspondingly, after at least two historical stock data sets are obtained, the method further comprises the following steps: dividing the closing price in each historical stock data set into a plurality of price intervals, and setting discrete labels for each price interval; digitizing the discrete labels of each price interval; updating corresponding dependent variable data according to the value of the collection price in each historical stock data set after the collection price is digitalized; and normalizing the independent variable data in each historical stock dataset.
Specifically, the stock tendency prediction can be realized by predicting the closing price of the stock, taking the closing price as the dependent variable data, and the corresponding independent variable data can be the related information except the closing price in the historical stock data set, and exemplarily, the independent variable data can include information such as the volume of the finished transaction, the price of the opening, the highest price and the lowest price. The prediction model can be trained by using the historical actual dependent variable information and the corresponding independent variable information.
Since it is a matter of low reliability to accurately predict the closing price of the stock to be predicted, the closing price can be divided into a plurality of price intervals to be converted into a discrete price interval prediction problem. Illustratively, the closing price of the ith company is divided into LiA price interval, then can [ Bi_min-Bi_min×10%,Bi_max+Bi_max×10%]The dividable range of closing prices is defined as the lowest price and the highest price of the closing prices are floated outward by a certain amount to take into account the case of the lowest price or the highest price close to the closing prices. Wherein, Bi_minFor the lowest price of the closing price in the historical stock data set of the ith company, Bi_maxThe highest price of the closing price is collected for the historical stock data set of the ith company. Optionally, the dividing manner may be equal division. Will reserve priceAfter the price intervals are divided into a plurality of price intervals, discrete labels can be set for the price intervals to identify trend information of the price intervals, for example, the discrete labels can include big drop, small fluctuation, small rise, big rise and the like from low price to high price, and can be specifically changed according to actual requirements. After the setting of the discrete labels is completed, each closing price in the historical stock data set can be converted into a discrete label according to the corresponding relation between each price interval and the discrete label.
In order to facilitate the calculation, the discrete labels also need to be digitized. Specifically, the numerical value may be obtained by dividing the number of price intervals, and assuming that the number of price intervals of the ith company is SiI.e. the discrete label has SiThe discrete labels can then be converted into numerical values {1,2,3, …, Si}. Correspondingly, the final converted numerical value of each closing price in the historical stock data set according to the price interval in which the closing price is located can be obtained, and the corresponding dependent variable data is updated according to the converted numerical value. Meanwhile, various data in the independent variable data have large difference in value and large fluctuation, so that after the historical stock data set is obtained, the independent variable data in the independent variable data can be normalized. Therefore, the stock tendency prediction method provided by the embodiment is continuously executed by using the updated dependent variable data and the normalized independent variable data.
And S12, establishing a seemingly unrelated regression model according to the training sets and the extreme learning machine models to be trained corresponding to the historical stock data sets.
An Extreme Learning Machine (ELM) model is a Machine Learning method constructed based on a feedforward neural network, and can be used for processing regression problems and discrete problems. The hidden layer feedforward neural network with random weight can obtain a hidden layer matrix only by setting the node number of the hidden layer, namely randomly initializing input weight and bias. The original extreme learning machine model solves the weight coefficient from the hidden layer to the output layer by using a least square method to obtain a final prediction resultNo iterative computation is required. Compared with some conventional gradient descent-based algorithms such as a BP neural network, the extreme learning machine model has the advantages of higher learning speed, better generalization and the like, and the prediction model can be the extreme learning machine model. Specifically, let (w, b) denote the random weights and random biases in the hidden layer, g (-) denotes the activation function, and then N samples xNAnd
Figure BDA0002848826410000071
the hidden layer output matrix H of each neuron node is:
Figure BDA0002848826410000072
therefore, the mathematical model of the extreme learning machine model from the hidden layer to the output layer can be expressed as a regression equation:
H·β=Y
wherein, beta represents the output weight of the hidden layer, and Y represents the output value of the extreme learning machine model. When the random weights and random offsets of the hidden layer are determined, the hidden layer output matrix is also uniquely determined. The extreme learning machine model itself can then find β in the above equation using the least squares method, i.e.:
β=H+Y
wherein H+A Moore-Penrose generalized inverse matrix representing the hidden layer output matrix H. In this embodiment, an extreme learning machine model is trained for at least two stocks, that is, a plurality of regression equations from the hidden layer to the output layer exist, in this case, correlation exists between different data sets, and then the final hidden layer output weight can be solved by using a seemingly uncorrelated regression model by using the correlation, so as to improve the accuracy of the calculation result.
A Seemingly Uncorrelated Regression (SUR) model is a model for calculating the correlation between data from the perspective of error space, and after calculating the variance-covariance matrix of errors, the model parameters can be estimated by using the generalized least square method, and therefore, the Seemingly uncorrelated Regression model itself can only be used to solve the Regression problem. Therefore, in the embodiment, a SUELM model for solving the classification problem is firstly provided, and a seemingly uncorrelated regression model and a limit learning machine model which utilize the error correlation among data are combined, that is, a machine learning algorithm is firstly applied from the angle of the error term correlation to solve the classification problem, so as to improve the accuracy of the prediction result. Seemingly uncorrelated regression models are made up of a number of individual equations, each seemingly uncorrelated and uncorrelated, but correlated between the random error terms in the equations in the model, based on which the following set of equations (where β is the model parameter to be estimated) can be constructed for m data sets with the same number of samples N (including the independent variable X and the dependent variable Y), while the error term u satisfies the following two conditions:
Figure BDA0002848826410000081
E[ui]=0,E[uiu′j]=σijIN,i,j=1,…,m
based on the seemingly uncorrelated regression model, model parameters and the like can be estimated by an Aitken (Aitken) estimation method, while the variance-covariance matrix of the error term is not known in general, the dependent variable can be first estimated by a least square method and the error from the true value can be calculated, so that the calculation formula of the variance-covariance matrix can be derived as follows:
Figure BDA0002848826410000082
wherein, YiThe true value of the dependent variable is represented,
Figure BDA0002848826410000083
representing an estimate of the dependent variable. By the calculation formulaThe correlation between the error terms can be utilized, and then the estimation efficiency of the equation can be improved by determining the model parameters by utilizing the variance-covariance matrix obtained by calculation, and the improvement effect is better when the correlation is higher. After the variance-covariance matrix is obtained, the model parameters β are estimated according to the following formula, which can be derived from a seemingly uncorrelated regression model:
Figure BDA0002848826410000091
optionally, the historical stock data set includes dependent variable data and corresponding independent variable data, and the dependent variable data includes a closing price; correspondingly, establishing an apparently unrelated regression model according to each training set and each extreme learning machine model to be trained corresponding to each historical stock data set, wherein the establishing of the apparently unrelated regression model comprises the following steps: inputting the independent variable data in each training set into a corresponding extreme learning machine model to be trained to obtain a hidden layer output matrix corresponding to the independent variable data in each training set; and establishing a seemingly uncorrelated regression model according to the dependent variable data in each training set and the hidden layer output matrix corresponding to the independent variable data in each training set.
Specifically, in this embodiment, the independent variable data in each training set may be input into the respective extreme learning machine model to be trained, the hidden layer output matrix corresponding to the independent variable data in each training set may be obtained according to the calculation formula of the hidden layer output matrix, the X in the equation set of the seemingly uncorrelated regression model may be replaced with each obtained hidden layer output matrix, and the Y in the equation set may be replaced with the dependent variable data in each training set, so as to establish the seemingly uncorrelated regression model used in the method of this embodiment. At this time, β in the equation set represents the hidden layer output weight of each extreme learning machine model, and u represents the error term of each training set.
And S13, determining the hidden layer output weight of each extreme learning machine model to be trained according to the seemingly uncorrelated regression models to obtain each trained extreme learning machine model.
Specifically, after the seemingly uncorrelated regression models are established, the model parameter β therein can be solved according to the above calculation process, and the hidden layer output weights of the extreme learning machine models to be trained can be obtained, thereby completing the training of the extreme learning machine models. Namely, determining the hidden layer output weight of each extreme learning machine model to be trained according to the seemingly uncorrelated regression models may include: determining an estimated value of the dependent variable data in each training set by using a least square method; determining a variance-covariance matrix of the error term based on the seemingly uncorrelated regression model and the estimated value; and determining the hidden layer output weight of each extreme learning machine model to be trained according to the seemingly uncorrelated regression model and the variance-covariance matrix of the error terms.
Further optionally, the number of historical stock datasets is two, and the seemingly unrelated regression model is:
Figure BDA0002848826410000101
wherein, T1And T2Representing dependent variable data in each training set, H1And H2Represents a hidden layer output matrix, beta ', corresponding to the argument data in each training set'1And beta'2Representing the hidden layer output weight u of each extreme learning machine model to be trained1And u2Error terms representing respective training sets;
determining estimates of dependent variable data in each training set using a least squares method comprising:
Figure BDA0002848826410000102
wherein the content of the first and second substances,
Figure BDA0002848826410000103
represents the estimated value of the dependent variable data in the ith training set, i is 1,2,
Figure BDA0002848826410000104
representation matrix HiThe transpose of (a) is performed,
Figure BDA0002848826410000105
representation matrix
Figure BDA0002848826410000106
The inverse of (1);
determining a variance-covariance matrix of error terms based on the seemingly uncorrelated regression model and the estimated values, comprising:
Figure BDA0002848826410000107
wherein the content of the first and second substances,
Figure BDA0002848826410000108
a variance-covariance matrix representing the error term, N represents the number of samples in the training set, j is 1, 2;
determining hidden layer output weights of the extreme learning machine models to be trained according to the seemingly uncorrelated regression models and the variance-covariance matrix of the error terms, wherein the hidden layer output weights comprise:
Figure BDA0002848826410000109
wherein the content of the first and second substances,
Figure BDA00028488264100001010
INrepresenting an N x N identity matrix,
Figure BDA00028488264100001011
represents the Kronecker product of the matrix. In the above, the number of the historical stock data sets is two, and the detailed combination process between the above-mentioned seemingly uncorrelated regression model and the extreme learning machine model using the inter-data error correlation is applied to the stock trend prediction process of this embodiment, so as to improve the accuracy of the estimation of the output weight of the hidden layerAnd (5) determining.
And S14, inputting a test set obtained by dividing the target historical stock data set into a corresponding trained extreme learning machine model so as to predict the tendency of the stock to be predicted.
After the trained extreme learning machine model is obtained, the future tendency can be predicted by using the historical data of the stock to be predicted, specifically, the independent variable data in the test set obtained by dividing the target historical stock data set is firstly input into the trained extreme learning machine model corresponding to the stock to be predicted to obtain a target hidden layer output matrix, and then the target hidden layer output matrix and the obtained target hidden layer output matrix are combined
Figure BDA0002848826410000111
Multiplying to obtain the predicted value of the stock trend to be predicted. And when the dependent variable data is the closing price, the predicted value is the prediction of the closing price.
Further optionally, inputting a test set obtained by dividing the target historical stock data set into a corresponding trained extreme learning machine model to predict the tendency of the stock to be predicted, where the test set includes: inputting a test set obtained by dividing a target historical stock data set into a corresponding trained extreme learning machine model to obtain a predicted discrete label value; rounding the discrete label value; and determining a predicted target discrete label according to the corresponding relation between the discrete label and each numerical value in the digitization process and the rounding result so as to predict the trend of the stock to be predicted according to the target discrete label.
Specifically, corresponding to the processes of setting the discrete label and digitizing, etc., since the predicted discrete label value (i.e., the predicted value) is a continuous value, in order to obtain the final discrete predicted value, the discrete label value can be rounded up in a rounding manner, and then the corresponding target discrete label is searched for according to the rounding result and referring to the corresponding relationship between the discrete label and each value in the digitizing process, i.e., the future trend of the stock to be predicted can be evaluated according to the target discrete label.
On the basis of the above technical solution, optionally, before establishing the seemingly uncorrelated regression models according to the training sets and the extreme learning machine models to be trained corresponding to the historical stock data sets, the method further includes: determining an estimated value of the dependent variable data in each training set by using a least square method; determining error items of each training set according to the estimated values and the corresponding real values; respectively determining correlation coefficients between error items of a target training set obtained by dividing a target historical stock data set and error items of other training sets except the target training set; and screening the historical stock data set according to the correlation coefficient.
Specifically, as the uncorrelated regression models improve the performance of the models better as the correlations between the data sets are higher, the historical stock data sets to be utilized can be screened according to the correlations between the data sets to select the historical stock data sets with higher correlations with the target historical stock data set, and specifically, the preset correlation threshold can be set. Illustratively, when historical stock data sets of A, B, C and D companies are obtained at the beginning, when the stock tendency of A company is expected to be predicted, the correlations between A company and B company, between A company and C company and between A company and D company can be calculated respectively, if the correlation between A company and B company is lower than a preset correlation threshold value, and the correlation between A company and C company and between A company and D company is higher than a preset correlation threshold value, the historical stock data set of B company can be eliminated, and the stock tendency of A company is predicted by only using the historical stock data sets of A company, C company and D company.
Specifically, the evaluation may be performed by calculating a correlation coefficient between an error item of a target training set obtained by dividing the target historical stock data set and an error item of each training set other than the target training set. Firstly, determining an error term, determining an estimated value of the dependent variable data in each training set by using a least square method, and then determining the error term of each training set according to the estimated value and a corresponding true value, wherein the calculation process of the estimated value of the dependent variable data can be as above and is not described repeatedly. Then, for the calculation of the correlation coefficient, a Pearson correlation coefficient, a Spearman correlation coefficient, a Kendall correlation coefficient, a cosine similarity and other calculation methods can be adopted to implement the calculation, and taking a Pearson correlation coefficient method as an example, the statistical quantity of the similarity degree between the two random variables X, Y can be measured, and the calculation formula is as follows:
Figure BDA0002848826410000131
wherein cov (X, Y) represents the covariance of X and Y, σXAnd σYDenotes the standard deviation, μ, of X and Y, respectivelyXAnd muYThen represent the mean of X and Y, E [. cndot.]Indicating the expectation of the variable in square brackets. Accordingly, in the method of the present embodiment, the correlation coefficient between the error terms of each two training sets can be obtained by the following formula:
Figure BDA0002848826410000132
wherein u isiAnd D (-) represents the variance of the variables in the brackets. The larger the absolute value of the correlation coefficient, the stronger the correlation, i.e., the stronger the correlation when the correlation coefficient is closer to 1 or-1, and the weaker the correlation when the correlation coefficient is closer to 0. Through screening the historical stock data set according to the correlation coefficient, data with low correlation can be eliminated, so that the waste of computing resources caused by using data with small performance improvement effect for prediction is avoided, the calculated amount is reduced on the basis of ensuring the improvement of the model prediction performance, and the efficiency of the prediction process is improved.
The technical scheme provided by the embodiment of the invention includes the steps of firstly obtaining at least two historical stock data sets, dividing each historical stock data set into a training set and a testing set respectively, wherein one historical stock data set is a target historical stock data set of a stock to be predicted, then training a limit learning machine model for each stock, establishing a seemingly unrelated regression model according to each training set and each limit learning machine model of each stock, determining hidden layer output weight values of each limit learning machine model according to the seemingly unrelated regression model, obtaining each trained limit learning machine model, and finally inputting the target historical stock data set into the corresponding trained limit learning machine model so as to output prediction results to predict the trend of the stock to be predicted. The hidden layer output weight of the extreme learning machine model is estimated by using the seemingly uncorrelated regression model, and the correlation among different data sets is considered, so that the estimation of the hidden layer output weight is more accurate, the classification performance of the extreme learning machine model is improved, the accuracy of stock trend prediction is improved, and a user can be effectively helped to make a decision better.
Example two
Fig. 2 is a schematic structural diagram of a stock tendency prediction apparatus according to a second embodiment of the present invention, which may be implemented by hardware and/or software, and may be generally integrated into a computer device. As shown in fig. 2, the apparatus includes:
the stock data acquisition module 21 is configured to acquire at least two historical stock data sets, and divide each historical stock data set into a training set and a test set, respectively, where the at least two historical stock data sets include a target historical stock data set of a stock to be predicted;
the regression model establishing module 22 is configured to establish a seemingly unrelated regression model according to each training set and each extreme learning machine model to be trained corresponding to each historical stock data set;
the model training module 23 is configured to determine hidden layer output weights of the extreme learning machine models to be trained according to the seemingly uncorrelated regression models to obtain the trained extreme learning machine models;
and the tendency prediction module 24 is configured to input a test set obtained by dividing the target historical stock data set into a corresponding trained extreme learning machine model so as to predict the tendency of the stock to be predicted.
The technical scheme provided by the embodiment of the invention includes the steps of firstly obtaining at least two historical stock data sets, dividing each historical stock data set into a training set and a testing set respectively, wherein one historical stock data set is a target historical stock data set of a stock to be predicted, then training a limit learning machine model for each stock, establishing a seemingly unrelated regression model according to each training set and each limit learning machine model of each stock, determining hidden layer output weight values of each limit learning machine model according to the seemingly unrelated regression model, obtaining each trained limit learning machine model, and finally inputting the target historical stock data set into the corresponding trained limit learning machine model so as to output prediction results to predict the trend of the stock to be predicted. The hidden layer output weight of the extreme learning machine model is estimated by using the seemingly uncorrelated regression model, and the correlation among different data sets is considered, so that the estimation of the hidden layer output weight is more accurate, the classification performance of the extreme learning machine model is improved, the accuracy of stock trend prediction is improved, and a user can be effectively helped to make a decision better.
On the basis of the above technical solution, optionally, the historical stock data set includes dependent variable data and corresponding independent variable data, and the dependent variable data includes closing price;
accordingly, the regression model building module 22 includes:
the output matrix obtaining unit is used for inputting the independent variable data in each training set into the corresponding extreme learning machine model to be trained to obtain a hidden layer output matrix corresponding to the independent variable data in each training set;
and the regression model establishing unit is used for establishing a seemingly uncorrelated regression model according to the dependent variable data in each training set and the hidden layer output matrix corresponding to the independent variable data in each training set.
On the basis of the above technical solution, optionally, the model training module 23 includes:
an estimated value determining unit for determining an estimated value of the dependent variable data in each training set by using a least square method;
a covariance matrix determination unit for determining a variance-covariance matrix of the error term based on the seemingly uncorrelated regression model and the estimated value;
and the output weight determining unit is used for determining the hidden layer output weights of the extreme learning machine models to be trained according to the almost irrelevant regression models and the variance-covariance matrix of the error items.
On the basis of the above technical solution, optionally, the number of the historical stock data sets is two, and the seemingly uncorrelated regression model is:
Figure BDA0002848826410000161
wherein, T1And T2Representing dependent variable data in each training set, H1And H2Represents a hidden layer output matrix, beta ', corresponding to the argument data in each training set'1And beta'2Representing the hidden layer output weight u of each extreme learning machine model to be trained1And u2Error terms representing respective training sets;
the estimation value determination unit is specifically configured to:
Figure BDA0002848826410000162
wherein the content of the first and second substances,
Figure BDA0002848826410000163
represents the estimated value of the dependent variable data in the ith training set, i is 1,2,
Figure BDA0002848826410000164
representation matrix HiThe transpose of (a) is performed,
Figure BDA0002848826410000165
representation matrix
Figure BDA0002848826410000166
The inverse of (1);
the covariance matrix determination unit is specifically configured to:
Figure BDA0002848826410000167
wherein the content of the first and second substances,
Figure BDA0002848826410000168
a variance-covariance matrix representing the error term, N represents the number of samples in the training set, j is 1, 2;
the output weight determination unit is specifically configured to:
Figure BDA0002848826410000169
wherein the content of the first and second substances,
Figure BDA00028488264100001610
INrepresenting an N x N identity matrix,
Figure BDA00028488264100001611
representing the kronecker product of the matrix.
On the basis of the above technical solution, optionally, the stock tendency prediction apparatus further includes:
the estimation value determining module is used for determining the estimation value of the dependent variable data in each training set by using a least square method before establishing a seemingly unrelated regression model according to each training set and each extreme learning machine model to be trained corresponding to each historical stock data set;
the error item determining module is used for determining the error items of each training set according to the estimated values and the corresponding real values;
the correlation coefficient determining module is used for respectively determining the correlation coefficients between the error items of the target training set obtained by dividing the target historical stock data set and the error items of other training sets except the target training set;
and the data set screening module is used for screening the historical stock data set according to the correlation coefficient.
On the basis of the above technical solution, optionally, the historical stock data set includes dependent variable data and corresponding independent variable data, and the dependent variable data includes closing price;
correspondingly, the stock tendency prediction device further comprises:
the discrete label setting module is used for dividing the closing price in each historical stock data set into a plurality of price intervals after at least two historical stock data sets are obtained, and setting discrete labels for the price intervals;
the discrete label numeralization module is used for numerating the discrete labels of each price interval;
the dependent variable data updating module is used for updating corresponding dependent variable data according to the value of each historical stock data set after the closing price is digitalized;
and the independent variable data normalization module is used for normalizing the independent variable data in each historical stock data set.
On the basis of the above technical solution, optionally, the trend prediction module 24 includes:
the discrete label value obtaining module is used for inputting a test set obtained by dividing the target historical stock data set into a corresponding trained extreme learning machine model to obtain a predicted discrete label value;
the rounding module is used for rounding the discrete label numerical value;
and the discrete label determining module is used for determining a predicted target discrete label according to the corresponding relation between the discrete label and each numerical value in the numeralization process and the rounding result so as to predict the trend of the stock to be predicted according to the target discrete label.
The stock tendency prediction device provided by the embodiment of the invention can execute the stock tendency prediction method provided by any embodiment of the invention, and has the corresponding functional modules and beneficial effects of the execution method.
It should be noted that, in the embodiment of the stock tendency prediction apparatus, the included units and modules are only divided according to the functional logic, but are not limited to the above division as long as the corresponding functions can be realized; in addition, specific names of the functional units are only for convenience of distinguishing from each other, and are not used for limiting the protection scope of the present invention.
EXAMPLE III
Fig. 3 is a schematic structural diagram of a computer device provided in the third embodiment of the present invention, and shows a block diagram of an exemplary computer device suitable for implementing the embodiment of the present invention. The computer device shown in fig. 3 is only an example, and should not bring any limitation to the function and the scope of use of the embodiments of the present invention. As shown in fig. 3, the computer apparatus includes a processor 31, a memory 32, an input device 33, and an output device 34; the number of the processors 31 in the computer device may be one or more, one processor 31 is taken as an example in fig. 3, the processor 31, the memory 32, the input device 33 and the output device 34 in the computer device may be connected by a bus or in other ways, and the connection by the bus is taken as an example in fig. 3.
The memory 32 is a computer-readable storage medium, and can be used for storing software programs, computer-executable programs, and modules, such as program instructions/modules corresponding to the stock tendency prediction method in the embodiment of the present invention (for example, the stock data acquisition module 21, the regression model establishment module 22, the model training module 23, and the tendency prediction module 24 in the stock tendency prediction apparatus). The processor 31 executes various functional applications and data processing of the computer device by executing software programs, instructions and modules stored in the memory 32, that is, implements the stock tendency prediction method described above.
The memory 32 may mainly include a program storage area and a data storage area, wherein the program storage area may store an operating system, an application program required for at least one function; the storage data area may store data created according to use of the computer device, and the like. Further, the memory 32 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other non-volatile solid state storage device. In some examples, the memory 32 may further include memory located remotely from the processor 31, which may be connected to a computer device over a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The input device 33 may be used to acquire historical stock data sets of a plurality of company stocks and to generate key signal inputs related to user settings and function controls of the computer apparatus, etc. The output device 34 may include a display screen or the like, and may be used to present the predicted outcome of the stock to the user.
Example four
A fourth embodiment of the present invention further provides a storage medium containing computer-executable instructions, which when executed by a computer processor, perform a stock movement prediction method, the method comprising:
acquiring at least two historical stock data sets, and dividing each historical stock data set into a training set and a testing set respectively, wherein the at least two historical stock data sets comprise a target historical stock data set of stocks to be predicted;
establishing a seemingly unrelated regression model according to each training set and each extreme learning machine model to be trained corresponding to each historical stock data set;
determining hidden layer output weights of all extreme learning machine models to be trained according to the seemingly unrelated regression models to obtain all trained extreme learning machine models;
and inputting a test set obtained by dividing the target historical stock data set into a corresponding trained extreme learning machine model so as to predict the tendency of the stock to be predicted.
The storage medium may be any of various types of memory devices or storage devices. The term "storage medium" is intended to include: mounting media such as CD-ROM, floppy disk, or tape devices; computer system memory or random access memory such as DRAM, DDR RAM, SRAM, EDO RAM, Lanbas (Rambus) RAM, etc.; non-volatile memory such as flash memory, magnetic media (e.g., hard disk or optical storage); registers or other similar types of memory elements, etc. The storage medium may also include other types of memory or combinations thereof. In addition, the storage medium may be located in the computer system in which the program is executed, or may be located in a different second computer system connected to the computer system through a network (such as the internet). The second computer system may provide the program instructions to the computer for execution. The term "storage medium" may include two or more storage media that may reside in different locations, such as in different computer systems that are connected by a network. The storage medium may store program instructions (e.g., embodied as a computer program) that are executable by one or more processors.
Of course, the storage medium provided by the embodiment of the present invention contains computer-executable instructions, and the computer-executable instructions are not limited to the method operations described above, and may also perform related operations in the stock tendency prediction method provided by any embodiment of the present invention.
A computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.
Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
From the above description of the embodiments, it is obvious for those skilled in the art that the present invention can be implemented by software and necessary general hardware, and certainly, can also be implemented by hardware, but the former is a better embodiment in many cases. Based on such understanding, the technical solutions of the present invention may be embodied in the form of a software product, which can be stored in a computer-readable storage medium, such as a floppy disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a FLASH Memory (FLASH), a hard disk or an optical disk of a computer, and includes several instructions for enabling a computer device (which may be a personal computer, a server, or a network device) to execute the methods according to the embodiments of the present invention.
It is to be noted that the foregoing is only illustrative of the preferred embodiments of the present invention and the technical principles employed. It will be understood by those skilled in the art that the present invention is not limited to the particular embodiments described herein, but is capable of various obvious changes, rearrangements and substitutions as will now become apparent to those skilled in the art without departing from the scope of the invention. Therefore, although the present invention has been described in greater detail by the above embodiments, the present invention is not limited to the above embodiments, and may include other equivalent embodiments without departing from the spirit of the present invention, and the scope of the present invention is determined by the scope of the appended claims.

Claims (10)

1. A stock tendency prediction method, comprising:
acquiring at least two historical stock data sets, and dividing each historical stock data set into a training set and a testing set respectively, wherein the at least two historical stock data sets comprise a target historical stock data set of stocks to be predicted;
establishing a seemingly unrelated regression model according to the training sets and the extreme learning machine models to be trained corresponding to the historical stock data sets;
determining hidden layer output weights of the extreme learning machine models to be trained according to the seemingly unrelated regression model to obtain the trained extreme learning machine models;
and inputting a test set obtained by dividing the target historical stock data set into a corresponding trained extreme learning machine model so as to predict the tendency of the stock to be predicted.
2. The stock tendency prediction method of claim 1, wherein the historical stock data set includes dependent variable data and corresponding independent variable data, the dependent variable data including closing prices;
correspondingly, the establishing of the seemingly unrelated regression model according to each training set and each extreme learning machine model to be trained corresponding to each historical stock data set includes:
inputting the independent variable data in each training set into a corresponding extreme learning machine model to be trained to obtain a hidden layer output matrix corresponding to the independent variable data in each training set;
and establishing the seemingly uncorrelated regression model according to the dependent variable data in each training set and the hidden layer output matrix corresponding to the independent variable data in each training set.
3. The stock trend prediction method of claim 2, wherein the determining hidden layer output weights of the extreme learning machine models to be trained according to the seemingly uncorrelated regression models comprises:
determining an estimated value of the dependent variable data in each training set by using a least square method;
determining a variance-covariance matrix of error terms based on the seemingly uncorrelated regression model and the estimated values;
and determining hidden layer output weights of the extreme learning machine models to be trained according to the seemingly uncorrelated regression model and the variance-covariance matrix of the error terms.
4. The stock trend prediction method of claim 3, wherein the number of historical stock data sets is two, and the seemingly unrelated regression model is:
Figure FDA0002848826400000021
wherein, T1And T2Representing dependent variable data, H, in each of said training sets1And H2Represents a hidden layer output matrix, beta ', corresponding to the independent variable data in each training set'1And beta'2Representing the hidden layer output weight u of each extreme learning machine model to be trained1And u2Error terms representing respective ones of the training sets;
the determining an estimated value of dependent variable data in each of the training sets using a least squares method includes:
Figure FDA0002848826400000022
wherein the content of the first and second substances,
Figure FDA0002848826400000023
an estimate value representing the dependent variable data in the ith said training set, i-1, 2,
Figure FDA0002848826400000024
representation matrix HiThe transpose of (a) is performed,
Figure FDA0002848826400000025
representation matrix
Figure FDA0002848826400000026
The inverse of (1);
said determining a variance-covariance matrix of error terms based on said seemingly uncorrelated regression model and said estimated values, comprising:
Figure FDA0002848826400000027
wherein the content of the first and second substances,
Figure FDA0002848826400000028
represents the aboveA variance-covariance matrix of error terms, N represents the number of samples in the training set, j is 1, 2;
determining hidden layer output weights of the extreme learning machine models to be trained according to the seemingly uncorrelated regression models and the variance-covariance matrix of the error terms, wherein the hidden layer output weights comprise:
Figure FDA0002848826400000031
wherein the content of the first and second substances,
Figure FDA0002848826400000032
INrepresenting an N x N identity matrix,
Figure FDA0002848826400000033
representing the kronecker product of the matrix.
5. The stock trend prediction method according to claim 1, further comprising, before the establishing a seemingly uncorrelated regression model from the extreme learning machine models to be trained corresponding to the training sets and the historical stock data sets,:
determining an estimated value of the dependent variable data in each training set by using a least square method;
determining an error item of each training set according to the estimated value and the corresponding real value;
respectively determining correlation coefficients between error items of a target training set obtained by dividing the target historical stock data set and error items of other training sets except the target training set;
and screening the historical stock data set according to the correlation coefficient.
6. The stock tendency prediction method of claim 1, wherein the historical stock data set includes dependent variable data and corresponding independent variable data, the dependent variable data including closing prices;
correspondingly, after the acquiring at least two historical stock data sets, the method further includes:
dividing the closing price in each historical stock data set into a plurality of price intervals, and setting discrete labels for each price interval;
digitizing the discrete labels of each price interval;
updating the corresponding dependent variable data according to the value of the historical stock data set after the closing price is digitalized;
and normalizing the independent variable data in each historical stock data set.
7. The stock tendency prediction method according to claim 6, wherein the test set obtained by dividing the target historical stock data set is input into a corresponding trained extreme learning machine model to predict the tendency of the stock to be predicted, and the method comprises the following steps:
inputting a test set obtained by dividing the target historical stock data set into a corresponding trained extreme learning machine model to obtain a predicted discrete label value;
rounding the discrete label value;
and determining a predicted target discrete label according to the corresponding relation between the discrete label and each numerical value in the digitization process and the rounding result so as to predict the trend of the stock to be predicted according to the target discrete label.
8. A stock tendency prediction apparatus, comprising:
the stock data acquisition module is used for acquiring at least two historical stock data sets and dividing each historical stock data set into a training set and a testing set respectively, wherein the at least two historical stock data sets comprise a target historical stock data set of stocks to be predicted;
the regression model establishing module is used for establishing a seemingly unrelated regression model according to each training set and each extreme learning machine model to be trained corresponding to each historical stock data set;
the model training module is used for determining hidden layer output weights of all extreme learning machine models to be trained according to the seemingly unrelated regression model so as to obtain all trained extreme learning machine models;
and the trend prediction module is used for inputting a test set obtained by dividing the target historical stock data set into a corresponding trained extreme learning machine model so as to predict the trend of the stock to be predicted.
9. A computer device, comprising:
one or more processors;
a memory for storing one or more programs;
when executed by the one or more processors, cause the one or more processors to implement the stock trend prediction method of any of claims 1-7.
10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out a stock tendency prediction method according to any one of claims 1 to 7.
CN202011518772.6A 2020-12-21 2020-12-21 Stock tendency prediction method and device, computer equipment and storage medium Pending CN112561179A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011518772.6A CN112561179A (en) 2020-12-21 2020-12-21 Stock tendency prediction method and device, computer equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011518772.6A CN112561179A (en) 2020-12-21 2020-12-21 Stock tendency prediction method and device, computer equipment and storage medium

Publications (1)

Publication Number Publication Date
CN112561179A true CN112561179A (en) 2021-03-26

Family

ID=75030668

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011518772.6A Pending CN112561179A (en) 2020-12-21 2020-12-21 Stock tendency prediction method and device, computer equipment and storage medium

Country Status (1)

Country Link
CN (1) CN112561179A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113344710A (en) * 2021-06-25 2021-09-03 未鲲(上海)科技服务有限公司 Model training method, device and storage medium
WO2022222230A1 (en) * 2021-04-23 2022-10-27 平安科技(深圳)有限公司 Indicator prediction method and apparatus based on machine learning, and device and storage medium

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2022222230A1 (en) * 2021-04-23 2022-10-27 平安科技(深圳)有限公司 Indicator prediction method and apparatus based on machine learning, and device and storage medium
CN113344710A (en) * 2021-06-25 2021-09-03 未鲲(上海)科技服务有限公司 Model training method, device and storage medium
WO2022267162A1 (en) * 2021-06-25 2022-12-29 未鲲(上海)科技服务有限公司 Model training method and apparatus, and storage medium

Similar Documents

Publication Publication Date Title
CN111127364B (en) Image data enhancement strategy selection method and face recognition image data enhancement method
US20170330078A1 (en) Method and system for automated model building
US20210103858A1 (en) Method and system for model auto-selection using an ensemble of machine learning models
CN113240155A (en) Method and device for predicting carbon emission and terminal
CN111028100A (en) Refined short-term load prediction method, device and medium considering meteorological factors
CN112561179A (en) Stock tendency prediction method and device, computer equipment and storage medium
CN113408869A (en) Power distribution network construction target risk assessment method
Mensah et al. Investigating the significance of the bellwether effect to improve software effort prediction: Further empirical study
Izotova et al. Comparison of Poisson process and machine learning algorithms approach for credit card fraud detection
CN115392477A (en) Skyline query cardinality estimation method and device based on deep learning
CN110781970A (en) Method, device and equipment for generating classifier and storage medium
Gritsenko et al. Decomposition analysis and machine learning in a workflow-forecast approach to the task scheduling problem for high-loaded distributed systems
Boets et al. Clustering time series, subspace identification and cepstral distances
CN117235606A (en) Production quality management method and system for special stainless steel
Benyacoub et al. Credit scoring model based on HMM/Baum-Welch method
CN112333652B (en) WLAN indoor positioning method and device and electronic equipment
CN114139636B (en) Abnormal operation processing method and device
Shybaiev et al. Predicting system for the estimated cost of real estate objects development using neural networks
CN115081856A (en) Enterprise knowledge management performance evaluation device and method
Petelin et al. Financial modeling using Gaussian process models
CN114118526A (en) Enterprise risk prediction method, device, equipment and storage medium
Li et al. Time series clustering based on relationship network and community detection
Deng et al. Financial futures prediction using fuzzy rough set and synthetic minority oversampling technique
CN114971075A (en) Air quality prediction method and device, computer equipment and storage medium
Andriyanov et al. Development and Research of Intellectual Algorithms in Taxi Service Data Processing Based on Machine Learning and Modified K-means Method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20210326