CN109523386A - A kind of investment portfolio risk prediction technique of GMM in conjunction with LSTM - Google Patents

A kind of investment portfolio risk prediction technique of GMM in conjunction with LSTM Download PDF

Info

Publication number
CN109523386A
CN109523386A CN201811217293.3A CN201811217293A CN109523386A CN 109523386 A CN109523386 A CN 109523386A CN 201811217293 A CN201811217293 A CN 201811217293A CN 109523386 A CN109523386 A CN 109523386A
Authority
CN
China
Prior art keywords
lstm
stock
gmm
risk
value
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201811217293.3A
Other languages
Chinese (zh)
Inventor
程良伦
黄振杰
吴梓宏
王卓薇
邱安波
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong University of Technology
Original Assignee
Guangdong University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong University of Technology filed Critical Guangdong University of Technology
Priority to CN201811217293.3A priority Critical patent/CN109523386A/en
Publication of CN109523386A publication Critical patent/CN109523386A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/049Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0635Risk analysis of enterprise or organisation activities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/06Asset management; Financial planning or analysis

Abstract

The investment portfolio risk prediction technique that the invention discloses a kind of GMM in conjunction with LSTM, which comprises collect stock certificate data, choose the non-noise stock sequence for meeting stationarity condition, and pre-process to it;Dimension-reduction treatment is carried out to data using PCA method, the data after dimensionality reduction are clustered with mixed Gauss model (GMM);It chooses class cluster from cluster result to represent as investment combination, and the data set for representing class cluster divides;Investment combination is predicted using three layers of LSTM model, the income and risk of prediction result are assessed, the investment combination of high yield low-risk is generated by multiple Forecast.The selection efficiency and risk profile precision of investment combination can be improved in the present invention.

Description

A kind of investment portfolio risk prediction technique of GMM in conjunction with LSTM
Technical field
The present invention relates to financial quantum chemical method fields, more particularly, to a kind of investment combination wind of GMM in conjunction with LSTM Dangerous prediction technique.
Background technique
In investment combination method, it is one that suitable asset portfolio how is selected in up to hundreds of stock kind A primary problem, common data statistical analysis method are selected in the case where being difficult to handle a large amount of stocks in terms of risk profile A small amount of stock haves the defects that low efficiency, precision are low as suitable combination, and investor needs more efficient, more accurate Investment combination and Risk Forecast Method obtain the stock portfolio of high yield, low-risk.
Artificial intelligence technology is widely used in financial field at present, using the method for cluster, can be identified and be lain in number According to the relationship between the feature of concentration, the different data of multiclass can be selected in thousands of a data points according to these relationships Collection, for investment combination, the otherness between stock and stock is bigger, the combined risk of these stocks is more controllable, income more It is high.But the kind of stock increase to it is hundreds and thousands of when, the difference being just difficult to find that between stock, mixed Gauss model (GMM) can With precise quantification data set, no matter which kind of rule data set is presented, and can be carried out by the mixing of multiple single Gauss models Fitting, and stock yield sequence data biggish for length can improve the efficiency of cluster by dimensionality reduction appropriate.When Between the excavation of sequence data be widely used in Financial Risk Forecast field.LSTM is a kind of time recurrent neural net Network, by opening or closing for control input gate, forgetting door and out gate, so that before Recognition with Recurrent Neural Network (RNN) has memory The function of result several times can handle and predict the longer stock certificate data collection of time delay, to reach risk controllably and reduce risk Purpose.For above situation, investment combination is chosen by the method for cluster, and constitution's risk is gone out by time series forecasting, This method can effectively help investor to select the investment combination of low-risk high yield.But the choosing of model and parameter It selects, and how to combine Clustering Model and prediction model, be current main problem.
Summary of the invention
The present invention in order to overcome at least one of the drawbacks of the prior art described above, provides a kind of throwing of GMM in conjunction with LSTM Provide constitution's risk prediction technique.
The present invention is directed to solve above-mentioned technical problem at least to a certain extent.
Primary and foremost purpose of the invention is a kind of investment portfolio risk prediction technique of GMM in conjunction with LSTM
In order to solve the above technical problems, technical scheme is as follows:
A kind of investment portfolio risk prediction technique of GMM in conjunction with LSTM, includes the following steps:
S1: obtaining stock certificate data collection, and calculate its earning rate, by data prediction, obtains meeting GMM model input lattice The stock certificate data collection of formula;
S2: according to investment combination need stock number n determine cluster core number k, then in S1 pre-process after obtain Data set carry out principal component constituent analysis, check the correlation of each dimension, obtain optimal dimension d;To in S1 pass through data Pretreated stock certificate data collection carries out PCA dimensionality reduction, carries out parameter evaluation to the data set after dimensionality reduction using GMM model and carries out Then training clusters the stock in the data set after dimensionality reduction with the GMM model after training;
S3: the classification center that each classification is found from step S2 cluster result is represented as such stock, to stock The yield volatility for representing non-dimensionality reduction carries out XY dataset construction, will be divided by the posttectonic data set of XY according to M:N ratio Two parts, N is as training dataset, and M is as test data set;
S4: establishing three layers of LSTM model and is trained using the training dataset in step S3, after training LSTM model is combined prediction to test data, is added by the mean value and predicted value of calculating predicted value and the variance of actual value Weight average obtains earning rate expectation and value-at-risk;
S5: whether the expectation of earning rate obtained in judgment step S4 and value-at-risk meet the evaluation index of setting, if not The step of meeting parameter of the evaluation index then in amendment step S2, repeating S2-S4, is until earning rate expectation and value-at-risk reach finger Mark.
Further, stock yield calculation formula is as follows in the step S1:
Wherein,For i-th stock, the t days earning rates, datenFor exact date,For opening price,For closing price.
Further, the pretreatment includes to carry out riding Quality Analysis and unit root analysis, filter out it is steady and The stock sequence of non-noise sequence.
Further, data set is constructed in step S3 it needs to be determined that input dimension and output dimension, posttectonic data set It is 1 that distribution, which is 4, N value according to M value, i.e. the ratio cut partition test data set and training dataset of 4:1.
Further, LSTM model foundation process includes the parameter for initializing each layer, setting activation primitive in step S4 Be mean square deviation for hyperbolic tangent function, loss function, setting ART network algorithm be model optimization method, using in small batches with Machine gradient descent method is trained mode and is iterated trained LSTM model to LSTM model.
Further, the earning rate expected risk value calculation formula is as follows:
Wherein,The respectively expectation and value-at-risk of earning rate,The receipts come out for LSTM model prediction Beneficial rate, x are the actual earning rate of investment combination, and T is prediction number of days.
Preferably,
Compared with prior art, the beneficial effect of technical solution of the present invention is:
Feature of the present invention by analysis stock sequence data in time, establishes the GMM model pair based on PCA dimensionality reduction Data carry out clustering, are predicted using three layers of LSTM model investment combination, choose optimal Portfolio, and the present invention mentions High investment combination efficiency, while improving investment portfolio risk precision of prediction.
Detailed description of the invention
Fig. 1 is the method frame figure of the embodiment of the present invention.
Fig. 2 is the method flow diagram of the embodiment of the present invention.
Fig. 3 is cluster result schematic diagram of the present invention.
Fig. 4 is the curve that the embodiment of the present invention predicts the investment combination curve come and actual investment combination.
Specific embodiment
The following further describes the technical solution of the present invention with reference to the accompanying drawings and examples.
Embodiment 1
A kind of investment portfolio risk prediction technique of the GMM provided by the invention in conjunction with LSTM, specific method frame diagram is such as Shown in Fig. 1, implementing procedure is as shown in Figure of description 2.It is assumed that need to select the investment combination that every stock is in equal proportions now, Data set is that have the nearly 100 days data sets of 1000 stocks.
Assuming that choosing the investment combination of 5 stock, the class cluster of cluster is 5, i.e., the cluster core of GMM model is 5.LSTM Model includes three layers of LSTM network, wherein the dimension of next layer of input is consistent with upper one layer of output dimension, parameter setting As follows: the dimension of input layer is 15;It is 20 that first layer LSTM network, which exports dimension, and it is 40 that second layer LSTM network, which exports dimension, It is 80 that third layer LSTM network, which exports dimension, connects one dropout layers behind each layer of LSTM network, loss ratio is 0.2; It is finally the full articulamentum for exporting dimension and being 1.The activation primitive of LSTM model is tanh, and loss function is mean square deviation function MSE, optimization algorithm are ART network algorithm, that is, Adam algorithm.
Specific implementation step is as follows:
S1: crawling nearly 100 days stock certificate datas to 3000 stock from network, and it is each according to these data to calculate its It earning rate obtains carrying out riding Quality Analysis and unit root analysis after stock yield set, meeting steady and non-white noise The earning rate data of 1000 stocks of sound sequence are converted to XY data set format, as input data set S.
S2: according to cluster core 5, principal component analysis (PCA) is carried out to each stock certificate data, checks the correlation of each dimension Property, it determines that dimension d is 20, PCA dimension-reduction treatment is carried out to the data of original 100 dimension.Then parameter is carried out using GMM model to comment Estimate, initialize one group of parameter first, repetition training parameter until tending towards stability, then with trained model to dimension-reduction treatment after Stock clustered, that is, take the classification of maximum probability as the classification of the stock, Fig. 3, which is shown, arrives all Mapping of data points After two dimension, cluster result schematic diagram is every a kind of respectively using rectangle, X-type, horizontal line, vertical line, round as marking.As can be seen that the greatest extent Data distribution of the pipe in 20 dimension spaces cannot visualize, but be mapped to the point after two dimension and still have certain rule.
S3: calculating each class cluster midpoint at a distance from point, representative of the Selection Center point as such cluster, what these were represented Combine the investment combination for exactly needing to assess.The yield volatility for representing non-dimensionality reduction to stock carries out dataset construction, according to preceding The setting input layer dimension in face is 15, and full articulamentum output is 1.Every 15 data points are the 16th as primary input X, output Y It is a, therefore 100 days data can be configured to 85 data.Data set is divided into two parts, ratio 4:1, that is to say, that Preceding 66 data are as training set, and rear 19 data are as test set.
S4: corresponding LSTM model, the number of iterations 200 are established according to the setting of front, training method is small quantities of random Gradient descent method (MSGD) is in batches 10, starts training pattern.Test data is predicted with trained model, is calculated Obtain the expectation and value-at-risk of earning rate.If the expectation of the portfolio yield is higher than 1 or risk is less than 20, show the group The requirement for meeting low-risk high yield is closed, the combination is exported.Otherwise, result is fed back to S2, adjusts d, iteration S2-S4 is until obtain To ideal investment combination.As Fig. 4 show the investment combination curve that LSTM model prediction comes out and actual investment combines Curve, by being calculated, earning rate is desired for 1.59, and risk 18.38 meets condition, exports each of the combination The code of branch stock, to investor as reference.

Claims (6)

1. a kind of investment portfolio risk prediction technique of GMM in conjunction with LSTM, which comprises the steps of:
S1: stock certificate data collection is obtained, and calculates its earning rate, by data prediction, obtains meeting GMM model input format Stock certificate data collection;
S2: cluster core number k is determined according to the stock number n that investment combination needs, then to the number obtained after pre-processing in S1 Principal component constituent analysis is carried out according to collection, the correlation of each dimension is checked, obtains optimal dimension d;Locate in advance to data are passed through in S1 Stock certificate data collection after reason carries out PCA dimensionality reduction, carries out parameter evaluation to the data set after dimensionality reduction using GMM model and instructs Practice, then the stock in the data set after dimensionality reduction is clustered with the GMM model after training;
S3: the classification center that each classification is found from step S2 cluster result is represented as such stock, is represented to stock The yield volatility of non-dimensionality reduction carries out XY dataset construction, will be divided into two parts by the posttectonic data set of XY according to M:N ratio, N is as training dataset, and M is as test data set;
S4: establishing three layers of LSTM model and is trained using the training dataset in step S3, uses the LSTM mould after training Type is combined prediction to test data, flat by the mean value and predicted value of calculating predicted value and the variance weighted of actual value , earning rate expectation and value-at-risk are obtained;
S5: whether the expectation of earning rate obtained in judgment step S4 and value-at-risk meet the evaluation index of setting, are such as unsatisfactory for The evaluation index then parameter in amendment step S2, the step of repeating S2-S4, is until earning rate expectation and value-at-risk touch the mark.
2. a kind of investment portfolio risk prediction technique of the GMM according to claim 1 in conjunction with LSTM, which is characterized in that Stock yield calculation formula is as follows in step S1:
Wherein,For i-th stock, the t days earning rates, datenFor exact date,For opening price,For closing price.
3. a kind of investment portfolio risk prediction technique of the GMM according to claim 1 in conjunction with LSTM, which is characterized in that The pretreatment includes to carry out riding Quality Analysis and unit root analysis, filters out the stock sequence of steady and non-noise sequence Column.
4. a kind of investment portfolio risk prediction technique of the GMM according to claim 1 in conjunction with LSTM, which is characterized in that Data set is constructed in step S3 it needs to be determined that inputting dimension and output dimension, posttectonic data set distribution, is 4 according to M value, N value is 1, i.e. the ratio cut partition test data set and training dataset of 4:1.
5. a kind of investment portfolio risk prediction technique of the GMM according to claim 1 in conjunction with LSTM, which is characterized in that In step S4 LSTM model foundation process include the parameter for initializing each layer, setting activation primitive be hyperbolic tangent function, damage Mistake function is mean square deviation, and setting ART network algorithm is the optimization method of model, is carried out using small quantities of stochastic gradient descent method Training method is iterated trained LSTM model to LSTM model.
6. a kind of investment portfolio risk prediction technique of the GMM according to claim 1 in conjunction with LSTM, which is characterized in that The earning rate expected risk value calculation formula is as follows:
Wherein,The respectively expectation and value-at-risk of earning rate,The income come out for LSTM model prediction Rate, x are the actual earning rate of investment combination, and T is prediction number of days.
CN201811217293.3A 2018-10-18 2018-10-18 A kind of investment portfolio risk prediction technique of GMM in conjunction with LSTM Pending CN109523386A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811217293.3A CN109523386A (en) 2018-10-18 2018-10-18 A kind of investment portfolio risk prediction technique of GMM in conjunction with LSTM

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811217293.3A CN109523386A (en) 2018-10-18 2018-10-18 A kind of investment portfolio risk prediction technique of GMM in conjunction with LSTM

Publications (1)

Publication Number Publication Date
CN109523386A true CN109523386A (en) 2019-03-26

Family

ID=65770962

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811217293.3A Pending CN109523386A (en) 2018-10-18 2018-10-18 A kind of investment portfolio risk prediction technique of GMM in conjunction with LSTM

Country Status (1)

Country Link
CN (1) CN109523386A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110070145A (en) * 2019-04-30 2019-07-30 天津开发区精诺瀚海数据科技有限公司 LSTM wheel hub single-item energy consumption prediction based on increment cluster
CN112257974A (en) * 2020-09-09 2021-01-22 北京无线电计量测试研究所 Gas lock well risk prediction model data set, model training method and application

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110070145A (en) * 2019-04-30 2019-07-30 天津开发区精诺瀚海数据科技有限公司 LSTM wheel hub single-item energy consumption prediction based on increment cluster
CN110070145B (en) * 2019-04-30 2021-04-27 天津开发区精诺瀚海数据科技有限公司 LSTM hub single-product energy consumption prediction based on incremental clustering
CN112257974A (en) * 2020-09-09 2021-01-22 北京无线电计量测试研究所 Gas lock well risk prediction model data set, model training method and application

Similar Documents

Publication Publication Date Title
CN108959728B (en) Radio frequency device parameter optimization method based on deep learning
CN108596335B (en) Self-adaptive crowdsourcing method based on deep reinforcement learning
CN102982373B (en) OIN (Optimal Input Normalization) neural network training method for mixed SVM (Support Vector Machine) regression algorithm
CN109118013A (en) A kind of management data prediction technique, readable storage medium storing program for executing and forecasting system neural network based
CN110298663A (en) Based on the wide fraudulent trading detection method learnt deeply of sequence
Qin et al. Linear and nonlinear trading models with gradient boosted random forests and application to Singapore stock market
CN109472088A (en) A kind of shale controlled atmosphere production well production Pressure behaviour prediction technique
CN103324954A (en) Image classification method based on tree structure and system using same
CN106991285A (en) A kind of short-term wind speed multistep forecasting method and device
CN109523386A (en) A kind of investment portfolio risk prediction technique of GMM in conjunction with LSTM
Mohamed et al. Artificial neural networks in data mining
CN108229750A (en) A kind of stock yield Forecasting Methodology
CN111028086A (en) Enhanced index tracking method based on clustering and LSTM network
CN110458722A (en) Flood interval prediction method based on multiple target random vector function connection network
Dash et al. Stock price index movement classification using a CEFLANN with extreme learning machine
Risteski et al. Single exponential smoothing method and neural network in one method for time series prediction
Shao Prediction of currency volume issued in Taiwan using a hybrid artificial neural network and multiple regression approach
CN114529063A (en) Financial field data prediction method, device and medium based on machine learning
CN113656707A (en) Financing product recommendation method, system, storage medium and equipment
Rifa'i et al. Optimized fuzzy backpropagation neural network using genetic algorithm for predicting Indonesian stock exchange composite index
Rosalina et al. Maximal Overlap Discrete Wavelet Transform, Graph Theory And Backpropagation Neural Network In Stock Market Forecasting
Samarawickrama et al. Multi-step-ahead prediction of exchange rates using artificial neural networks: a study on selected sri lankan foreign exchange rates
Mao et al. QoS trust rate prediction for Web services using PSO-based neural network
Kurniawati et al. Optimization of Backpropagation Using Harmony Search for Gold Price Forecasting
KR102496501B1 (en) A method for calculating asset allocation information using simulation data and an apparatus for calculating asset allocation information using simulation data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20190326

RJ01 Rejection of invention patent application after publication