CN109767333A - Select based method, device, electronic equipment and computer readable storage medium - Google Patents
Select based method, device, electronic equipment and computer readable storage medium Download PDFInfo
- Publication number
- CN109767333A CN109767333A CN201811536792.9A CN201811536792A CN109767333A CN 109767333 A CN109767333 A CN 109767333A CN 201811536792 A CN201811536792 A CN 201811536792A CN 109767333 A CN109767333 A CN 109767333A
- Authority
- CN
- China
- Prior art keywords
- classifier
- fund
- banking operation
- operation data
- data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Abstract
Based method, device, electronic equipment and computer readable storage medium are selected the present invention relates to a kind of.The described method includes: reading banking operation data, and the factor of banking operation data is constructed according to the banking operation data of reading;The factor of banking operation data and banking operation data is divided into training dataset and test data set;It is fitted multiple classifiers respectively using training dataset and test data set;Optimum classifier is determined from the classifier of fitting;Current fund data is obtained, and the fund that preset quantity is selected using optimum classifier from the current fund data of acquisition determines investment combination assets;And investment combination assets are carried out with the Asset Allocation of major class Asset Allocation model, determine the investment weight of the fund of investment combination assets.The present invention determines optimum classifier from the classifier of fitting, and the fund that preset quantity is selected using optimum classifier from the current fund data of acquisition determines investment combination assets, to improve the accuracy for selecting base to predict.
Description
Technical field
The present invention relates to Financial Management fields, and in particular to a kind of to select based method, device, electricity based on machine learning algorithm
Sub- equipment and computer readable storage medium.
Background technique
It mainly carries out selecting base using multiple-factor linear model in the prior art, but has the influence of many factor pair next period incomes
Be not it is linear, need to introduce Fitting of Nonlinear Models historical data, can just make and more accurately base be selected to predict.
Summary of the invention
In view of the foregoing, it is necessary to propose a kind of to select based method, device, electronic equipment and computer readable storage medium
To improve the accuracy for selecting base to predict.
The first aspect of the application, which provides, a kind of selects based method, which comprises
Banking operation data are read, and construct the factor of the banking operation data according to the banking operation data of reading,
Wherein the factor of the banking operation data include year earning rate, maximum withdraw, Sharpe Ratio, downlink standard deviation, Suo Tinuo
One of ratio, the scale of Fund Company, the scale of fund or entire period of actual operation of fund manager are a variety of;
The factor of the banking operation data and the banking operation data is divided into training dataset and test data
Collection;
It is fitted multiple classifiers respectively using the training dataset and the test data set, wherein the classifier packet
Include logistic regression classifier, support vector machine classifier, Gauss Naive Bayes Classifier and random forest grader;
Optimum classifier is determined from the classifier of fitting;
Current fund data is obtained, and selects present count from the current fund data of acquisition using the optimum classifier
The fund of amount determines investment combination assets;And
The Asset Allocation that investment combination assets are carried out with major class Asset Allocation model, so that it is determined that the investment combination assets
Fund investment weight.
Preferably, the reading banking operation data and the banking operation number is constructed according to the banking operation data of reading
According to the factor include: that banking operation data to acquisition are cleaned, and according to the banking operation data building after cleaning
The factor of banking operation data after cleaning.
Preferably, the banking operation data of described pair of acquisition, which clean, includes:
The fund without net value is removed from the banking operation data;
Removal classification fund and money market type fund from the banking operation data;
It is removed from the banking operation data and lists the fund less than 1 year by the current the end of month;
From removal in the banking operation data by the day of trade net value for continuing to exceed 20% in current the previous year at the end of month
Fund without update;And
It is remaining after removing fund of the current the end of month fund net assets lower than 10,000,000 yuan in the banking operation data
Fund.
Preferably, the factor of the banking operation data and the banking operation data is divided into training dataset and survey
Trying data set includes:
Judge the banking operation data lower monthly benefits whether be more than the banking operation data similar fund finger
The banking operation data label is 1 if being more than, is otherwise 0 by the banking operation data label by number, so will be described
Banking operation data carry out label and generate label data;And
The factor of the banking operation data and the label data randomly divide with preset ratio and generate institute
State training dataset and the test data set.
Preferably, determine that optimum classifier includes: in the classifier from fitting
Each classifier after fitting is tested using the test data set, and calculates the of each classifier
One coefficient of stability C1, wherein C1 represents degree of stability of the classifier used in the test data when;
Each classifier after fitting is tested using current fund, and calculates the second of each classifier and stablizes
Coefficient C2, C2 represent degree of stability of the classifier used in the current fund when;
Each classifier after fitting is tested using current fund, and calculates the head combination of each classifier
Information ratio IR;
According to calculation formula A=C1*C2*IR, the final index A of each classifier is calculated;And
Classifier corresponding to maximum final index A is determined as the optimum classifier.
Preferably, the method also includes:
The Asset Allocation of major class Asset Allocation model is carried out to the throwing with predetermined period to the investment combination assets
The investment weight of money combination investment is regularly updated.
Preferably, the major class Asset Allocation model is risk par model.
The second aspect of the application provide it is a kind of select based devices, described device includes:
Module is constructed, for reading banking operation data, and according to the banking operation data of the reading building financial row
For the factor of data, wherein the factor of the banking operation data include year earning rate, maximum withdraw, Sharpe Ratio, downlink
One of standard deviation, Suo Tinuo ratio, the scale of Fund Company, the scale of fund or entire period of actual operation of fund manager are more
Kind;
Data division module, for the factor of the banking operation data and the banking operation data to be divided into training
Data set and test data set;
Fitting module, for being fitted multiple classifiers respectively using the training dataset and the test data set,
Described in classifier include logistic regression classifier, support vector machine classifier, Gauss Naive Bayes Classifier and random gloomy
Woods classifier;
Classifier determining module, for determining optimum classifier from the classifier of fitting;
Investment combination assets determining module for obtaining current fund data, and utilizes the optimum classifier from acquisition
Current fund data in select the fund of preset quantity and determine investment combination assets;And
Asset Allocation module, for investment combination assets to be carried out with the Asset Allocation of major class Asset Allocation model, thus really
The investment weight of the fund of the fixed investment combination assets.
The third aspect of the application provides a kind of electronic equipment, and the electronic equipment includes processor, and the processor is used
It is realized when executing the computer program stored in memory and described selects based method.
The fourth aspect of the application provides a kind of computer readable storage medium, is stored thereon with computer program, described
It is realized when computer program is executed by processor and described selects based method.
This case can determine optimum classifier, and the working as from acquisition using the optimum classifier from the classifier of fitting
The fund that preset quantity is selected in preceding fund data determines investment combination assets, to improve the accuracy for selecting base to predict.
Detailed description of the invention
Fig. 1 is the flow chart that based method method is selected in an embodiment of the present invention.
Fig. 2 is the structure chart that based method device is selected in an embodiment of the present invention.
Fig. 3 is the schematic diagram of electronic equipment preferred embodiment of the present invention.
Specific embodiment
To better understand the objects, features and advantages of the present invention, with reference to the accompanying drawing and specific real
Applying example, the present invention will be described in detail.It should be noted that in the absence of conflict, embodiments herein and embodiment
In feature can be combined with each other.
In the following description, numerous specific details are set forth in order to facilitate a full understanding of the present invention, described embodiment is only
It is only a part of the embodiment of the present invention, instead of all the embodiments.Based on the embodiments of the present invention, ordinary skill
Personnel's every other embodiment obtained without making creative work, shall fall within the protection scope of the present invention.
Unless otherwise defined, all technical and scientific terms used herein and belong to technical field of the invention
The normally understood meaning of technical staff is identical.Term as used herein in the specification of the present invention is intended merely to description tool
The purpose of the embodiment of body, it is not intended that in the limitation present invention.
Preferably, of the invention that based method is selected to apply in one or more electronic equipment.The electronic equipment is one
Kind can be according to the instruction for being previously set or storing, the automatic equipment for carrying out numerical value calculating and/or information processing, and hardware includes
But be not limited to microprocessor, specific integrated circuit (Application Specific Integrated Circuit, ASIC),
Programmable gate array (Field-Programmable Gate Array, FPGA), digital processing unit (Digital Signal
Processor, DSP), embedded device etc..
The electronic equipment can be the calculating such as desktop PC, laptop, tablet computer and cloud server
Equipment.The equipment can carry out man-machine friendship by modes such as keyboard, mouse, remote controler, touch tablet or voice-operated devices with user
Mutually.
Embodiment 1
In present embodiment, the method is applied in multiple operation systems.For example, the multiple operation system 1 can be with
For the first operation system, the second operation system ..., N operation system.In present embodiment, the operation system be can be together
There are respective business in the operation system of finance service under one financial system, such as each bank or securities broker company respectively
System, the first operation system can be the operation system of finance service A, and the second operation system can be finance service
The operation system of B.Each operation system includes an at least server, and an at least server passes through described in network connection
The terminal installation of the staff of the terminal installation and finance service of all clients of finance service.This
In embodiment, an at least server connection storage equipment.The storage equipment stores the institute of finance service
There are the data such as the data of client, such as client identity information, account information and asset configuration information.An at least server
The storage equipment of connection can be local storage equipment, be also possible to the storage equipment by network connection.
Fig. 1 is the flow chart that based method is selected in an embodiment of the present invention.It is walked in the flow chart according to different requirements,
Rapid sequence can change, and certain steps can be omitted.
As shown in fig.1, the production configuration method specifically includes the following steps:
Step S21 reads banking operation data, and constructs the banking operation data according to the banking operation data of reading
The factor.
In present embodiment, banking operation data can be read from the corresponding server of each operation system.Wherein, institute
State banking operation data assets data include, but are not limited to public offering fund, the net value data of public offering fund, Fund Company number
According to, data of fund manager etc..In present embodiment, also constructed after reading banking operation data banking operation data because
Son.For example, in one embodiment, the factor of banking operation data can be configured to according to the banking operation data, but not
Be limited to year earning rate, it is maximum withdraw, Sharpe Ratio, downlink standard deviation, Suo Tinuo ratio, the scale of Fund Company, fund
Scale, entire period of actual operation of fund manager etc..Wherein, the year earning rate of fund refers to pre- as obtained by purchase fund product
Phase earning rate is converted into annualized return and is calculated.The maximum of fund, which is withdrawn, refers to that any history time point is backward within the selected period
It pushes away, earning rate when product net value goes to minimum point withdraws the maximum value of amplitude.Maximum is withdrawn for describing purchase of fund product
The case where worst being likely to occur afterwards.Sharpe Ratio is fund valuation standardized index.The downlink standard deviation of fund is
Refer to the possible change degree of fund.Standard deviation is bigger, and the degree that fund future net value may change is bigger, and stability is got over
Small, investment risk is higher.Suo Tinuo ratio refers to a kind of method for measuring investment combination relative performance, this ratio is higher,
Show that fund undertakes same units downside risks and can obtain higher excess return rate.
In present embodiment, step S21 " reads banking operation data, and more according to the building of the banking operation data of reading
The factor of a banking operation data " further include: the banking operation data of acquisition are cleaned, and according to the financial row after cleaning
The factor of the banking operation data after the cleaning is constructed for data.In present embodiment, the banking operation number of described pair of acquisition
Include: according to cleaning is carried out
The fund without net value is removed from the banking operation data;
Removal classification fund and money market type fund from the banking operation data;
It is removed from the banking operation data and lists the fund less than 1 year by the current the end of month;
From removal in the banking operation data by the day of trade net value for continuing to exceed 20% in current the previous year at the end of month
Fund without update;
The fund that current the end of month fund net assets are lower than 1,000 ten thousand yuan is removed from the banking operation data.
In present embodiment, remove from the banking operation data is by the fund that the current the end of month listed less than 1 year
Refer to from the banking operation data removal from from fund distribution to current fund of the end of month Time To Market less than 1 year.From
Removal is by day of trade base of the net value without update for continuing to exceed 20% in current the previous year at the end of month in the banking operation data
Gold refers to that the day of trade net value that 20% is continued to exceed out of the previous year that remove the current the end of month in the banking operation data does not have
The fund of update.For example, there is continuing to exceed 20% day of trade without more within the previous year at the current the end of month in the net value of certain fund
Newly, then the fund is removed.It is square from Fund Type, net fund value etc. to the banking operation data read in present embodiment
It is cleaned in face of data, obtains the banking operation data that can be used for studying.
Step S22, by the factor of the banking operation data and the banking operation data be divided into training dataset and
Test data set.
In the present embodiment, gold can be obtained from operation system 1 in the monthly fixed time (such as No. 1 monthly)
Melt behavioral data and be built into the factor of banking operation data, and the banking operation data are subjected to label and generate number of tags
According to.In present embodiment, the label data can generate in the following way: judging the lower monthly benefits of banking operation data is
The index of the no similar fund more than the banking operation data, if be more than if by corresponding banking operation data label be 1, it is no
It is then 0 by corresponding banking operation data label.Then the factor of banking operation data and the label data of generation are divided at random
For training dataset and test data set.
In present embodiment, the training dataset is different with the test data set.Wherein, the training dataset is
It is made of a part of data in the factor of banking operation data and label data, and the test data is by banking operation number
According to the factor and label data in another part data constitute.In the present embodiment, with preset ratio to banking operation number
According to the factor and label data randomly divide and generate the training dataset and the test data set.The default ratio
Example refers to the ratio for the data volume that the data volume that training data is concentrated and test data are concentrated.The preset ratio can be according to specific
Using determining.In one embodiment, the preset ratio of training dataset and test data set is 8:2, i.e. training dataset
It is made of any 80% data in the factor of banking operation data and label data, and test data set is by banking operation number
According to the factor and label data in remaining 20% data constitute.
Step S23 is fitted multiple classifiers using the training dataset and the test data set respectively.
In present embodiment, the multiple classifier includes logistic regression classifier, support vector machine classifier, Gao Sipiao
Plain Bayes classifier and random forest grader.In present embodiment, it can use in each fixed time point (such as monthly No. 1)
Training dataset and test data set, respectively be fitted logistic regression classifier, support vector machine classifier, Gauss simplicity pattra leaves
This classifier and random forest grader.It in another embodiment, can also be for the training data on each fixed time point
Collection is fitted with test data set using the Bu Tong derivative classifier according to each classifier parameters.
Step S24 determines optimum classifier from the classifier of fitting.
In present embodiment, each classifier after fitting is carried out respectively using the test data set and current fund
It tests and determines the final index A of each classifier, and classifier corresponding to maximum final index A is determined as optimal
Classifier.Specifically, the method includes: in step " from optimum classifier is determined in the classifier of fitting "
(S241) each classifier after fitting is tested using the test data set, and calculates each classification
First coefficient of stability C1 of device.
In present embodiment, according to classification when being tested using the test data set each classifier after fitting
The data that test data is concentrated are divided into 5 grades from big to small by the probability value of device output, that is, and first grade, second gear, third gear, the
Fourth gear and fifth speed.If the earning rate for first grade of the data that a certain classifier divides be it is highest, by the classifier pair
The test result answered is denoted as 1, on the contrary then be denoted as 0.Then the fixation time point (monthly 1 in such as 1 year in the several years in past is utilized
Number) time series on the test data set that obtains each classifier is tested, and by test result according to above-mentioned original
Then the corresponding test result of each classifier is marked.In this way, one group of array being made of 1 and 0 will be obtained, to the number
Group obtains the first coefficient of stability C1 after averaging.When wherein, the first coefficient of stability C1 represents classifier used in test data
Degree of stability.
(S242) each classifier after fitting is tested using current fund, and calculates the of each classifier
Two coefficient of stability C2.
In present embodiment, exported when being tested using current fund each classifier after fitting according to classifier
Probability value current fund is divided into 5 grades from big to small, that is, first grade, second gear, third gear, fourth speed and fifth speed.If
The earning rate for first grade of the current fund that a certain classifier divides be it is highest, then the corresponding test result of classifier is denoted as
1, it is on the contrary then be denoted as 0.Then the current fund pair in the time series of the fixation time point (monthly No. 1) of several years in past is utilized
Each classifier is tested, and test result marks the corresponding test result of each classifier according to above-mentioned principle
Note.In this way, one group of array being made of 1 and 0 will be obtained, the second coefficient of stability C2 is obtained after averaging to the array.Wherein,
Second coefficient of stability C2 represents degree of stability of the classifier used in true predictive when.
(S243) each classifier after fitting is tested using current fund, and calculates the head of each classifier
Portion combined information ratio IR (Information Ratio).
In present embodiment, exported when being tested using current fund each classifier after fitting according to classifier
Probability value current fund is ranked up from big to small, take before N only, then utilize the several years in past fixation time point when
Between current fund in sequence each classifier is tested, and by test result according to above-mentioned principle to each classifier
Current fund is ranked up by the probability value of output from big to small, and N only, constitutes head combination, and calculate each classification before taking
The information ratio IR of the corresponding head combination of device.
(S244) according to calculation formula A=C1*C2*IR, the final index A of each classifier is calculated.
(S245) classifier corresponding to maximum final index A is determined as optimum classifier.
Step S25 is obtained current fund data, and is selected from the current fund data of acquisition using the optimum classifier
The fund of preset quantity determines investment combination assets out.
In present embodiment, current fund data is tested using the optimum classifier, according to the most optimal sorting
Current fund data is ranked up by the probability value of class device output from big to small, and N fund data is as the investment group before taking
Joint production.
Step S26 carries out the Asset Allocation of major class Asset Allocation model to investment combination assets, so that it is determined that investment combination
The investment weight of assets.
In present embodiment, the major class Asset Allocation model can be risk par model.In present embodiment, to throwing
The Asset Allocation that money combination investment carries out major class Asset Allocation model, which refers to, carries out risk par model to investment combination assets
Asset Allocation is with the investment weight of the determination investment combination assets.
In present embodiment, risk par model is based on Principal Component Analysis, linear by carrying out to investment combination assets
Combination forms irrelevant investment combination assets, and the assets of risk par model are carried out for incoherent investment combination assets
Configuration, the final investment weight for determining investment combination assets.For example, in one embodiment, it is assumed that in investment combination assets altogether
Have N number of assets, the earning rate of assets be R=[r1, r2 ... rN] ', for investment combination weight w=[w1, w2 ... wN] ', throw
Provide combined total revenue are as follows: Rw=w ' R.Then, covariance matrix Σ=Cov of assets is calculated using the earning rate of N number of assets
(R), because of the symmetry of covariance matrix Σ, Σ can be decomposed into N number of orthogonal feature vector: E Λ E '=∑, in which: Λ=
Diag (λ 1, λ 2 ..., λ N) is the diagonal matrix of Σ characteristic value building, and λ i meets λ1≥λ2≥…≥λN;E is λiCharacter pair to
Measure eiThe eigenvectors matrix being arranged to make up is arranged, and E is orthogonal matrix, wherein E '=E-1And E ' E=I.Therefore, covariance matrix can
It decomposes are as follows: ∑=λ1e1e'1+λ2e2e'2+···+λNeNe'N.Feature vector can form N number of orthogonal investment combination, and be claimed
The principal component factor.The earning rate of the principal component factor may be defined as: RPC=E ' R, meanwhile, Cov (RPC)=Cov (E ' R)=E '
Cov (R) E=E ' Σ E=E ' E Λ E ' E=Λ.Have for single principal component investment combinationFor appointing
It anticipates two principal component factorsWithHaveIt can be found that N number of principal component factor be it is incoherent and
Their variance respectively with λ1,λ2,…,λNIt is equal.Therefore, the weight of the principal component factor can be made of the linear combination of former weight,
That is: wPC=E ' w.In present embodiment, the total revenue of the principal component factor are as follows: Rw=wPC′RPC=(E ' w) ' (E ' R)=w '
EE ' R=w ' R.To principal component factor risk par model, RC is contributed by riskiDefinition can obtain
Institute
The risk par model for stating the principal component factor can be exchanged into:
Above-mentioned equation can be converted into Optimized model and solve optimal weights, so as to define principal component risk par mould
Type, it may be assumed that
When objective function is equal to 0, haveRCi=RCj, the numerical solution is principal component risk
The weight of par investment combination.
In present embodiment, the method also includes:
Step S27, to the investment combination assets with predetermined period carry out major class Asset Allocation model Asset Allocation with
The investment weight of the investment combination assets is regularly updated.In present embodiment, the predetermined period is one month, one
A season or 1 year.
This case can determine optimum classifier, and the working as from acquisition using the optimum classifier from the classifier of fitting
The fund that preset quantity is selected in preceding fund data determines investment combination assets, to improve the accuracy for selecting base to predict.
Embodiment 2
Fig. 2 is the structure chart that based devices 10 are selected in an embodiment of the present invention.
In some embodiments, described that based devices 10 is selected to run in electronic equipment.It is described to select the based devices 10 to may include
Multiple functional modules as composed by program code segments.The program code for selecting each program segment in based devices 10 can be deposited
It is stored in memory, and as performed by least one processor, to execute the function of Asset Allocation.
In the present embodiment, the electronic equipment selects function of the based devices 10 according to performed by it, can be divided into more
A functional module.As shown in fig.2, described select based devices 10 to may include building module 301, data division module 302, be fitted
Module 303, classifier determining module 304, investment combination assets determining module 305, Asset Allocation module 306 and update module
307.The so-called module of the present invention, which refers to, a kind of performed by least one processor and can complete fixed function
Series of computation machine program segment, storage is in memory.It in some embodiments, will be subsequent about the function of each module
It is described in detail in embodiment.
The building module 301 is used to read banking operation data, and according to the building of the banking operation data of reading
The factor of banking operation data.
In present embodiment, the building module 301 can read gold from the corresponding server 11 of each operation system 1
Melt behavioral data.Wherein, the banking operation data assets data include, but are not limited to the net value of public offering fund, public offering fund
Data, the data of Fund Company, data of fund manager etc..In present embodiment, structure is gone back after reading banking operation data
Build the factor of banking operation data.For example, in one embodiment, the factor of banking operation data can be configured to, but unlimited
Yu Nianhua earning rate, it is maximum withdraw, the rule of Sharpe Ratio, downlink standard deviation, Suo Tinuo ratio, the scale of Fund Company, fund
Mould, entire period of actual operation of fund manager etc..Wherein, the year earning rate of fund refers to expected as obtained by purchase fund product
Earning rate is converted into annualized return and is calculated.The maximum of fund, which is withdrawn, refers to that any history time point is backward within the selected period
It pushes away, earning rate when product net value goes to minimum point withdraws the maximum value of amplitude.Maximum is withdrawn for describing purchase of fund product
The case where worst being likely to occur afterwards.Sharpe Ratio is fund valuation standardized index.The downlink standard deviation of fund is
Refer to the possible change degree of fund.Standard deviation is bigger, and the degree that fund future net value may change is bigger, and stability is got over
Small, investment risk is higher.Suo Tinuo ratio refers to a kind of method for measuring investment combination relative performance, this ratio is higher,
Show that fund undertakes same units downside risks and can obtain higher excess return rate.
In present embodiment, the building module 301 is also used to the banking operation data to acquisition and cleans, and according to
Banking operation data after cleaning construct the factor of the banking operation data after the cleaning.In present embodiment, described pair is obtained
The banking operation data taken carry out cleaning
The fund without net value is removed from the banking operation data;
Removal classification fund and money market type fund from the banking operation data;
It is removed from the banking operation data and lists the fund less than 1 year by the current the end of month;
From removal in the banking operation data by the day of trade net value for continuing to exceed 20% in current the previous year at the end of month
Fund without update;
The fund that current the end of month fund net assets are lower than 1,000 ten thousand yuan is removed from the banking operation data.
In present embodiment, remove from the banking operation data is by the fund that the current the end of month listed less than 1 year
Refer to from the banking operation data removal from from fund distribution to current fund of the end of month Time To Market less than 1 year.From
Removal is by day of trade base of the net value without update for continuing to exceed 20% in current the previous year at the end of month in the banking operation data
Gold refers to that the day of trade net value that 20% is continued to exceed out of the previous year that remove the current the end of month in the banking operation data does not have
The fund of update.For example, there is continuing to exceed 20% day of trade without more within the previous year at the current the end of month in the net value of certain fund
Newly, then the fund is removed.
In present embodiment, to the banking operation data read from Fund Type, net fund value etc. to data into
Row cleaning, obtains the banking operation data that can be used for studying.
The data division module 302 is used to divide the factor of the banking operation data and the banking operation data
For training dataset and test data set.
In the present embodiment, financial when being obtained from operation system 1 in the monthly fixed time (such as No. 1 monthly)
Behavioral data and after being built into the factor of banking operation data, the data division module 302 by the banking operation data into
Row label generates label data.In present embodiment, the label data can generate in the following way: judge banking operation
The lower monthly benefits of data whether be more than the banking operation data similar fund index, by corresponding financial row if being more than
It is 1 for data label, is otherwise 0 by corresponding banking operation data label.The data division module 302 is also by banking operation
The factor of data and the label data of generation are randomly divided into training dataset and test data set.
In present embodiment, the training dataset is different with the test data set.Wherein, the training dataset is
It is made of a part of data in the factor of banking operation data and label data, and the test data is by banking operation number
According to the factor and label data in another part data constitute.In the present embodiment, the data division module 302 is with pre-
If ratio, which randomly divide to the factor and label data of banking operation data, generates the training dataset and the survey
Try data set.The preset ratio refers to the ratio for the data volume that the data volume that training data is concentrated and test data are concentrated.It is described
Preset ratio can be determined according to concrete application.In one embodiment, the default ratio of training dataset and test data set
Example is 8:2, i.e., training dataset is made of any 80% data in the factor of banking operation data and label data, and is surveyed
Examination data set is made of 20% data remaining in the factor of banking operation data and label data.
The fitting module 303 is fitted multiple classifiers using the training dataset and the test data set respectively.
In present embodiment, the multiple classifier includes logistic regression classifier, support vector machine classifier, Gao Sipiao
Plain Bayes classifier and random forest grader.In present embodiment, it can use in each fixed time point (such as monthly No. 1)
Training dataset and test data set, respectively be fitted logistic regression classifier, support vector machine classifier, Gauss simplicity pattra leaves
This classifier and random forest grader.It in another embodiment, can also be for the training data on each fixed time point
Collection is fitted with test data set using the Bu Tong derivative classifier according to each classifier parameters
The classifier determining module 304 determines optimum classifier from the classifier of fitting.
In present embodiment, the classifier determining module 304 is using the test data set and current fund to fitting
The final index A of each classifier is tested respectively and determined to each classifier afterwards, and by maximum final index A institute
Corresponding classifier is determined as optimum classifier.
In a specific embodiment, after the utilization of classifier determining module 304 test data set is to fitting
Each classifier is tested, and calculates the first coefficient of stability C1 of each classifier.
In present embodiment, according to classification when being tested using the test data set each classifier after fitting
The data that test data is concentrated are divided into 5 grades from big to small by the probability value of device output, that is, and first grade, second gear, third gear, the
Fourth gear and fifth speed.If the earning rate for first grade of the data that a certain classifier divides be it is highest, by the classifier pair
The test result answered is denoted as 1, on the contrary then be denoted as 0.Then the fixation time point (monthly 1 in such as 1 year in the several years in past is utilized
Number) time series on the test data set that obtains each classifier is tested, and by test result according to above-mentioned original
Then the corresponding test result of each classifier is marked.In this way, one group of array being made of 1 and 0 will be obtained, to the number
Group obtains the first coefficient of stability C1 after averaging.When wherein, the first coefficient of stability C1 represents classifier used in test data
Degree of stability.
The classifier determining module 304 also tests each classifier after fitting using current fund, and counts
Calculate the second coefficient of stability C2 of each classifier.
In present embodiment, the classifier determining module 304 using current fund to each classifier after fitting into
Current fund is divided into 5 grades from big to small by probability value when row test according to classifier output, that is, and first grade, second gear, third
Shelves, fourth speed and fifth speed.If the earning rate for first grade of the current fund that a certain classifier divides be it is highest, will point
The corresponding test result of class device is denoted as 1, on the contrary then be denoted as 0.Then the fixation time point (monthly No. 1) in the several years in past is utilized
Current fund in time series tests each classifier, and by test result according to above-mentioned principle to each classification
The corresponding test result of device is marked.In this way, one group of array being made of 1 and 0 will be obtained, after averaging to the array
To the second coefficient of stability C2.Wherein, the second coefficient of stability C2 represents degree of stability of the classifier used in true predictive when.
The classifier determining module 304 also tests each classifier after fitting using current fund, and counts
Calculate the head combination information ratio IR (Information Ratio) of each classifier.
In present embodiment, the classifier determining module 304 using current fund to each classifier after fitting into
Current fund is ranked up by probability value when row test according to classifier output from big to small, and N only, is then utilized in mistake before taking
Go the current fund in the time series of the fixation time point of several years to test each classifier, and by test result according to
Current fund is ranked up the probability value that each classifier exports by above-mentioned principle from big to small, and N only, constitutes head before taking
Combination, and calculate the information ratio IR of the corresponding head combination of each classifier.
The classifier determining module 304 calculates the final index of each classifier according to calculation formula A=C1*C2*IR
A, and classifier corresponding to maximum final index A is determined as optimum classifier.
The investment combination assets determining module 305 obtains current fund data, and using the optimum classifier from obtaining
The fund that preset quantity is selected in the current fund data taken determines investment combination assets.
In present embodiment, the investment combination assets determining module 305 is using the optimum classifier to current fund
Data are tested, and current fund data is ranked up by the probability value according to optimum classifier output from big to small, are taken
Preceding N fund data is as the investment combination assets.
The Asset Allocation module 306 carries out the Asset Allocation of major class Asset Allocation model to investment combination assets, thus
Determine the investment weight of investment combination assets.
In present embodiment, the major class Asset Allocation model can be risk par model.In present embodiment, to throwing
The Asset Allocation that money combination investment carries out major class Asset Allocation model, which refers to, carries out risk par model to investment combination assets
Asset Allocation is with the investment weight of the determination investment combination assets.
In present embodiment, risk par model is based on Principal Component Analysis, linear by carrying out to investment combination assets
Combination forms irrelevant investment combination assets, and the assets of risk par model are carried out for incoherent investment combination assets
Configuration, the final investment weight for determining investment combination assets.For example, in one embodiment, it is assumed that in investment combination assets altogether
Have N number of assets, the earning rate of assets be R=[r1, r2 ... rN] ', for investment combination weight w=[w1, w2 ... wN] ', throw
Provide combined total revenue are as follows: Rw=w ' R.Then, covariance matrix Σ=Cov of assets is calculated using the earning rate of N number of assets
(R), because of the symmetry of covariance matrix Σ, Σ can be decomposed into N number of orthogonal feature vector: E Λ E '=∑, in which: Λ=
Diag (λ 1, λ 2 ..., λ N) is the diagonal matrix of Σ characteristic value building, and λ i meets λ1≥λ2≥…≥λN;E is λiCharacter pair to
Measure eiThe eigenvectors matrix being arranged to make up is arranged, and E is orthogonal matrix, wherein E '=E-1And E ' E=I.Therefore, covariance matrix can
It decomposes are as follows: ∑=λ1e1e'1+λ2e2e'2+···+λNeNe'N.Feature vector can form N number of orthogonal investment combination, and be claimed
The principal component factor.The earning rate of the principal component factor may be defined as: RPC=E ' R, meanwhile, Cov (RPC)=Cov (E ' R)=E '
Cov (R) E=E ' Σ E=E ' E Λ E ' E=Λ.Have for single principal component investment combinationFor appointing
It anticipates two principal component factorsWithHaveIt can be found that N number of principal component factor be it is incoherent and
Their variance respectively with λ1,λ2,…,λNIt is equal.Therefore, the weight of the principal component factor can be made of the linear combination of former weight,
That is: wPC=E ' w.In present embodiment, the total revenue of the principal component factor are as follows: Rw=wPC′RPC=(E ' w) ' (E ' R)=w '
EE ' R=w ' R.To principal component factor risk par model, RC is contributed by riskiDefinition can obtain
Institute
The risk par model for stating the principal component factor can be exchanged into:
Above-mentioned equation can be converted into Optimized model and solve optimal weights, so as to define principal component risk par mould
Type, it may be assumed that
When objective function is equal to 0, haveRCi=RCj, the numerical solution is principal component risk
The weight of par investment combination.
In present embodiment, the method also includes:
The update module 307 carries out the assets of major class Asset Allocation model to the investment combination assets with predetermined period
Configuration is regularly updated with the investment weight to the investment combination assets.In present embodiment, the predetermined period is one
A month, a season or 1 year.
Embodiment three
Fig. 3 is the schematic diagram of 4 preferred embodiment of electronic equipment of the present invention.
The electronic equipment 4 includes memory 41, processor 42 and is stored in the memory 41 and can be described
The computer program 43 run on processor 42.The processor 42 is realized when executing the computer program 43 above-mentioned selects base side
Step in method embodiment, such as step S21~S27 shown in FIG. 1.Alternatively, the processor 42 executes the computer journey
The above-mentioned function of selecting each module/unit in based method Installation practice, such as the module 301~307 in Fig. 2 are realized when sequence 43.
Illustratively, the computer program 43 can be divided into one or more module/units, it is one or
Multiple module/units are stored in the memory 41, and are executed by the processor 43, to complete the present invention.Described one
A or multiple module/units can be the series of computation machine program instruction section that can complete specific function, and described instruction section is used
In implementation procedure of the description computer program 43 in the electronic equipment 4.For example, the computer program 43 can be by
Building module 301, data division module 302, fitting module 303, classifier determining module 304, the investment group being divided into Fig. 2
Joint production determining module 305, Asset Allocation module 306 and update module 307, each module concrete function is referring to embodiment two.
The electronic equipment 4 can be the calculating such as desktop PC, notebook, palm PC and cloud server and set
It is standby.It will be understood by those skilled in the art that the schematic diagram is only the example of electronic equipment 4, do not constitute to electronic equipment 4
Restriction, may include perhaps combining certain components or different components, such as institute than illustrating more or fewer components
Stating electronic equipment 4 can also include input-output equipment, network access equipment, bus etc..
Alleged processor 42 can be central processing module (Central Processing Unit, CPU), can also be
Other general processors, digital signal processor (Digital Signal Processor, DSP), specific integrated circuit
(Application Specific Integrated Circuit, ASIC), ready-made programmable gate array (Field-
Programmable Gate Array, FPGA) either other programmable logic device, discrete gate or transistor logic,
Discrete hardware components etc..General processor can be microprocessor or the processor 42 is also possible to any conventional processing
Device etc., the processor 42 are the control centres of the electronic equipment 4, utilize various interfaces and the entire electronic equipment of connection
4 various pieces.
The memory 41 can be used for storing the computer program 43 and/or module/unit, and the processor 42 passes through
Operation executes the computer program and/or module/unit being stored in the memory 41, and calls and be stored in memory
Data in 41 realize the various functions of the meter electronic equipment 4.The memory 41 can mainly include storing program area and deposit
Store up data field, wherein storing program area can application program needed for storage program area, at least one function (for example sound is broadcast
Playing function, image player function etc.) etc.;Storage data area can store according to electronic equipment 4 use created data (such as
Audio data, phone directory etc.) etc..In addition, memory 41 may include high-speed random access memory, it can also include non-volatile
Property memory, such as hard disk, memory, plug-in type hard disk, intelligent memory card (Smart Media Card, SMC), secure digital
(Secure Digital, SD) card, flash card (Flash Card), at least one disk memory, flush memory device or other
Volatile solid-state part.
If the integrated module/unit of the electronic equipment 4 is realized in the form of software function module and as independent
Product when selling or using, can store in a computer readable storage medium.Based on this understanding, the present invention is real
All or part of the process in existing above-described embodiment method, can also instruct relevant hardware come complete by computer program
At the computer program can be stored in a computer readable storage medium, and the computer program is held by processor
When row, it can be achieved that the step of above-mentioned each embodiment of the method.Wherein, the computer program includes computer program code, institute
Stating computer program code can be source code form, object identification code form, executable file or certain intermediate forms etc..It is described
Computer-readable medium may include: any entity or device, recording medium, U that can carry the computer program code
Disk, mobile hard disk, magnetic disk, CD, computer storage, read-only memory (ROM, Read-Only Memory), arbitrary access
Memory (RAM, Random Access Memory), electric carrier signal, telecommunication signal and software distribution medium etc..It needs
It is bright, the content that the computer-readable medium includes can according in jurisdiction make laws and patent practice requirement into
Row increase and decrease appropriate, such as do not include electric load according to legislation and patent practice, computer-readable medium in certain jurisdictions
Wave signal and telecommunication signal.
In several embodiments provided by the present invention, it should be understood that arriving, disclosed electronic equipment and method can be with
It realizes by another way.For example, electronic equipment embodiment described above is only schematical, for example, the mould
The division of block, only a kind of logical function partition, there may be another division manner in actual implementation.
It, can also be in addition, each functional module in each embodiment of the present invention can integrate in same treatment module
It is that modules physically exist alone, can also be integrated in equal modules with two or more modules.Above-mentioned integrated mould
Block both can take the form of hardware realization, can also realize in the form of hardware adds software function module.
It is obvious to a person skilled in the art that invention is not limited to the details of the above exemplary embodiments, Er Qie
In the case where without departing substantially from spirit or essential attributes of the invention, the present invention can be realized in other specific forms.Therefore, no matter
From the point of view of which point, the present embodiments are to be considered as illustrative and not restrictive, and the scope of the present invention is by appended power
Benefit requires rather than above description limits, it is intended that all by what is fallen within the meaning and scope of the equivalent elements of the claims
Variation is included in the present invention.Any reference signs in the claims should not be construed as limiting the involved claims.This
Outside, it is clear that one word of " comprising " is not excluded for other modules or step, and odd number is not excluded for plural number.It is stated in electronic equipment claim
Multiple modules or electronic equipment can also be implemented through software or hardware by the same module or electronic equipment.The first, the
Second-class word is used to indicate names, and is not indicated any particular order.
Finally it should be noted that the above examples are only used to illustrate the technical scheme of the present invention and are not limiting, although reference
Preferred embodiment describes the invention in detail, those skilled in the art should understand that, it can be to of the invention
Technical solution is modified or equivalent replacement, without departing from the spirit and scope of the technical solution of the present invention.
Claims (10)
1. a kind of select based method, which is characterized in that the described method includes:
Banking operation data are read, and construct the factor of the banking operation data according to the banking operation data of reading, wherein
The factor of the banking operation data include year earning rate, maximum withdraw, Sharpe Ratio, downlink standard deviation, Suo Tinuo ratio,
One of entire period of actual operation of the scale of Fund Company, the scale of fund or fund manager is a variety of;
The factor of the banking operation data and the banking operation data is divided into training dataset and test data set;
It is fitted multiple classifiers respectively using the training dataset and the test data set, wherein the classifier being fitted
Including logistic regression classifier, support vector machine classifier, Gauss Naive Bayes Classifier and random forest grader;
Optimum classifier is determined from the classifier of fitting;
Current fund data is obtained, and selects preset quantity from the current fund data of acquisition using the optimum classifier
Fund determines investment combination assets;And
The Asset Allocation that investment combination assets are carried out with major class Asset Allocation model, determines the fund of the investment combination assets
Invest weight.
2. selecting based method as described in claim 1, which is characterized in that the reading banking operation data and the gold according to reading
Melt behavioral data and construct the factors of the banking operation data and includes: that banking operation data to acquisition are cleaned, and according to
Banking operation data after cleaning construct the factor of the banking operation data after the cleaning.
3. selecting based method as described in claim 1, which is characterized in that the banking operation data of described pair of acquisition carry out cleaning packet
It includes:
The fund without net value is removed from the banking operation data;
Removal classification fund and money market type fund from the banking operation data;
It is removed from the banking operation data and lists the fund less than 1 year by the current the end of month;
From in the banking operation data remove by continued to exceed in current the previous year at the end of month 20% the day of trade net value without more
New fund;And
The fund that current the end of month fund net assets are lower than 10,000,000 yuan is removed from the banking operation data.
4. selecting based method as described in claim 1, which is characterized in that by the banking operation data and the banking operation number
According to the factor be divided into training dataset and test data set includes:
Judge the banking operation data lower monthly benefits whether be more than the banking operation data similar fund index, if
More than then by the banking operation data label be 1, otherwise by the banking operation data label be 0, so by the finance
Behavioral data carries out label and generates label data;And
The factor of the banking operation data and the label data randomly divide with preset ratio and generate the instruction
Practice data set and the test data set.
5. selecting based method as described in claim 1, which is characterized in that determine optimum classifier in the classifier from fitting
Include:
Each classifier after fitting is tested using the test data set, and calculates the first steady of each classifier
Determine coefficient C1, wherein C1 represents degree of stability of the classifier used in the test data when;
Each classifier after fitting is tested using current fund, and calculates second coefficient of stability of each classifier
C2, C2 represent degree of stability of the classifier used in the current fund when;
Each classifier after fitting is tested using current fund, and calculates the head combination information of each classifier
Ratio IR;
According to calculation formula A=C1*C2*IR, the final index A of each classifier is calculated;And
Classifier corresponding to maximum final index A is determined as the optimum classifier.
6. selecting based method as described in claim 1, which is characterized in that the method also includes:
The Asset Allocation of major class Asset Allocation model is carried out to the investment group with predetermined period to the investment combination assets
The investment weight of joint production is regularly updated.
7. selecting based method as described in claim 1, which is characterized in that the major class Asset Allocation model is risk par mould
Type.
8. a kind of select based devices, which is characterized in that described device includes:
Module is constructed, constructs the banking operation number for reading banking operation data, and according to the banking operation data of reading
According to the factor, wherein the factor of the banking operation data include year earning rate, maximum withdraw, Sharpe Ratio, downlink standard
One of entire period of actual operation of difference, Suo Tinuo ratio, the scale of Fund Company, the scale of fund or fund manager is a variety of;
Data division module, for the factor of the banking operation data and the banking operation data to be divided into training data
Collection and test data set;
Fitting module, for being fitted multiple classifiers respectively using the training dataset and the test data set, wherein institute
Stating classifier includes logistic regression classifier, support vector machine classifier, Gauss Naive Bayes Classifier and random forest point
Class device;
Classifier determining module, for determining optimum classifier from the classifier of fitting;
Investment combination assets determining module, for obtaining current fund data, and the working as from acquisition using the optimum classifier
The fund that preset quantity is selected in preceding fund data determines investment combination assets;And
Asset Allocation module, for investment combination assets to be carried out with the Asset Allocation of major class Asset Allocation model, and described in determination
The investment weight of the fund of investment combination assets.
9. a kind of electronic equipment, it is characterised in that: the electronic equipment includes processor, and the processor is for executing memory
It is realized when the computer program of middle storage and selects based method as described in any one of claim 1-7.
10. a kind of computer readable storage medium, is stored thereon with computer program, it is characterised in that: the computer program
It is realized when being executed by processor and selects based method as described in any one of claim 1-7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811536792.9A CN109767333A (en) | 2018-12-15 | 2018-12-15 | Select based method, device, electronic equipment and computer readable storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811536792.9A CN109767333A (en) | 2018-12-15 | 2018-12-15 | Select based method, device, electronic equipment and computer readable storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109767333A true CN109767333A (en) | 2019-05-17 |
Family
ID=66451901
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811536792.9A Pending CN109767333A (en) | 2018-12-15 | 2018-12-15 | Select based method, device, electronic equipment and computer readable storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109767333A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110298759A (en) * | 2019-05-29 | 2019-10-01 | 苏宁易购集团股份有限公司 | A kind of fund diagnostic method, device and computer readable storage medium |
CN112767132A (en) * | 2021-01-26 | 2021-05-07 | 北京国腾联信科技有限公司 | Data processing method and system |
-
2018
- 2018-12-15 CN CN201811536792.9A patent/CN109767333A/en active Pending
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110298759A (en) * | 2019-05-29 | 2019-10-01 | 苏宁易购集团股份有限公司 | A kind of fund diagnostic method, device and computer readable storage medium |
CN112767132A (en) * | 2021-01-26 | 2021-05-07 | 北京国腾联信科技有限公司 | Data processing method and system |
CN112767132B (en) * | 2021-01-26 | 2024-02-02 | 北京国腾联信科技有限公司 | Data processing method and system |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
De Andrés et al. | Bankruptcy forecasting: A hybrid approach using Fuzzy c-means clustering and Multivariate Adaptive Regression Splines (MARS) | |
Hajizadeh et al. | Application of data mining techniques in stock markets: A survey | |
CN111402061B (en) | Asset management method and system | |
CN108256691A (en) | Refund Probabilistic Prediction Model construction method and device | |
TWI248001B (en) | Methods and apparatus for automated underwriting of segmentable portfolio assets | |
CN107590688A (en) | The recognition methods of target customer and terminal device | |
CN104321794B (en) | A kind of system and method that the following commercial viability of an entity is determined using multidimensional grading | |
CN108280541A (en) | Customer service strategies formulating method, device based on random forest and decision tree | |
Becker et al. | ANP-based analysis of ICT usage in Central European enterprises | |
CN109598300A (en) | A kind of assessment system and method | |
CN109784779A (en) | Financial risk prediction technique, device and storage medium | |
ElBahrawy et al. | Wikipedia and cryptocurrencies: interplay between collective attention and market performance | |
CN110147389A (en) | Account number treating method and apparatus, storage medium and electronic device | |
CN109636620A (en) | Asset Allocation method, apparatus, electronic equipment and computer readable storage medium | |
Wanke et al. | Revisiting camels rating system and the performance of Asean banks: a comprehensive mcdm/z-numbers approach | |
CN107851283A (en) | The segmentation and layering of the comprehensive method of investment combination of investment securities | |
CN109767333A (en) | Select based method, device, electronic equipment and computer readable storage medium | |
Makkonen et al. | Multi‐criteria decision support in the liberalized energy market | |
CN108985595A (en) | The move transaction service evaluation method and device mutually commented based on counterparty | |
Pritam et al. | A novel methodology for perception-based portfolio management | |
CN111667307A (en) | Method and device for predicting financial product sales volume | |
Önder et al. | REITs in Turkey: Fundamentals vs. market | |
CN116361542A (en) | Product recommendation method, device, computer equipment and storage medium | |
KR101960863B1 (en) | System of valuation of technology | |
Niknya et al. | Financial distress prediction of Tehran Stock Exchange companies using support vector machine |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |