CN108304975A - A kind of data prediction system and method - Google Patents

A kind of data prediction system and method Download PDF

Info

Publication number
CN108304975A
CN108304975A CN201810168243.4A CN201810168243A CN108304975A CN 108304975 A CN108304975 A CN 108304975A CN 201810168243 A CN201810168243 A CN 201810168243A CN 108304975 A CN108304975 A CN 108304975A
Authority
CN
China
Prior art keywords
frequency data
prediction model
frequency
data prediction
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201810168243.4A
Other languages
Chinese (zh)
Inventor
李燕伟
段立新
王甲樑
夏耘海
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guoxin Youe Data Co Ltd
Original Assignee
Guoxin Youe Data Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guoxin Youe Data Co Ltd filed Critical Guoxin Youe Data Co Ltd
Priority to CN201810168243.4A priority Critical patent/CN108304975A/en
Publication of CN108304975A publication Critical patent/CN108304975A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/06Asset management; Financial planning or analysis

Abstract

This application provides a kind of data prediction system and methods, wherein the system includes:High-frequency data acquisition module, the corresponding high-frequency data of low-frequency data for obtaining the period to be predicted;Prediction model extraction module, for being concentrated from advance trained high-frequency data prediction model collection and data prediction model, the high-frequency data prediction model subset and data prediction model that extraction matches with the system period to be predicted;Low-frequency data prediction module obtains the prediction result of system low-frequency data for utilizing the corresponding high-frequency data of system low-frequency data, the high-frequency data prediction model subset of extraction and data prediction model.The application carries out the prediction of low-frequency data by trained in advance high-frequency data prediction model collection and data prediction model collection, avoid information caused by being predicted by the way of directly average lose and the poor problem of the accuracy rate and reliability predicted, the possibility of information loss is reduced, the accuracy rate and reliability of prediction are higher.

Description

A kind of data prediction system and method
Technical field
This application involves technical field of data prediction, in particular to a kind of data prediction system and method.
Background technology
Financial market (such as stock market) is analyzed and predicted, is the key problem in financial investment, to economy Market is analyzed and predicted, and is the key problem in macro adjustments and controls.Data prediction is carried out in financial field and economic field When, it can be related to the data of various frequencies.Wherein, the data of above-mentioned various frequencies may include not only low-frequency data, such as season Degrees of data can also include high-frequency data, such as weekly data, monthly data, or even can also include ultra-high-frequency data, such as in real time Data etc..As it can be seen that the frequency for the data collected with increasingly finer time scale is also higher and higher.
For the ease of being predicted using the data of above-mentioned various frequencies, the data predication method provided in the related technology is first First high-frequency data is averaged according to the corresponding frequency of low-frequency data, the high-frequency data prediction low frequency number after being then based on averagely According to.
As it can be seen that data predication method in the related technology carries out data conversion by the way of directly average, in high frequency According in (such as monthly data), the observed data value of each moon have it is big also have it is small in the case of, corresponding transformation result is definite value, high frequency Relevant information (such as tendency information) in data will lose, poor so as to cause the accuracy rate and reliability of prediction.
Invention content
In view of this, the application's is designed to provide a kind of data prediction system and method, to improve data prediction Accuracy and reliability.
In a first aspect, this application provides a kind of data prediction system, the system comprises:
High-frequency data acquisition module, the corresponding high-frequency data of low-frequency data for obtaining the period to be predicted;
Prediction model extraction module is used for from advance trained high-frequency data prediction model collection and data prediction model It concentrates, the high-frequency data prediction model subset and data prediction model that extraction matches with the period to be predicted;
Low-frequency data prediction module, for pre- using the corresponding high-frequency data of the low-frequency data, the high-frequency data of extraction Model subset and data prediction model are surveyed, the prediction result of the low-frequency data is obtained.
With reference to first aspect, this application provides the first possible embodiments of first aspect, wherein the system Further include high-frequency data prediction model collection structure module, the high-frequency data prediction model collection structure module includes:
The multiclass of data capture unit, low-frequency data and the target time section for obtaining target time section is high Frequency evidence is corresponding at least one high-frequency data per class high-frequency data;
Single class high-frequency data prediction model training unit, for being directed to any sort high-frequency data, by any sort high frequency The value of the corresponding at least one high-frequency data of data is carried out as independent variable using the value of the low-frequency data as dependent variable Training, obtains single class high-frequency data prediction model of the target time section;
High-frequency data prediction model collection construction unit, for building the height according to multiple single class high-frequency data prediction models Frequency data prediction model subset, it is pre- that the high-frequency data prediction model subset according to multiple target time sections builds the high-frequency data Survey Models Sets.
The possible embodiment of with reference to first aspect the first, second this application provides first aspect are possible Embodiment, wherein the system also includes data prediction model collection to build module, and the data prediction model collection builds module Including:
Single class high-frequency data prediction result acquiring unit, for any sort high-frequency data is corresponding at least one high The value of frequency evidence is input to the corresponding single class high-frequency data prediction model of the high-frequency data prediction model subset, obtains single Class high-frequency data prediction result;
Data prediction model collection construction unit, multiple single class high-frequency data prediction results for that will obtain are used as certainly The value of the low-frequency data is trained by variable as dependent variable, obtains the data prediction model of the target time section, Data prediction model according to multiple target time sections builds the data prediction model collection.
With reference to first aspect, the possible embodiment of the first of first aspect and second of possible embodiment, this Application provides the third possible embodiment of first aspect, wherein the low-frequency data prediction module includes:
High-frequency data taxon, for carrying out high-frequency data classification to the corresponding high-frequency data of the low-frequency data, often Class high-frequency data is corresponding at least one high-frequency data;
Prediction model matching unit inquires high-frequency data prediction model of extraction for being directed to any sort high-frequency data Collection, obtains the single class high-frequency data prediction model to match with any sort high-frequency data;
Prediction result acquiring unit is used for the value of the corresponding at least one high-frequency data of any sort high-frequency data Obtained single class high-frequency data prediction model is inputted, single class high-frequency data prediction result is obtained;
Low-frequency data predicting unit, the number for the multiple single class high-frequency data prediction result input extractions that will be obtained It is predicted that in model, the prediction result of the low-frequency data is obtained.
The possible embodiment of with reference to first aspect the first, the 4th kind this application provides first aspect are possible Embodiment, wherein the high-frequency data prediction model collection builds module and further includes:
Precision of prediction judgment module, for for any single class high-frequency data prediction model, judging the high frequency of single class It is predicted that whether the precision of prediction of model is more than predetermined threshold value;
The high-frequency data prediction model collection construction unit, specifically for judging single class high-frequency data prediction mould When the precision of prediction of type is more than predetermined threshold value, builds the high-frequency data according to multiple single class high-frequency data prediction models and predict mould Type subset.
The possible embodiment of with reference to first aspect the first, the 5th kind this application provides first aspect are possible Embodiment, wherein it is pre- to be specifically used for the initial single class high-frequency data of structure for the list class high-frequency data prediction model training unit Survey model;The value of initial weight coefficient and the corresponding high-frequency data of any sort high-frequency data is input to described in structure Initial list class high-frequency data prediction model;
Judge whether the output error of initial single class high-frequency data prediction model is less than default error, if it is not, being based on The output error is adjusted the initial weight coefficient, and based on initial single class after progress initial weight coefficient adjustment High-frequency data prediction model obtains output error, directly again to the value of the corresponding high-frequency data of any sort high-frequency data The corresponding next high-frequency data of any sort high-frequency data is inputted when being less than default error to output error;Wherein, described defeated Go out error by the current predictive result of initial single class high-frequency data prediction model and the low frequency number of the target time section According to value determine;
The value of the corresponding high-frequency data of any sort high-frequency data is input to successively and carries out initial weight coefficient tune Initial single class high-frequency data prediction model after whole, until judging that the corresponding output error of the last one high-frequency data is less than When the default error, single class high-frequency data prediction model is obtained.
The possible embodiment of with reference to first aspect the first, the 6th kind this application provides first aspect are possible Embodiment, wherein according to following formula structure single class high-frequency data prediction model:
Wherein, the ytMean the low-frequency data of target time section,Mean any sort high-frequency data of target time section Corresponding high-frequency data, m indicate the multiplying power between high-frequency data and low-frequency data, β0、β1Mean constant, εtMean random error , L1/mMean high frequency lag operator, andK means high-frequency data lag order,System Refer to the high-frequency data that t-th of low-frequency data corresponds to lag k exponent numbers, W (k;θ) mean weight coefficient.
The 6th kind of possible embodiment with reference to first aspect, the 7th kind this application provides first aspect are possible Embodiment, wherein the weight coefficient is obtained by any one estimation technique in estimated below method:The A Ermeng estimations technique refer to The number A Ermeng estimations technique, the beta distribution estimation technique and step Function Estimation method.
Second aspect, present invention also provides a kind of data predication method, the method includes:
Obtain the corresponding high-frequency data of low-frequency data of period to be predicted;
Concentrated from advance trained high-frequency data prediction model collection and data prediction model, extraction with it is described to be predicted The high-frequency data prediction model subset and data prediction model that period matches;
Utilize the corresponding high-frequency data of the low-frequency data, the high-frequency data prediction model subset of extraction and data prediction Model obtains the prediction result of the low-frequency data.
In conjunction with second aspect, this application provides the first possible embodiments of second aspect, wherein described in structure High-frequency data prediction model collection, including:
The low-frequency data of target time section and the multiclass high-frequency data of the target time section are obtained, per the high frequency of class According to being corresponding at least one high-frequency data;
For any sort high-frequency data, the value of the corresponding at least one high-frequency data of any sort high-frequency data is made For independent variable, the value of the low-frequency data is trained as dependent variable, obtains single class high frequency of the target time section Data prediction model;
The high-frequency data prediction model subset is built according to multiple single class high-frequency data prediction models, according to multiple targets The high-frequency data prediction model subset of period builds the high-frequency data prediction model collection.
Data prediction system and method provided by the present application, high-frequency data acquisition module obtain the low of period to be predicted Frequency is according to corresponding high-frequency data;Prediction model extraction module is from advance trained high-frequency data prediction model collection and data Prediction model is concentrated, the high-frequency data prediction model subset and data prediction mould that extraction matches with the period to be predicted Type;Low-frequency data prediction module utilizes the corresponding high-frequency data of the low-frequency data, the high-frequency data prediction model subset of extraction And data prediction model, the prediction result of the low-frequency data is obtained, that is, it is predicted by high-frequency data trained in advance Models Sets and data prediction model collection carry out the prediction of low-frequency data, avoid that carry out data by the way of directly average pre- Information caused by surveying is lost and the poor problem of accuracy rate and reliability predicted, greatly reduces the possibility of information loss Property, the accuracy rate and reliability of prediction are higher.
To enable the above objects, features, and advantages of the application to be clearer and more comprehensible, preferred embodiment cited below particularly, and coordinate Appended attached drawing, is described in detail below.
Description of the drawings
It, below will be to needed in the embodiment attached in order to illustrate more clearly of the technical solution of the embodiment of the present application Figure is briefly described, it should be understood that the following drawings illustrates only some embodiments of the application, therefore is not construed as pair The restriction of range for those of ordinary skill in the art without creative efforts, can also be according to this A little attached drawings obtain other relevant attached drawings.
Fig. 1 shows a kind of structural schematic diagram for data prediction system that the embodiment of the present application is provided;
Fig. 2 shows a kind of data prediction system High-frequency Data prediction model collection structures that the embodiment of the present application is provided The structural schematic diagram of module;
Fig. 3 shows that data prediction model collection builds module in a kind of data prediction system that the embodiment of the present application is provided Structural schematic diagram;
Fig. 4 shows the structure of low-frequency data prediction module in a kind of data prediction system that the embodiment of the present application is provided Schematic diagram;
Fig. 5 shows a kind of flow chart for data predication method that the embodiment of the present application is provided;
Fig. 6 shows the flow chart for another data predication method that the embodiment of the present application is provided;
Fig. 7 shows the flow chart for another data predication method that the embodiment of the present application is provided;
Fig. 8 shows the flow chart for another data predication method that the embodiment of the present application is provided;
Fig. 9 shows the flow chart for another data predication method that the embodiment of the present application is provided;
Figure 10 shows a kind of structural schematic diagram for computer equipment that the embodiment of the present application is provided.
Main element symbol description:
11, high-frequency data acquisition module;22, prediction model extraction module;33, low-frequency data prediction module;44, high frequency It is predicted that Models Sets build module;55, data prediction model collection builds module;331, high-frequency data taxon;332, it predicts Model Matching unit;333, prediction result acquiring unit;334, low-frequency data predicting unit;441, data capture unit;442、 Single class high-frequency data prediction model training unit;443, high-frequency data prediction model collection construction unit;444, precision of prediction judges Module;551, single class high-frequency data prediction result acquiring unit;552, data prediction model collection construction unit.
Specific implementation mode
To keep the purpose, technical scheme and advantage of the embodiment of the present application clearer, below in conjunction with the embodiment of the present application Middle attached drawing, technical solutions in the embodiments of the present application are clearly and completely described, it is clear that described embodiment is only It is some embodiments of the present application, instead of all the embodiments.The application being usually described and illustrated herein in the accompanying drawings is real Applying the component of example can be arranged and designed with a variety of different configurations.Therefore, below to the application's for providing in the accompanying drawings The detailed description of embodiment is not intended to limit claimed scope of the present application, but is merely representative of the selected reality of the application Apply example.Based on embodiments herein, institute that those skilled in the art are obtained without making creative work There is other embodiment, shall fall in the protection scope of this application.
In view of data predication method in the related technology carries out data conversion by the way of directly average, in high frequency According in (such as monthly data), the observed data value of each moon have it is big also have it is small in the case of, corresponding transformation result is definite value, high frequency Relevant information (such as tendency information) in data will lose, poor so as to cause the accuracy rate and reliability of prediction.This Shen Please a kind of embodiment provide a kind of data prediction system, to improve accuracy and the reliability of data prediction, reality as described below Apply example.
It is the structural schematic diagram of data prediction system provided by the embodiments of the present application, data prediction system tool referring to Fig. 1 Body includes:
High-frequency data acquisition module 11, the corresponding high-frequency data of low-frequency data for obtaining the period to be predicted;
Prediction model extraction module 22 is used for from advance trained high-frequency data prediction model collection and data prediction mould Type is concentrated, the high-frequency data prediction model subset and data prediction model that extraction matches with the period to be predicted;
Low-frequency data prediction module 33, for being predicted using the high-frequency data of the corresponding high-frequency data of low-frequency data, extraction Model subset and data prediction model obtain the prediction result of low-frequency data.
Here, the data prediction system that the embodiment of the present application is provided can be applied to various application scenarios, especially golden Melt field, economic field.For different application fields, the low-frequency data that can be got is different, corresponding high-frequency data It is different.For example, for economic field, the high-frequency data in the embodiment of the present application corresponds to economic fluctuation influence index number According to low-frequency data then corresponds to economic fluctuation achievement data.
Wherein, above-mentioned economic fluctuation achievement data can be GDP (Gross Domestic Product, GDP) speedup, corresponding economic fluctuation influence index data can be then include that state's room consumer confidence index, M1 and M2 speedups are poor, practical The steady achievement data of economy including effective exchange rate index etc., can also be the economic pressures including increasing credit data etc. newly Achievement data can also be that the investment of production including generated energy yield, cement output, sum of investments in fixed assets used etc. refers to Data are marked, can also be including the year-on-year speedup of export amount, the year-on-year speedup of import volume, export order index, Merchandising Manager's index Inlet and outlet achievement data including (Purchasing Managers'Index, PMI) etc. can also be and be advised including social collecting funds Fund-raising gap achievement data including mould newly-gained loan, social collecting funds rule creation of new enterprise bond etc..
For above-mentioned GDP speedups as season data, that is to say every season can be collected into a GDP speedup, above-mentioned newly-increased letter It borrows and is used as monthly data, that is to say can be collected into a newly-increased credit per monthly, it is seen then that for the same time to be predicted Section, each season data are corresponding with three monthly datas.The time scale of monthly data is finer, corresponding data frequency Also higher, (i.e. newly-increased credit) then corresponds to the high-frequency data in the embodiment of the present application at this point, above-mentioned monthly data, season data (i.e. GDP speedups) then corresponds to the low-frequency data in the embodiment of the present application.
It is worth noting that the low-frequency data and high-frequency data in the embodiment of the present application are merely for convenience of distinguishing two kinds not The data of same frequency are also different in the above-mentioned low-frequency data of different application scene and the corresponding particular content of high-frequency data.Such as, this Shen High-frequency data that please be in embodiment can also be that real time data, low-frequency data can be counted day, and for another example, the application is implemented High-frequency data in example can also be that day counts, and low-frequency data can be then monthly data etc..For the ease of to the application The understanding for the data prediction system that embodiment provides, next content will be using above-mentioned newly-increased credits as high-frequency data, and GDP increases Speed carries out example as low-frequency data.
The corresponding high-frequency data of the low-frequency data of acquisition is sequentially input supreme frequency data prediction model by the embodiment of the present application In subset and data prediction model, the prediction result of low-frequency data can be obtained.Wherein, above-mentioned high-frequency data prediction model Subset is to concentrate the model subset of extraction to match with the period to be predicted from trained high-frequency data prediction model in advance, Similar, above-mentioned data prediction model is then from trained data prediction model concentration extraction in advance and period to be predicted The Models Sets to match.In addition, for high-frequency data prediction model collection, when the embodiment of the present application is the target based on acquisition Between section low-frequency data and per class high-frequency data, corresponding at least one high-frequency data carries out single class high-frequency data prediction model Training after build, for the data prediction model collection, the embodiment of the present application is then pre- based on above-mentioned single class high-frequency data It is built after the prediction result of survey model and the training of low-frequency data progress data prediction model.
For the ease of building above-mentioned high-frequency data prediction model collection, referring to Fig. 2, data prediction provided by the embodiments of the present application System further includes high-frequency data prediction model collection structure module 44, which builds module 44 and specifically wrap It includes:
Data capture unit 441, the multiclass high frequency of low-frequency data and target time section for obtaining target time section Data are corresponding at least one high-frequency data per class high-frequency data;
Single class high-frequency data prediction model training unit 442, for being directed to any sort high-frequency data, by the high frequency of any sort According to the value of corresponding at least one high-frequency data as independent variable, the value of low-frequency data is trained as dependent variable, Obtain single class high-frequency data prediction model of target time section;
High-frequency data prediction model collection construction unit 443, for high according to multiple single class high-frequency data prediction model structures Frequency data prediction model subset, the high-frequency data prediction model subset structure high-frequency data according to multiple target time sections predict mould Type collection.
Here, for above-mentioned newly-increased credit as high-frequency data, for GDP speedups are as low-frequency data, when above-mentioned target Between section can refer to discrete time section, that is, the corresponding low-frequency data of above-mentioned target time section can be in 2010 to 2017 The corresponding GDP speedups of all first quarters, correspond to the first quarter in above-mentioned each time, corresponding a kind of high frequency According to three high-frequency datas of (i.e. newly-increased credit).It here, then can be by correspondence for newly-increased credit for this kind of high-frequency data Three high-frequency datas value as independent variable, the value of GDP speedups is trained as dependent variable, to obtain above-mentioned mesh The period is marked corresponding to single class high-frequency data prediction model of this kind of high-frequency data of newly-increased credit, then, the application is also based on High-frequency data prediction model collection construction unit 443, according to single class high-frequency data prediction model structure corresponding to the high frequency of all classes The corresponding high-frequency data prediction model subset of data, it is finally, pre- according to the corresponding high-frequency data of each different target time section Surveying the set of model subset can then build to obtain above-mentioned high-frequency data prediction model collection.
It is worth noting that the embodiment of the present application not only can carry out single class height by this kind of high-frequency data for newly-increased credit It is pre- can also to carry out single class high-frequency data for this one kind high-frequency data of the year-on-year speedup of export amount for the training of frequency data prediction model The training of model is surveyed, other class high-frequency datas in economic fluctuation achievement data can also be carried out to single class high-frequency data and predict mould The training of type.As it can be seen that the embodiment of the present application can individually consider that the corresponding independent variable of every class high-frequency data corresponds to low-frequency data Dependent variable influence, to maximum possible protrusion per effect of the class high-frequency data to low-frequency data.
Referring to Fig. 2, the high-frequency data prediction model collection structure module 44 in the embodiment of the present application further includes that precision of prediction is sentenced Disconnected module 444, the precision of prediction judgment module 444, for for any single class high-frequency data prediction model, judging single class high frequency Whether the precision of prediction of data prediction model is more than predetermined threshold value.Judge that single class is high in above-mentioned precision of prediction judgment module 444 When the precision of prediction of frequency data prediction model is more than predetermined threshold value, high-frequency data prediction model collection construction unit 443 is then according to more A list class high-frequency data prediction model builds high-frequency data prediction model subset.Wherein, above-mentioned precision of prediction refers to that single class is high Degree of closeness between the prediction result of frequency data prediction model and practical low-frequency data, if the two is closer, corresponding training Precision is higher.
For the ease of building above-mentioned data prediction model collection, referring to Fig. 3, data prediction system provided by the embodiments of the present application Further include data prediction model collection structure module 55, data prediction model collection structure module 55 specifically includes:
Single class high-frequency data prediction result acquiring unit 551, for any sort high-frequency data is corresponding at least one high The value of frequency evidence is input to the corresponding single class high-frequency data prediction model of high-frequency data prediction model subset, it is high to obtain single class Frequency is it is predicted that result;
Data prediction model collection construction unit 552, multiple single class high-frequency data prediction results for that will obtain are used as certainly The value of low-frequency data is trained by variable as dependent variable, the data prediction model of target time section is obtained, according to multiple The data prediction model of target time section builds data prediction model collection.
Here, the embodiment of the present application can corresponding three high-frequency datas of this kind of high-frequency data by newly-increased credit first Value is input to single class high-frequency data prediction model, to obtain the high frequency of single class for corresponding to this kind of high-frequency data of newly-increased credit It is predicted that as a result, and/or, by the year-on-year speedup of export amount, the value of corresponding three high-frequency datas of this kind of high-frequency data is input to Single class high-frequency data prediction model, it is pre- with the single class high-frequency data for obtaining this kind of high-frequency data corresponding to the year-on-year speedup of export amount Survey as a result, and/or, the value of other corresponding high-frequency datas of class high-frequency data in economic fluctuation achievement data is input to list Class high-frequency data prediction model, then will be upper corresponding to single class high-frequency data prediction result of other class high-frequency datas with acquisition The multiple single class high-frequency data prediction results for stating acquisition are instructed as independent variable, and using the value of low-frequency data as dependent variable Practice, the data prediction model of target time section is obtained, finally according to the corresponding data prediction mould of each different target time section Type can then build to obtain above-mentioned data prediction model collection.
In view of data prediction systematic difference scene provided by the embodiments of the present application, the embodiment of the present application is in above-mentioned data During prediction model is trained, the low-frequency data that can also will be late by the current goal period participates in model training, That is, the application is to the second quarter corresponding target time section when carrying out low-frequency data prediction, in the data pair the of the first quarter When the data of the second quarter have association, the second quarter corresponding low-frequency data can be input in data prediction model and be instructed Practice.
Referring to Fig. 4, the low-frequency data prediction module 33 in data prediction system provided by the embodiments of the present application specifically includes:
High-frequency data taxon 331, for carrying out high-frequency data classification to the corresponding high-frequency data of low-frequency data, per class High-frequency data is corresponding at least one high-frequency data;
Prediction model matching unit 332 inquires the high-frequency data prediction model of extraction for being directed to any sort high-frequency data Subset obtains the single class high-frequency data prediction model to match with any sort high-frequency data;
Prediction result acquiring unit 333 is used for the value of the corresponding at least one high-frequency data of any sort high-frequency data Obtained single class high-frequency data prediction model is inputted, single class high-frequency data prediction result is obtained;
Low-frequency data predicting unit 334, the number for the multiple single class high-frequency data prediction result input extractions that will be obtained It is predicted that in model, the prediction result of low-frequency data is obtained.
As it can be seen that the embodiment of the present application carries out high-frequency data classification, then needle to the corresponding high-frequency data of low-frequency data first To any sort high-frequency data, the high-frequency data prediction model subset of extraction is inquired, obtains matching with any sort high-frequency data Single class high-frequency data prediction model, and the value of the corresponding at least one high-frequency data of any sort high-frequency data is inputted Single class high-frequency data prediction model obtains single class high-frequency data prediction result, finally that multiple single class high-frequency datas of acquisition are pre- It surveys in the data prediction model of result input extraction, obtains the prediction result of low-frequency data.As it can be seen that it is pre- to carry out data using model The process of survey is similar with the training process of model, and particular content is not repeating herein.
It is pre- that data prediction system provided by the embodiments of the present application can be based on the above-mentioned single class high-frequency data of following formula structure Survey model:
Wherein, ytMean the low-frequency data of target time section,Mean that any sort high-frequency data of target time section corresponds to High-frequency data, m indicates the multiplying power between high-frequency data and low-frequency data, β0、β1Mean constant, εtMean stochastic error, L1 /mMean high frequency lag operator, andK means high-frequency data lag order,Mean T low-frequency data corresponds to the high-frequency data of lag i exponent numbers, W (k;θ) mean weight coefficient.
Wherein, if the application implement in using newly-increased credit as high-frequency data, GDP speedups are as low-frequency data, then ytFor t The GDP speedups in period,For the newly-increased credit in t-i periods, then the data frequency for increasing credit newly is 3 times of GPD speedups, corresponding M then be 3, and so on, repeat no more.In addition, above-mentioned weight coefficient can be by the A Ermeng estimations technique, index A Ermeng estimations Any one determination in the estimations technique such as method, the beta distribution estimation technique and step Function Estimation method.
In view of the good characteristic that the index A Ermeng estimations technique have, the embodiment of the present application is preferably based on index A Ermeng The estimation technique is estimated that the weight coefficient estimated according to the index A Ermeng estimations technique is shown below:
Wherein, θ1≤300,θ2<0。
It is worth noting that data prediction system provided by the embodiments of the present application can not only utilize the high frequency of above-mentioned single class It is predicted that the corresponding distributed lag model of model construction formula, by the corresponding current value of high-frequency data for inputting target time section And if its lagged value of dry spell carries out the prediction of single class high-frequency data, can also only be corresponded to using the high-frequency data of target time section If dry spell lagged value structure forward prediction model predicted that details are not described herein.
In addition, carrying out single class high-frequency data prediction model in addition to the above-mentioned estimation technique can be directly based upon in the embodiment of the present application Weight coefficient can also utilize single class high-frequency data prediction model training unit 442 to further realize single class high-frequency data and predict The training of model.
Above-mentioned list class high-frequency data prediction model training unit 442 is specifically used for the initial single class high-frequency data prediction of structure Model;Initial single class that the value of initial weight coefficient and the corresponding high-frequency data of any sort high-frequency data is input to structure is high Frequency data prediction model;
Judge whether the output error of initial single class high-frequency data prediction model is less than default error, if it is not, based on output Error is adjusted initial weight coefficient, and based on initial single class high-frequency data prediction after progress initial weight coefficient adjustment Model obtains output error again to the value of the corresponding high-frequency data of any sort high-frequency data, until output error is less than in advance If inputting any sort high-frequency data corresponding next high-frequency data when error;Wherein, output error is by initial single class high-frequency data The current predictive result of prediction model and the value of the low-frequency data of target time section determine;
The value of the corresponding high-frequency data of any sort high-frequency data is input to successively after carrying out initial weight coefficient adjustment Initial single class high-frequency data prediction model, until judging that it is default that the corresponding output error of the last one high-frequency data is less than When error, single class high-frequency data prediction model is obtained.
The embodiment of the present application can be defeated by the value of initial weight coefficient and the corresponding high-frequency data of any sort high-frequency data Enter into initial single class high-frequency data prediction model of above-mentioned structure to be trained to initial single class high-frequency data prediction model, And by the comparison result between the output error and default error of initial single class high-frequency data prediction model to above-mentioned initial power Weight coefficient is adjusted, until when being less than default error for the output error obtained after the adjustment of above-mentioned high-frequency data, addition is new High-frequency data re-start weight adjustment, until judging that it is default that the corresponding output error of the last one high-frequency data is less than When error, single class high-frequency data prediction model is obtained.Wherein, above-mentioned initial weight coefficient can be determined by random function.
Data prediction system provided by the embodiments of the present application can also carry out after getting low-frequency data and high-frequency data Corresponding data processing, such as data prediction, Missing Data Filling, data conversion, seasonal adjustment etc..Data prediction can be pair The data of acquisition carry out preliminary analysis, and reject the operations such as duplicate data, and Missing Data Filling can be then to be based on mass data pair It can be then that data are normalized that the tendency chart answered carries out operations, the data conversions such as data point filling to incomplete data Processing is under the unification a to referential of the data of separate sources, seasonal adjustment can be divided Seasonal Data Solution, seasonal factor is rejected, and by operation that trend factor or enchancement factor screen to reduce seasonal factor to rear The interference of continuous model training.
In addition, data prediction system provided by the embodiments of the present application can also screen data, namely from multiclass height Frequency filters out best high-frequency data class in.The embodiment of the present application can be based on regression tree and principal component analysis The method that (Principal Component Analysis, PCA) is combined is screened, to further increase following model instruction Experienced efficiency and accuracy rate.
Conceived based on same application, data prediction side corresponding with data prediction system is additionally provided in the embodiment of the present application Method, since the principle that the method in the embodiment of the present application solves the problems, such as is similar to the above-mentioned data prediction system of the embodiment of the present application, Because the implementation of the method may refer to the implementation of system, overlaps will not be repeated.As shown in figure 5, for the embodiment of the present application institute The flow chart of the data predication method of offer, this method include:
S101, the corresponding high-frequency data of low-frequency data for obtaining the period to be predicted;
S102, it is concentrated from advance trained high-frequency data prediction model collection and data prediction model, extracts and wait for and is pre- Survey the high-frequency data prediction model subset and data prediction model that the period matches;
It is S103, pre- using the corresponding high-frequency data of low-frequency data, the high-frequency data prediction model subset of extraction and data Model is surveyed, the prediction result of low-frequency data is obtained.
In the embodiment of the present application, referring to Fig. 6, structure high-frequency data prediction model collection specifically comprises the following steps:
The multiclass high-frequency data of S201, the low-frequency data for obtaining target time section and target time section, per the high frequency of class According to being corresponding at least one high-frequency data;
S202, it is directed to any sort high-frequency data, by the value of the corresponding at least one high-frequency data of any sort high-frequency data As independent variable, the value of low-frequency data is trained as dependent variable, the single class high-frequency data for obtaining target time section is pre- Survey model;
S203, high-frequency data prediction model subset is built according to multiple single class high-frequency data prediction models, according to multiple mesh The high-frequency data prediction model subset for marking the period builds high-frequency data prediction model collection.
In the embodiment of the present application, referring to Fig. 7, structure data prediction model collection specifically comprises the following steps:
S301, by the value of the corresponding at least one high-frequency data of any sort high-frequency data, be input to high-frequency data prediction The corresponding single class high-frequency data prediction model of model subset, obtains single class high-frequency data prediction result;
S302, using multiple single class high-frequency data prediction results of acquisition as independent variable, using the value of low-frequency data as Dependent variable is trained, and obtains the data prediction model of target time section, the data prediction model according to multiple target time sections Build data prediction model collection.
Referring to Fig. 8, above-mentioned S103 specifically comprises the following steps:
S401, high-frequency data classification is carried out to the corresponding high-frequency data of low-frequency data, is corresponding at least per class high-frequency data One high-frequency data;
S402, it is directed to any sort high-frequency data, inquires the high-frequency data prediction model subset of extraction, obtains and any sort height Single class high-frequency data prediction model of frequency data match;
S403, the high frequency of single class for inputting the value of the corresponding at least one high-frequency data of any sort high-frequency data It is predicted that model, obtains single class high-frequency data prediction result;
In S404, the data prediction model for extracting multiple single class high-frequency data prediction result inputs of acquisition, obtain low The prediction result of frequency evidence.
In the embodiment of the present application, after obtaining single class high-frequency data prediction model of target time section, according to multiple lists Before class high-frequency data prediction model builds high-frequency data prediction model subset, the above method further includes:
For any single class high-frequency data prediction model, judge whether the precision of prediction of single class high-frequency data prediction model is big In predetermined threshold value, high-frequency data prediction model subset is built according to multiple single class high-frequency data prediction models if so, executing Step.
Referring to Fig. 9, above-mentioned S202 specifically comprises the following steps:
The initial single class high-frequency data prediction model of S501, structure;
S502, the value of initial weight coefficient and the corresponding high-frequency data of any sort high-frequency data is input to the first of structure Begin single class high-frequency data prediction model;
S503, judge whether the output error of initial single class high-frequency data prediction model is less than default error, if it is not, being based on Output error is adjusted initial weight coefficient, and based on initial single class high-frequency data after progress initial weight coefficient adjustment Prediction model obtains output error again to the value of the corresponding high-frequency data of any sort high-frequency data, until output error is small The corresponding next high-frequency data of input any sort high-frequency data when default error;Wherein, output error is by initial single class high frequency The current predictive result of data prediction model and the value of the low-frequency data of target time section determine;
S504, successively by the value of the corresponding high-frequency data of any sort high-frequency data be input to carry out initial weight coefficient tune Initial single class high-frequency data prediction model after whole, until judging that the corresponding output error of the last one high-frequency data is less than When default error, single class high-frequency data prediction model is obtained.
In specific implementation, single class high-frequency data prediction model is built according to following formula:
Wherein, ytMean the low-frequency data of target time section,Mean that any sort high-frequency data of target time section corresponds to High-frequency data, m indicates the multiplying power between high-frequency data and low-frequency data, β0、β1Mean constant, εtMean stochastic error, L1 /mMean high frequency lag operator, andK means high-frequency data lag order,System Refer to the high-frequency data that t-th of low-frequency data corresponds to lag k exponent numbers, W (k;θ) mean weight coefficient.
Wherein, weight coefficient is obtained by any one estimation technique in estimated below method:The A Ermeng estimations technique, index A Er Cover the estimation technique, the beta distribution estimation technique and step Function Estimation method.
Corresponding to the data predication method in Fig. 5 to Fig. 9, the embodiment of the present application also provides a kind of computer equipments, such as Shown in Figure 10, which includes memory 1000, processor 2000 and is stored on the memory 1000 and can be in the processor The computer program run on 2000, wherein above-mentioned processor 2000 realizes that above-mentioned data are pre- when executing above computer program The step of survey method.
Specifically, above-mentioned memory 1000 and processor 2000 can be general memory and processor, not do here It is specific to limit, when the computer program of 2000 run memory 1000 of processor storage, it is able to carry out above-mentioned data prediction side Method, it is poor using the accuracy rate and reliability of caused prediction by the way of directly average at present to solve the problems, such as, into And reach the data prediction effect of high-accuracy and high reliability.
Corresponding to the data predication method in Fig. 5 to Fig. 9, the embodiment of the present application also provides a kind of computer-readable storages Medium is stored with computer program on the computer readable storage medium, is executed when which is run by processor The step of stating data predication method.
Specifically, which can be general storage medium, such as mobile disk, hard disk, on the storage medium Computer program when being run, above-mentioned data predication method is able to carry out, to solve at present by the way of directly average The poor problem of the accuracy rate and reliability of caused prediction, and then reach high-accuracy and the data prediction of high reliability Effect.
The computer program product of data predication method and system that the embodiment of the present application is provided, including store program The computer readable storage medium of code, the instruction that program code includes can be used for executing the method in previous methods embodiment, Specific implementation can be found in embodiment of the method, and details are not described herein.
It is apparent to those skilled in the art that for convenience and simplicity of description, the system of foregoing description It with the specific work process of device, can refer to corresponding processes in the foregoing method embodiment, details are not described herein.
If function is realized in the form of SFU software functional unit and when sold or used as an independent product, can store In a computer read/write memory medium.Based on this understanding, the technical solution of the application is substantially in other words to existing There is the part for the part or the technical solution that technology contributes that can be expressed in the form of software products, the computer Software product is stored in a storage medium, including some instructions are used so that a computer equipment (can be personal meter Calculation machine, server or network equipment etc.) execute each embodiment method of the application all or part of step.And it is above-mentioned Storage medium includes:USB flash disk, mobile hard disk, read-only memory (ROM, Read-Only Memory), random access memory The various media that can store program code such as (RAM, Random Access Memory), magnetic disc or CD.
More than, the only specific implementation mode of the application, but the protection domain of the application is not limited thereto, and it is any to be familiar with Those skilled in the art can easily think of the change or the replacement in the technical scope that the application discloses, and should all cover Within the protection domain of the application.Therefore, the protection domain of the application should be subject to the protection scope in claims.

Claims (10)

1. a kind of data prediction system, which is characterized in that the system comprises:
High-frequency data acquisition module, the corresponding high-frequency data of low-frequency data for obtaining the period to be predicted;
Prediction model extraction module is used for from advance trained high-frequency data prediction model collection and data prediction model collection In, high-frequency data prediction model subset and data prediction model that extraction matches with the period to be predicted;
Low-frequency data prediction module, for predicting mould using the high-frequency data of the corresponding high-frequency data of the low-frequency data, extraction Type subset and data prediction model obtain the prediction result of the low-frequency data.
2. system according to claim 1, which is characterized in that the system also includes high-frequency data prediction model collection structures Module, the high-frequency data prediction model collection structure module include:
Data capture unit, the high frequency of multiclass of low-frequency data and the target time section for obtaining target time section According to every class high-frequency data is corresponding at least one high-frequency data;
Single class high-frequency data prediction model training unit, for being directed to any sort high-frequency data, by any sort high-frequency data The value of corresponding at least one high-frequency data instructs the value of the low-frequency data as dependent variable as independent variable Practice, obtains single class high-frequency data prediction model of the target time section;
High-frequency data prediction model collection construction unit, for building the high frequency according to multiple single class high-frequency data prediction models It is predicted that model subset, the high-frequency data prediction model subset according to multiple target time sections builds the high-frequency data and predicts mould Type collection.
3. system according to claim 2, which is characterized in that the system also includes data prediction model collection to build mould Block, the data prediction model collection structure module include:
Single class high-frequency data prediction result acquiring unit is used for the corresponding at least one high frequency of any sort high-frequency data According to value, be input to the corresponding single class high-frequency data prediction model of the high-frequency data prediction model subset, it is high to obtain single class Frequency is it is predicted that result;
Data prediction model collection construction unit, multiple single class high-frequency data prediction results for that will obtain are used as from change Amount, the value of the low-frequency data is trained as dependent variable, obtains the data prediction model of the target time section, according to The data prediction model collection is built according to the data prediction model of multiple target time sections.
4. system according to any one of claims 1 to 3, which is characterized in that the low-frequency data prediction module includes:
High-frequency data taxon is high per class for carrying out high-frequency data classification to the corresponding high-frequency data of the low-frequency data Frequency evidence is corresponding at least one high-frequency data;
Prediction model matching unit is inquired the high-frequency data prediction model subset of extraction, is obtained for being directed to any sort high-frequency data To the single class high-frequency data prediction model to match with any sort high-frequency data;
Prediction result acquiring unit, for inputting the value of the corresponding at least one high-frequency data of any sort high-frequency data Obtained single class high-frequency data prediction model obtains single class high-frequency data prediction result;
Low-frequency data predicting unit, the data for the multiple single class high-frequency data prediction result input extractions that will be obtained are pre- It surveys in model, obtains the prediction result of the low-frequency data.
5. system according to claim 2, which is characterized in that the high-frequency data prediction model collection structure module is also wrapped It includes:
Precision of prediction judgment module, for for any single class high-frequency data prediction model, judging that single class high-frequency data is pre- Whether the precision of prediction for surveying model is more than predetermined threshold value;
The high-frequency data prediction model collection construction unit, specifically for judging single class high-frequency data prediction model When precision of prediction is more than predetermined threshold value, high-frequency data prediction model is built according to multiple single class high-frequency data prediction models Collection.
6. system according to claim 2, which is characterized in that the list class high-frequency data prediction model training unit, tool Body is for building initial single class high-frequency data prediction model;By initial weight coefficient and the corresponding height of any sort high-frequency data The value of frequency evidence is input to initial single class high-frequency data prediction model of structure;
Judge whether the output error of initial single class high-frequency data prediction model is less than default error, if it is not, based on described Output error is adjusted the initial weight coefficient, and based on initial single class high frequency after progress initial weight coefficient adjustment Data prediction model obtains output error again to the value of the corresponding high-frequency data of any sort high-frequency data, until defeated Go out when error is less than default error and inputs the corresponding next high-frequency data of any sort high-frequency data;Wherein, the output misses Difference is by the current predictive result of initial single class high-frequency data prediction model and the low-frequency data of the target time section Value determines;
The value of the corresponding high-frequency data of any sort high-frequency data is input to successively after carrying out initial weight coefficient adjustment Initial single class high-frequency data prediction model, until judging that it is described that the corresponding output error of the last one high-frequency data is less than When default error, single class high-frequency data prediction model is obtained.
7. system according to claim 2, which is characterized in that according to the single class high-frequency data prediction of following formula structure Model:
Wherein, the ytMean the low-frequency data of target time section,Mean that any sort high-frequency data of target time section corresponds to High-frequency data, m indicates the multiplying power between high-frequency data and low-frequency data, β0、β1Mean constant, εtMean stochastic error, L1 /mMean high frequency lag operator, andK means high-frequency data lag order,Mean T low-frequency data corresponds to the high-frequency data of lag i exponent numbers, W (k;θ) mean weight coefficient.
8. system according to claim 7, which is characterized in that the weight coefficient is by any one in estimated below method The estimation technique obtains:The A Ermeng estimations technique, the index A Ermeng estimations technique, the beta distribution estimation technique and step Function Estimation method.
9. a kind of data predication method, which is characterized in that the method includes:
Obtain the corresponding high-frequency data of low-frequency data of period to be predicted;
It is concentrated from advance trained high-frequency data prediction model collection and data prediction model, extraction and the time to be predicted The high-frequency data prediction model subset and data prediction model that section matches;
Utilize the corresponding high-frequency data of the low-frequency data, the high-frequency data prediction model subset of extraction and data prediction mould Type obtains the prediction result of the low-frequency data.
10. according to the method described in claim 9, it is characterized in that, build the high-frequency data prediction model collection, including:
The low-frequency data of target time section and the multiclass high-frequency data of the target time section are obtained, per class high-frequency data pair There should be at least one high-frequency data;
For any sort high-frequency data, using the value of the corresponding at least one high-frequency data of any sort high-frequency data as certainly The value of the low-frequency data is trained by variable as dependent variable, obtains single class high-frequency data of the target time section Prediction model;
The high-frequency data prediction model subset is built according to multiple single class high-frequency data prediction models, according to multiple object times The high-frequency data prediction model subset of section builds the high-frequency data prediction model collection.
CN201810168243.4A 2018-02-28 2018-02-28 A kind of data prediction system and method Pending CN108304975A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810168243.4A CN108304975A (en) 2018-02-28 2018-02-28 A kind of data prediction system and method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810168243.4A CN108304975A (en) 2018-02-28 2018-02-28 A kind of data prediction system and method

Publications (1)

Publication Number Publication Date
CN108304975A true CN108304975A (en) 2018-07-20

Family

ID=62848894

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810168243.4A Pending CN108304975A (en) 2018-02-28 2018-02-28 A kind of data prediction system and method

Country Status (1)

Country Link
CN (1) CN108304975A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109670653A (en) * 2018-12-29 2019-04-23 北京航天数据股份有限公司 A kind of method and device predicted based on industrial model predictive engine
CN110737656A (en) * 2019-10-23 2020-01-31 广发证券股份有限公司 data interpolation method and device and readable storage medium
CN112232197A (en) * 2020-10-15 2021-01-15 武汉微派网络科技有限公司 Juvenile identification method, device and equipment based on user behavior characteristics
CN112669057A (en) * 2020-12-17 2021-04-16 北京五八信息技术有限公司 Data prediction method and device, electronic equipment and storage medium

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109670653A (en) * 2018-12-29 2019-04-23 北京航天数据股份有限公司 A kind of method and device predicted based on industrial model predictive engine
CN110737656A (en) * 2019-10-23 2020-01-31 广发证券股份有限公司 data interpolation method and device and readable storage medium
CN112232197A (en) * 2020-10-15 2021-01-15 武汉微派网络科技有限公司 Juvenile identification method, device and equipment based on user behavior characteristics
CN112669057A (en) * 2020-12-17 2021-04-16 北京五八信息技术有限公司 Data prediction method and device, electronic equipment and storage medium
CN112669057B (en) * 2020-12-17 2022-07-08 北京五八信息技术有限公司 Data prediction method and device, electronic equipment and storage medium

Similar Documents

Publication Publication Date Title
Verbraken et al. Development and application of consumer credit scoring models using profit-based classification measures
CN108304975A (en) A kind of data prediction system and method
Llorca et al. Using the latent class approach to cluster firms in benchmarking: An application to the US electricity transmission industry
US8984022B1 (en) Automating growth and evaluation of segmentation trees
CN106384197A (en) Service quality evaluation method and device based on big data
CN105205052B (en) A kind of data digging method and device
CN110930038A (en) Loan demand identification method, loan demand identification device, loan demand identification terminal and loan demand identification storage medium
Li et al. Heterogeneous ensemble learning with feature engineering for default prediction in peer-to-peer lending in China
CN108446984A (en) A kind of investment data management method and device
CN110414627A (en) A kind of training method and relevant device of model
CN108734567A (en) A kind of asset management system and its appraisal procedure based on big data artificial intelligence air control
CN113807469A (en) Multi-energy user value prediction method, device, storage medium and equipment
Long et al. Clustering stock data for multi-objective portfolio optimization
CN103455509B (en) A kind of method and system obtaining time window model parameter
CN106779240A (en) The Forecasting Methodology and system of civil aviaton&#39;s market macroscopic view index
CN104598705B (en) For identifying the method and apparatus of subsurface material layer
CN108596765A (en) A kind of Electronic Finance resource recommendation method and device
Ji et al. Portfolio diversification strategy via tail‐dependence clustering and ARMA‐GARCH Vine Copula approach
KR20110114181A (en) Loan underwriting method for improving forecasting accuracy
CN111222993A (en) Fund recommendation method and device
CN111428148B (en) Intelligent optimization algorithm recommendation method suitable for manufacturing process planning
CN110196797A (en) Automatic optimization method and system suitable for credit scoring card system
CN108710994A (en) Investment share-selecting method, device and storage medium based on the public sentiment factor
CN109767263A (en) Business revenue data predication method, device, computer equipment and storage medium
CN114493279A (en) Workflow task prediction method and device, storage medium and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20180720

RJ01 Rejection of invention patent application after publication