CN107977754A - Data predication method, system and computer-readable recording medium - Google Patents

Data predication method, system and computer-readable recording medium Download PDF

Info

Publication number
CN107977754A
CN107977754A CN201711363043.6A CN201711363043A CN107977754A CN 107977754 A CN107977754 A CN 107977754A CN 201711363043 A CN201711363043 A CN 201711363043A CN 107977754 A CN107977754 A CN 107977754A
Authority
CN
China
Prior art keywords
feature vector
data
learning algorithm
deep learning
value
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201711363043.6A
Other languages
Chinese (zh)
Inventor
周中和
李晶
汪亚男
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
WeBank Co Ltd
Original Assignee
WeBank Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by WeBank Co Ltd filed Critical WeBank Co Ltd
Priority to CN201711363043.6A priority Critical patent/CN107977754A/en
Publication of CN107977754A publication Critical patent/CN107977754A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/02Banking, e.g. interest calculation or account maintenance

Landscapes

  • Business, Economics & Management (AREA)
  • Engineering & Computer Science (AREA)
  • Economics (AREA)
  • Strategic Management (AREA)
  • General Business, Economics & Management (AREA)
  • Human Resources & Organizations (AREA)
  • Marketing (AREA)
  • Finance (AREA)
  • Development Economics (AREA)
  • Physics & Mathematics (AREA)
  • Accounting & Taxation (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Game Theory and Decision Science (AREA)
  • Technology Law (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Tourism & Hospitality (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a kind of data predication method, system and computer-readable recording medium, the described method comprises the following steps:Source data is extracted from data system, the source data of extraction is packaged into feature vector;Described eigenvector is loaded, and using hyper parameter structure deep learning algorithm model;Described eigenvector is inputted into the deep learning algorithm model, to be trained to the deep learning algorithm model;Feature vector is predicted based on the trained deep learning algorithm model, to obtain predicted value.The present invention improves the accuracy and convenience of data prediction.

Description

Data predication method, system and computer-readable recording medium
Technical field
The present invention relates to data analysis field, more particularly to a kind of data predication method, system and computer-readable storage Medium.
Background technology
Existing data prediction scheme, as bank liquidity resource forecasting scheme is mostly fairly simple, substantially according to warp Test to establish model, and the model established also is substantially linear model so that predicted value and actual value gap are larger.Especially For the periodically somewhat strong data of a bit, model prediction result can only in the short term effectively, and difference is just very big afterwards, into And need manual intervention to adjust, due to needing the manual identified cycle, in the different cycles, it is necessary to using different and similar functions Model is fitted, and operation is very cumbersome.Obviously, existing data prediction mode, accuracy is relatively low, and operation is also relatively complicated.
The content of the invention
It is a primary object of the present invention to provide a kind of data predication method, system and computer-readable recording medium, purport Existing bank liquidity resource forecasting mode is being solved, the technical problem that accuracy is relatively low, operation is relatively complicated.
To achieve the above object, the present invention provides a kind of data predication method, and the data predication method includes:
Source data is extracted from data system, the source data of extraction is packaged into feature vector;
Described eigenvector is loaded, and using hyper parameter structure deep learning algorithm model;
Described eigenvector is inputted into the deep learning algorithm model, to be instructed to the deep learning algorithm model Practice;
Feature vector is predicted based on the trained deep learning algorithm model, to obtain predicted value.
Alternatively, the described the step of source data of extraction is packaged into feature vector, includes:
Data cleansing is carried out to the source data of extraction;
Statistics is grouped to the source data after cleaning according to the date, obtains feature vector.
Alternatively, described the step of is grouped by statistics, obtains feature vector for the source data after cleaning according to the date it Afterwards, the method further includes:
Test to feature vector;
When detecting that feature vector has shortage of data, completion is carried out to feature vector using interpolation average algorithm.
Alternatively, after the step of loading described eigenvector, the method further includes:
The feature vector of loading is standardized, feature vector is zoomed in and out according to preset ratio, is obtained The feature vector of standardization;
It is described that feature vector is predicted based on the trained deep learning algorithm model, to obtain the step of predicted value After rapid, the method further includes:
Predicted value is reduced according to the preset ratio, to be used as final predicted value.
Alternatively, described pair loading feature vector be standardized, by feature vector according to preset ratio into The step of going and scale, obtaining the feature vector of standardization includes:
Determine the maximum in feature vector, definite maximum is worth to plus default and;
Will in feature vector each value divided by and, to obtain the feature vector of standardization.
Alternatively, described pair loading feature vector be standardized, by feature vector according to preset ratio into The step of going and scale, obtaining the feature vector of standardization further includes:
Each value in feature vector calculates average value, and standard deviation;
Each value is subtracted into the average value respectively, obtains each difference;
By each difference divided by the standard deviation, to obtain the feature vector of standardization.
Alternatively, it is described that feature vector is predicted based on the trained deep learning algorithm model, it is pre- to obtain After the step of measured value, the method further includes:
The corresponding actual value of the predicted value is obtained when prefixed time interval reaches;
Calculate the difference of the preset value and actual value, and the ratio of calculating difference and actual value;
When the ratio of calculating is more than default ratio, percentage regulation learning algorithm model.
Alternatively, described eigenvector includes calendar feature vector sum index feature vector.
In addition, to achieve the above object, the present invention also provides a kind of data prediction system, the data prediction system includes Processor, memory and it is stored in the data prediction program that can be run on the memory and on the processor, the number It is predicted that the step of data predication method as described above is realized when program is performed by the processor.
In addition, to achieve the above object, it is described computer-readable the present invention also provides a kind of computer-readable recording medium Data prediction program is stored with storage medium, the data prediction program is applied to data prediction system, the data prediction The step of data predication method as described above is realized when program is executed by processor.
Data predication method proposed by the present invention, first extracts source data from data system, and the source data of extraction is encapsulated Into feature vector, described eigenvector is then loaded, and using hyper parameter structure deep learning algorithm model, then by the feature Vector inputs the deep learning algorithm model, to be trained to the deep learning algorithm model, is based ultimately upon trained The deep learning algorithm model is predicted feature vector, to obtain predicted value.Deep learning algorithm is used in this programme Model is predicted the corresponding feature vector of source data, it is not necessary to the specific cycle of source data is known, even if the spy of source data When sign changes, model can also succeed in school new periodic characteristic automatically, so as to be adjusted to the data value of prediction, it is ensured that number It is predicted that accuracy and convenience.
Brief description of the drawings
Fig. 1 is the hardware architecture diagram of data prediction system of the present invention;
Fig. 2 is the flow diagram of data predication method first embodiment of the present invention;
Fig. 3 is the refinement flow diagram of step S10 in Fig. 2;
Fig. 4 is the flow diagram of data predication method 3rd embodiment of the present invention.
The embodiments will be further described with reference to the accompanying drawings for the realization, the function and the advantages of the object of the present invention.
Embodiment
It should be appreciated that the specific embodiments described herein are merely illustrative of the present invention, it is not intended to limit the present invention.
The solution of the embodiment of the present invention is mainly:Source data is first extracted from data system, by the source data of extraction Feature vector is packaged into, then loads described eigenvector, and using hyper parameter structure deep learning algorithm model, then by described in Feature vector inputs the deep learning algorithm model, to be trained to the deep learning algorithm model, is based ultimately upon instruction The experienced deep learning algorithm model is predicted feature vector, to obtain predicted value.To solve existing bank liquidity The problem of property resource forecasting mode, accuracy and relatively low convenience.
As shown in Figure 1, the structure of the data prediction system for the hardware running environment that Fig. 1, which is the embodiment of the present invention, to be related to Schematic diagram.
The data prediction system of the embodiment of the present invention can be PC, smart mobile phone, tablet computer, pocket computer, also may be used Be server, virtual machine facility etc. have display function equipment.
As shown in Figure 1, the data prediction system can include:Processor 1001, such as CPU, communication bus 1002, user Interface 1003, network interface 1004, memory 1005.Wherein, communication bus 1002 is used for realization the connection between these components Communication.User interface 1003 can include display screen (Display), input unit such as keyboard (Keyboard), optional user Interface 1003 can also include wireline interface (such as connecting wired keyboard, wire mouse etc.), the wave point (example of standard Such as it is used to connect Wireless Keyboard, wireless mouse).The wireline interface that network interface 1004 can optionally include standard (is used to connect Connect cable network), wave point (such as WI-FI interfaces, blue tooth interface, infrared interface, for connecting wireless network).Storage Device 1005 can be high-speed RAM memory or the memory (non-volatile memory) of stabilization, such as disk Memory.Memory 1005 optionally can also be the storage device independently of aforementioned processor 1001.
Alternatively, data prediction system can also include camera, RF (Radio Frequency, radio frequency) circuit, sensing Device, voicefrequency circuit, WiFi module etc..
It will be understood by those skilled in the art that the data prediction system structure shown in Fig. 1 is not formed to data prediction The restriction of system, can include than illustrating more or fewer components, either combine some components or different component cloth Put.
As shown in Figure 1, as in a kind of memory 1005 of computer-readable recording medium can include operating system, net Network communication module, Subscriber Interface Module SIM and data prediction program.Wherein, operating system is management and control data prediction system With the program of software resource, network communication module, Subscriber Interface Module SIM, data prediction program and other programs or software are supported Operation;Network communication module is used to managing and controlling network interface 1002;Subscriber Interface Module SIM is used to managing and controlling user Interface 1003.
In the data prediction system shown in Fig. 1, the data prediction system calls memory by processor 1001 The data prediction program stored in 1005, to realize following steps:
Source data is extracted from data system, the source data of extraction is packaged into feature vector;
Described eigenvector is loaded, and using hyper parameter structure deep learning algorithm model;
Described eigenvector is inputted into the deep learning algorithm model, to be instructed to the deep learning algorithm model Practice;
Feature vector is predicted based on the trained deep learning algorithm model, to obtain predicted value.
Further, the data prediction system calls the data prediction stored in memory 1005 by processor 1001 Program, to realize the step of source data of extraction is packaged into feature vector:
Data cleansing is carried out to the source data of extraction;
Statistics is grouped to the source data after cleaning according to the date, obtains feature vector.
Further, described the step of is grouped by statistics, obtains feature vector for the source data after cleaning according to the date Afterwards, the data prediction system calls the data prediction program stored in memory 1005 by processor 1001, to realize Following steps:
Test to feature vector;
When detecting that feature vector has shortage of data, completion is carried out to feature vector using interpolation average algorithm.
Further, after the step of loading described eigenvector, the data prediction system passes through processor The data prediction program stored in 1001 calling memories 1005, to realize following steps:
The feature vector of loading is standardized, feature vector is zoomed in and out according to preset ratio, is obtained The feature vector of standardization;
It is described that feature vector is predicted based on the trained deep learning algorithm model, to obtain the step of predicted value After rapid, the method further includes:
Predicted value is reduced according to the preset ratio, to be used as final predicted value.
Further, the data prediction system calls the data prediction stored in memory 1005 by processor 1001 Program, is standardized the feature vector of loading with realizing, feature vector is zoomed in and out according to preset ratio, is obtained To standardization feature vector the step of:
Determine the maximum in feature vector, definite maximum is worth to plus default and;
Will in feature vector each value divided by and, to obtain the feature vector of standardization.
Further, the data prediction system calls the data prediction stored in memory 1005 by processor 1001 Program, is standardized the feature vector of loading with realizing, feature vector is zoomed in and out according to preset ratio, is obtained To standardization feature vector the step of:
Each value in feature vector calculates average value, and standard deviation;
Each value is subtracted into the average value respectively, obtains each difference;
By each difference divided by the standard deviation, to obtain the feature vector of standardization.
Further, it is described that feature vector is predicted based on the trained deep learning algorithm model, to obtain After the step of predicted value, the data prediction system calls the data prediction stored in memory 1005 by processor 1001 Program, to realize following steps:
The corresponding actual value of the predicted value is obtained when prefixed time interval reaches;
Calculate the difference of the preset value and actual value, and the ratio of calculating difference and actual value;
When the ratio of calculating is more than default ratio, percentage regulation learning algorithm model.
Further, described eigenvector includes calendar feature vector sum index feature vector.
Technical solution proposed by the present invention, the data prediction system are called in memory 1005 by processor 1001 and deposited The data prediction program of storage, to realize step:Source data is first extracted from data system, the source data of extraction is packaged into feature Vector, then loads described eigenvector, and builds deep learning algorithm model using hyper parameter, then described eigenvector is defeated Enter the deep learning algorithm model, to be trained to the deep learning algorithm model, be based ultimately upon the trained depth Degree learning algorithm model is predicted feature vector, to obtain predicted value.Deep learning algorithm model pair is used in this programme The corresponding feature vector of source data is predicted, it is not necessary to the specific cycle of source data is known, even if the feature of source data occurs During change, model can also succeed in school new periodic characteristic automatically, so as to be adjusted to the data value of prediction, it is ensured that data prediction Accuracy and convenience.
Based on the hardware configuration of above-mentioned data prediction system, each embodiment of data predication method of the present invention is proposed.
With reference to Fig. 2, Fig. 2 is the flow diagram of data predication method first embodiment of the present invention.
In the present embodiment, the data predication method is applied to data prediction system, and the data predication method includes:
Step S10, extracts source data from data system, and the source data of extraction is packaged into feature vector;
Step S20, loads described eigenvector, and using hyper parameter structure deep learning algorithm model;
Step S30, inputs the deep learning algorithm model, with to the deep learning algorithm mould by described eigenvector Type is trained;
Step S40, is predicted feature vector based on the trained deep learning algorithm model, to be predicted Value.
In the present embodiment, the program can be applied to the bank field, and it is pre- that this programme is preferably applied to bank capital mobility Survey, hereinafter need the place illustrated, illustrated with the prediction of bank capital mobility.
The data prediction system includes the data system that loan system, position system or core system etc. provide data source, And including Feature Engineering and prediction engine, wherein, Feature Engineering is used to generate feature vector according to source data, and prediction engine is used In using feature vector training deep learning algorithm model, and using the deep learning algorithm model to feature vector into line number It is predicted that.
Each step of data prediction is done step-by-step in the present embodiment described in detail below:
Wherein, step S10, extracts source data from data system, and the source data of extraction is packaged into feature vector;
In the present embodiment, source data first is extracted from the data systems such as loan system, position system or core system, so The source data of extraction is packaged afterwards, to be packaged into the feature vector used for prediction engine, described eigenvector includes Index feature vector sum calendar feature vector.
Wherein, index feature vector includes sample Id, Data Date, data value, such as:
Make loans amount of money feature:Sample id=100001,2017-01-01,212323433
Balance of deposits feature:Id=100002,2017-01-01,234322344
Amount of money feature is drawn in advance:Id=100003,2017-01-01,34234233
It should be noted that each index feature vector is made of one group of continuous date, with balance of deposits feature For, above-mentioned illustrated index feature vector is only a line therein, adds other each rows just to form balance of deposits feature Index feature vector, i.e., balance of deposits feature be exactly one group from bank opening since, daily balance of deposits data.
Wherein, calendar feature vector be from many dimensions to describe every day the characteristics of, be mainly used for model being allowed to exist Training process identifies that the day is working day, festivals or holidays, first job day, the beginning of the month, the end of month, or the feature such as year end, so that pre- During survey, if running into similar scene, model can take the circumstances into consideration to adjust prediction result according to weight.Calendar feature vector includes Sample Id, Data Date, date feature, such as:
Id=110001,2017-01-01, week X
Id=110002,2017-01-01, if special holidays
Id=110003,2017-01-01, festivals or holidays xth day
It should be noted that as soon as each calendar feature vector is made of a group continuous date, 110001 the insides are wrapped What day contained from bank's opening to next year last description of all dates.
In the present embodiment, calendar feature vector described in the index feature vector sum is based on big data platform hive HQL (Hive Query Language) language is built in shell scripts, is building the index feature vector sum day When going through feature vector, relevant treatment has been carried out, has been included with reference to Fig. 3, the step S10:
Step S11, data cleansing is carried out to the source data of extraction;
Step S12, is grouped statistics to the source data after cleaning according to the date, obtains feature vector.
That is, data cleansing first is carried out to the source data of extraction, system is grouped to the source data after cleaning according still further to the date Meter, will be that the characteristic of 2017-01-01 is divided into one group on the date, by the characteristic of 2017-01-02 to obtain feature vector According to being separated into one group, to obtain each feature vector.
Further, in order to improve the accuracy of data prediction, after the step S12, the method further includes:
Step A, tests feature vector;
Step B, when detecting that feature vector has shortage of data, mends feature vector using interpolation average algorithm Entirely.
That is, after feature vector is obtained, missing, which occur, in data in order to prevent causes follow-up data prediction inaccurate, this Embodiment first tests feature vector, and specific check system is:Feature vector value corresponding to each date is done Non-NULL verifies, to determine whether this feature vector shortage of data occurs.
If having detected, there is shortage of data in feature vector, and completion is carried out to feature vector using interpolation average algorithm, Specifically, completion is carried out to the part of missing data in feature vector according to pre-stored characteristics vector, so that feature vector becomes One complete feature vector of data.
In this embodiment, it is optional to be placed into feature vector after feature vector is packaged into according to the source data of extraction In Feature Engineering.
Wherein, step S20, loads described eigenvector, and using hyper parameter structure deep learning algorithm model;
In the present embodiment, deep learning algorithm model is prediction engine in pre-set configuration, by preset language using super The algorithm model of parameter structure, so as to which subsequently the data of Feature Engineering input are trained and are predicted, the pre-set configuration bag Tensorflow frames, Caffe frameworks or Keras frameworks are included, described preset includes Python, R language or Java language. Prediction engine in the present embodiment includes two kernel programs, the first program (being represented with PredictEngine.py) and the second journey Sequence (is represented) with DeepLearning.py.PredictEngine is mainly responsible in building process " reception of hyper parameter ", feature Vector input pretreatment, the output processing of predicted value.DeepLearning.py is then docking Tensorflow frames, creates mould Type, calculates loss values, and continues to optimize model using gradient descent method.Detailed building process is as follows:First program using hql or Person's sql language loads feature vector from Feature Engineering, is being loaded into feature vector and then is using hyper parameter by the second program Build deep learning algorithm model.
In the present embodiment, before using hyper parameter structure deep learning algorithm model, hyper parameter, hyper parameter are first loaded Loading equally loaded by the first program in configuration file, i.e. in the first program loading configuration file be used for define nerve The hyper parameter of network, hyper parameter include:
Neutral net hierarchical structure parameter layer:[65,30,2]
Training pace step:0.01
Activation primitive activate_func:Relu/Tanh/Sigmoid
Dropout regularizations keepprob:0.8
Frequency of training training_times:150000
It should be appreciated that what above-mentioned each data were merely exemplary, other values can be arranged to according to actual needs.Loading To after hyper parameter, in Tensorflow frames, deep learning algorithm model is built using hyper parameter.
Wherein, step S30, inputs the deep learning algorithm model, with to the deep learning by described eigenvector Algorithm model is trained;
Wherein, step S40, is predicted feature vector based on the trained deep learning algorithm model, to obtain Predicted value.
Constructing deep learning algorithm model and then feature vector is inputted into the deep learning algorithm model, with The deep learning algorithm model is trained, finally, based on the trained deep learning algorithm model to feature vector It is predicted, to obtain predicted value.
The data predication method that the present embodiment proposes, first extracts source data from data system, and the source data of extraction is sealed Feature vector is dressed up, then loads described eigenvector, and using hyper parameter structure deep learning algorithm model, then by the spy Sign vector inputs the deep learning algorithm model, to be trained to the deep learning algorithm model, is based ultimately upon training The deep learning algorithm model feature vector is predicted, to obtain predicted value.Calculated in this programme using deep learning Method model is predicted the corresponding feature vector of source data, it is not necessary to know the specific cycle of source data, even if source data When feature changes, model can also succeed in school new periodic characteristic automatically, so as to be adjusted to the data value of prediction, it is ensured that The accuracy and convenience of data prediction.
Further, the second embodiment of data predication method of the present invention is proposed based on first embodiment.
Difference lies in described " to add for the second embodiment of data predication method and the first embodiment of data predication method After the step of load described eigenvector ", the data predication method further includes:
Step C, is standardized the feature vector of loading, and feature vector is contracted according to preset ratio Put, obtain the feature vector of standardization;
In the present embodiment, after being loaded into feature vector from Feature Engineering by the first program in default engine, The feature vector of loading is standardized again, feature vector is zoomed in and out according to preset ratio, is standardized The feature vector of processing.
Specifically, the embodiment of the step C includes:
1) mode one, the step C include:
Step C1, determines the maximum in feature vector, definite maximum is worth to plus default and;
Step C2, will in feature vector each value divided by and, to obtain the feature vector of standardization.
In the present embodiment, feature vector xv=[x1, x2, x3...., xn] is obtained, is then obtained in feature vector Maximum max (xv), after determining maximum max (xv), definite maximum is worth to plus default and, it is described default Value can be set according to being actually needed, and be represented herein with 0.01, then be and just max (xv)+0.01.Then, will be every in feature vector It is a value divided by and, to obtain the feature vector of standardization, calculation formula is:The feature vector of standardization=x/ (max (xv)+0.01).The value that the feature vector divided by feature vector of the standardization obtain, is above-mentioned preset ratio.
2) mode two, the step C include:
Step C3, each value in feature vector calculate average value, and standard deviation;
Step C4, subtracts the average value by each value respectively, obtains each difference;
Step C5, by each difference divided by the standard deviation, to obtain the feature vector of standardization.
In the present embodiment, feature vector xv=[x1, x2, x3...., xn] is obtained, and it is each in feature vector A value calculates average value mean (xv), and standard deviation std (xv), after average value and standard deviation is calculated, by feature to In amount each value subtract it is described be averagely worth to each difference, i.e. x-mean (xv), most each difference divided by the standard at last Difference, to obtain the feature vector of standardization, calculation formula is:The feature vector of standardization=(x-mean (xv))/ std(xv).The value that the feature vector divided by feature vector of the standardization obtain, is above-mentioned preset ratio.
It will be apparent to a skilled person that two kinds enumerated listed above carry out standard to the feature vector of loading Change what the mode handled was merely exemplary, those skilled in the art are using technological thought of the invention, according to its specific requirements The mode that the other various feature vectors to loading proposed are standardized is within the scope of the present invention, This is without exhaustive one by one.
After the step of feature vector to loading is standardized, obtains the feature vector of standardization, The feature vector of standardization is input to the deep learning algorithm model, to be carried out to the deep learning algorithm model Training, is subsequently again predicted the feature vector of standardization based on the trained deep learning algorithm model, with To predicted value.
Further, after the step S40, the method further includes:
Step D, predicted value is reduced according to the preset ratio, to be used as final predicted value.
After being predicted namely based on the trained deep learning algorithm model to the feature vector of standardization, Inverse is carried out to predicted value to go back, i.e., is reduced preset value according to above-mentioned preset ratio, obtains original there is bank The numerical value of business meaning, such as balance of deposits, using the predicted value after reduction as final predicted value.
In the present embodiment, by being standardized to the feature vector of loading, prevent that there are number in feature vector During according to the larger data of dimensional discrepancy, different data areas causes data result of calculation gap excessive, so as to influence data meter The accuracy of calculation, the present embodiment improve the accuracy of data prediction.
Further, with reference to Fig. 4, propose data predication method of the present invention based on first or second embodiments the 3rd implements Example.
Difference lies in institute for the 3rd embodiment of data predication method and the first or second embodiments of data predication method After stating step S40, the data predication method further includes:
Step S50, the corresponding actual value of the predicted value is obtained when prefixed time interval reaches;
Step S60, calculates the difference of the preset value and actual value, and the ratio of calculating difference and actual value;
Step S70, when the ratio of calculating is more than default ratio, percentage regulation learning algorithm model.
In the present embodiment, feature vector is predicted by trained deep learning algorithm model, it is pre- to obtain After measured value, the monitoring of time is carried out by preset timer, the predicted value pair is obtained when prefixed time interval reaches The actual value answered, then calculates the difference of the preset value and actual value, and the ratio of calculating difference and actual value, formula are optional For:(predicted value-actual value)/actual value.When the ratio of calculating is more than default ratio, using the difference of actual value and predicted value as According to adjustment hyper parameter, to realize the optimization of deep learning algorithm model according to the hyper parameter of adjustment, to improve subsequently into line number It is predicted that accuracy.
Technical scheme, is preferably applied to bank liquidity prediction:To each service product of bank in future time The prediction that flows in and out of fund, according to prediction result, prevention outflow, which is more than, to be flowed into, and bank is sent out without the situation that money is paid It is raw, it is ensured that bank can meet the requirement of client's drawing etc. at any time.
Found by experiment test, this method production environment measured result is better than Conventional wisdom model algorithm, embodies :Manual identified periodic linear algorithm is not required, is predicted using this method, complete data driving.Prediction result and true Result error is average within 5%, while the accuracy of prediction result is also greatly improved.
The present invention further provides a kind of computer-readable recording medium.
Data prediction program is stored with the computer-readable recording medium, the data prediction program is held by processor Following steps are realized during row:
Source data is extracted from data system, the source data of extraction is packaged into feature vector;
Described eigenvector is loaded, and using hyper parameter structure deep learning algorithm model;
Described eigenvector is inputted into the deep learning algorithm model, to be instructed to the deep learning algorithm model Practice;
Feature vector is predicted based on the trained deep learning algorithm model, to obtain predicted value.
Further, when the data prediction program is executed by processor, also realizes and the source data of extraction is packaged into spy Levy the step of vector:
Data cleansing is carried out to the source data of extraction;
Statistics is grouped to the source data after cleaning according to the date, obtains feature vector.
Further, described the step of is grouped by statistics, obtains feature vector for the source data after cleaning according to the date Afterwards, when the data prediction program is executed by processor, following steps are also realized:
Test to feature vector;
When detecting that feature vector has shortage of data, completion is carried out to feature vector using interpolation average algorithm.
Further, after the step of loading described eigenvector, the data prediction program is executed by processor When, also realize following steps:
The feature vector of loading is standardized, feature vector is zoomed in and out according to preset ratio, is obtained The feature vector of standardization;
It is described that feature vector is predicted based on the trained deep learning algorithm model, to obtain the step of predicted value After rapid, the method further includes:
Predicted value is reduced according to the preset ratio, to be used as final predicted value.
Further, when the data prediction program is executed by processor, also realize to the feature vector of loading into rower Quasi-ization processing, the step of being zoomed in and out according to preset ratio by feature vector, obtain the feature vector of standardization:
Determine the maximum in feature vector, definite maximum is worth to plus default and;
Will in feature vector each value divided by and, to obtain the feature vector of standardization.
Further, when the data prediction program is executed by processor, also realize to the feature vector of loading into rower Quasi-ization processing, the step of being zoomed in and out according to preset ratio by feature vector, obtain the feature vector of standardization:
Each value in feature vector calculates average value, and standard deviation;
Each value is subtracted into the average value respectively, obtains each difference;
By each difference divided by the standard deviation, to obtain the feature vector of standardization.
Further, it is described that feature vector is predicted based on the trained deep learning algorithm model, to obtain After the step of predicted value, when the data prediction program is executed by processor, following steps are also realized:
The corresponding actual value of the predicted value is obtained when prefixed time interval reaches;
Calculate the difference of the preset value and actual value, and the ratio of calculating difference and actual value;
When the ratio of calculating is more than default ratio, percentage regulation learning algorithm model.
Further, described eigenvector includes calendar feature vector sum index feature vector.
Technical solution proposed by the present invention, when the data prediction program is executed by processor, realizes following steps:First from Source data is extracted in data system, the source data of extraction is packaged into feature vector, then loads described eigenvector, and use Hyper parameter builds deep learning algorithm model, then described eigenvector is inputted the deep learning algorithm model, with to described Deep learning algorithm model is trained, and is based ultimately upon the trained deep learning algorithm model and feature vector is carried out in advance Survey, to obtain predicted value.The corresponding feature vector of source data is predicted using deep learning algorithm model in this programme, no Need to know the specific cycle of source data, even if the feature of source data changes, model can also succeed in school new week automatically Phase feature, so as to be adjusted to the data value of prediction, it is ensured that the accuracy and convenience of data prediction.
It should be noted that herein, term " comprising ", "comprising" or its any other variation are intended to non-row His property includes, so that process, method, article or device including a series of elements not only include those key elements, and And the other key elements being not explicitly listed are further included, or further include as this process, method, article or device institute inherently Key element.In the absence of more restrictions, the key element limited by sentence "including a ...", it is not excluded that including this Also there are other identical element in the process of key element, method, article or device.
The embodiments of the present invention are for illustration only, do not represent the quality of embodiment.
Through the above description of the embodiments, those skilled in the art can be understood that above-described embodiment side Method can add the mode of required general hardware platform to realize by software, naturally it is also possible to by hardware, but in many cases The former is more preferably embodiment.Based on such understanding, technical scheme substantially in other words does the prior art Going out the part of contribution can be embodied in the form of software product, which is stored in a storage medium In (such as ROM/RAM, magnetic disc, CD), including some instructions are used so that a station terminal equipment (can be mobile phone, computer, takes Be engaged in device, air conditioner, or network equipment etc.) perform method described in each embodiment of the present invention.
It these are only the preferred embodiment of the present invention, be not intended to limit the scope of the invention, it is every to utilize this hair The equivalent structure or equivalent flow shift that bright specification and accompanying drawing content are made, is directly or indirectly used in other relevant skills Art field, is included within the scope of the present invention.

Claims (10)

1. a kind of data predication method, it is characterised in that the data predication method includes:
Source data is extracted from data system, the source data of extraction is packaged into feature vector;
Described eigenvector is loaded, and using hyper parameter structure deep learning algorithm model;
Described eigenvector is inputted into the deep learning algorithm model, to be trained to the deep learning algorithm model;
Feature vector is predicted based on the trained deep learning algorithm model, to obtain predicted value.
2. data predication method as claimed in claim 1, it is characterised in that it is described by the source data of extraction be packaged into feature to The step of amount, includes:
Data cleansing is carried out to the source data of extraction;
Statistics is grouped to the source data after cleaning according to the date, obtains feature vector.
3. data predication method as claimed in claim 2, it is characterised in that it is described according to the date to the source data after cleaning into Row classified statistic, after the step of obtaining feature vector, the method further includes:
Test to feature vector;
When detecting that feature vector has shortage of data, completion is carried out to feature vector using interpolation average algorithm.
4. data predication method as claimed in claim 1, it is characterised in that it is described loading described eigenvector the step of it Afterwards, the method further includes:
The feature vector of loading is standardized, feature vector is zoomed in and out according to preset ratio, obtains standard Change the feature vector of processing;
It is described that feature vector is predicted based on the trained deep learning algorithm model, the step of to obtain predicted value it Afterwards, the method further includes:
Predicted value is reduced according to the preset ratio, to be used as final predicted value.
5. data predication method as claimed in claim 4, it is characterised in that the feature vector of described pair of loading is standardized The step of handling, zooming in and out, obtain the feature vector of standardization feature vector according to preset ratio includes:
Determine the maximum in feature vector, definite maximum is worth to plus default and;
Will in feature vector each value divided by and, to obtain the feature vector of standardization.
6. data predication method as claimed in claim 4, it is characterised in that the feature vector of described pair of loading is standardized The step of handling, zooming in and out, obtain the feature vector of standardization feature vector according to preset ratio further includes:
Each value in feature vector calculates average value, and standard deviation;
Each value is subtracted into the average value respectively, obtains each difference;
By each difference divided by the standard deviation, to obtain the feature vector of standardization.
7. data predication method as claimed in claim 1, it is characterised in that described based on the trained deep learning algorithm Model is predicted feature vector, the step of to obtain predicted value after, the method further includes:
The corresponding actual value of the predicted value is obtained when prefixed time interval reaches;
Calculate the difference of the preset value and actual value, and the ratio of calculating difference and actual value;
When the ratio of calculating is more than default ratio, percentage regulation learning algorithm model.
8. such as claim 1-7 any one of them data predication methods, it is characterised in that it is special that described eigenvector includes calendar Levy vector sum index feature vector.
9. a kind of data prediction system, it is characterised in that the data prediction system includes processor, memory and is stored in institute The data prediction program that can be run on memory and on the processor is stated, the data prediction program is held by the processor Realized during row such as the step of data predication method described in any item of the claim 1 to 8.
10. a kind of computer-readable recording medium, it is characterised in that it is pre- to be stored with data on the computer-readable recording medium Ranging sequence, realizes such as claim 1 to 8 any one of them data prediction side when the data prediction program is executed by processor The step of method.
CN201711363043.6A 2017-12-18 2017-12-18 Data predication method, system and computer-readable recording medium Pending CN107977754A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711363043.6A CN107977754A (en) 2017-12-18 2017-12-18 Data predication method, system and computer-readable recording medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711363043.6A CN107977754A (en) 2017-12-18 2017-12-18 Data predication method, system and computer-readable recording medium

Publications (1)

Publication Number Publication Date
CN107977754A true CN107977754A (en) 2018-05-01

Family

ID=62006683

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711363043.6A Pending CN107977754A (en) 2017-12-18 2017-12-18 Data predication method, system and computer-readable recording medium

Country Status (1)

Country Link
CN (1) CN107977754A (en)

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108897818A (en) * 2018-06-20 2018-11-27 北京三快在线科技有限公司 Determine the method, apparatus and readable storage medium storing program for executing of data handling procedure ageing state
CN109047698A (en) * 2018-09-03 2018-12-21 中冶连铸技术工程有限责任公司 A kind of continuous casting billets of fixed weight scale on-line prediction method
CN109363789A (en) * 2018-10-19 2019-02-22 上海交通大学 For predicting the method and data collection system of sono-explorer
CN110363661A (en) * 2019-08-05 2019-10-22 中国工商银行股份有限公司 Bank liquidity prediction technique and device
CN110445629A (en) * 2018-05-03 2019-11-12 佛山市顺德区美的电热电器制造有限公司 A kind of server concurrency prediction technique and device
CN110634061A (en) * 2018-06-22 2019-12-31 马上消费金融股份有限公司 Capital demand prediction method and device and terminal equipment
CN111090571A (en) * 2019-12-18 2020-05-01 中国建设银行股份有限公司 Information system maintenance method, device and computer storage medium
CN111339072A (en) * 2020-02-23 2020-06-26 中国平安财产保险股份有限公司 User behavior based change value analysis method and device, electronic device and medium
CN111383101A (en) * 2020-03-25 2020-07-07 深圳前海微众银行股份有限公司 Post-loan risk monitoring method, device, equipment and computer-readable storage medium
CN111554094A (en) * 2020-05-20 2020-08-18 四川万网鑫成信息科技有限公司 Journey prediction model based on vehicle journey, driving behavior and traffic information
WO2020164275A1 (en) * 2019-02-15 2020-08-20 平安科技(深圳)有限公司 Processing result prediction method and apparatus based on prediction model, and server
CN112016794A (en) * 2020-07-15 2020-12-01 北京淇瑀信息科技有限公司 Resource quota management method and device and electronic equipment
CN113129127A (en) * 2021-04-21 2021-07-16 建信金融科技有限责任公司 Early warning method and device
CN113487425A (en) * 2021-08-03 2021-10-08 北京神州数字科技有限公司 Method and system for backtracking daytime liquidity condition based on historical data

Cited By (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110445629A (en) * 2018-05-03 2019-11-12 佛山市顺德区美的电热电器制造有限公司 A kind of server concurrency prediction technique and device
CN108897818A (en) * 2018-06-20 2018-11-27 北京三快在线科技有限公司 Determine the method, apparatus and readable storage medium storing program for executing of data handling procedure ageing state
CN110634061A (en) * 2018-06-22 2019-12-31 马上消费金融股份有限公司 Capital demand prediction method and device and terminal equipment
CN109047698A (en) * 2018-09-03 2018-12-21 中冶连铸技术工程有限责任公司 A kind of continuous casting billets of fixed weight scale on-line prediction method
CN109363789A (en) * 2018-10-19 2019-02-22 上海交通大学 For predicting the method and data collection system of sono-explorer
WO2020164275A1 (en) * 2019-02-15 2020-08-20 平安科技(深圳)有限公司 Processing result prediction method and apparatus based on prediction model, and server
CN110363661A (en) * 2019-08-05 2019-10-22 中国工商银行股份有限公司 Bank liquidity prediction technique and device
CN111090571A (en) * 2019-12-18 2020-05-01 中国建设银行股份有限公司 Information system maintenance method, device and computer storage medium
CN111090571B (en) * 2019-12-18 2024-01-23 中国建设银行股份有限公司 Maintenance method, device and computer storage medium for information system
CN111339072B (en) * 2020-02-23 2023-09-15 中国平安财产保险股份有限公司 User behavior-based change value analysis method and device, electronic equipment and medium
CN111339072A (en) * 2020-02-23 2020-06-26 中国平安财产保险股份有限公司 User behavior based change value analysis method and device, electronic device and medium
CN111383101A (en) * 2020-03-25 2020-07-07 深圳前海微众银行股份有限公司 Post-loan risk monitoring method, device, equipment and computer-readable storage medium
CN111383101B (en) * 2020-03-25 2024-03-15 深圳前海微众银行股份有限公司 Post-credit risk monitoring method, post-credit risk monitoring device, post-credit risk monitoring equipment and computer readable storage medium
CN111554094A (en) * 2020-05-20 2020-08-18 四川万网鑫成信息科技有限公司 Journey prediction model based on vehicle journey, driving behavior and traffic information
CN112016794A (en) * 2020-07-15 2020-12-01 北京淇瑀信息科技有限公司 Resource quota management method and device and electronic equipment
CN112016794B (en) * 2020-07-15 2024-03-01 北京淇瑀信息科技有限公司 Resource quota management method and device and electronic equipment
CN113129127A (en) * 2021-04-21 2021-07-16 建信金融科技有限责任公司 Early warning method and device
CN113487425A (en) * 2021-08-03 2021-10-08 北京神州数字科技有限公司 Method and system for backtracking daytime liquidity condition based on historical data

Similar Documents

Publication Publication Date Title
CN107977754A (en) Data predication method, system and computer-readable recording medium
CN107609652B (en) Execute the distributed system and its method of machine learning
CN105183322B (en) Progress bar display methods and device
CN108090755A (en) Billing model defines method, data system for settling account and computer readable storage medium
CN109857935A (en) A kind of information recommendation method and device
CN107977457A (en) Data liquidation method, system and computer-readable recording medium
CN111695791A (en) Service index prediction method and device
CN107392645A (en) Usage mining method, apparatus and its equipment
CN108923996A (en) A kind of capacity analysis method and device
CN110517246A (en) A kind of image processing method, device, electronic equipment and storage medium
CN108960505A (en) Quantitative estimation method, device, system and the storage medium of personal finance credit
CN108564464A (en) risk control auditing result intelligent display method, device, equipment and storage medium
CN107871282A (en) Finance product purchasing method, device, equipment and readable storage medium storing program for executing
US20230154050A1 (en) Spatial image analysis-based color scheme recommendation apparatus and method
CN105468161A (en) Instruction execution method and device
CN107977867A (en) Billing model update method, data system for settling account and computer-readable recording medium
CN109660999A (en) A kind of network performance optimizing method based on CCA model, device and storage medium
CN107025464A (en) A kind of colour selecting method and terminal
CN107967650A (en) A kind of batch accounting data processing method and processing device of core banking system
CN112783614A (en) Object processing method, device, equipment, storage medium and program product
CN109145932A (en) User's gender prediction's method, device and equipment
CN107819745A (en) The defence method and device of abnormal flow
CN112200491B (en) Digital twin model construction method, device and storage medium
CN103577047B (en) The display processing method of HScrollBar and device for DataGrid control
CN117095541A (en) Method, device, equipment and storage medium for predicting space-time feature fusion traffic flow

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20180501

RJ01 Rejection of invention patent application after publication