CN108399248A - A kind of time series data prediction technique, device and equipment - Google Patents
A kind of time series data prediction technique, device and equipment Download PDFInfo
- Publication number
- CN108399248A CN108399248A CN201810174986.2A CN201810174986A CN108399248A CN 108399248 A CN108399248 A CN 108399248A CN 201810174986 A CN201810174986 A CN 201810174986A CN 108399248 A CN108399248 A CN 108399248A
- Authority
- CN
- China
- Prior art keywords
- time series
- series data
- antibody
- sequence
- deep learning
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2458—Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
- G06F16/2474—Sequence data queries, e.g. querying versioned data
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/21—Design, administration or maintenance of databases
- G06F16/215—Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/18—Complex mathematical operations for evaluating statistical data, e.g. average values, frequency distributions, probability functions, regression analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/12—Computing arrangements based on biological models using genetic models
- G06N3/126—Evolutionary algorithms, e.g. genetic algorithms or genetic programming
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Mathematical Physics (AREA)
- General Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Biophysics (AREA)
- Databases & Information Systems (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Computational Linguistics (AREA)
- Biomedical Technology (AREA)
- Mathematical Analysis (AREA)
- Pure & Applied Mathematics (AREA)
- Probability & Statistics with Applications (AREA)
- Evolutionary Computation (AREA)
- Computing Systems (AREA)
- Molecular Biology (AREA)
- Computational Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Mathematical Optimization (AREA)
- Operations Research (AREA)
- Fuzzy Systems (AREA)
- Algebra (AREA)
- Quality & Reliability (AREA)
- Physiology (AREA)
- Genetics & Genomics (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention discloses a kind of time series data prediction technique, device and equipment, wherein this method includes:History time series data is obtained, and data cleansing and data slicer are carried out to the history time series data, obtains corresponding time series data sequence;Tranquilization operation is carried out to the time series data sequence, and feature reconstruction is carried out to the time series data sequence after carrying out tranquilization operation using Immunogenetic features restructing algorithm, obtains corresponding characteristic sequence;The deep learning model trained based on the characteristic sequence is obtained, and time series data prediction is carried out using the deep learning model.It can be seen that, the application is different from realizing the acquisition of data set features by the methods of sampling in the prior art, but by above-mentioned data prediction, tranquilization operation and feature reconstruction and etc. ensure that acquisition time series data feature validity, so that deep learning model can learn the temporal aspect to time series data, the forecasting accuracy of deep learning model ensure that.
Description
Technical field
The present invention relates to depth learning technology field, more specifically to a kind of time series data prediction technique, device and
Equipment.
Background technology
" exchange rate " is referred to as ExRate, also known as " exchange rate quotation ", " foreign exchange quotation " or " exchange rate " etc., is a kind of currency conversion
The ratio of another currency, the variation of reaction coin centre relative price;With global floating exchange rate system legalize and the world
The reinforcement of economic integration trend, foreign exchange become the important composition of numerous capital products, therefore as important capital element
The concern of various circles of society investor and securities market is caused to its prediction.
In recent years, with the development of heuritic approach, all kinds of machine learning algorithms are applied to Exchange Rate Forecasting, specific next
It says, proposes to simulate the nonlinear change of the exchange rate using broad sense Recurrent neural network (GRNN) model in the prior art, still
Such static network model needs to learn the feature of input data set by the method for random sampling, this makes model without calligraphy learning
To the temporal aspect of Exchange Rate.
In conclusion for realizing the technical solution of Exchange Rate Forecasting, there are models to become without calligraphy learning to the exchange rate in the prior art
The problem of temporal aspect of change.
Invention content
The object of the present invention is to provide a kind of time series data prediction technique, device and equipment, can pass through time series data sequence
The determination of row feature enables model effectively to learn the temporal aspect to Exchange Rate.
To achieve the goals above, the present invention provides the following technical solutions:
A kind of time series data prediction technique, including:
History time series data is obtained, and data cleansing and data slicer are carried out to the history time series data, is corresponded to
Time series data sequence;
Tranquilization operation is carried out to the time series data sequence, and steady to carrying out using Immunogenetic features restructing algorithm
Change the time series data sequence after operation and carry out feature reconstruction, obtains corresponding characteristic sequence;
The deep learning model trained based on the characteristic sequence is obtained, and is carried out using the deep learning model
Time series data is predicted.
Preferably, before carrying out time series data prediction using the deep learning model, further include:
The precision of prediction of the deep learning model is calculated based on the characteristic sequence, if the precision of prediction meet it is pre-
If it is required that, it is determined that the deep learning model can be used in carrying out time series data prediction, otherwise, it is determined that the deep learning
Model is not used to carry out time series data prediction.
Preferably, the precision of prediction of the deep learning model is calculated using the characteristic sequence, including:
Training set and test set are obtained, the training set and the test set are that the characteristic sequence is divided into multiple sons
The multiple subsequence is grouped after sequence, wherein the deep learning model is to train to obtain based on the training set
's;
The each subsequence for including in the training set is inputted into the deep learning model respectively, and utilizes the depth
Residual error composition training residual sequence between each time series data and corresponding practical time series data of learning model output;It will be described
The each subsequence for including in test set inputs the deep learning model respectively, and utilizes deep learning model output
Each residual error composition test residual sequence between time series data and corresponding practical time series data;
The LB values of the trained residual sequence and the LB values of the test residual sequence are calculated separately using LB inspection technologies;
It is corresponding, judge whether the precision of prediction reaches preset requirement, including:
The training set and the corresponding LB values of the test set are compared with preset requirement respectively, if the training
Collection and the corresponding LB values of the test set meet preset requirement, it is determined that the deep learning model can be used in carrying out sequential
Data prediction, otherwise, it is determined that the deep learning model is not used to carry out time series data prediction.
Preferably, after the training set and the corresponding LB values of the test set being compared with preset requirement respectively,
Further include:
If the test set correspond to LB values do not meet the preset requirement and the training set correspond to LB values meet it is described
Preset requirement, then return execute it is described using Immunogenetic features restructing algorithm to carry out tranquilization operation after time series data sequence
Row carry out the step of feature reconstruction, are corresponded to until LB values meet preset requirement until the training set and the test set.
Preferably, the time series data sequence after carrying out tranquilization operation is carried out using Immunogenetic features restructing algorithm special
Sign reconstruct obtains character pair sequence, including:
Obtain according to the following formula carry out tranquilization operation after time series data sequence x time series data residual error derivative to
Measure X:
X=[x, x', x(2),...,x(n)]T;
It obtains and following character pair sequence is obtained based on derivative vector XFormula:
Obtain following affinity evaluation function:
Wherein, A is the square formation of n × n, and B is the column vector of n × 1, and n is characterized the exponent number of transformation, and m is input based on each
The length of the sequence for the deep learning model that training set obtains,For the training set neutron sequence inputting is corresponded to deep learning
The time series data that the deep learning model exports after model, Y indicate corresponding with the time series data that the deep learning model exports
Practical time series data, N is the quantity of subsequence in the training set, the deep learning model, the training set and described
The value of affinity evaluation function corresponds;The acquisition process of the wherein described training set includes:The every group of A and B for determining and generating
The corresponding characteristic sequence of value, each characteristic sequence is divided into multiple subsequences, and the multiple subsequence is grouped
Obtain training set corresponding with each characteristic sequence and test set;
Determine the minimum A's and B of value for making the affinity evaluation function using Immunogenetic features restructing algorithm
Value, and determine that the value character pair sequence of the A and B is the characteristic sequence that the feature reconstruction finally determined obtains.
Preferably, it is determined using Immunogenetic features restructing algorithm so that the value of the affinity evaluation function is minimum
The value of A and B, including:
Multiple antibody are generated, each antibody corresponds to the value of one group of A and B;
The antibody that corresponding A is singular matrix is eliminated, and the antibody to match with the antibody in data base is eliminated, the note
It is to test obtained illegal antibody in advance to recall the antibody that library includes;
The value for calculating the corresponding affinity evaluation function of each antibody judges whether to meet any end condition, if so,
It then determines that the antibody of the value minimum of affinity evaluation function is best match antibody, and determines that the best match antibody is corresponding
The value of A and B is the value so that the minimum A and B of value of the affinity evaluation function;If it is not, then the multiple antibody is pressed
Heritable variation operation is carried out according to front two numerical classification, and to the antibody in every one kind, obtains new antibody, and resist based on new
Body, which returns, executes the step of eliminating the antibody that corresponding A is singular matrix;The wherein described end condition includes:There are some antibody pair
The value of affinity evaluation function is answered to execute time for the step of eliminating the antibody that corresponding A is singular matrix less than predetermined threshold, return
The antibody of the value minimum of several affinity evaluation functions for reaching preset times, determining does not change for default time continuously.
Preferably, heritable variation operation is carried out to the antibody in every one kind, obtains new antibody, including:
Determine that per a kind of antibody be a parent, and the ratio for obtaining duplication, intersection and variation is respectively p, q, r;
The pK antibody that the value minimum of affinity evaluation function is chosen from each parent, remains into filial generation;From each
QK/2 is randomly selected in parent to antibody, and the i-th bit of each pair of antibody and subsequent sequence are exchanged, it is anti-by what is obtained after exchange
Body remains into filial generation;RK antibody is randomly selected from each parent, is replaced with a random number in preset codomain
The jth position for changing each antibody will be replaced obtained antibody and be remained into filial generation;Wherein, K is per a kind of antibody levels, i and j
It is random number;
Determine that the antibody that whole filial generation includes is that heritable variation operation obtains new antibody.
Preferably, the deep learning model trained based on the characteristic sequence is obtained, including:
Obtain the GRU models trained based on the characteristic sequence.
A kind of time series data prediction meanss, including:
Preprocessing module is used for:History time series data is obtained, and data cleansing and number are carried out to the history time series data
According to slice, corresponding time series data sequence is obtained;
Feature reconstruction module, is used for:Tranquilization operation is carried out to the time series data sequence, and uses Immunogenetic features
Restructing algorithm carries out feature reconstruction to the time series data sequence after carrying out tranquilization operation, obtains corresponding characteristic sequence;
Prediction module is used for:The deep learning model trained based on the characteristic sequence is obtained, and utilizes the depth
It spends learning model and carries out time series data prediction.
A kind of pre- measurement equipment of time series data, including:
Memory, for storing computer program;
Processor realizes the step of the as above any one time series data prediction technique when for executing the computer program
Suddenly.
The present invention provides a kind of time series data prediction technique, device and equipment, wherein this method includes:When obtaining history
Ordinal number evidence, and data cleansing and data slicer are carried out to the history time series data, obtain corresponding time series data sequence;To institute
State time series data sequence carry out tranquilization operation, and using Immunogenetic features restructing algorithm to carry out tranquilization operation after when
Sequence data sequence carries out feature reconstruction, obtains corresponding characteristic sequence;Obtain the depth trained based on the characteristic sequence
Learning model, and carry out time series data prediction using the deep learning model.In above-mentioned technical proposal provided by the invention, lead to
It crosses data cleansing and data slicer ensure that the integrality and accuracy of history time series data, by being carried out to time series data sequence
Tranquilization operates, and ensure that the stability of time series data sequence, ordinal number when being realized by using Immunogenetic features restructing algorithm
According to the feature reconstruction of sequence, effective extraction to the feature of time series data sequence, further Pass through above-mentioned technical proposal ensure that
It ensure that accuracy when the corresponding deep learning model realization time series data of feature based sequence is predicted;As it can be seen that the application is not
Be same as in the prior art by the methods of sampling realize data set features acquisition, but ensure that through the above steps acquisition when
The validity of sequence data characteristics, and then ensure that deep learning model can learn the temporal aspect to time series data;And this Shen
Please disclosed above-mentioned technical proposal can realize automatically, participated in without artificial, pass through the harmful effect for avoiding human error from bringing
Further ensure the accuracy of temporal aspect acquisition.
Description of the drawings
In order to more clearly explain the embodiment of the invention or the technical proposal in the existing technology, to embodiment or will show below
There is attached drawing needed in technology description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this
The embodiment of invention for those of ordinary skill in the art without creative efforts, can also basis
The attached drawing of offer obtains other attached drawings.
Fig. 1 is a kind of flow chart of time series data prediction technique provided in an embodiment of the present invention;
Fig. 2 is the structure chart of GRU models in a kind of time series data prediction technique provided in an embodiment of the present invention;
Fig. 3 is the network structure of GRU model units in a kind of time series data prediction technique provided in an embodiment of the present invention;
Fig. 4 is models fitting curve graph in a kind of time series data prediction technique provided in an embodiment of the present invention;
Fig. 5 is model accuracy curve graph in a kind of time series data prediction technique provided in an embodiment of the present invention;
Fig. 6 is that first-order difference tranquilization corresponding sequence dissipates in a kind of time series data prediction technique provided in an embodiment of the present invention
Point diagram;
Fig. 7 be in a kind of time series data prediction technique provided in an embodiment of the present invention first-order difference tranquilization corresponding sequence from
Correlation figure;
Fig. 8 is that first-order difference tranquilization corresponding sequence is inclined in a kind of time series data prediction technique provided in an embodiment of the present invention
Correlation figure;
Fig. 9 is ARCH model prediction result schematic diagrams in a kind of time series data prediction technique provided in an embodiment of the present invention;
Figure 10 is that GARCH model prediction results are illustrated in a kind of time series data prediction technique provided in an embodiment of the present invention
Figure;
Figure 11 is GARCH model binaryzation prediction results in a kind of time series data prediction technique provided in an embodiment of the present invention
Schematic diagram;
Figure 12 is that the reproducing sequence based on GRU models is pre- in a kind of time series data prediction technique provided in an embodiment of the present invention
Survey result schematic diagram;
Figure 13 is the binaryzation prediction based on GRU models in a kind of time series data prediction technique provided in an embodiment of the present invention
Result schematic diagram;
Figure 14 is that the residual error LB in a kind of time series data prediction technique provided in an embodiment of the present invention under each rank delay is examined
Schematic diagram;
Figure 15 is a kind of structural schematic diagram of time series data prediction meanss provided in an embodiment of the present invention.
Specific implementation mode
Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete
Site preparation describes, it is clear that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.It is based on
Embodiment in the present invention, it is obtained by those of ordinary skill in the art without making creative efforts every other
Embodiment shall fall within the protection scope of the present invention.
It, can be with referring to Fig. 1, it illustrates a kind of flow chart of time series data prediction technique provided in an embodiment of the present invention
Include the following steps:
S11:History time series data is obtained, and data cleansing and data slicer are carried out to history time series data, is corresponded to
Time series data sequence.
It should be noted that the time series data in the application, which can be the exchange rate, stock, futures, noble metal etc., has the time
The assets sequence of sequence characteristic and other data with time series characteristic, within protection scope of the present invention.In addition,
It is not only suitable for the time series data of structuring in the application, is also applied for non-structured time series data, i.e., the application is for difference
The time series data of structure has versatility.
History time series data be current time before time in the time series data that has generated, and obtained in the application
History time series data can be current time before, the when ordinal number that is generated in period setting according to actual needs
According to, it is specifically described by the exchange rate of time series data, since the transaction of foreign exchange market has global and continuity, but it is each
Foreign exchange market opening quotation, the closing quotation in area have a local characteristic, therefore the data of the exchange rate actually obtained usually have that quantity is more, remembers
The features such as imperfect is recorded, cannot be used directly for Exchange Rate Forecasting, therefore in order to solve the forecasting problem of complex time series, it is right first
This history time series data with temporal aspect of the exchange rate of acquisition carries out data prediction, which may include number
According to cleaning and data slicer, specifically, data cleansing is used to that incomplete data in history time series data to be rejected or be used
The Data-parallel language that interpolation technique will lack in history time series data, and the system in history time series data is rejected by filtering technique
Property error, retains main rule;In the present invention, it may be used and reject incomplete data, the mean filter that is 3 using window
Method rejects the Systematic Errors in history time series data to realize data cleansing.Data slicer technology is used for according to forecast demand pair
Exchange rate sequence data is sampled and splits, and such as often passes through preset time period generation in (1 minute) by being obtained in history time series data
A time series data, and these time series datas by obtaining form time series data sequence, therefore can obtain one by step S11
A time series data sequence.To ensure that the integrality of history time series data and accurate by the process of above-mentioned data prediction
Property.
S12:Tranquilization operation is carried out to time series data sequence, and steady to carrying out using Immunogenetic features restructing algorithm
Change the time series data sequence after operation and carry out feature reconstruction, obtains corresponding characteristic sequence.
It should be noted that also needing to be converted to characteristic sequence suitable for supervised learning after obtaining characteristic sequence
Label data collection, but since the step is consistent with the realization principle for corresponding to technical solution in the prior art, herein no longer
It repeats.Since time series data sequence usually has non-stationary property, deep learning model training is realized being translated into
Data set (including training set and test set) before need to carry out tranquilization to it, in order to determination be ultimately passed to deep learning model
Data set sequence length and characteristic value, using Immunogenetic features restructing algorithm IGFRA to time series data sequence in the application
Row carry out feature reconstruction.
It is operated for tranquilization, needs to illustrate, since the non-stationary property of time series data sequence can lead to depth
Degree learning model stresses the tendency of learning sample sequence, and the fluctuation of time series data will be considered as noise and ignore, therefore special
Tranquilization is carried out to it first in sign structure;Common method is first-order difference tranquilization, can according to time series analysis theory
Know, the stability bandwidth of time series data can use the higher derivative of residual error with the difference of residual error come approximate, complicated sequence fluctuation
Linear combination carrys out approximate characterization, and the above tranquilization operation is consistent with the realization principle for corresponding to technical solution in the prior art, herein
It repeats no more.
S13:The deep learning model that feature based sequence is trained is obtained, and sequential is carried out using deep learning model
Data prediction.
After obtaining the deep learning model that feature based sequence obtains, the pre- of the model realization time series data can be utilized
It surveys, specifically, can be converted to time series data to be measured can be by the sequence to be measured of deep learning Model Identification, and then this is waited for
Sequencing row are input to deep learning model, deep learning model, that is, exportable corresponding prediction result, since this utilizes depth
The technical solution for practising the prediction of model realization time series data is consistent with the realization principle for corresponding to technical solution in the prior art, herein not
It repeats again.
In technical solution provided in an embodiment of the present invention, history time series data ensure that by data cleansing and data slicer
Integrality and accuracy, by time series data sequence carry out tranquilization operation, ensure that the stability of time series data sequence,
The feature reconstruction that time series data sequence is realized by using Immunogenetic features restructing algorithm, ensure that time series data sequence
Effective extraction of feature, further Pass through above-mentioned technical proposal ensure that the corresponding deep learning model realization of feature based sequence
Accuracy when time series data is predicted;As it can be seen that the application is different from realizing data set features by the methods of sampling in the prior art
Acquisition, but ensure that the validity of the time series data feature of acquisition through the above steps, and then ensure that corresponding model energy
Temporal aspect of the enough study to time series data;And above-mentioned technical proposal disclosed in the present application can be realized automatically, without artificial ginseng
With, by avoid the harmful effect that human error brings further ensure temporal aspect acquisition accuracy.
It is pre- to carry out time series data using deep learning model for a kind of time series data prediction technique provided in an embodiment of the present invention
Before survey, can also include:
Feature based sequence calculates the precision of prediction of deep learning model, if precision of prediction meets preset requirement, really
Depthkeeping degree learning model can be used in carrying out time series data prediction, otherwise, it is determined that when deep learning model is not used to carry out
Ordinal number it is predicted that.
It should be noted that feature based sequence calculates the precision of prediction of deep learning model, can be specifically by feature
Sequence is divided into multiple subsequences, and any subsequence is then input to deep learning model, calculates the output of deep learning model
Time series data actual time series data corresponding with subsequence between difference, and then when determining that the difference accounts for corresponding actual
The percentage of ordinal number evidence is as precision of prediction, naturally it is also possible to use other modes to calculate deep learning model according to actual needs
Precision of prediction, within protection scope of the present invention.If the precision of prediction of deep learning model meets previously according to reality
Border needs the preset requirement (such as accuracy rating) set, it is determined that and deep learning model can be used in carrying out time series data prediction,
With subsequent execution using deep learning model carry out time series data prediction the step of, otherwise, then it is assumed that deep learning model
Precision of prediction is unsatisfactory for requiring, and determines its prediction for being not used to realize time series data, returns using IGFRA reconstruct feature steps
Suddenly.To further ensure the precision of prediction of time series data prediction.
A kind of time series data prediction technique provided in an embodiment of the present invention calculates deep learning model using characteristic sequence
Precision of prediction may include:
Training set and test set are obtained, training set and test set will be multiple after multiple subsequences for characteristic sequence to be divided into
What subsequence was grouped, wherein deep learning model trains to obtain based on training set;
The each subsequence for including in training set is inputted into deep learning model respectively, and is exported using deep learning model
Each time series data and corresponding practical time series data between residual error composition training residual sequence;It is every by include in test set
A subsequence inputs deep learning model respectively, and using each time series data of deep learning model output with it is corresponding practical when
Residual error composition test residual sequence of the ordinal number between;
The LB values of trained residual sequence are calculated separately using LB (Ljung Box) inspection technology and test the LB of residual sequence
Value.
It should be noted that characteristic sequence can be divided into multiple subsequences, the length of each subsequence is identical, in turn
These subsequences are divided into two groups, one group is used as training set, and one group is used as test set, and wherein training set is for realizing deep learning
The training of model.The corresponding residual sequence of training set (training residual sequence) is obtained through the above way and test set is corresponding residual
Difference sequence (test residual sequence), and then the corresponding LB values of above-mentioned two residual sequence are obtained using LB inspection technologies, it can
Using the precision of prediction by two LB values as deep learning model.LB inspection technologies are wherein utilized to calculate the LB of certain residual sequence
Value is the prior art, is no longer excessively repeated herein.
It is corresponding with above-mentioned technical proposal, judge whether precision of prediction reaches preset requirement, may include:
Training set and the corresponding LB values of test set are compared with preset requirement respectively, if training set and test set pair
The LB values answered meet preset requirement, it is determined that deep learning model can be used in carrying out time series data prediction, otherwise, it is determined that
Deep learning model is not used to carry out time series data prediction.
Wherein preset requirement be judge LB values correspond to residual error training whether be white noise requirement, i.e., if LB values are more than
0.05, then it is assumed that corresponding residual sequence is white noise, which meets preset requirement, otherwise then thinks that corresponding residual sequence is non-
White noise, the LB values do not meet preset requirement.Specifically, after obtaining above-mentioned training set and the corresponding LB values of test set,
If two LB values meet preset requirement, then it is assumed that deep learning model fully learns to have arrived sample characteristics (the application implementation
Sample in example is the subsequence divided by characteristic sequence), and sample characteristics selection is appropriate, time series data sequence universal
Rule statement is accurate, at this time, it is determined that deep learning model can be used in carrying out time series data prediction, otherwise, then it is assumed that depth
Learning model is not used to carry out time series data prediction, to examine the precision of prediction for realizing deep learning model by LB
Effectively judge.
A kind of time series data prediction technique provided in an embodiment of the present invention distinguishes training set and the corresponding LB values of test set
After being compared with preset requirement, can also include:
If test set corresponds to, LB values do not meet preset requirement and training set corresponds to LB values and meets preset requirement, and return is held
The step of row carries out feature reconstruction using Immunogenetic features restructing algorithm to the exchange rate sequence after carrying out tranquilization operation, until
Training set and test set correspond to until LB values meet preset requirement.
If test set corresponds to, LB values do not meet preset requirement and training set corresponds to LB values and meets preset requirement, i.e. training set
Corresponding residual sequence be white noise and test set to correspond to residual sequence be nonwhite noise, then it is assumed that deep learning model is for sample
The study of feature is abundant, but sample characteristics choose it is inappropriate, the universal law of time series data sequence is stated it is inaccurate
Really, therefore sample characteristics are rebuild, that is, returns and re-starts feature reconstruction.If it is white noise that training set, which corresponds to residual error, recognize
It is not completed also for the training of deep learning model, adjusts model parameter at this time, and join using training set re -training adjustment model
Deep learning model after number corresponds to until training set and test set until LB values meet preset requirement, thereby it is ensured that pair
In effective training of deep learning model.
A kind of time series data prediction technique provided in an embodiment of the present invention, using Immunogenetic features restructing algorithm to carrying out
Time series data sequence after tranquilization operation carries out feature reconstruction and obtains character pair sequence, may include:
Obtain according to the following formula carry out tranquilization operation after time series data sequence x time series data residual error derivative to
Measure X:
X=[x, x', x(2),...,x(n)]T;
It obtains and following character pair sequence is obtained based on derivative vector XFormula:
Obtain following affinity evaluation function:
Wherein, A is the square formation of n × n, and B is the column vector of n × 1, and n is characterized the exponent number of transformation, and m is input based on each
The length of the sequence for the deep learning model that training set obtains,For training set neutron sequence inputting is corresponded to deep learning model
The time series data of deep learning model output afterwards, when Y indicates that the time series data exported with deep learning model is corresponding practical
Ordinal number evidence, N are the quantity of subsequence in training set, and the value one of deep learning model, training set and affinity evaluation function is a pair of
It answers;The acquisition process of wherein training set includes:Characteristic sequence corresponding with the value of every group of A and B generated is determined, by each feature
Sequence is divided into multiple subsequences, and multiple subsequences are grouped to obtain training set corresponding with each characteristic sequence and test
Collection;
The value so that the minimum A and B of value of affinity evaluation function is determined using Immunogenetic features restructing algorithm, and
Determine that the value character pair sequence of the A and B is the characteristic sequence that the feature reconstruction finally determined obtains.
Wherein, reconstruction of function is one group by former characteristic set to the mapping of new feature set, useIndicate new feature set
(characteristic sequence), reconstruction of function are represented by following form:
It should be noted that the value of multigroup A and B can be generated at random, the value of every group of A and B of generation is substituted into formulaCharacter pair sequence is obtained, i.e., the value of every group A and B corresponds to a characteristic sequence, then divides characteristic sequence
Its corresponding whole subsequence is divided into two groups, one group by the subsequence for being m for multiple length for the value of any group of A and B
For training set, one group is test set;It is then based on training set to train to obtain corresponding deep learning model, then is based on the training set
The value for calculating corresponding affinity evaluation function, finally determines the value of optimal A and B;It is wherein based on training set and calculates correspondence
Affinity evaluation function value when input deep learning model subsequence be training for training the deep learning model
Numerical value group, training set, test set, deep learning model and the affinity for collecting the value composition of the son training namely A and B that have are commented
The value of valence function has one-to-one relationship, and the above calculating is to be realized based on the correspondence, i.e., the value of one group A and B
Character pair sequence is divided into subsequence and obtains corresponding training set and test set after being grouped, trains to obtain using a training set
One deep learning model, the deep learning model obtained based on a training set and corresponding training calculate affinity evaluation function
Value.
It is further to note that being stated in realization in the application when step can not also divide characteristic sequence
Subsequence is grouped, i.e., the whole subsequences divided using a characteristic sequence train to obtain deep learning model after,
Again based on the deep learning model and for training whole subsequences of the deep learning model to calculate affinity evaluation function
At this time it needs to be determined that after going out the value of optimal A and B, the corresponding characteristic sequence of the value of the optimal A and B determined is divided for value
It for multiple subsequences, and is grouped and obtains training set and test set, and train using training set to obtain deep learning model, the depth
Learning model is the deep learning model in step S13;And if carried out by the subsequence divided to characteristic sequence
The above-mentioned steps of grouping realize the determination of the value of A and B, then subsequently no longer need to the corresponding characteristic sequence of value of optimal A and B into
Row divides, and the deep learning model in step S13 is the corresponding depth of value of optimal A and B during realizing above-mentioned steps
Learning model, naturally it is also possible to training deep learning model after being divided again to the corresponding characteristic sequence of the value of optimal A and B,
Deng within protection scope of the present invention.
In addition, the purpose of feature reconstruction is to improve the accuracy of time series data prediction, thus it is defeated using deep learning model
The time series data gone out is used as affinity evaluation function with the mean square deviation (MSE) of corresponding practical time series data:
Since exponent number is higher, list entries is longer, and the complexity and calculation amount of deep learning model will all increase therewith;For
It avoids causing the generalization ability of deep learning model to reduce because of the growth of feature complexity, introduces sparse penalty term and carry out balance characteristics
The precision and structure complexity of selection, obtain following affinity evaluation function:
The purpose of Immunogenetic features restructing algorithm is to determine the value of suitable A and B so that above-mentioned affinity evaluation
The value of function is minimum.
A kind of time series data prediction technique provided in an embodiment of the present invention, is determined using Immunogenetic features restructing algorithm
So that the value of the minimum A and B of value of affinity evaluation function, may include:
Multiple antibody are generated, each antibody corresponds to the value of one group of A and B;
The antibody that corresponding A is singular matrix is eliminated, and the antibody to match with the antibody in data base is eliminated, data base
Including antibody be test obtained illegal antibody in advance;
The value for calculating the corresponding affinity evaluation function of each antibody judges whether to meet any end condition, if so,
It then determines that the antibody of the value minimum of affinity evaluation function is best match antibody, and determines the corresponding A and B of best match antibody
Value be so that affinity evaluation function the minimum A and B of value value;If it is not, then by multiple antibody according to front two numerical value
Classification, and to it is every it is a kind of in antibody carry out heritable variation operation, obtain new antibody, and based on new antibody return execute general
Corresponding A is the step of antibody of singular matrix is eliminated;Wherein end condition includes:There are some antibody to correspond to affinity evaluation function
Value less than predetermined threshold, return execute by corresponding A be singular matrix antibody eliminate the step of number reach preset times, really
The antibody of the value minimum for the affinity evaluation function made does not change for default time continuously.
It should be noted that antibody can generate at random, each antibody obtains after being merged by A and B, specific next
It says, using real coding as an antibody after A and B is merged, the 1st of coding is the value of n, and second is the value of m, coding
Front two encoded according to positive integer, m × position (n+1) uses real coding later, respectively m × n the element and B in corresponding A
In respective element.Accordingly, it is determined that after going out best match antibody, best match antibody can be decomposed to obtain corresponding A and B.
Specifically, IGFRA purposes are to find suitable A and B so that the value of affinity evaluation function is minimum;In order to
It can restore original sample eigen, reconstruction of function requires the inverse presence of A, if therefore a rule of Solid phase is A in antibody
For singular matrix (singular matrix is the matrix that its order is zero), then the antibody is eliminated.In addition, in order to ensure algorithm towards the residual of corresponding generation
Difference is that the direction of white noise is evolved, and for predetermined illegal antibody, should avoid repeating, therefore Solid phase is another
Match with any antibody for including in data base if a rule is antibody, eliminates the antibody.Specifically, illegal anti-
Body can be the precision of prediction in above-described embodiment when determining that the precision of prediction of deep learning model does not meet preset requirement
The corresponding antibody of deep learning model of preset requirement is not met, it should be noted that deep learning model, antibody and (A and B
Value) there is correspondence, and the antibody that can be eliminated is referred to as autoimmune antibody.In addition autoimmune antibody will be eliminated
The whole antibody obtained afterwards can be that the identical antibody of front two numerical value is divided into one kind, Jin Er according to front two numerical classification
Every a kind of interior heritable variation operation for carrying out antibody, obtains new antibody, to realize antibody elimination and antibody based on new antibody
Choose and etc..
Machine is simulated for ease of calculation, meets one of above three end condition, then IGFRA is terminated, and exports affinity
The minimum antibody of the value of evaluation function.
A kind of time series data prediction technique provided in an embodiment of the present invention carries out hereditary variation behaviour to the antibody in every one kind
Make, obtains new antibody, may include:
Determine that per a kind of antibody be a parent, and the ratio for obtaining duplication, intersection and variation is respectively p, q, r;
The pK antibody that the value minimum of affinity evaluation function is chosen from each parent, remains into filial generation;From each
QK/2 is randomly selected in parent to antibody, and the i-th bit of each pair of antibody and subsequent sequence are exchanged, it is anti-by what is obtained after exchange
Body remains into filial generation;RK antibody is randomly selected from each parent, is replaced with a random number in preset codomain
The jth position for changing each antibody will be replaced obtained antibody and be remained into filial generation;Wherein, K is per a kind of antibody levels, i and j
It is random number.
Determine that the antibody that whole filial generation includes is that heritable variation operation obtains new antibody.
The antibody levels for including in wherein each class can it is identical, can also be different, namely K herein be merely representative of it is each
The antibody levels for including in class, it is identical not represent the antibody levels for including in each class;P, q, r are to be set according to actual needs
Fixed, three be added 1 ratio value.In addition, i>3,j>2, and codomain can be set according to actual needs, generally be taken
Codomain is the range of the possible values of preset A and B.
A kind of time series data prediction technique provided in an embodiment of the present invention obtains the depth that feature based sequence is trained
Learning model may include:
Obtain GRU (Threshold Autoregressive unit) model that feature based sequence is trained.
Since the Recognition with Recurrent Neural Network such as LSTM, GRU are natively suitable for processing time sequence data, and has long-range note
Recall characteristic, therefore in the present invention, based on experiment comparison result, is predicted using GRU model realization time series datas.Specifically,
A full connection god can be arranged in the requirement that situation of change is predicted to realize GRU models for time series data in GRU models
Through network layer, while in order to increase the Generalization Capability of GRU models, a Batch is arranged in the present invention in GRU models
Normalization layers, as a result, GRU models include sequentially connected GRU model units, normalization layers of Batch and
Full Connection Neural Network layer, block mold structure can be as shown in Figure 2.
Wherein it should be noted that the network structure of GRU model units can be with as shown in figure 3, wherein recall infoAccording to
Following formula generates:
Wherein, rtFor reset signal, as last moment hidden state ht-11 is taken when for synthesizing new recall info correlation,
Otherwise 0 is taken.rtCalculation formula it is as follows:
rt=σ (Wrxt+Urht-1);
Wherein, σ () is sigmoid functions.Hidden state h in above formulatBy the state and recall info of previous moment
Linear combination is constituted.Its expression formula is as follows:
Wherein, ztTo update threshold signal, characterization model is with much degree by the hidden state h of last momentt-1It is transmitted to
Subsequent time.Its calculation is similar to resetting gate signal, and formula is as follows:
zt=σ (Wzxt+Uzht-1)。
In addition, after obtaining reconstruction of function using IGFRA, due to the non-linear transform function in GRU models [- 1,
1] more sensitive to the variation of input on section, it is therefore desirable to characteristic sequence regularization, by the value model of wherein each element
It encloses and is scaled [- 1,1].
The invention will be further elucidated with reference to specific embodiments;It should be understood that these embodiments are merely to illustrate this hair
It is bright rather than limit the scope of the invention.In addition, it should also be understood that, after reading the content taught by the present invention, art technology
Personnel can make various changes or modifications the present invention, and such equivalent forms equally fall within the application the appended claims and limited
Fixed range.
In order to verify the validity of method proposed by the invention, this example uses July 1 day to 2017 June in 2017
16431 RMB on the 31st convert dollar currency rate minute data record, using the method for the present invention construction feature, and to 1 point following
The exchange rate ups and downs situation of clock is made a prediction.Wherein, training dataset accounts for the 80% of the total number of records, remaining 20% be used as test data
Collection.The hyper parameter of GRU models is as follows:
1. model hyper parameter table of table
In an experiment, the models such as LSTM and GRU are respectively adopted to not using disclosed in the present application above-mentioned including reconstruct feature
And etc. the processed exchange rate be fitted and predicted, experimental result shows that the error of fitting of two kinds of models is all very low
(LSTM:0.003, GRU:0.0028);Use LSTM models to the consensus forecast accuracys rate of ups and downs for 53.63%.When using with
It is accurate to the prediction of the test set sequence fragment (by what is randomly selected in the exchange rate) randomly selected after machine sampling is to network training
True rate is 56.9%, and specific fitting result of testing is as shown in Fig. 4 to Fig. 5.By Fig. 4 to Fig. 5 it is visible using models fitting come out
Price curve exists in time with actual curve to be lagged, and the lag period is equal to forward prediction step number.
To carrying out ADF detections, statistic mixed-state -29.338 to it after exchange rate sequence first-order difference>- 3.43, therefore refuse
Null hypothesis, it is believed that the process is weakly stationary sequence.Its scatter plot and auto-correlation, partial correlation figure difference are as shown in Figure 6 to 8.
2. augmentation Dickey-Fuller inspection results of table
From the point of view of the LB inspection results of table 2, carries out the exchange rate sequence pair after first-order stationary and answer LB values 0.02382<0.05,
Therefore, it is considered that the sequence is not white noise sequence, has learning value.Due to the auto-correlation coefficient of the sequence after tranquilization, inclined phase
Relationship number all shows 1 rank truncation characteristic, and there are apparent variance variations (i.e. stability bandwidth is assembled), therefore is passed through according to tradition
It tests, ARCH models can be used and GARCH models model it;It is defeated that Fig. 9 to Figure 11 gives the corresponding prediction of above two model
Artificial situation, and the corresponding prediction accuracy of above two model is as shown in table 3.
3. tradition TS models of table and accuracy
By experimental result as it can be seen that ARCH models (autoregressive conditional different Variance model) and GARCH models (broad sense autoregression item
Part heteroscedastic model) precision of prediction it is not high.Use method provided by the present invention by after feature reconstruction it is Sequence Transformed for
Training dataset with mark, trained result is as shown in Figure 12 to Figure 13, as can see from Figure 12, for test set
(test), the prediction result (prd) of model is although there are still certain errors in amplitude, and the variation tendency of prediction curve is
Through synchronous with test set, i.e., phase delay is eliminated using the feature after feature reconstruction so as to the variation prediction of the exchange rate
It is possibly realized.It is 64.04% by (such as Figure 13) statistical forecast accuracy after prediction result binaryzation.In order to probe into the study of model
As a result whether can further increase, Ljung Box inspections be carried out to residual error, it will be seen from figure 14 that for computable
Different residual error ordered series of numbers within the delay of 40 phases, the value that LB is examined are all higher than 0.05, and average value is about 0.95, i.e., to being examined
Arbitrary sequence, the sequence are that the confidence level of white noise is 95%.;And use gold of the deep learning model realization with temporal characteristics
When financing production prediction, since financial asset sequence has very strong autocorrelation, obtained prediction output and actual assets
There is lag in time.
Since any useful knowledge or pattern can not be extracted again from white noise, it can be considered that being given currently
Feature and data set under conditions of, accuracy that model the obtains approximation theory upper limit, continuing, which improves model, to carry
In high precision.
The embodiment of the present invention additionally provides a kind of time series data prediction meanss:
Preprocessing module 11, is used for:History time series data is obtained, and data cleansing and data are carried out to history time series data
Slice, obtains corresponding time series data sequence;
Feature reconstruction module 12, is used for:Tranquilization operation is carried out to time series data sequence, and uses Immunogenetic features weight
Structure algorithm carries out feature reconstruction to the time series data sequence after carrying out tranquilization operation, obtains corresponding characteristic sequence;
Prediction module 13, is used for:The deep learning model that feature based sequence is trained is obtained, and utilizes deep learning
Model carries out time series data prediction.
A kind of time series data prediction meanss provided in an embodiment of the present invention can also include:
Precision evaluation module, is used for:Before time series data prediction being carried out using deep learning model, feature based sequence meter
The precision of prediction for calculating deep learning model, if precision of prediction meets preset requirement, it is determined that deep learning model can be used in
Time series data prediction is carried out, otherwise, it is determined that deep learning model is not used to carry out time series data prediction.
A kind of time series data prediction meanss provided in an embodiment of the present invention, precision evaluation module may include:
Accuracy computation unit, is used for:Training set and test set are obtained, training set and test set are to be divided into characteristic sequence
Multiple subsequences are grouped after multiple subsequences, wherein deep learning model trains to obtain based on training set;It will
The each subsequence for including in training set inputs deep learning model respectively, and utilizes each sequential of deep learning model output
Residual error composition training residual sequence between data and corresponding practical time series data;By each subsequence for including in test set point
Not Shu Ru deep learning model, and using deep learning model output each time series data and corresponding practical time series data between
Residual error composition test residual sequence;The LB values and test residual sequence of trained residual sequence are calculated separately using LB inspection technologies
LB values;
Precision evaluation unit, is used for:Training set and the corresponding LB values of test set are compared with preset requirement respectively, such as
Fruit training set and the corresponding LB values of test set meet preset requirement, it is determined that ordinal number when deep learning model can be used in carrying out
It was predicted that otherwise, it is determined that deep learning model is not used to carry out time series data prediction.
A kind of time series data prediction meanss provided in an embodiment of the present invention, precision evaluation module can also include:
Modifying model unit, is used for:It is compared with preset requirement respectively in training set and the corresponding LB values of test set
Afterwards, if test set corresponds to, LB values do not meet preset requirement and training set corresponds to LB values and meets preset requirement, return to execution and adopt
The step of feature reconstruction is carried out to the time series data sequence after carrying out tranquilization operation with Immunogenetic features restructing algorithm, until
Training set and test set correspond to until LB values meet preset requirement.
A kind of time series data prediction meanss provided in an embodiment of the present invention, feature reconstruction module may include:
Feature reconstruction unit, is used for:Obtain according to the following formula carry out tranquilization operation after time series data sequence x when
Ordinal number according to residual error derivative vector X:
X=[x, x', x(2),...,x(n)]T;
It obtains and following character pair sequence is obtained based on derivative vector XFormula:
Obtain following affinity evaluation function:
Wherein, A is the square formation of n × n, and B is the column vector of n × 1, and n is characterized the exponent number of transformation, and m is input based on each
The length of the sequence for the deep learning model that training set obtains,For training set neutron sequence inputting is corresponded to deep learning model
The time series data of deep learning model output afterwards, when Y indicates that the time series data exported with deep learning model is corresponding practical
Ordinal number evidence, N are the quantity of subsequence in training set, and the value one of deep learning model, training set and affinity evaluation function is a pair of
It answers;The acquisition process of wherein training set includes:Characteristic sequence corresponding with the value of every group of A and B generated is determined, by each feature
Sequence is divided into multiple subsequences, and multiple subsequences are grouped to obtain training set corresponding with each characteristic sequence and test
Collection;The value so that the minimum A and B of value of affinity evaluation function is determined using Immunogenetic features restructing algorithm, and is determined
The value character pair sequence of the A and B is the characteristic sequence that the feature reconstruction finally determined obtains.
A kind of time series data prediction meanss provided in an embodiment of the present invention, feature reconstruction unit may include:
Feature reconstruction subelement, is used for:Multiple antibody are generated, each antibody corresponds to the value of one group of A and B;It is strange by corresponding A
The antibody of different battle array is eliminated, and the antibody to match with the antibody in data base is eliminated, and the antibody that data base includes is to survey in advance
Try obtained illegal antibody;The value for calculating the corresponding affinity evaluation function of each antibody judges whether to meet any termination
Condition if it is, determining that the antibody of the value minimum of affinity evaluation function is best match antibody, and determines that best match is anti-
The value of body corresponding A and B is the value so that the minimum A and B of value of affinity evaluation function;If it is not, then multiple antibody are pressed
Heritable variation operation is carried out according to front two numerical classification, and to the antibody in every one kind, obtains new antibody, and resist based on new
Body, which returns, executes the step of eliminating the antibody that corresponding A is singular matrix;Wherein end condition includes:There are some antibody to correspond to parent
The number for the step of eliminating the antibody that corresponding A is singular matrix less than predetermined threshold, return execution with the value of degree evaluation function reaches
Antibody to preset times, the value minimum for the affinity evaluation function determined does not change for default time continuously.
A kind of time series data prediction meanss provided in an embodiment of the present invention, feature reconstruction subelement may include:
Make a variation subelement, is used for:It determines that per a kind of antibody be a parent, and obtains the ratio of duplication, intersection and variation
Respectively p, q, r;The pK antibody that the value minimum of affinity evaluation function is chosen from each parent, remains into filial generation;From
QK/2 is randomly selected to antibody in each parent, and the i-th bit of each pair of antibody and subsequent sequence are exchanged, and will be obtained after exchange
Antibody remain into filial generation;RK antibody is randomly selected from each parent, it is random with one in preset codomain
Number replaces the jth position of each antibody, will replace obtained antibody and remains into filial generation;Wherein, K is per a kind of antibody levels, i
It is random number with j;Determine that the antibody that whole filial generation includes is that heritable variation operation obtains new antibody.
A kind of time series data prediction meanss provided in an embodiment of the present invention, preprocessing module may include:
Acquiring unit is used for:Obtain the GRU models that feature based sequence is trained.
The embodiment of the present invention additionally provides a kind of pre- measurement equipment of time series data, may include:
Memory, for storing computer program;
Processor, when for executing computer program realize as above any one of time series data prediction technique the step of.
The embodiment of the present invention additionally provides a kind of computer readable storage medium, is stored on computer readable storage medium
Computer program, may be implemented when computer program is executed by processor as above any one of time series data prediction technique the step of
It is related in a kind of time series data prediction meanss provided in an embodiment of the present invention, equipment and computer readable storage medium
Partial explanation refers to the detailed description of corresponding part in a kind of time series data prediction technique provided in an embodiment of the present invention,
This is repeated no more.In above-mentioned technical proposal provided in an embodiment of the present invention with correspond to technical solution realization principle one in the prior art
The part of cause is simultaneously unspecified, in order to avoid excessively repeat.
The foregoing description of the disclosed embodiments enables those skilled in the art to realize or use the present invention.To this
A variety of modifications of a little embodiments will be apparent for a person skilled in the art, and the general principles defined herein can
Without departing from the spirit or scope of the present invention, to realize in other embodiments.Therefore, the present invention will not be limited
It is formed on the embodiments shown herein, and is to fit to consistent with the principles and novel features disclosed in this article widest
Range.
Claims (10)
1. a kind of time series data prediction technique, which is characterized in that including:
History time series data is obtained, and data cleansing and data slicer are carried out to the history time series data, when obtaining corresponding
Sequence data sequence;
Tranquilization operation is carried out to the time series data sequence, and using Immunogenetic features restructing algorithm to carrying out tranquilization behaviour
Time series data sequence after work carries out feature reconstruction, obtains corresponding characteristic sequence;
The deep learning model trained based on the characteristic sequence is obtained, and sequential is carried out using the deep learning model
Data prediction.
2. according to the method described in claim 1, it is characterized in that, carrying out time series data prediction using the deep learning model
Before, further include:
The precision of prediction of the deep learning model is calculated based on the characteristic sequence, if the precision of prediction meets default want
It asks, it is determined that the deep learning model can be used in carrying out time series data prediction, otherwise, it is determined that the deep learning model
It is not used to carry out time series data prediction.
3. according to the method described in claim 2, it is characterized in that, calculating the deep learning model using the characteristic sequence
Precision of prediction, including:
Training set and test set are obtained, the training set and the test set are that the characteristic sequence is divided into multiple subsequences
The multiple subsequence is grouped afterwards, wherein the deep learning model trains to obtain based on the training set;
The each subsequence for including in the training set is inputted into the deep learning model respectively, and utilizes the deep learning
Residual error composition training residual sequence between each time series data and corresponding practical time series data of model output;By the test
Each subsequence that concentration includes inputs the deep learning model respectively, and utilizes each of deep learning model output
Residual error composition test residual sequence between time series data and corresponding practical time series data;
The LB values of the trained residual sequence and the LB values of the test residual sequence are calculated separately using LB inspection technologies;
It is corresponding, judge whether the precision of prediction reaches preset requirement, including:
The training set and the corresponding LB values of the test set are compared with preset requirement respectively, if the training set and
The corresponding LB values of the test set meet preset requirement, it is determined that the deep learning model can be used in carrying out time series data
Prediction, otherwise, it is determined that the deep learning model is not used to carry out time series data prediction.
4. according to the method described in claim 3, it is characterized in that, by the training set and the corresponding LB values point of the test set
After not being compared with preset requirement, further include:
If the test set corresponds to, LB values do not meet the preset requirement and the training set corresponds to LB values and meets described preset
It is required that then return execute it is described using Immunogenetic features restructing algorithm to carry out tranquilization operation after time series data sequence into
The step of row feature reconstruction, corresponds to until the training set and the test set until LB values meet preset requirement.
5. according to the method described in claim 1, it is characterized in that, using Immunogenetic features restructing algorithm to carrying out tranquilization
Time series data sequence after operation carries out feature reconstruction and obtains character pair sequence, including:
Obtain carrying out the derivative vector X of the time series data residual error of the time series data sequence x after tranquilization operation according to the following formula:
X=[x, x', x(2),...,x(n)]T;
It obtains and following character pair sequence is obtained based on derivative vector XFormula:
Obtain following affinity evaluation function:
Wherein, A is the square formation of n × n, and B is the column vector of n × 1, and n is characterized the exponent number of transformation, and m is input based on each training
Collect the length of the sequence of obtained deep learning model,For the training set neutron sequence inputting is corresponded to deep learning model
The time series data of deep learning model output, Y indicate reality corresponding with the time series data that the deep learning model exports afterwards
Border time series data, N are the quantity of subsequence in the training set, the deep learning model, the training set and described affine
The value for spending evaluation function corresponds;The acquisition process of the wherein described training set includes:Determine the value with the every group of A and B generated
Each characteristic sequence is divided into multiple subsequences, and the multiple subsequence is grouped to obtain by corresponding characteristic sequence
Training set corresponding with each characteristic sequence and test set;
The value so that the minimum A and B of value of the affinity evaluation function is determined using Immunogenetic features restructing algorithm, and
Determine that the value character pair sequence of the A and B is the characteristic sequence that the feature reconstruction finally determined obtains.
6. according to the method described in claim 5, it is characterized in that, being determined using Immunogenetic features restructing algorithm so that institute
The value of the minimum A and B of value of affinity evaluation function is stated, including:
Multiple antibody are generated, each antibody corresponds to the value of one group of A and B;
The antibody that corresponding A is singular matrix is eliminated, and the antibody to match with the antibody in data base is eliminated, the data base
Including antibody be test obtained illegal antibody in advance;
The value for calculating the corresponding affinity evaluation function of each antibody judges whether to meet any end condition, if it is, really
The antibody for determining the value minimum of affinity evaluation function is best match antibody, and determines the corresponding A and B of the best match antibody
Value be so that the affinity evaluation function the minimum A and B of value value;If it is not, then by the multiple antibody according to preceding
Double figures value is classified, and carries out heritable variation operation to the antibody in every one kind, obtains new antibody, and return based on new antibody
Receipt is about to the step of antibody that corresponding A is singular matrix is eliminated;The wherein described end condition includes:There are some antibody to correspond to parent
The number for the step of eliminating the antibody that corresponding A is singular matrix less than predetermined threshold, return execution with the value of degree evaluation function reaches
Antibody to preset times, the value minimum for the affinity evaluation function determined does not change for default time continuously.
7. according to the method described in claim 6, it is characterized in that, to the antibody implementation heritable variation operation in every one kind, obtain
To new antibody, including:
Determine that per a kind of antibody be a parent, and the ratio for obtaining duplication, intersection and variation is respectively p, q, r;
The pK antibody that the value minimum of affinity evaluation function is chosen from each parent, remains into filial generation;From each parent
In randomly select qK/2 to antibody, and the i-th bit of each pair of antibody and subsequent sequence are exchanged, the antibody obtained after exchange are protected
It is left in filial generation;RK antibody is randomly selected from each parent, is replaced with a random number in preset codomain every
The jth position of a antibody will be replaced obtained antibody and be remained into filial generation;Wherein, K is to be per a kind of antibody levels, i and j
Random number;
Determine that the antibody that whole filial generation includes is that heritable variation operation obtains new antibody.
8. according to the method described in claim 1, it is characterized in that, obtaining the depth trained based on the characteristic sequence
Model is practised, including:
Obtain the GRU models trained based on the characteristic sequence.
9. a kind of time series data prediction meanss, which is characterized in that including:
Preprocessing module is used for:History time series data is obtained, and history time series data progress data cleansing and data are cut
Piece obtains corresponding time series data sequence;
Feature reconstruction module, is used for:Tranquilization operation is carried out to the time series data sequence, and is reconstructed using Immunogenetic features
Algorithm carries out feature reconstruction to the time series data sequence after carrying out tranquilization operation, obtains corresponding characteristic sequence;
Prediction module is used for:The deep learning model trained based on the characteristic sequence is obtained, and utilizes the depth
It practises model and carries out time series data prediction.
10. a kind of pre- measurement equipment of time series data, which is characterized in that including:
Memory, for storing computer program;
Processor realizes the time series data prediction side as described in any one of claim 1 to 8 when for executing the computer program
The step of method.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810174986.2A CN108399248A (en) | 2018-03-02 | 2018-03-02 | A kind of time series data prediction technique, device and equipment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810174986.2A CN108399248A (en) | 2018-03-02 | 2018-03-02 | A kind of time series data prediction technique, device and equipment |
Publications (1)
Publication Number | Publication Date |
---|---|
CN108399248A true CN108399248A (en) | 2018-08-14 |
Family
ID=63091772
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810174986.2A Pending CN108399248A (en) | 2018-03-02 | 2018-03-02 | A kind of time series data prediction technique, device and equipment |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108399248A (en) |
Cited By (16)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109461067A (en) * | 2018-09-12 | 2019-03-12 | 阿里巴巴集团控股有限公司 | A kind of detection method of foreign exchange quotation abnormal data, apparatus and system |
CN109583568A (en) * | 2018-11-28 | 2019-04-05 | 中科赛诺(北京)科技有限公司 | Data extension method, device and electronic equipment |
CN110414442A (en) * | 2019-07-31 | 2019-11-05 | 广东省智能机器人研究院 | A kind of pressure time series data segmentation feature value prediction technique |
CN110610458A (en) * | 2019-04-30 | 2019-12-24 | 北京联合大学 | Method and system for GAN image enhancement interactive processing based on ridge regression |
CN110737696A (en) * | 2019-10-12 | 2020-01-31 | 北京百度网讯科技有限公司 | Data sampling method, device, electronic equipment and storage medium |
CN110866672A (en) * | 2019-10-10 | 2020-03-06 | 重庆金融资产交易所有限责任公司 | Data processing method, device, terminal and medium |
CN111199114A (en) * | 2020-03-06 | 2020-05-26 | 上海复见网络科技有限公司 | Method for building system of urban industry evolution model based on recurrent neural network |
CN111754033A (en) * | 2020-06-15 | 2020-10-09 | 西安工业大学 | Non-stationary time sequence data prediction method based on recurrent neural network |
CN112923922A (en) * | 2021-03-04 | 2021-06-08 | 香港理工大学深圳研究院 | Method, system and storage medium for counting steps and determining position information of pedestrian |
CN113051811A (en) * | 2021-03-16 | 2021-06-29 | 重庆邮电大学 | Multi-mode short-term traffic jam prediction method based on GRU network |
CN113468151A (en) * | 2020-03-31 | 2021-10-01 | 横河电机株式会社 | Learning data processing device, learning data processing method, and medium |
CN113743971A (en) * | 2020-06-17 | 2021-12-03 | 北京沃东天骏信息技术有限公司 | Data processing method and device |
CN114528334A (en) * | 2022-02-18 | 2022-05-24 | 重庆伏特猫科技有限公司 | Rapid similarity searching method in time sequence database |
CN115099144A (en) * | 2022-06-24 | 2022-09-23 | 无锡物联网创新中心有限公司 | Yarn raw material characteristic parameter inversion method and related device |
CN115204533A (en) * | 2022-09-16 | 2022-10-18 | 中国地质大学(北京) | Oil-gas yield prediction method and system based on multivariable weighted combination model |
CN116311829A (en) * | 2023-05-22 | 2023-06-23 | 广州豪特节能环保科技股份有限公司 | Remote alarm method and device for data machine room |
-
2018
- 2018-03-02 CN CN201810174986.2A patent/CN108399248A/en active Pending
Cited By (18)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109461067A (en) * | 2018-09-12 | 2019-03-12 | 阿里巴巴集团控股有限公司 | A kind of detection method of foreign exchange quotation abnormal data, apparatus and system |
CN109583568A (en) * | 2018-11-28 | 2019-04-05 | 中科赛诺(北京)科技有限公司 | Data extension method, device and electronic equipment |
CN110610458A (en) * | 2019-04-30 | 2019-12-24 | 北京联合大学 | Method and system for GAN image enhancement interactive processing based on ridge regression |
CN110610458B (en) * | 2019-04-30 | 2023-10-20 | 北京联合大学 | GAN image enhancement interaction processing method and system based on ridge regression |
CN110414442A (en) * | 2019-07-31 | 2019-11-05 | 广东省智能机器人研究院 | A kind of pressure time series data segmentation feature value prediction technique |
CN110866672A (en) * | 2019-10-10 | 2020-03-06 | 重庆金融资产交易所有限责任公司 | Data processing method, device, terminal and medium |
CN110737696A (en) * | 2019-10-12 | 2020-01-31 | 北京百度网讯科技有限公司 | Data sampling method, device, electronic equipment and storage medium |
CN111199114A (en) * | 2020-03-06 | 2020-05-26 | 上海复见网络科技有限公司 | Method for building system of urban industry evolution model based on recurrent neural network |
CN113468151A (en) * | 2020-03-31 | 2021-10-01 | 横河电机株式会社 | Learning data processing device, learning data processing method, and medium |
CN111754033A (en) * | 2020-06-15 | 2020-10-09 | 西安工业大学 | Non-stationary time sequence data prediction method based on recurrent neural network |
CN113743971A (en) * | 2020-06-17 | 2021-12-03 | 北京沃东天骏信息技术有限公司 | Data processing method and device |
CN112923922A (en) * | 2021-03-04 | 2021-06-08 | 香港理工大学深圳研究院 | Method, system and storage medium for counting steps and determining position information of pedestrian |
CN113051811B (en) * | 2021-03-16 | 2022-08-05 | 重庆邮电大学 | Multi-mode short-term traffic jam prediction method based on GRU network |
CN113051811A (en) * | 2021-03-16 | 2021-06-29 | 重庆邮电大学 | Multi-mode short-term traffic jam prediction method based on GRU network |
CN114528334A (en) * | 2022-02-18 | 2022-05-24 | 重庆伏特猫科技有限公司 | Rapid similarity searching method in time sequence database |
CN115099144A (en) * | 2022-06-24 | 2022-09-23 | 无锡物联网创新中心有限公司 | Yarn raw material characteristic parameter inversion method and related device |
CN115204533A (en) * | 2022-09-16 | 2022-10-18 | 中国地质大学(北京) | Oil-gas yield prediction method and system based on multivariable weighted combination model |
CN116311829A (en) * | 2023-05-22 | 2023-06-23 | 广州豪特节能环保科技股份有限公司 | Remote alarm method and device for data machine room |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108399248A (en) | A kind of time series data prediction technique, device and equipment | |
CN112446591B (en) | Zero sample evaluation method for student comprehensive ability evaluation | |
Widiputra et al. | Multivariate cnn-lstm model for multiple parallel financial time-series prediction | |
CN114509266B (en) | Bearing health monitoring method based on fault feature fusion | |
CN106570516A (en) | Obstacle recognition method using convolution neural network | |
Chatterjee et al. | Extraction of binary black hole gravitational wave signals from detector data using deep learning | |
CN108447057A (en) | SAR image change detection based on conspicuousness and depth convolutional network | |
CN103268519A (en) | Electric power system short-term load forecast method and device based on improved Lyapunov exponent | |
CN115601661A (en) | Building change detection method for urban dynamic monitoring | |
CN114297036A (en) | Data processing method and device, electronic equipment and readable storage medium | |
CN104809471A (en) | Hyperspectral image residual error fusion classification method based on space spectrum information | |
CN114565594A (en) | Image anomaly detection method based on soft mask contrast loss | |
Priyadarshini | A comparative analysis of prediction using Artificial Neural network and auto regressive integrated moving average | |
CN109920489A (en) | It is a kind of that model and method for building up are hydrocracked based on Lasso-CCF-CNN | |
CN105787265A (en) | Atomic spinning top random error modeling method based on comprehensive integration weighting method | |
CN107220346A (en) | A kind of higher-dimension deficiency of data feature selection approach | |
CN116364203A (en) | Water quality prediction method, system and device based on deep learning | |
CN116541771A (en) | Unbalanced sample bearing fault diagnosis method based on multi-scale feature fusion | |
Zhang et al. | Multivariate discrete grey model base on dummy drivers | |
CN115359197A (en) | Geological curved surface reconstruction method based on spatial autocorrelation neural network | |
CN112801955A (en) | Plankton detection method under unbalanced population distribution condition | |
CN111695989A (en) | Modeling method and platform of wind-control credit model | |
Wang et al. | Medium and long-term trend prediction of urban air quality based on deep learning | |
Chakravarthi et al. | Gross Domestic Product Prediction Model Using Gradient Boosting Algorithm in Machine Learning | |
CN115270638B (en) | Urban thermal environment downscaling space-time analysis and prediction method and system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20180814 |
|
RJ01 | Rejection of invention patent application after publication |