CN108399248A - A kind of time series data prediction technique, device and equipment - Google Patents

A kind of time series data prediction technique, device and equipment Download PDF

Info

Publication number
CN108399248A
CN108399248A CN201810174986.2A CN201810174986A CN108399248A CN 108399248 A CN108399248 A CN 108399248A CN 201810174986 A CN201810174986 A CN 201810174986A CN 108399248 A CN108399248 A CN 108399248A
Authority
CN
China
Prior art keywords
time series
series data
antibody
sequence
deep learning
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201810174986.2A
Other languages
Chinese (zh)
Inventor
李峰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhengzhou Yunhai Information Technology Co Ltd
Original Assignee
Zhengzhou Yunhai Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhengzhou Yunhai Information Technology Co Ltd filed Critical Zhengzhou Yunhai Information Technology Co Ltd
Priority to CN201810174986.2A priority Critical patent/CN108399248A/en
Publication of CN108399248A publication Critical patent/CN108399248A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2474Sequence data queries, e.g. querying versioned data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/215Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/18Complex mathematical operations for evaluating statistical data, e.g. average values, frequency distributions, probability functions, regression analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/12Computing arrangements based on biological models using genetic models
    • G06N3/126Evolutionary algorithms, e.g. genetic algorithms or genetic programming

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Biophysics (AREA)
  • Databases & Information Systems (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Computational Linguistics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Analysis (AREA)
  • Pure & Applied Mathematics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Evolutionary Computation (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • Computational Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Mathematical Optimization (AREA)
  • Operations Research (AREA)
  • Fuzzy Systems (AREA)
  • Algebra (AREA)
  • Quality & Reliability (AREA)
  • Physiology (AREA)
  • Genetics & Genomics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a kind of time series data prediction technique, device and equipment, wherein this method includes:History time series data is obtained, and data cleansing and data slicer are carried out to the history time series data, obtains corresponding time series data sequence;Tranquilization operation is carried out to the time series data sequence, and feature reconstruction is carried out to the time series data sequence after carrying out tranquilization operation using Immunogenetic features restructing algorithm, obtains corresponding characteristic sequence;The deep learning model trained based on the characteristic sequence is obtained, and time series data prediction is carried out using the deep learning model.It can be seen that, the application is different from realizing the acquisition of data set features by the methods of sampling in the prior art, but by above-mentioned data prediction, tranquilization operation and feature reconstruction and etc. ensure that acquisition time series data feature validity, so that deep learning model can learn the temporal aspect to time series data, the forecasting accuracy of deep learning model ensure that.

Description

A kind of time series data prediction technique, device and equipment
Technical field
The present invention relates to depth learning technology field, more specifically to a kind of time series data prediction technique, device and Equipment.
Background technology
" exchange rate " is referred to as ExRate, also known as " exchange rate quotation ", " foreign exchange quotation " or " exchange rate " etc., is a kind of currency conversion The ratio of another currency, the variation of reaction coin centre relative price;With global floating exchange rate system legalize and the world The reinforcement of economic integration trend, foreign exchange become the important composition of numerous capital products, therefore as important capital element The concern of various circles of society investor and securities market is caused to its prediction.
In recent years, with the development of heuritic approach, all kinds of machine learning algorithms are applied to Exchange Rate Forecasting, specific next It says, proposes to simulate the nonlinear change of the exchange rate using broad sense Recurrent neural network (GRNN) model in the prior art, still Such static network model needs to learn the feature of input data set by the method for random sampling, this makes model without calligraphy learning To the temporal aspect of Exchange Rate.
In conclusion for realizing the technical solution of Exchange Rate Forecasting, there are models to become without calligraphy learning to the exchange rate in the prior art The problem of temporal aspect of change.
Invention content
The object of the present invention is to provide a kind of time series data prediction technique, device and equipment, can pass through time series data sequence The determination of row feature enables model effectively to learn the temporal aspect to Exchange Rate.
To achieve the goals above, the present invention provides the following technical solutions:
A kind of time series data prediction technique, including:
History time series data is obtained, and data cleansing and data slicer are carried out to the history time series data, is corresponded to Time series data sequence;
Tranquilization operation is carried out to the time series data sequence, and steady to carrying out using Immunogenetic features restructing algorithm Change the time series data sequence after operation and carry out feature reconstruction, obtains corresponding characteristic sequence;
The deep learning model trained based on the characteristic sequence is obtained, and is carried out using the deep learning model Time series data is predicted.
Preferably, before carrying out time series data prediction using the deep learning model, further include:
The precision of prediction of the deep learning model is calculated based on the characteristic sequence, if the precision of prediction meet it is pre- If it is required that, it is determined that the deep learning model can be used in carrying out time series data prediction, otherwise, it is determined that the deep learning Model is not used to carry out time series data prediction.
Preferably, the precision of prediction of the deep learning model is calculated using the characteristic sequence, including:
Training set and test set are obtained, the training set and the test set are that the characteristic sequence is divided into multiple sons The multiple subsequence is grouped after sequence, wherein the deep learning model is to train to obtain based on the training set 's;
The each subsequence for including in the training set is inputted into the deep learning model respectively, and utilizes the depth Residual error composition training residual sequence between each time series data and corresponding practical time series data of learning model output;It will be described The each subsequence for including in test set inputs the deep learning model respectively, and utilizes deep learning model output Each residual error composition test residual sequence between time series data and corresponding practical time series data;
The LB values of the trained residual sequence and the LB values of the test residual sequence are calculated separately using LB inspection technologies;
It is corresponding, judge whether the precision of prediction reaches preset requirement, including:
The training set and the corresponding LB values of the test set are compared with preset requirement respectively, if the training Collection and the corresponding LB values of the test set meet preset requirement, it is determined that the deep learning model can be used in carrying out sequential Data prediction, otherwise, it is determined that the deep learning model is not used to carry out time series data prediction.
Preferably, after the training set and the corresponding LB values of the test set being compared with preset requirement respectively, Further include:
If the test set correspond to LB values do not meet the preset requirement and the training set correspond to LB values meet it is described Preset requirement, then return execute it is described using Immunogenetic features restructing algorithm to carry out tranquilization operation after time series data sequence Row carry out the step of feature reconstruction, are corresponded to until LB values meet preset requirement until the training set and the test set.
Preferably, the time series data sequence after carrying out tranquilization operation is carried out using Immunogenetic features restructing algorithm special Sign reconstruct obtains character pair sequence, including:
Obtain according to the following formula carry out tranquilization operation after time series data sequence x time series data residual error derivative to Measure X:
X=[x, x', x(2),...,x(n)]T
It obtains and following character pair sequence is obtained based on derivative vector XFormula:
Obtain following affinity evaluation function:
Wherein, A is the square formation of n × n, and B is the column vector of n × 1, and n is characterized the exponent number of transformation, and m is input based on each The length of the sequence for the deep learning model that training set obtains,For the training set neutron sequence inputting is corresponded to deep learning The time series data that the deep learning model exports after model, Y indicate corresponding with the time series data that the deep learning model exports Practical time series data, N is the quantity of subsequence in the training set, the deep learning model, the training set and described The value of affinity evaluation function corresponds;The acquisition process of the wherein described training set includes:The every group of A and B for determining and generating The corresponding characteristic sequence of value, each characteristic sequence is divided into multiple subsequences, and the multiple subsequence is grouped Obtain training set corresponding with each characteristic sequence and test set;
Determine the minimum A's and B of value for making the affinity evaluation function using Immunogenetic features restructing algorithm Value, and determine that the value character pair sequence of the A and B is the characteristic sequence that the feature reconstruction finally determined obtains.
Preferably, it is determined using Immunogenetic features restructing algorithm so that the value of the affinity evaluation function is minimum The value of A and B, including:
Multiple antibody are generated, each antibody corresponds to the value of one group of A and B;
The antibody that corresponding A is singular matrix is eliminated, and the antibody to match with the antibody in data base is eliminated, the note It is to test obtained illegal antibody in advance to recall the antibody that library includes;
The value for calculating the corresponding affinity evaluation function of each antibody judges whether to meet any end condition, if so, It then determines that the antibody of the value minimum of affinity evaluation function is best match antibody, and determines that the best match antibody is corresponding The value of A and B is the value so that the minimum A and B of value of the affinity evaluation function;If it is not, then the multiple antibody is pressed Heritable variation operation is carried out according to front two numerical classification, and to the antibody in every one kind, obtains new antibody, and resist based on new Body, which returns, executes the step of eliminating the antibody that corresponding A is singular matrix;The wherein described end condition includes:There are some antibody pair The value of affinity evaluation function is answered to execute time for the step of eliminating the antibody that corresponding A is singular matrix less than predetermined threshold, return The antibody of the value minimum of several affinity evaluation functions for reaching preset times, determining does not change for default time continuously.
Preferably, heritable variation operation is carried out to the antibody in every one kind, obtains new antibody, including:
Determine that per a kind of antibody be a parent, and the ratio for obtaining duplication, intersection and variation is respectively p, q, r;
The pK antibody that the value minimum of affinity evaluation function is chosen from each parent, remains into filial generation;From each QK/2 is randomly selected in parent to antibody, and the i-th bit of each pair of antibody and subsequent sequence are exchanged, it is anti-by what is obtained after exchange Body remains into filial generation;RK antibody is randomly selected from each parent, is replaced with a random number in preset codomain The jth position for changing each antibody will be replaced obtained antibody and be remained into filial generation;Wherein, K is per a kind of antibody levels, i and j It is random number;
Determine that the antibody that whole filial generation includes is that heritable variation operation obtains new antibody.
Preferably, the deep learning model trained based on the characteristic sequence is obtained, including:
Obtain the GRU models trained based on the characteristic sequence.
A kind of time series data prediction meanss, including:
Preprocessing module is used for:History time series data is obtained, and data cleansing and number are carried out to the history time series data According to slice, corresponding time series data sequence is obtained;
Feature reconstruction module, is used for:Tranquilization operation is carried out to the time series data sequence, and uses Immunogenetic features Restructing algorithm carries out feature reconstruction to the time series data sequence after carrying out tranquilization operation, obtains corresponding characteristic sequence;
Prediction module is used for:The deep learning model trained based on the characteristic sequence is obtained, and utilizes the depth It spends learning model and carries out time series data prediction.
A kind of pre- measurement equipment of time series data, including:
Memory, for storing computer program;
Processor realizes the step of the as above any one time series data prediction technique when for executing the computer program Suddenly.
The present invention provides a kind of time series data prediction technique, device and equipment, wherein this method includes:When obtaining history Ordinal number evidence, and data cleansing and data slicer are carried out to the history time series data, obtain corresponding time series data sequence;To institute State time series data sequence carry out tranquilization operation, and using Immunogenetic features restructing algorithm to carry out tranquilization operation after when Sequence data sequence carries out feature reconstruction, obtains corresponding characteristic sequence;Obtain the depth trained based on the characteristic sequence Learning model, and carry out time series data prediction using the deep learning model.In above-mentioned technical proposal provided by the invention, lead to It crosses data cleansing and data slicer ensure that the integrality and accuracy of history time series data, by being carried out to time series data sequence Tranquilization operates, and ensure that the stability of time series data sequence, ordinal number when being realized by using Immunogenetic features restructing algorithm According to the feature reconstruction of sequence, effective extraction to the feature of time series data sequence, further Pass through above-mentioned technical proposal ensure that It ensure that accuracy when the corresponding deep learning model realization time series data of feature based sequence is predicted;As it can be seen that the application is not Be same as in the prior art by the methods of sampling realize data set features acquisition, but ensure that through the above steps acquisition when The validity of sequence data characteristics, and then ensure that deep learning model can learn the temporal aspect to time series data;And this Shen Please disclosed above-mentioned technical proposal can realize automatically, participated in without artificial, pass through the harmful effect for avoiding human error from bringing Further ensure the accuracy of temporal aspect acquisition.
Description of the drawings
In order to more clearly explain the embodiment of the invention or the technical proposal in the existing technology, to embodiment or will show below There is attached drawing needed in technology description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this The embodiment of invention for those of ordinary skill in the art without creative efforts, can also basis The attached drawing of offer obtains other attached drawings.
Fig. 1 is a kind of flow chart of time series data prediction technique provided in an embodiment of the present invention;
Fig. 2 is the structure chart of GRU models in a kind of time series data prediction technique provided in an embodiment of the present invention;
Fig. 3 is the network structure of GRU model units in a kind of time series data prediction technique provided in an embodiment of the present invention;
Fig. 4 is models fitting curve graph in a kind of time series data prediction technique provided in an embodiment of the present invention;
Fig. 5 is model accuracy curve graph in a kind of time series data prediction technique provided in an embodiment of the present invention;
Fig. 6 is that first-order difference tranquilization corresponding sequence dissipates in a kind of time series data prediction technique provided in an embodiment of the present invention Point diagram;
Fig. 7 be in a kind of time series data prediction technique provided in an embodiment of the present invention first-order difference tranquilization corresponding sequence from Correlation figure;
Fig. 8 is that first-order difference tranquilization corresponding sequence is inclined in a kind of time series data prediction technique provided in an embodiment of the present invention Correlation figure;
Fig. 9 is ARCH model prediction result schematic diagrams in a kind of time series data prediction technique provided in an embodiment of the present invention;
Figure 10 is that GARCH model prediction results are illustrated in a kind of time series data prediction technique provided in an embodiment of the present invention Figure;
Figure 11 is GARCH model binaryzation prediction results in a kind of time series data prediction technique provided in an embodiment of the present invention Schematic diagram;
Figure 12 is that the reproducing sequence based on GRU models is pre- in a kind of time series data prediction technique provided in an embodiment of the present invention Survey result schematic diagram;
Figure 13 is the binaryzation prediction based on GRU models in a kind of time series data prediction technique provided in an embodiment of the present invention Result schematic diagram;
Figure 14 is that the residual error LB in a kind of time series data prediction technique provided in an embodiment of the present invention under each rank delay is examined Schematic diagram;
Figure 15 is a kind of structural schematic diagram of time series data prediction meanss provided in an embodiment of the present invention.
Specific implementation mode
Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete Site preparation describes, it is clear that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.It is based on Embodiment in the present invention, it is obtained by those of ordinary skill in the art without making creative efforts every other Embodiment shall fall within the protection scope of the present invention.
It, can be with referring to Fig. 1, it illustrates a kind of flow chart of time series data prediction technique provided in an embodiment of the present invention Include the following steps:
S11:History time series data is obtained, and data cleansing and data slicer are carried out to history time series data, is corresponded to Time series data sequence.
It should be noted that the time series data in the application, which can be the exchange rate, stock, futures, noble metal etc., has the time The assets sequence of sequence characteristic and other data with time series characteristic, within protection scope of the present invention.In addition, It is not only suitable for the time series data of structuring in the application, is also applied for non-structured time series data, i.e., the application is for difference The time series data of structure has versatility.
History time series data be current time before time in the time series data that has generated, and obtained in the application History time series data can be current time before, the when ordinal number that is generated in period setting according to actual needs According to, it is specifically described by the exchange rate of time series data, since the transaction of foreign exchange market has global and continuity, but it is each Foreign exchange market opening quotation, the closing quotation in area have a local characteristic, therefore the data of the exchange rate actually obtained usually have that quantity is more, remembers The features such as imperfect is recorded, cannot be used directly for Exchange Rate Forecasting, therefore in order to solve the forecasting problem of complex time series, it is right first This history time series data with temporal aspect of the exchange rate of acquisition carries out data prediction, which may include number According to cleaning and data slicer, specifically, data cleansing is used to that incomplete data in history time series data to be rejected or be used The Data-parallel language that interpolation technique will lack in history time series data, and the system in history time series data is rejected by filtering technique Property error, retains main rule;In the present invention, it may be used and reject incomplete data, the mean filter that is 3 using window Method rejects the Systematic Errors in history time series data to realize data cleansing.Data slicer technology is used for according to forecast demand pair Exchange rate sequence data is sampled and splits, and such as often passes through preset time period generation in (1 minute) by being obtained in history time series data A time series data, and these time series datas by obtaining form time series data sequence, therefore can obtain one by step S11 A time series data sequence.To ensure that the integrality of history time series data and accurate by the process of above-mentioned data prediction Property.
S12:Tranquilization operation is carried out to time series data sequence, and steady to carrying out using Immunogenetic features restructing algorithm Change the time series data sequence after operation and carry out feature reconstruction, obtains corresponding characteristic sequence.
It should be noted that also needing to be converted to characteristic sequence suitable for supervised learning after obtaining characteristic sequence Label data collection, but since the step is consistent with the realization principle for corresponding to technical solution in the prior art, herein no longer It repeats.Since time series data sequence usually has non-stationary property, deep learning model training is realized being translated into Data set (including training set and test set) before need to carry out tranquilization to it, in order to determination be ultimately passed to deep learning model Data set sequence length and characteristic value, using Immunogenetic features restructing algorithm IGFRA to time series data sequence in the application Row carry out feature reconstruction.
It is operated for tranquilization, needs to illustrate, since the non-stationary property of time series data sequence can lead to depth Degree learning model stresses the tendency of learning sample sequence, and the fluctuation of time series data will be considered as noise and ignore, therefore special Tranquilization is carried out to it first in sign structure;Common method is first-order difference tranquilization, can according to time series analysis theory Know, the stability bandwidth of time series data can use the higher derivative of residual error with the difference of residual error come approximate, complicated sequence fluctuation Linear combination carrys out approximate characterization, and the above tranquilization operation is consistent with the realization principle for corresponding to technical solution in the prior art, herein It repeats no more.
S13:The deep learning model that feature based sequence is trained is obtained, and sequential is carried out using deep learning model Data prediction.
After obtaining the deep learning model that feature based sequence obtains, the pre- of the model realization time series data can be utilized It surveys, specifically, can be converted to time series data to be measured can be by the sequence to be measured of deep learning Model Identification, and then this is waited for Sequencing row are input to deep learning model, deep learning model, that is, exportable corresponding prediction result, since this utilizes depth The technical solution for practising the prediction of model realization time series data is consistent with the realization principle for corresponding to technical solution in the prior art, herein not It repeats again.
In technical solution provided in an embodiment of the present invention, history time series data ensure that by data cleansing and data slicer Integrality and accuracy, by time series data sequence carry out tranquilization operation, ensure that the stability of time series data sequence, The feature reconstruction that time series data sequence is realized by using Immunogenetic features restructing algorithm, ensure that time series data sequence Effective extraction of feature, further Pass through above-mentioned technical proposal ensure that the corresponding deep learning model realization of feature based sequence Accuracy when time series data is predicted;As it can be seen that the application is different from realizing data set features by the methods of sampling in the prior art Acquisition, but ensure that the validity of the time series data feature of acquisition through the above steps, and then ensure that corresponding model energy Temporal aspect of the enough study to time series data;And above-mentioned technical proposal disclosed in the present application can be realized automatically, without artificial ginseng With, by avoid the harmful effect that human error brings further ensure temporal aspect acquisition accuracy.
It is pre- to carry out time series data using deep learning model for a kind of time series data prediction technique provided in an embodiment of the present invention Before survey, can also include:
Feature based sequence calculates the precision of prediction of deep learning model, if precision of prediction meets preset requirement, really Depthkeeping degree learning model can be used in carrying out time series data prediction, otherwise, it is determined that when deep learning model is not used to carry out Ordinal number it is predicted that.
It should be noted that feature based sequence calculates the precision of prediction of deep learning model, can be specifically by feature Sequence is divided into multiple subsequences, and any subsequence is then input to deep learning model, calculates the output of deep learning model Time series data actual time series data corresponding with subsequence between difference, and then when determining that the difference accounts for corresponding actual The percentage of ordinal number evidence is as precision of prediction, naturally it is also possible to use other modes to calculate deep learning model according to actual needs Precision of prediction, within protection scope of the present invention.If the precision of prediction of deep learning model meets previously according to reality Border needs the preset requirement (such as accuracy rating) set, it is determined that and deep learning model can be used in carrying out time series data prediction, With subsequent execution using deep learning model carry out time series data prediction the step of, otherwise, then it is assumed that deep learning model Precision of prediction is unsatisfactory for requiring, and determines its prediction for being not used to realize time series data, returns using IGFRA reconstruct feature steps Suddenly.To further ensure the precision of prediction of time series data prediction.
A kind of time series data prediction technique provided in an embodiment of the present invention calculates deep learning model using characteristic sequence Precision of prediction may include:
Training set and test set are obtained, training set and test set will be multiple after multiple subsequences for characteristic sequence to be divided into What subsequence was grouped, wherein deep learning model trains to obtain based on training set;
The each subsequence for including in training set is inputted into deep learning model respectively, and is exported using deep learning model Each time series data and corresponding practical time series data between residual error composition training residual sequence;It is every by include in test set A subsequence inputs deep learning model respectively, and using each time series data of deep learning model output with it is corresponding practical when Residual error composition test residual sequence of the ordinal number between;
The LB values of trained residual sequence are calculated separately using LB (Ljung Box) inspection technology and test the LB of residual sequence Value.
It should be noted that characteristic sequence can be divided into multiple subsequences, the length of each subsequence is identical, in turn These subsequences are divided into two groups, one group is used as training set, and one group is used as test set, and wherein training set is for realizing deep learning The training of model.The corresponding residual sequence of training set (training residual sequence) is obtained through the above way and test set is corresponding residual Difference sequence (test residual sequence), and then the corresponding LB values of above-mentioned two residual sequence are obtained using LB inspection technologies, it can Using the precision of prediction by two LB values as deep learning model.LB inspection technologies are wherein utilized to calculate the LB of certain residual sequence Value is the prior art, is no longer excessively repeated herein.
It is corresponding with above-mentioned technical proposal, judge whether precision of prediction reaches preset requirement, may include:
Training set and the corresponding LB values of test set are compared with preset requirement respectively, if training set and test set pair The LB values answered meet preset requirement, it is determined that deep learning model can be used in carrying out time series data prediction, otherwise, it is determined that Deep learning model is not used to carry out time series data prediction.
Wherein preset requirement be judge LB values correspond to residual error training whether be white noise requirement, i.e., if LB values are more than 0.05, then it is assumed that corresponding residual sequence is white noise, which meets preset requirement, otherwise then thinks that corresponding residual sequence is non- White noise, the LB values do not meet preset requirement.Specifically, after obtaining above-mentioned training set and the corresponding LB values of test set, If two LB values meet preset requirement, then it is assumed that deep learning model fully learns to have arrived sample characteristics (the application implementation Sample in example is the subsequence divided by characteristic sequence), and sample characteristics selection is appropriate, time series data sequence universal Rule statement is accurate, at this time, it is determined that deep learning model can be used in carrying out time series data prediction, otherwise, then it is assumed that depth Learning model is not used to carry out time series data prediction, to examine the precision of prediction for realizing deep learning model by LB Effectively judge.
A kind of time series data prediction technique provided in an embodiment of the present invention distinguishes training set and the corresponding LB values of test set After being compared with preset requirement, can also include:
If test set corresponds to, LB values do not meet preset requirement and training set corresponds to LB values and meets preset requirement, and return is held The step of row carries out feature reconstruction using Immunogenetic features restructing algorithm to the exchange rate sequence after carrying out tranquilization operation, until Training set and test set correspond to until LB values meet preset requirement.
If test set corresponds to, LB values do not meet preset requirement and training set corresponds to LB values and meets preset requirement, i.e. training set Corresponding residual sequence be white noise and test set to correspond to residual sequence be nonwhite noise, then it is assumed that deep learning model is for sample The study of feature is abundant, but sample characteristics choose it is inappropriate, the universal law of time series data sequence is stated it is inaccurate Really, therefore sample characteristics are rebuild, that is, returns and re-starts feature reconstruction.If it is white noise that training set, which corresponds to residual error, recognize It is not completed also for the training of deep learning model, adjusts model parameter at this time, and join using training set re -training adjustment model Deep learning model after number corresponds to until training set and test set until LB values meet preset requirement, thereby it is ensured that pair In effective training of deep learning model.
A kind of time series data prediction technique provided in an embodiment of the present invention, using Immunogenetic features restructing algorithm to carrying out Time series data sequence after tranquilization operation carries out feature reconstruction and obtains character pair sequence, may include:
Obtain according to the following formula carry out tranquilization operation after time series data sequence x time series data residual error derivative to Measure X:
X=[x, x', x(2),...,x(n)]T
It obtains and following character pair sequence is obtained based on derivative vector XFormula:
Obtain following affinity evaluation function:
Wherein, A is the square formation of n × n, and B is the column vector of n × 1, and n is characterized the exponent number of transformation, and m is input based on each The length of the sequence for the deep learning model that training set obtains,For training set neutron sequence inputting is corresponded to deep learning model The time series data of deep learning model output afterwards, when Y indicates that the time series data exported with deep learning model is corresponding practical Ordinal number evidence, N are the quantity of subsequence in training set, and the value one of deep learning model, training set and affinity evaluation function is a pair of It answers;The acquisition process of wherein training set includes:Characteristic sequence corresponding with the value of every group of A and B generated is determined, by each feature Sequence is divided into multiple subsequences, and multiple subsequences are grouped to obtain training set corresponding with each characteristic sequence and test Collection;
The value so that the minimum A and B of value of affinity evaluation function is determined using Immunogenetic features restructing algorithm, and Determine that the value character pair sequence of the A and B is the characteristic sequence that the feature reconstruction finally determined obtains.
Wherein, reconstruction of function is one group by former characteristic set to the mapping of new feature set, useIndicate new feature set (characteristic sequence), reconstruction of function are represented by following form:
It should be noted that the value of multigroup A and B can be generated at random, the value of every group of A and B of generation is substituted into formulaCharacter pair sequence is obtained, i.e., the value of every group A and B corresponds to a characteristic sequence, then divides characteristic sequence Its corresponding whole subsequence is divided into two groups, one group by the subsequence for being m for multiple length for the value of any group of A and B For training set, one group is test set;It is then based on training set to train to obtain corresponding deep learning model, then is based on the training set The value for calculating corresponding affinity evaluation function, finally determines the value of optimal A and B;It is wherein based on training set and calculates correspondence Affinity evaluation function value when input deep learning model subsequence be training for training the deep learning model Numerical value group, training set, test set, deep learning model and the affinity for collecting the value composition of the son training namely A and B that have are commented The value of valence function has one-to-one relationship, and the above calculating is to be realized based on the correspondence, i.e., the value of one group A and B Character pair sequence is divided into subsequence and obtains corresponding training set and test set after being grouped, trains to obtain using a training set One deep learning model, the deep learning model obtained based on a training set and corresponding training calculate affinity evaluation function Value.
It is further to note that being stated in realization in the application when step can not also divide characteristic sequence Subsequence is grouped, i.e., the whole subsequences divided using a characteristic sequence train to obtain deep learning model after, Again based on the deep learning model and for training whole subsequences of the deep learning model to calculate affinity evaluation function At this time it needs to be determined that after going out the value of optimal A and B, the corresponding characteristic sequence of the value of the optimal A and B determined is divided for value It for multiple subsequences, and is grouped and obtains training set and test set, and train using training set to obtain deep learning model, the depth Learning model is the deep learning model in step S13;And if carried out by the subsequence divided to characteristic sequence The above-mentioned steps of grouping realize the determination of the value of A and B, then subsequently no longer need to the corresponding characteristic sequence of value of optimal A and B into Row divides, and the deep learning model in step S13 is the corresponding depth of value of optimal A and B during realizing above-mentioned steps Learning model, naturally it is also possible to training deep learning model after being divided again to the corresponding characteristic sequence of the value of optimal A and B, Deng within protection scope of the present invention.
In addition, the purpose of feature reconstruction is to improve the accuracy of time series data prediction, thus it is defeated using deep learning model The time series data gone out is used as affinity evaluation function with the mean square deviation (MSE) of corresponding practical time series data:
Since exponent number is higher, list entries is longer, and the complexity and calculation amount of deep learning model will all increase therewith;For It avoids causing the generalization ability of deep learning model to reduce because of the growth of feature complexity, introduces sparse penalty term and carry out balance characteristics The precision and structure complexity of selection, obtain following affinity evaluation function:
The purpose of Immunogenetic features restructing algorithm is to determine the value of suitable A and B so that above-mentioned affinity evaluation The value of function is minimum.
A kind of time series data prediction technique provided in an embodiment of the present invention, is determined using Immunogenetic features restructing algorithm So that the value of the minimum A and B of value of affinity evaluation function, may include:
Multiple antibody are generated, each antibody corresponds to the value of one group of A and B;
The antibody that corresponding A is singular matrix is eliminated, and the antibody to match with the antibody in data base is eliminated, data base Including antibody be test obtained illegal antibody in advance;
The value for calculating the corresponding affinity evaluation function of each antibody judges whether to meet any end condition, if so, It then determines that the antibody of the value minimum of affinity evaluation function is best match antibody, and determines the corresponding A and B of best match antibody Value be so that affinity evaluation function the minimum A and B of value value;If it is not, then by multiple antibody according to front two numerical value Classification, and to it is every it is a kind of in antibody carry out heritable variation operation, obtain new antibody, and based on new antibody return execute general Corresponding A is the step of antibody of singular matrix is eliminated;Wherein end condition includes:There are some antibody to correspond to affinity evaluation function Value less than predetermined threshold, return execute by corresponding A be singular matrix antibody eliminate the step of number reach preset times, really The antibody of the value minimum for the affinity evaluation function made does not change for default time continuously.
It should be noted that antibody can generate at random, each antibody obtains after being merged by A and B, specific next It says, using real coding as an antibody after A and B is merged, the 1st of coding is the value of n, and second is the value of m, coding Front two encoded according to positive integer, m × position (n+1) uses real coding later, respectively m × n the element and B in corresponding A In respective element.Accordingly, it is determined that after going out best match antibody, best match antibody can be decomposed to obtain corresponding A and B.
Specifically, IGFRA purposes are to find suitable A and B so that the value of affinity evaluation function is minimum;In order to It can restore original sample eigen, reconstruction of function requires the inverse presence of A, if therefore a rule of Solid phase is A in antibody For singular matrix (singular matrix is the matrix that its order is zero), then the antibody is eliminated.In addition, in order to ensure algorithm towards the residual of corresponding generation Difference is that the direction of white noise is evolved, and for predetermined illegal antibody, should avoid repeating, therefore Solid phase is another Match with any antibody for including in data base if a rule is antibody, eliminates the antibody.Specifically, illegal anti- Body can be the precision of prediction in above-described embodiment when determining that the precision of prediction of deep learning model does not meet preset requirement The corresponding antibody of deep learning model of preset requirement is not met, it should be noted that deep learning model, antibody and (A and B Value) there is correspondence, and the antibody that can be eliminated is referred to as autoimmune antibody.In addition autoimmune antibody will be eliminated The whole antibody obtained afterwards can be that the identical antibody of front two numerical value is divided into one kind, Jin Er according to front two numerical classification Every a kind of interior heritable variation operation for carrying out antibody, obtains new antibody, to realize antibody elimination and antibody based on new antibody Choose and etc..
Machine is simulated for ease of calculation, meets one of above three end condition, then IGFRA is terminated, and exports affinity The minimum antibody of the value of evaluation function.
A kind of time series data prediction technique provided in an embodiment of the present invention carries out hereditary variation behaviour to the antibody in every one kind Make, obtains new antibody, may include:
Determine that per a kind of antibody be a parent, and the ratio for obtaining duplication, intersection and variation is respectively p, q, r;
The pK antibody that the value minimum of affinity evaluation function is chosen from each parent, remains into filial generation;From each QK/2 is randomly selected in parent to antibody, and the i-th bit of each pair of antibody and subsequent sequence are exchanged, it is anti-by what is obtained after exchange Body remains into filial generation;RK antibody is randomly selected from each parent, is replaced with a random number in preset codomain The jth position for changing each antibody will be replaced obtained antibody and be remained into filial generation;Wherein, K is per a kind of antibody levels, i and j It is random number.
Determine that the antibody that whole filial generation includes is that heritable variation operation obtains new antibody.
The antibody levels for including in wherein each class can it is identical, can also be different, namely K herein be merely representative of it is each The antibody levels for including in class, it is identical not represent the antibody levels for including in each class;P, q, r are to be set according to actual needs Fixed, three be added 1 ratio value.In addition, i>3,j>2, and codomain can be set according to actual needs, generally be taken Codomain is the range of the possible values of preset A and B.
A kind of time series data prediction technique provided in an embodiment of the present invention obtains the depth that feature based sequence is trained Learning model may include:
Obtain GRU (Threshold Autoregressive unit) model that feature based sequence is trained.
Since the Recognition with Recurrent Neural Network such as LSTM, GRU are natively suitable for processing time sequence data, and has long-range note Recall characteristic, therefore in the present invention, based on experiment comparison result, is predicted using GRU model realization time series datas.Specifically, A full connection god can be arranged in the requirement that situation of change is predicted to realize GRU models for time series data in GRU models Through network layer, while in order to increase the Generalization Capability of GRU models, a Batch is arranged in the present invention in GRU models Normalization layers, as a result, GRU models include sequentially connected GRU model units, normalization layers of Batch and Full Connection Neural Network layer, block mold structure can be as shown in Figure 2.
Wherein it should be noted that the network structure of GRU model units can be with as shown in figure 3, wherein recall infoAccording to Following formula generates:
Wherein, rtFor reset signal, as last moment hidden state ht-11 is taken when for synthesizing new recall info correlation, Otherwise 0 is taken.rtCalculation formula it is as follows:
rt=σ (Wrxt+Urht-1);
Wherein, σ () is sigmoid functions.Hidden state h in above formulatBy the state and recall info of previous moment Linear combination is constituted.Its expression formula is as follows:
Wherein, ztTo update threshold signal, characterization model is with much degree by the hidden state h of last momentt-1It is transmitted to Subsequent time.Its calculation is similar to resetting gate signal, and formula is as follows:
zt=σ (Wzxt+Uzht-1)。
In addition, after obtaining reconstruction of function using IGFRA, due to the non-linear transform function in GRU models [- 1, 1] more sensitive to the variation of input on section, it is therefore desirable to characteristic sequence regularization, by the value model of wherein each element It encloses and is scaled [- 1,1].
The invention will be further elucidated with reference to specific embodiments;It should be understood that these embodiments are merely to illustrate this hair It is bright rather than limit the scope of the invention.In addition, it should also be understood that, after reading the content taught by the present invention, art technology Personnel can make various changes or modifications the present invention, and such equivalent forms equally fall within the application the appended claims and limited Fixed range.
In order to verify the validity of method proposed by the invention, this example uses July 1 day to 2017 June in 2017 16431 RMB on the 31st convert dollar currency rate minute data record, using the method for the present invention construction feature, and to 1 point following The exchange rate ups and downs situation of clock is made a prediction.Wherein, training dataset accounts for the 80% of the total number of records, remaining 20% be used as test data Collection.The hyper parameter of GRU models is as follows:
1. model hyper parameter table of table
In an experiment, the models such as LSTM and GRU are respectively adopted to not using disclosed in the present application above-mentioned including reconstruct feature And etc. the processed exchange rate be fitted and predicted, experimental result shows that the error of fitting of two kinds of models is all very low (LSTM:0.003, GRU:0.0028);Use LSTM models to the consensus forecast accuracys rate of ups and downs for 53.63%.When using with It is accurate to the prediction of the test set sequence fragment (by what is randomly selected in the exchange rate) randomly selected after machine sampling is to network training True rate is 56.9%, and specific fitting result of testing is as shown in Fig. 4 to Fig. 5.By Fig. 4 to Fig. 5 it is visible using models fitting come out Price curve exists in time with actual curve to be lagged, and the lag period is equal to forward prediction step number.
To carrying out ADF detections, statistic mixed-state -29.338 to it after exchange rate sequence first-order difference>- 3.43, therefore refuse Null hypothesis, it is believed that the process is weakly stationary sequence.Its scatter plot and auto-correlation, partial correlation figure difference are as shown in Figure 6 to 8.
2. augmentation Dickey-Fuller inspection results of table
From the point of view of the LB inspection results of table 2, carries out the exchange rate sequence pair after first-order stationary and answer LB values 0.02382<0.05, Therefore, it is considered that the sequence is not white noise sequence, has learning value.Due to the auto-correlation coefficient of the sequence after tranquilization, inclined phase Relationship number all shows 1 rank truncation characteristic, and there are apparent variance variations (i.e. stability bandwidth is assembled), therefore is passed through according to tradition It tests, ARCH models can be used and GARCH models model it;It is defeated that Fig. 9 to Figure 11 gives the corresponding prediction of above two model Artificial situation, and the corresponding prediction accuracy of above two model is as shown in table 3.
3. tradition TS models of table and accuracy
By experimental result as it can be seen that ARCH models (autoregressive conditional different Variance model) and GARCH models (broad sense autoregression item Part heteroscedastic model) precision of prediction it is not high.Use method provided by the present invention by after feature reconstruction it is Sequence Transformed for Training dataset with mark, trained result is as shown in Figure 12 to Figure 13, as can see from Figure 12, for test set (test), the prediction result (prd) of model is although there are still certain errors in amplitude, and the variation tendency of prediction curve is Through synchronous with test set, i.e., phase delay is eliminated using the feature after feature reconstruction so as to the variation prediction of the exchange rate It is possibly realized.It is 64.04% by (such as Figure 13) statistical forecast accuracy after prediction result binaryzation.In order to probe into the study of model As a result whether can further increase, Ljung Box inspections be carried out to residual error, it will be seen from figure 14 that for computable Different residual error ordered series of numbers within the delay of 40 phases, the value that LB is examined are all higher than 0.05, and average value is about 0.95, i.e., to being examined Arbitrary sequence, the sequence are that the confidence level of white noise is 95%.;And use gold of the deep learning model realization with temporal characteristics When financing production prediction, since financial asset sequence has very strong autocorrelation, obtained prediction output and actual assets There is lag in time.
Since any useful knowledge or pattern can not be extracted again from white noise, it can be considered that being given currently Feature and data set under conditions of, accuracy that model the obtains approximation theory upper limit, continuing, which improves model, to carry In high precision.
The embodiment of the present invention additionally provides a kind of time series data prediction meanss:
Preprocessing module 11, is used for:History time series data is obtained, and data cleansing and data are carried out to history time series data Slice, obtains corresponding time series data sequence;
Feature reconstruction module 12, is used for:Tranquilization operation is carried out to time series data sequence, and uses Immunogenetic features weight Structure algorithm carries out feature reconstruction to the time series data sequence after carrying out tranquilization operation, obtains corresponding characteristic sequence;
Prediction module 13, is used for:The deep learning model that feature based sequence is trained is obtained, and utilizes deep learning Model carries out time series data prediction.
A kind of time series data prediction meanss provided in an embodiment of the present invention can also include:
Precision evaluation module, is used for:Before time series data prediction being carried out using deep learning model, feature based sequence meter The precision of prediction for calculating deep learning model, if precision of prediction meets preset requirement, it is determined that deep learning model can be used in Time series data prediction is carried out, otherwise, it is determined that deep learning model is not used to carry out time series data prediction.
A kind of time series data prediction meanss provided in an embodiment of the present invention, precision evaluation module may include:
Accuracy computation unit, is used for:Training set and test set are obtained, training set and test set are to be divided into characteristic sequence Multiple subsequences are grouped after multiple subsequences, wherein deep learning model trains to obtain based on training set;It will The each subsequence for including in training set inputs deep learning model respectively, and utilizes each sequential of deep learning model output Residual error composition training residual sequence between data and corresponding practical time series data;By each subsequence for including in test set point Not Shu Ru deep learning model, and using deep learning model output each time series data and corresponding practical time series data between Residual error composition test residual sequence;The LB values and test residual sequence of trained residual sequence are calculated separately using LB inspection technologies LB values;
Precision evaluation unit, is used for:Training set and the corresponding LB values of test set are compared with preset requirement respectively, such as Fruit training set and the corresponding LB values of test set meet preset requirement, it is determined that ordinal number when deep learning model can be used in carrying out It was predicted that otherwise, it is determined that deep learning model is not used to carry out time series data prediction.
A kind of time series data prediction meanss provided in an embodiment of the present invention, precision evaluation module can also include:
Modifying model unit, is used for:It is compared with preset requirement respectively in training set and the corresponding LB values of test set Afterwards, if test set corresponds to, LB values do not meet preset requirement and training set corresponds to LB values and meets preset requirement, return to execution and adopt The step of feature reconstruction is carried out to the time series data sequence after carrying out tranquilization operation with Immunogenetic features restructing algorithm, until Training set and test set correspond to until LB values meet preset requirement.
A kind of time series data prediction meanss provided in an embodiment of the present invention, feature reconstruction module may include:
Feature reconstruction unit, is used for:Obtain according to the following formula carry out tranquilization operation after time series data sequence x when Ordinal number according to residual error derivative vector X:
X=[x, x', x(2),...,x(n)]T
It obtains and following character pair sequence is obtained based on derivative vector XFormula:
Obtain following affinity evaluation function:
Wherein, A is the square formation of n × n, and B is the column vector of n × 1, and n is characterized the exponent number of transformation, and m is input based on each The length of the sequence for the deep learning model that training set obtains,For training set neutron sequence inputting is corresponded to deep learning model The time series data of deep learning model output afterwards, when Y indicates that the time series data exported with deep learning model is corresponding practical Ordinal number evidence, N are the quantity of subsequence in training set, and the value one of deep learning model, training set and affinity evaluation function is a pair of It answers;The acquisition process of wherein training set includes:Characteristic sequence corresponding with the value of every group of A and B generated is determined, by each feature Sequence is divided into multiple subsequences, and multiple subsequences are grouped to obtain training set corresponding with each characteristic sequence and test Collection;The value so that the minimum A and B of value of affinity evaluation function is determined using Immunogenetic features restructing algorithm, and is determined The value character pair sequence of the A and B is the characteristic sequence that the feature reconstruction finally determined obtains.
A kind of time series data prediction meanss provided in an embodiment of the present invention, feature reconstruction unit may include:
Feature reconstruction subelement, is used for:Multiple antibody are generated, each antibody corresponds to the value of one group of A and B;It is strange by corresponding A The antibody of different battle array is eliminated, and the antibody to match with the antibody in data base is eliminated, and the antibody that data base includes is to survey in advance Try obtained illegal antibody;The value for calculating the corresponding affinity evaluation function of each antibody judges whether to meet any termination Condition if it is, determining that the antibody of the value minimum of affinity evaluation function is best match antibody, and determines that best match is anti- The value of body corresponding A and B is the value so that the minimum A and B of value of affinity evaluation function;If it is not, then multiple antibody are pressed Heritable variation operation is carried out according to front two numerical classification, and to the antibody in every one kind, obtains new antibody, and resist based on new Body, which returns, executes the step of eliminating the antibody that corresponding A is singular matrix;Wherein end condition includes:There are some antibody to correspond to parent The number for the step of eliminating the antibody that corresponding A is singular matrix less than predetermined threshold, return execution with the value of degree evaluation function reaches Antibody to preset times, the value minimum for the affinity evaluation function determined does not change for default time continuously.
A kind of time series data prediction meanss provided in an embodiment of the present invention, feature reconstruction subelement may include:
Make a variation subelement, is used for:It determines that per a kind of antibody be a parent, and obtains the ratio of duplication, intersection and variation Respectively p, q, r;The pK antibody that the value minimum of affinity evaluation function is chosen from each parent, remains into filial generation;From QK/2 is randomly selected to antibody in each parent, and the i-th bit of each pair of antibody and subsequent sequence are exchanged, and will be obtained after exchange Antibody remain into filial generation;RK antibody is randomly selected from each parent, it is random with one in preset codomain Number replaces the jth position of each antibody, will replace obtained antibody and remains into filial generation;Wherein, K is per a kind of antibody levels, i It is random number with j;Determine that the antibody that whole filial generation includes is that heritable variation operation obtains new antibody.
A kind of time series data prediction meanss provided in an embodiment of the present invention, preprocessing module may include:
Acquiring unit is used for:Obtain the GRU models that feature based sequence is trained.
The embodiment of the present invention additionally provides a kind of pre- measurement equipment of time series data, may include:
Memory, for storing computer program;
Processor, when for executing computer program realize as above any one of time series data prediction technique the step of.
The embodiment of the present invention additionally provides a kind of computer readable storage medium, is stored on computer readable storage medium Computer program, may be implemented when computer program is executed by processor as above any one of time series data prediction technique the step of
It is related in a kind of time series data prediction meanss provided in an embodiment of the present invention, equipment and computer readable storage medium Partial explanation refers to the detailed description of corresponding part in a kind of time series data prediction technique provided in an embodiment of the present invention, This is repeated no more.In above-mentioned technical proposal provided in an embodiment of the present invention with correspond to technical solution realization principle one in the prior art The part of cause is simultaneously unspecified, in order to avoid excessively repeat.
The foregoing description of the disclosed embodiments enables those skilled in the art to realize or use the present invention.To this A variety of modifications of a little embodiments will be apparent for a person skilled in the art, and the general principles defined herein can Without departing from the spirit or scope of the present invention, to realize in other embodiments.Therefore, the present invention will not be limited It is formed on the embodiments shown herein, and is to fit to consistent with the principles and novel features disclosed in this article widest Range.

Claims (10)

1. a kind of time series data prediction technique, which is characterized in that including:
History time series data is obtained, and data cleansing and data slicer are carried out to the history time series data, when obtaining corresponding Sequence data sequence;
Tranquilization operation is carried out to the time series data sequence, and using Immunogenetic features restructing algorithm to carrying out tranquilization behaviour Time series data sequence after work carries out feature reconstruction, obtains corresponding characteristic sequence;
The deep learning model trained based on the characteristic sequence is obtained, and sequential is carried out using the deep learning model Data prediction.
2. according to the method described in claim 1, it is characterized in that, carrying out time series data prediction using the deep learning model Before, further include:
The precision of prediction of the deep learning model is calculated based on the characteristic sequence, if the precision of prediction meets default want It asks, it is determined that the deep learning model can be used in carrying out time series data prediction, otherwise, it is determined that the deep learning model It is not used to carry out time series data prediction.
3. according to the method described in claim 2, it is characterized in that, calculating the deep learning model using the characteristic sequence Precision of prediction, including:
Training set and test set are obtained, the training set and the test set are that the characteristic sequence is divided into multiple subsequences The multiple subsequence is grouped afterwards, wherein the deep learning model trains to obtain based on the training set;
The each subsequence for including in the training set is inputted into the deep learning model respectively, and utilizes the deep learning Residual error composition training residual sequence between each time series data and corresponding practical time series data of model output;By the test Each subsequence that concentration includes inputs the deep learning model respectively, and utilizes each of deep learning model output Residual error composition test residual sequence between time series data and corresponding practical time series data;
The LB values of the trained residual sequence and the LB values of the test residual sequence are calculated separately using LB inspection technologies;
It is corresponding, judge whether the precision of prediction reaches preset requirement, including:
The training set and the corresponding LB values of the test set are compared with preset requirement respectively, if the training set and The corresponding LB values of the test set meet preset requirement, it is determined that the deep learning model can be used in carrying out time series data Prediction, otherwise, it is determined that the deep learning model is not used to carry out time series data prediction.
4. according to the method described in claim 3, it is characterized in that, by the training set and the corresponding LB values point of the test set After not being compared with preset requirement, further include:
If the test set corresponds to, LB values do not meet the preset requirement and the training set corresponds to LB values and meets described preset It is required that then return execute it is described using Immunogenetic features restructing algorithm to carry out tranquilization operation after time series data sequence into The step of row feature reconstruction, corresponds to until the training set and the test set until LB values meet preset requirement.
5. according to the method described in claim 1, it is characterized in that, using Immunogenetic features restructing algorithm to carrying out tranquilization Time series data sequence after operation carries out feature reconstruction and obtains character pair sequence, including:
Obtain carrying out the derivative vector X of the time series data residual error of the time series data sequence x after tranquilization operation according to the following formula:
X=[x, x', x(2),...,x(n)]T
It obtains and following character pair sequence is obtained based on derivative vector XFormula:
Obtain following affinity evaluation function:
Wherein, A is the square formation of n × n, and B is the column vector of n × 1, and n is characterized the exponent number of transformation, and m is input based on each training Collect the length of the sequence of obtained deep learning model,For the training set neutron sequence inputting is corresponded to deep learning model The time series data of deep learning model output, Y indicate reality corresponding with the time series data that the deep learning model exports afterwards Border time series data, N are the quantity of subsequence in the training set, the deep learning model, the training set and described affine The value for spending evaluation function corresponds;The acquisition process of the wherein described training set includes:Determine the value with the every group of A and B generated Each characteristic sequence is divided into multiple subsequences, and the multiple subsequence is grouped to obtain by corresponding characteristic sequence Training set corresponding with each characteristic sequence and test set;
The value so that the minimum A and B of value of the affinity evaluation function is determined using Immunogenetic features restructing algorithm, and Determine that the value character pair sequence of the A and B is the characteristic sequence that the feature reconstruction finally determined obtains.
6. according to the method described in claim 5, it is characterized in that, being determined using Immunogenetic features restructing algorithm so that institute The value of the minimum A and B of value of affinity evaluation function is stated, including:
Multiple antibody are generated, each antibody corresponds to the value of one group of A and B;
The antibody that corresponding A is singular matrix is eliminated, and the antibody to match with the antibody in data base is eliminated, the data base Including antibody be test obtained illegal antibody in advance;
The value for calculating the corresponding affinity evaluation function of each antibody judges whether to meet any end condition, if it is, really The antibody for determining the value minimum of affinity evaluation function is best match antibody, and determines the corresponding A and B of the best match antibody Value be so that the affinity evaluation function the minimum A and B of value value;If it is not, then by the multiple antibody according to preceding Double figures value is classified, and carries out heritable variation operation to the antibody in every one kind, obtains new antibody, and return based on new antibody Receipt is about to the step of antibody that corresponding A is singular matrix is eliminated;The wherein described end condition includes:There are some antibody to correspond to parent The number for the step of eliminating the antibody that corresponding A is singular matrix less than predetermined threshold, return execution with the value of degree evaluation function reaches Antibody to preset times, the value minimum for the affinity evaluation function determined does not change for default time continuously.
7. according to the method described in claim 6, it is characterized in that, to the antibody implementation heritable variation operation in every one kind, obtain To new antibody, including:
Determine that per a kind of antibody be a parent, and the ratio for obtaining duplication, intersection and variation is respectively p, q, r;
The pK antibody that the value minimum of affinity evaluation function is chosen from each parent, remains into filial generation;From each parent In randomly select qK/2 to antibody, and the i-th bit of each pair of antibody and subsequent sequence are exchanged, the antibody obtained after exchange are protected It is left in filial generation;RK antibody is randomly selected from each parent, is replaced with a random number in preset codomain every The jth position of a antibody will be replaced obtained antibody and be remained into filial generation;Wherein, K is to be per a kind of antibody levels, i and j Random number;
Determine that the antibody that whole filial generation includes is that heritable variation operation obtains new antibody.
8. according to the method described in claim 1, it is characterized in that, obtaining the depth trained based on the characteristic sequence Model is practised, including:
Obtain the GRU models trained based on the characteristic sequence.
9. a kind of time series data prediction meanss, which is characterized in that including:
Preprocessing module is used for:History time series data is obtained, and history time series data progress data cleansing and data are cut Piece obtains corresponding time series data sequence;
Feature reconstruction module, is used for:Tranquilization operation is carried out to the time series data sequence, and is reconstructed using Immunogenetic features Algorithm carries out feature reconstruction to the time series data sequence after carrying out tranquilization operation, obtains corresponding characteristic sequence;
Prediction module is used for:The deep learning model trained based on the characteristic sequence is obtained, and utilizes the depth It practises model and carries out time series data prediction.
10. a kind of pre- measurement equipment of time series data, which is characterized in that including:
Memory, for storing computer program;
Processor realizes the time series data prediction side as described in any one of claim 1 to 8 when for executing the computer program The step of method.
CN201810174986.2A 2018-03-02 2018-03-02 A kind of time series data prediction technique, device and equipment Pending CN108399248A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810174986.2A CN108399248A (en) 2018-03-02 2018-03-02 A kind of time series data prediction technique, device and equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810174986.2A CN108399248A (en) 2018-03-02 2018-03-02 A kind of time series data prediction technique, device and equipment

Publications (1)

Publication Number Publication Date
CN108399248A true CN108399248A (en) 2018-08-14

Family

ID=63091772

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810174986.2A Pending CN108399248A (en) 2018-03-02 2018-03-02 A kind of time series data prediction technique, device and equipment

Country Status (1)

Country Link
CN (1) CN108399248A (en)

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109461067A (en) * 2018-09-12 2019-03-12 阿里巴巴集团控股有限公司 A kind of detection method of foreign exchange quotation abnormal data, apparatus and system
CN109583568A (en) * 2018-11-28 2019-04-05 中科赛诺(北京)科技有限公司 Data extension method, device and electronic equipment
CN110414442A (en) * 2019-07-31 2019-11-05 广东省智能机器人研究院 A kind of pressure time series data segmentation feature value prediction technique
CN110610458A (en) * 2019-04-30 2019-12-24 北京联合大学 Method and system for GAN image enhancement interactive processing based on ridge regression
CN110737696A (en) * 2019-10-12 2020-01-31 北京百度网讯科技有限公司 Data sampling method, device, electronic equipment and storage medium
CN110866672A (en) * 2019-10-10 2020-03-06 重庆金融资产交易所有限责任公司 Data processing method, device, terminal and medium
CN111199114A (en) * 2020-03-06 2020-05-26 上海复见网络科技有限公司 Method for building system of urban industry evolution model based on recurrent neural network
CN111754033A (en) * 2020-06-15 2020-10-09 西安工业大学 Non-stationary time sequence data prediction method based on recurrent neural network
CN112923922A (en) * 2021-03-04 2021-06-08 香港理工大学深圳研究院 Method, system and storage medium for counting steps and determining position information of pedestrian
CN113051811A (en) * 2021-03-16 2021-06-29 重庆邮电大学 Multi-mode short-term traffic jam prediction method based on GRU network
CN113468151A (en) * 2020-03-31 2021-10-01 横河电机株式会社 Learning data processing device, learning data processing method, and medium
CN113743971A (en) * 2020-06-17 2021-12-03 北京沃东天骏信息技术有限公司 Data processing method and device
CN114528334A (en) * 2022-02-18 2022-05-24 重庆伏特猫科技有限公司 Rapid similarity searching method in time sequence database
CN115099144A (en) * 2022-06-24 2022-09-23 无锡物联网创新中心有限公司 Yarn raw material characteristic parameter inversion method and related device
CN115204533A (en) * 2022-09-16 2022-10-18 中国地质大学(北京) Oil-gas yield prediction method and system based on multivariable weighted combination model
CN116311829A (en) * 2023-05-22 2023-06-23 广州豪特节能环保科技股份有限公司 Remote alarm method and device for data machine room

Cited By (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109461067A (en) * 2018-09-12 2019-03-12 阿里巴巴集团控股有限公司 A kind of detection method of foreign exchange quotation abnormal data, apparatus and system
CN109583568A (en) * 2018-11-28 2019-04-05 中科赛诺(北京)科技有限公司 Data extension method, device and electronic equipment
CN110610458A (en) * 2019-04-30 2019-12-24 北京联合大学 Method and system for GAN image enhancement interactive processing based on ridge regression
CN110610458B (en) * 2019-04-30 2023-10-20 北京联合大学 GAN image enhancement interaction processing method and system based on ridge regression
CN110414442A (en) * 2019-07-31 2019-11-05 广东省智能机器人研究院 A kind of pressure time series data segmentation feature value prediction technique
CN110866672A (en) * 2019-10-10 2020-03-06 重庆金融资产交易所有限责任公司 Data processing method, device, terminal and medium
CN110737696A (en) * 2019-10-12 2020-01-31 北京百度网讯科技有限公司 Data sampling method, device, electronic equipment and storage medium
CN111199114A (en) * 2020-03-06 2020-05-26 上海复见网络科技有限公司 Method for building system of urban industry evolution model based on recurrent neural network
CN113468151A (en) * 2020-03-31 2021-10-01 横河电机株式会社 Learning data processing device, learning data processing method, and medium
CN111754033A (en) * 2020-06-15 2020-10-09 西安工业大学 Non-stationary time sequence data prediction method based on recurrent neural network
CN113743971A (en) * 2020-06-17 2021-12-03 北京沃东天骏信息技术有限公司 Data processing method and device
CN112923922A (en) * 2021-03-04 2021-06-08 香港理工大学深圳研究院 Method, system and storage medium for counting steps and determining position information of pedestrian
CN113051811B (en) * 2021-03-16 2022-08-05 重庆邮电大学 Multi-mode short-term traffic jam prediction method based on GRU network
CN113051811A (en) * 2021-03-16 2021-06-29 重庆邮电大学 Multi-mode short-term traffic jam prediction method based on GRU network
CN114528334A (en) * 2022-02-18 2022-05-24 重庆伏特猫科技有限公司 Rapid similarity searching method in time sequence database
CN115099144A (en) * 2022-06-24 2022-09-23 无锡物联网创新中心有限公司 Yarn raw material characteristic parameter inversion method and related device
CN115204533A (en) * 2022-09-16 2022-10-18 中国地质大学(北京) Oil-gas yield prediction method and system based on multivariable weighted combination model
CN116311829A (en) * 2023-05-22 2023-06-23 广州豪特节能环保科技股份有限公司 Remote alarm method and device for data machine room

Similar Documents

Publication Publication Date Title
CN108399248A (en) A kind of time series data prediction technique, device and equipment
CN112446591B (en) Zero sample evaluation method for student comprehensive ability evaluation
Widiputra et al. Multivariate cnn-lstm model for multiple parallel financial time-series prediction
CN114509266B (en) Bearing health monitoring method based on fault feature fusion
CN106570516A (en) Obstacle recognition method using convolution neural network
Chatterjee et al. Extraction of binary black hole gravitational wave signals from detector data using deep learning
CN108447057A (en) SAR image change detection based on conspicuousness and depth convolutional network
CN103268519A (en) Electric power system short-term load forecast method and device based on improved Lyapunov exponent
CN115601661A (en) Building change detection method for urban dynamic monitoring
CN114297036A (en) Data processing method and device, electronic equipment and readable storage medium
CN104809471A (en) Hyperspectral image residual error fusion classification method based on space spectrum information
CN114565594A (en) Image anomaly detection method based on soft mask contrast loss
Priyadarshini A comparative analysis of prediction using Artificial Neural network and auto regressive integrated moving average
CN109920489A (en) It is a kind of that model and method for building up are hydrocracked based on Lasso-CCF-CNN
CN105787265A (en) Atomic spinning top random error modeling method based on comprehensive integration weighting method
CN107220346A (en) A kind of higher-dimension deficiency of data feature selection approach
CN116364203A (en) Water quality prediction method, system and device based on deep learning
CN116541771A (en) Unbalanced sample bearing fault diagnosis method based on multi-scale feature fusion
Zhang et al. Multivariate discrete grey model base on dummy drivers
CN115359197A (en) Geological curved surface reconstruction method based on spatial autocorrelation neural network
CN112801955A (en) Plankton detection method under unbalanced population distribution condition
CN111695989A (en) Modeling method and platform of wind-control credit model
Wang et al. Medium and long-term trend prediction of urban air quality based on deep learning
Chakravarthi et al. Gross Domestic Product Prediction Model Using Gradient Boosting Algorithm in Machine Learning
CN115270638B (en) Urban thermal environment downscaling space-time analysis and prediction method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20180814

RJ01 Rejection of invention patent application after publication