CN109754115A - Method, apparatus, storage medium and the electronic equipment of data prediction - Google Patents
Method, apparatus, storage medium and the electronic equipment of data prediction Download PDFInfo
- Publication number
- CN109754115A CN109754115A CN201811475791.8A CN201811475791A CN109754115A CN 109754115 A CN109754115 A CN 109754115A CN 201811475791 A CN201811475791 A CN 201811475791A CN 109754115 A CN109754115 A CN 109754115A
- Authority
- CN
- China
- Prior art keywords
- vector
- data
- time sequence
- vector set
- target identification
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Abstract
This disclosure relates to a kind of method, apparatus, storage medium and the electronic equipment of data prediction, it can be by obtaining multiple historical time sequence datas;Multiple historical time sequence data is converted into first time sequence data vector set according to multiple historical time sequence datas corresponding acquisition moment;Mark vector collection is obtained according to the first time sequence data vector set;Target identification vector set is determined according to the mark vector collection and the first time sequence data vector set;Activation vector set is determined according to the first time sequence data vector set and the target identification vector set;The density function of each data vector on a preset condition based in the activation vector set is obtained, and determines the predicted density function of data to be predicted according to the density function;Meet the probability of the preset condition according to the predicted density function prediction data to be predicted.
Description
Technical field
This disclosure relates to which data predict field, and in particular, to the method, apparatus of data prediction a kind of, storage medium and
Electronic equipment.
Background technique
Time series forecasting technology is based on ordered data associated with time sequencing, thus it is speculated that the development trend of data with
Solving practical problems are instructed, nowadays, the prediction of time series data all plays extremely important work in different industries
With for example, banking is used to predict the situation of change of daily trading volume;Exchange is used to predict that the stock price of stock market to become
Law;Detect the CPU of application system, memory, the future trend etc. of the key indexes such as http response time.
But with the fast development of computer software technology, data volume scale is increasing, and time series data
Complexity it is higher and higher so that the regularity of data variation is also increasingly difficult to excavate, traditional time series data prediction
Method finds the changing rule of data by carrying out Mathematical Fitting to data, but for the pre- of discrete time series data
It surveys, the precision of prediction of Classical forecast algorithm is lower, and conventional time series data prediction algorithm only shows shape with predicted value
Formula provides prediction result and is not able to satisfy actual business demand.
Summary of the invention
Purpose of this disclosure is to provide method, apparatus, storage medium and the electronic equipments of a kind of prediction of data.
In a first aspect, providing a kind of method of data prediction, which comprises obtain multiple historical time sequence numbers
According to;The multiple historical time sequence data is converted to according to multiple historical time sequence datas corresponding acquisition moment
First time sequence data vector set;Mark vector collection is obtained according to the first time sequence data vector set;According to described
Mark vector collection and the first time sequence data vector set determine target identification vector set;According to the first time sequence
Data vector collection and the target identification vector set determine activation vector set;Obtain in the activation vector set each data to
Density function on a preset condition based is measured, and determines the predicted density function of data to be predicted according to the density function;According to
Data to be predicted described in the predicted density function prediction meet the probability of the preset condition.
Optionally, it is described according to multiple historical time sequence datas corresponding acquisition moment by the multiple history when
Between sequence data to be converted to first time sequence data vector set include: corresponding according to multiple historical time sequence datas
The historical time sequence data is converted to first time sequence data vector set, the public affairs by following formula by the acquisition moment
Formula includes:
yj=[xj-m,xj-m+1,...,xj]T
Wherein, xjIndicate the historical time sequence data acquired in multiple historical time sequence datas at the j moment,
xj-mIndicate the historical time sequence data acquired in multiple historical time sequence datas at the j-m moment, yjIt is described first
Time series data vector set [ym+1,ym+2,...,yt] in any data vector, the value range of j includes m+1 to t.
Optionally, described that target mark is determined according to the mark vector collection and the first time sequence data vector set
Knowing vector set includes: that circulation executes mark vector collection update step, until meeting loop termination condition, and will meet circulation eventually
Only mark vector collection when condition is determined as the target identification vector set;It includes: to calculate that the mark vector collection, which updates step,
The corresponding first distance of each data vector in the first time sequence data vector set, the first distance include the first number
It is concentrated at a distance from each mark vector according to vector and the mark vector, first data vector includes the first time sequence
Any data vector in column data vector set;Determined from the first distance the corresponding target of first data vector away from
From, and determine that the corresponding mark vector of the target range is the corresponding target identification vector of first data vector, the mesh
Subject distance includes the smallest distance of the first distance;The mean vector of the corresponding data vector of the target identification vector is calculated,
And using the mean vector as the updated target identification vector;And it is true according to the updated target identification vector
The fixed target identification vector set;The loop termination condition includes circulation of the target range in continuous first preset quantity
After number, remain unchanged.
Optionally, described determined according to the first time sequence data vector set and the target identification vector set is swashed
Vector set living includes: that the corresponding object vector of the data to be predicted is determined from the first time sequence data vector set;
Calculate the second distance of each target identification vector in the object vector and the target identification vector set;In the target mark
Know in vector set, according to the data vector of determining second preset quantity nearest with the object vector of the second distance, obtains
To the activation vector set.
Optionally, the predicted density function that data to be predicted are determined according to the density function includes: according to
The comentropy of each data vector in activation vector set described in Density functional calculations;According to the comentropy determine it is described preset it is close
Spend function.
Second aspect provides a kind of data prediction meanss, and described device includes: the first acquisition module, multiple for obtaining
Historical time sequence data;Data conversion module, for according to multiple historical time sequence datas corresponding acquisition moment
The multiple historical time sequence data is converted into first time sequence data vector set;Second obtains module, is used for basis
The first time sequence data vector set obtains mark vector collection;Third obtains module, for according to the mark vector collection
Target identification vector set is determined with the first time sequence data vector set;First determining module, for according to described first
Time series data vector set and the target identification vector set determine activation vector set;Second determining module, for obtaining
The density function of each data vector on a preset condition based in the activation vector set, and determined according to the density function to pre-
The predicted density function of measured data;Prediction module meets for the data to be predicted according to the predicted density function prediction
The probability of the preset condition.
Optionally, the data conversion module was used for according to multiple historical time sequence datas corresponding acquisition moment
The historical time sequence data is converted into first time sequence data vector set by following formula, the formula includes:
yj=[xj-m,xj-m+1,...,xj]T
Wherein, xjIndicate the historical time sequence data acquired in multiple historical time sequence datas at the j moment,
xj-mIndicate the historical time sequence data acquired in multiple historical time sequence datas at the j-m moment, yjIt is described first
Time series data vector set [ym+1,ym+2,...,yt] in any data vector, the value range of j includes m+1 to t.
Optionally, the third obtains module, mark vector collection update step is executed for recycling, until meeting circulation eventually
Only condition, and the mark vector collection when meeting loop termination condition is determined as the target identification vector set;The mark
Vector set update step include: calculate in the first time sequence data vector set each data vector corresponding first away from
From, the first distance include the first data vector and the mark vector is concentrated at a distance from each mark vector, and described first
Data vector includes any data vector in the first time sequence data vector set;Institute is determined from the first distance
State the corresponding target range of the first data vector, and determine the corresponding mark vector of the target range be first data to
Corresponding target identification vector is measured, which includes the smallest distance of the first distance;Calculate the target identification vector
The mean vector of corresponding data vector, and using the mean vector as the updated target identification vector;And according to
The updated target identification vector determines the target identification vector set;The loop termination condition include the target away from
From after the cycle-index of continuous first preset quantity, remain unchanged.
Optionally, first determining module is used to determine from the first time sequence data vector set described to pre-
The corresponding object vector of measured data;Calculate each target identification vector in the object vector and the target identification vector set
Second distance;In the target identification vector set, second nearest with the object vector is determined according to the second distance
The data vector of preset quantity obtains the activation vector set.
Optionally, second determining module, for each in the activation vector set according to the Density functional calculations
The comentropy of data vector;The pre-set density function is determined according to the comentropy.
The third aspect provides a kind of computer readable storage medium, is stored thereon with computer program, and the program is processed
The step of disclosure first aspect the method is realized when device executes.
Fourth aspect provides a kind of electronic equipment, comprising: memory is stored thereon with computer program;Processor is used
In executing the computer program in the memory, the step of to realize disclosure first aspect the method.
Through the above technical solutions, can be by obtaining multiple historical time sequence datas;When according to multiple history
Between the sequence data corresponding acquisition moment the multiple historical time sequence data is converted into first time sequence data vector
Collection;Mark vector collection is obtained according to the first time sequence data vector set;According to the mark vector collection and described
One time series data vector set determines target identification vector set;According to the first time sequence data vector set and described
Target identification vector set determines activation vector set;Obtain each data vector on a preset condition based close in the activation vector set
Function is spent, and determines the predicted density function of data to be predicted according to the density function;It is pre- according to the predicted density function
The probability that the data to be predicted meet the preset condition is surveyed, in this way, can showing with the density function of data to be predicted
Form provides prediction result, meets different preset conditions so as to show data to be predicted to user according to the density function
Probability, and then higher reference value is provided for actual business demand.
Other feature and advantage of the disclosure will the following detailed description will be given in the detailed implementation section.
Detailed description of the invention
Attached drawing is and to constitute part of specification for providing further understanding of the disclosure, with following tool
Body embodiment is used to explain the disclosure together, but does not constitute the limitation to the disclosure.In the accompanying drawings:
Fig. 1 is a kind of flow chart of the method for data prediction shown according to an exemplary embodiment;
Fig. 2 is the flow chart of the method for another data prediction shown according to an exemplary embodiment;
Fig. 3 is a kind of block diagram of the device of data prediction shown according to an exemplary embodiment;
Fig. 4 is the block diagram of a kind of electronic equipment shown according to an exemplary embodiment.
Specific embodiment
It is described in detail below in conjunction with specific embodiment of the attached drawing to the disclosure.It should be understood that this place is retouched
The specific embodiment stated is only used for describing and explaining the disclosure, is not limited to the disclosure.
The disclosure provides method, apparatus, storage medium and the electronic equipment of a kind of data prediction, can be multiple by obtaining
Historical time sequence data;According to multiple historical time sequence datas corresponding acquisition moment by multiple historical time sequence
Data are converted to first time sequence data vector set;Mark vector collection is obtained according to the first time sequence data vector set;
Target identification vector set is determined according to the mark vector collection and the first time sequence data vector set;At the first time according to this
Sequence data vector set and the target identification vector set determine activation vector set;Obtain in the activation vector set each data to
Density function on a preset condition based is measured, and determines the predicted density function of data to be predicted according to the density function;According to this
Predicted density function prediction data to be predicted meet the probability of the preset condition, in this way, can be with the density letter of data to be predicted
Several forms that show provide prediction result, meet difference in advance so as to show data to be predicted to user according to the density function
If the probability of condition, and then higher reference value is provided for actual business demand.
The specific embodiment of the disclosure is described in detail with reference to the accompanying drawing.
Fig. 1 is a kind of flow chart of data predication method shown according to an exemplary embodiment, as shown in Figure 1, the party
Method the following steps are included:
S101 obtains multiple historical time sequence datas.
Wherein, which is that can be used for describing in different moments collected data sequentially in time
The case where data change over time, for example, the time series data may include the daily trading volume of banking, stock market
The data such as the response time of stock price and application system, when multiple historical time sequence data may include default history
Between the time series data of the first preset quantity that acquires in section.
S102, according to multiple historical time sequence datas corresponding acquisition moment by multiple historical time sequence data
Be converted to first time sequence data vector set.
In this step, it can be incited somebody to action according to multiple historical time sequence datas corresponding acquisition moment by following formula
The historical time sequence data is converted to first time sequence data vector set, which includes:
yj=[xj-m,xj-m+1,...,xj]T
Wherein, multiple historical time sequence datas can be expressed as [x1,x2,...,xt], xjIndicate multiple history
The historical time sequence data acquired in time series data at the j moment, xj-mIt indicates in multiple historical time sequence datas
In the historical time sequence data that the j-m moment acquires, which can be expressed as [ym+1,
ym+2,...,yt], yjFor first time sequence data vector set [ym+1,ym+2,...,yt] in any data vector, and j
Value range include m+1 to t,, can be according to different business need in practical application scene in addition, m value can be preset value
Seek the size of setting m value.
After executing S102, it can will be acquired after m acquires the moment in multiple historical time sequence datas every
A historical time sequence data is transformed to by m before object time sequence data and the object time sequence data
Data composition column vector, wherein the object time sequence data be multiple historical time sequence data in m acquisition when
Any time sequence data in historical time sequence data acquired after carving.
S103 obtains mark vector collection according to the first time sequence data vector set.
In this step, third preset quantity data can be randomly choosed from the first time sequence data vector set
Vector, and by randomly selected third preset quantity data Vector Groups at the mark vector collection.In addition, to avoid over-fitting existing
As the third preset quantity can be less than the number of data vector in the first time sequence data vector set.
S104 determines target identification vector set according to the mark vector collection and the first time sequence data vector set.
In this step, it can recycle and execute mark vector collection and update step, until meet loop termination condition, and will be
Mark vector collection when meeting loop termination condition is determined as the target identification vector set;The mark vector collection updates step packet
It includes: calculating the corresponding first distance of each data vector in the first time sequence data vector set, which includes the
One data vector is concentrated at a distance from each mark vector with the mark vector, which includes the first time sequence
Any data vector that data vector is concentrated;The corresponding target range of the first data vector is determined from the first distance, and
Determine that the corresponding mark vector of the target range is the corresponding target identification vector of first data vector, which includes
The smallest distance of the first distance;Calculate the mean vector of the corresponding data vector of target identification vector, and by the mean value to
Amount is used as the updated target identification vector;And the target identification vector is determined according to the updated target identification vector
Collection;The loop termination condition includes the target range after the cycle-index of continuous first preset quantity, is remained unchanged.
In one possible implementation, determining the target range after the cycle-index of continuous first preset quantity
When remaining unchanged, it can determine that the mark vector has been restrained, at this point it is possible to which convergent mark vector is determined as target identification
Vector, and then can determine the target identification vector set, time sequence can be carried out according to the target identification vector set so as to subsequent
The prediction of column data.
S105 determines activation vector set according to the first time sequence data vector set and the target identification vector set.
In this step, the corresponding target of the data to be predicted can be determined from the first time sequence data vector set
Vector;Calculate the second distance of each target identification vector in the object vector and the target identification vector set;In the target mark
Know in vector set, according to the data vector of determining second preset quantity nearest with the object vector of the second distance, is somebody's turn to do
Activate vector set.
S106 obtains the density function of each data vector on a preset condition based in the activation vector set, and close according to this
Degree function determines the predicted density function of data to be predicted.
In view of comentropy can be used as the uncertainty measure of density function, therefore, if certain in the activation vector set
The comentropy of a data vector is small, then illustrates that the predicted value based on the data variable is mostly invalid prediction, at this time, it may be necessary to reduce
Its weight therefore, in one possible implementation, can be according to the density function meter so as to improve the accuracy of prediction
Calculate the comentropy of each data vector in the activation vector set;The pre-set density function is determined according to the comentropy.
S107 meets the probability of the preset condition according to the predicted density function prediction data to be predicted.
It should be noted that after the true value for getting data to be predicted, it can be using the data to be predicted as new
Historical time sequence data, and the close of each data vector in activation vector set is updated according to new historical time sequence data
Spend function so that the data predication method in the disclosure can automatic adaptation time sequence new rule so that the time
The prediction of sequence data is more acurrate, and does not need in advance to learn a large amount of historical datas, improves the suitable of prediction algorithm
The property used.
Using the above method, prediction result can be provided in the form of the showing of the density function of data to be predicted, so as to
To show the probability that data to be predicted meet different preset conditions to user according to the density function, and then needed for actual business
It asks and higher reference value is provided.
Fig. 2 is a kind of flow chart of data predication method shown according to an exemplary embodiment, as shown in Fig. 2, the party
Method the following steps are included:
S201 obtains multiple historical time sequence datas.
Wherein, which is that can be used for describing in different moments collected data sequentially in time
The case where data change over time, for example, the time series data may include the daily trading volume of banking, stock market
The data such as the response time of stock price and application system, when multiple historical time sequence data may include default history
Between the time series data of the first preset quantity that acquires in section.
S202, according to multiple historical time sequence datas corresponding acquisition moment by multiple historical time sequence data
Be converted to first time sequence data vector set.
In this step, it can be incited somebody to action according to multiple historical time sequence datas corresponding acquisition moment by following formula
The historical time sequence data is converted to first time sequence data vector set, which includes:
yj=[xj-m,xj-m+1,...,xj]T
Wherein, multiple historical time sequence datas can be expressed as [x1,x2,...,xt], xjIndicate multiple history
The historical time sequence data acquired in time series data at the j moment, xj-mIt indicates in multiple historical time sequence datas
In the historical time sequence data that the j-m moment acquires, which can be expressed as [ym+1,
ym+2,...,yt], yjFor first time sequence data vector set [ym+1,ym+2,...,yt] in any data vector, and j
Value range include m+1 to t,, can be according to different business need in practical application scene in addition, m value can be preset value
Seek the size of setting m value.
It specifically, is [x in multiple historical time sequence data1,x2,...,xt] when, it can be incited somebody to action according to above-mentioned formula
It is [y that the historical time sequence data, which is converted to first time sequence data vector set,m+1,ym+2,...,yt], wherein
ym+1=[x1,x2,...,xm+1]T
ym+2=[x2,x3,...,xm+2]T
......
yt=[xt-m,xt-m+1,...,xt]T
Illustratively, it is illustrated so that the time series data is the daily trading volume of banking as an example, for purposes of illustration only, will
The nearest 10 days trading volumes of the banking got are expressed as [x1,x2,...,x10] (t=10), at this point, xiIndicate banking i-th
The trading volume in (i gets 10 from 1) day, when m value is set as 5, which is [y6,y7,y8,
y9,y10], wherein
y6=[x1,x2,...,x6]T
y7=[x2,x3,...,x7]T
y8=[x3,x4,...,x8]T
y9=[x4,x5,...,x9]T
y10=[x5,x6,...,x10]T
Above-mentioned example is merely illustrative, and the disclosure does not limit this.
That is, after executing S202, can by multiple historical time sequence datas after m acquires the moment
Each of the acquisition historical time sequence data is transformed to by object time sequence data and the object time sequence data
Before m data composition column vector, wherein the object time sequence data be multiple historical time sequence data in
M acquires any time sequence data in the historical time sequence data acquired after the moment.
S203 obtains mark vector collection according to the first time sequence data vector set.
In this step, third preset quantity data can be randomly choosed from the first time sequence data vector set
Vector, and by the randomly selected third preset quantity data Vector Groups at the mark vector collection, in addition, to avoid over-fitting
Phenomenon, the third preset quantity can be less than the number of data vector in the first time sequence data vector set.
It illustratively, is [y in the first time sequence data vector setm+1,ym+2,...,yt] when, K can be randomly selected
(as third preset quantity) a yjAs the mark vector of different data mode, then by randomly selected K yjComposition one
The mark vector collection of K dimension, for example, the mark vector collection can be [ym+1,ym+2,...,ym+k], for example, in the first time sequence
Column data vector set is [y6,y7,y8,y9,y10] when, 3 (K=can be randomly choosed in the first time sequence data vector set
3) a data vector (as yj) one 3 mark vector collection tieed up of composition, mark vector collection of 3 dimension can be by [y6,y7,y8,
y9,y10] in any three data vectors (for example, [y6,y7,y8]、[y7,y8,y9]、[y7,y9,y10] etc.) composition, above-mentioned example
It is merely illustrative, the disclosure is not construed as limiting this.
S204 calculates the corresponding first distance of each data vector in the first time sequence data vector set.
Wherein, the first distance may include the first data vector and the mark vector concentrate each mark vector away from
From first data vector may include any data vector in the first time sequence data vector set, in a kind of possibility
Implementation in, can be concentrated by calculating each data vector and mark vector in the first time sequence data vector set
The Euclidean distance of each mark vector obtains the first distance.
Illustratively, with the first time sequence data vector set for [ym+1,ym+2,...,yt], which integrates as K dimension
[ym+1,ym+2,...,ym+k] for be illustrated, first data vector be ym+1When, calculate the first data vector ym+1With
The mark vector collection [ym+1,ym+2,...,ym+k] in each mark vector distance, obtain K and first data vector ym+1
Corresponding first distance;It is y in first data vectorm+k+1When, calculate the first data vector ym+k+1With the mark vector collection
[ym+1,ym+2,...,ym+k] in each mark vector distance, obtain K and first data vector ym+k+1Corresponding first away from
From first time sequence data vector set [y can be calculated according to similar calculation method in this waym+1,ym+2,...,yt] in it is each
Data vector and the mark vector collection [ym+1,ym+2,...,ym+k] in each mark vector distance, obtain the first distance, on
It states example to be merely illustrative, the disclosure is not construed as limiting this.
S205 determines the corresponding target range of the first data vector from the first distance, and determines the target range
Corresponding mark vector is the corresponding target identification vector of first data vector, which may include the first distance
The smallest distance.
It can determine that each data vector in the first time sequence data vector set respectively corresponds after executing S204
The K first distances, at this point it is possible to which the smallest distance in K first distance corresponding with first data vector is true
Be set to target range corresponding with first data vector, and by the corresponding mark vector of the target range be determined as with this first
The corresponding target identification vector of data vector.
Illustratively, continue with the first time sequence data vector set as [y6,y7,y8,y9,y10], which is
[the y of 3 dimensions6,y7,y8] for be illustrated, at this point, first data vector be y6,y7,y8,y9,y10In any one data to
Amount is y in first data vector6When, calculate y6With mark vector collection [y6,y7,y8] in each mark vector distance, at this time
It can determine the first data vector y6Corresponding target identification vector is y6, similarly, can determine the first data vector y7It is corresponding
Target identification vector is y7, the first data vector y8Corresponding target identification vector is y8, it is y in first data vector9When,
Calculate y9With mark vector collection [y6,y7,y8] in each mark vector distance, it is assumed that y9With mark vector y6Distance it is minimum,
y10With mark vector y7Distance it is minimum, at this point it is possible to determine the first data vector y9Corresponding target identification vector is y6, the
One data vector y10Corresponding target identification vector is y7, above-mentioned example is merely illustrative, and the disclosure is not construed as limiting this.
S206 calculates the mean vector of the corresponding data vector of target identification vector, and using the mean vector as more
The target identification vector after new;And the target identification vector set is determined according to the updated target identification vector.
Illustratively, continue with the first time sequence data vector set as [y6,y7,y8,y9,y10], which is
[the y of 3 dimensions6,y7,y8] for be illustrated, after executing S205, can determine target identification vector y6Corresponding described
One time series data vector set [y6,y7,y8,y9,y10] in data vector be y6And y9, target identification vector y7Corresponding institute
State first time sequence data vector set [y6,y7,y8,y9,y10] in data vector be y7And y10, target identification vector y8It is right
The first time sequence data vector set [y answered6,y7,y8,y9,y10] in data vector be y8, in this way, can be by y6With
y9The mean vector of two data vectors is as updated target identification vector y6', it can be by y7And y10Two data vectors
Mean vector is as updated target identification vector y7' can be by data vector y8As updated target identification vector y8’
(at this point, the target identification vector y before updating8Data vector y as in first time sequence data vector set8Itself, is not necessarily to
Calculate mean value), in this way, can determine that updated target identification vector set is according to updated target identification vectorAbove-mentioned example is merely illustrative, and the disclosure is not construed as limiting this.
S207, determines whether the target range remains unchanged after the cycle-index of continuous first preset quantity.
In one possible implementation, determining the target range after the cycle-index of continuous first preset quantity
When remaining unchanged, it can determine that the mark vector has been restrained, at this point it is possible to which convergent mark vector is determined as target identification
Vector, and then can determine the target identification vector set, time sequence can be carried out according to the target identification vector set so as to subsequent
The prediction of column data.
When determining that the target range remains unchanged after the cycle-index of continuous first preset quantity, S208 is executed;?
When determining that the cycle-index does not reach first preset quantity and/or the target range changes, S204 is executed extremely
S207。
S208 determines the corresponding object vector of data to be predicted from the first time sequence data vector set.
It illustratively, is [x in multiple historical time sequence data1,x2,...,xt] when, which is xt+1
(it should be noted that in the disclosure, it can be to the data x to be predictedt+1The probability of place preset condition is predicted),
That is in one possible implementation, multiple historical time sequence data [x can be used1,x2,...,xt] prediction t+1
The data x at momentt+1, at this time can be from first time sequence data vector set [ym+1,ym+2,...,yt] in get to pre-
Measured data xt+1Corresponding object vector are as follows: yt=[xt-m,xt-m+1,...,xt]T, above-mentioned example is merely illustrative, the disclosure pair
This is not construed as limiting.
S209 calculates the second distance of each target identification vector in the object vector and the target identification vector set.
It in one possible implementation, can be by calculating the European of the object vector and each target identification vector
Distance obtains the second distance.
S210 determines second nearest with the object vector in advance according to the second distance in the target identification vector set
If the data vector of quantity, the activation vector set is obtained.
It illustratively, is y with the object vectorpredict=[2,2,3,4]T, which is [y1,y2,y3,
y4], also, y1=[1,2,3,4]T, y2=[2,2,4,4]T, y3=[4,4,2,2]T, y4=[4,3,2,1]TFor said
It is bright, at this point, object vector ypredictWith the target identification vector set [y1,y2,y3,y4] in four target identification vectors
Two distances are respectively as follows:
dist(ypredict,y1)=1
dist(ypredict,y2)=1
dist(ypredict,y3)=3.16
dist(ypredict,y4)=3.87
At this point, can determine the target identification vector y in the target identification vector set when second preset quantity is 21
And y2The activation vector set [y can be formed1,y2], above-mentioned example is merely illustrative, and the disclosure is not construed as limiting this.
S211 obtains the density function of each data vector on a preset condition based in the activation vector set, and close according to this
Degree function calculates the comentropy of each data vector in the activation vector set.
In view of comentropy can be used as the uncertainty measure of density function, therefore, if certain in the activation vector set
The comentropy of a data vector is small, then illustrates that the predicted value based on the data variable is mostly invalid prediction, at this time, it may be necessary to reduce
Its weight, so as to improve the accuracy of prediction.
In one possible implementation, for ease of description, can be indicated with following formula in activation vector set i-th
The density function of data vector:
Wherein, fi(x) density function of i-th of data vector in activation vector set, p are indicatednFor the statistics time of density function
Number is normalized as a result, a+ Δ≤x≤a+2 Δ, a+2 Δ≤x≤a+3 Δ and a+n Δ≤x≤a+ (n+1) Δ respectively indicate
Different preset conditions where data to be predicted, a are the preset boundary threshold value of multiple preset conditions, and Δ is preset data change
Change amount, n are the number of preset condition.At this point it is possible to be calculated by the following formula in the activation vector set according to the density function
The comentropy of each data vector:
Wherein, I (fi(x)) the density function f of i-th of data vector in activation vector set is indicatedi(x) comentropy, pjTable
Show that data to be predicted are located at the probability of j-th of preset condition.
S212 determines the pre-set density function of data to be predicted according to the comentropy.
In one possible implementation, it can be calculated by the following formula to obtain the number to be predicted according to the comentropy
According to predicted density function:
Wherein, f (x) indicates the predicted density function of the data to be predicted, fi(x) i-th of number in activation vector set is indicated
According to the density function of vector, I (fi(x)) the density function f of i-th of data vector in activation vector set is indicatedi(x) comentropy.
S213 meets the probability of the preset condition according to the predicted density function prediction data to be predicted.
Illustratively, it is illustrated for predicting the daily trading volume of banking, at this point, the data x to be predictedt+1As
Following three preset conditions: trading volume position to be predicted can be set in one possible implementation in trading volume to be predicted
In 80,000 or less (as xt+1< 80000), trading volume to be predicted (as 80000≤x between 80,000 to 100,000t+1≤
100000), trading volume to be predicted is located at 100,000 or more (as xt+1> 100000), at this point it is possible to according to the predicted density letter
Number f (x) predicts that the probability that trading volume to be predicted meets above three preset condition is respectively as follows: trading volume to be predicted positioned at 80,000
Probability below is 10%, and probability of the trading volume to be predicted between 80,000 to 100,000 is 75%, trading volume position to be predicted
It is 15% in 100,000 or more probability, above-mentioned example is merely illustrative, and the disclosure is not construed as limiting this.
It should be noted that after the true value for getting data to be predicted, it can be using the data to be predicted as new
Historical time sequence data, and the close of each data vector in activation vector set is updated according to new historical time sequence data
Spend function so that the data predication method in the disclosure can automatic adaptation time sequence new rule so that the time
The prediction of sequence data is more acurrate, and does not need in advance to learn a large amount of historical datas, improves the suitable of prediction algorithm
The property used.
Using the above method, prediction result can be provided in the form of the showing of the density function of data to be predicted, so as to
To show the probability that data to be predicted meet different preset conditions to user according to the density function, and then needed for actual business
It asks and higher reference value is provided.
Fig. 3 is a kind of block diagram of the device of data prediction shown according to an exemplary embodiment, as shown in figure 3, the dress
It sets and includes:
First obtains module 301, for obtaining multiple historical time sequence datas;
Data conversion module 302, being used for will be multiple according to multiple historical time sequence datas corresponding acquisition moment
Historical time sequence data is converted to first time sequence data vector set;
Second obtains module 303, for obtaining mark vector collection according to the first time sequence data vector set;
Third obtains module 304, for being determined according to the mark vector collection and the first time sequence data vector set
Target identification vector set;
First determining module 305, for according to the first time sequence data vector set and the target identification vector set
Determine activation vector set;
Second determining module 306, for obtaining the density of each data vector on a preset condition based in the activation vector set
Function, and determine according to the density function predicted density function of data to be predicted;
Prediction module 307, for meeting the probability of the preset condition according to the predicted density function prediction data to be predicted.
Optionally, which was used for according to multiple historical time sequence datas corresponding acquisition moment
The historical time sequence data is converted into first time sequence data vector set by following formula, which includes:
yj=[xj-m,xj-m+1,...,xj]T
Wherein, xjIndicate the historical time sequence data acquired in multiple historical time sequence datas at the j moment,
xj-mIndicate the historical time sequence data acquired in multiple historical time sequence datas at the j-m moment, yjIt is described first
Time series data vector set [ym+1,ym+2,...,yt] in any data vector, the value range of j includes m+1 to t.
Optionally, which obtains module 304, mark vector collection update step is executed for recycling, until meeting circulation
Termination condition, and the mark vector collection when meeting loop termination condition is determined as the target identification vector set;The mark to
It includes: to calculate the corresponding first distance of each data vector in the first time sequence data vector set that quantity set, which updates step, should
First distance includes that the first data vector is concentrated at a distance from each mark vector with the mark vector, which includes
Any data vector in the first time sequence data vector set;Determine that first data vector is corresponding from the first distance
Target range, and determine that the corresponding mark vector of the target range is the corresponding target identification vector of first data vector;
The mean vector of the corresponding data vector of target identification vector is calculated, and using the mean vector as the updated target mark
Know vector;And the target identification vector set is determined according to the updated target identification vector;The loop termination condition includes should
Target range remains unchanged after the cycle-index of continuous first preset quantity.
Optionally, which is used to determine that this is to be predicted from the first time sequence data vector set
The corresponding object vector of data;Calculate second of each target identification vector in the object vector and the target identification vector set away from
From;In the target identification vector set, according to the number of determining second preset quantity nearest with the object vector of the second distance
According to vector, the activation vector set is obtained.
Optionally, second determining module 306, for according to each data in the Density functional calculations activation vector set
The comentropy of vector;The pre-set density function is determined according to the comentropy.
About the device in above-described embodiment, wherein modules execute the concrete mode of operation in related this method
Embodiment in be described in detail, no detailed explanation will be given here.
Using above-mentioned apparatus, prediction result can be provided in the form of the showing of the density function of data to be predicted, so as to
To show the probability that data to be predicted meet different preset conditions to user according to the density function, and then needed for actual business
It asks and higher reference value is provided.
Fig. 4 is the block diagram of a kind of electronic equipment 400 shown according to an exemplary embodiment.As shown in figure 4, the electronics is set
Standby 400 may include: processor 401, memory 402.The electronic equipment 400 can also include multimedia component 403, input/
Export one or more of (I/O) interface 404 and communication component 405.
Wherein, processor 401 is used to control the integrated operation of the electronic equipment 400, to complete above-mentioned data prediction side
All or part of the steps in method.Memory 402 is for storing various types of data to support the behaviour in the electronic equipment 400
To make, these data for example may include the instruction of any application or method for operating on the electronic equipment 400, with
And the relevant data of application program, such as contact data, the message of transmitting-receiving, picture, audio, video etc..The memory 402
It can be realized by any kind of volatibility or non-volatile memory device or their combination, such as static random-access is deposited
Reservoir (Static Random Access Memory, abbreviation SRAM), electrically erasable programmable read-only memory
(Electrically Erasable Programmable Read-Only Memory, abbreviation EEPROM), erasable programmable
Read-only memory (Erasable Programmable Read-Only Memory, abbreviation EPROM), programmable read only memory
(Programmable Read-Only Memory, abbreviation PROM), and read-only memory (Read-Only Memory, referred to as
ROM), magnetic memory, flash memory, disk or CD.Multimedia component 403 may include screen and audio component.Wherein
Screen for example can be touch screen, and audio component is used for output and/or input audio signal.For example, audio component may include
One microphone, microphone is for receiving external audio signal.The received audio signal can be further stored in storage
Device 402 is sent by communication component 405.Audio component further includes at least one loudspeaker, is used for output audio signal.I/O
Interface 404 provides interface between processor 401 and other interface modules, other above-mentioned interface modules can be keyboard, mouse,
Button etc..These buttons can be virtual push button or entity button.Communication component 405 is for the electronic equipment 400 and other
Wired or wireless communication is carried out between equipment.Wireless communication, such as Wi-Fi, bluetooth, near-field communication (Near Field
Communication, abbreviation NFC), 2G, 3G or 4G or they one or more of combination, therefore corresponding communication
Component 405 may include: Wi-Fi module, bluetooth module, NFC module.
In one exemplary embodiment, electronic equipment 400 can be by one or more application specific integrated circuit
(Application Specific Integrated Circuit, abbreviation ASIC), digital signal processor (Digital
Signal Processor, abbreviation DSP), digital signal processing appts (Digital Signal Processing Device,
Abbreviation DSPD), programmable logic device (Programmable Logic Device, abbreviation PLD), field programmable gate array
(Field Programmable Gate Array, abbreviation FPGA), controller, microcontroller, microprocessor or other electronics member
Part is realized, for executing above-mentioned data predication method.
In a further exemplary embodiment, a kind of computer readable storage medium including program instruction is additionally provided, it should
The step of above-mentioned data predication method is realized when program instruction is executed by processor.For example, the computer readable storage medium
It can be the above-mentioned memory 402 including program instruction, above procedure instruction can be executed by the processor 401 of electronic equipment 400
To complete above-mentioned data predication method.
The preferred embodiment of the disclosure is described in detail in conjunction with attached drawing above, still, the disclosure is not limited to above-mentioned reality
The detail in mode is applied, in the range of the technology design of the disclosure, a variety of letters can be carried out to the technical solution of the disclosure
Monotropic type, these simple variants belong to the protection scope of the disclosure.
It is further to note that specific technical features described in the above specific embodiments, in not lance
In the case where shield, can be combined in any appropriate way, in order to avoid unnecessary repetition, the disclosure to it is various can
No further explanation will be given for the combination of energy.
In addition, any combination can also be carried out between a variety of different embodiments of the disclosure, as long as it is without prejudice to originally
Disclosed thought equally should be considered as disclosure disclosure of that.
Claims (10)
1. a kind of method of data prediction, which is characterized in that the described method includes:
Obtain multiple historical time sequence datas;
The multiple historical time sequence data is converted according to multiple historical time sequence datas corresponding acquisition moment
For first time sequence data vector set;
Mark vector collection is obtained according to the first time sequence data vector set;
Target identification vector set is determined according to the mark vector collection and the first time sequence data vector set;
Activation vector set is determined according to the first time sequence data vector set and the target identification vector set;
The density function of each data vector on a preset condition based in the activation vector set is obtained, and according to the density function
Determine the predicted density function of data to be predicted;
The data to be predicted according to the predicted density function prediction meet the probability of the preset condition.
2. the method according to claim 1, wherein described according to the mark vector collection and the first time
Sequence data vector set determines that target identification vector set includes:
Circulation executes mark vector collection and updates step, until meeting loop termination condition, and will be when meeting loop termination condition
Mark vector collection be determined as the target identification vector set;
It includes: to calculate each data vector pair in the first time sequence data vector set that the mark vector collection, which updates step,
The first distance answered, the first distance includes the first data vector and the mark vector concentrate each mark vector away from
From first data vector includes any data vector in the first time sequence data vector set;
The corresponding target range of first data vector is determined from the first distance, and determines that the target range is corresponding
Mark vector be the corresponding target identification vector of first data vector, which includes that the first distance is the smallest
Distance;
The mean vector of the corresponding data vector of the target identification vector is calculated, and using the mean vector as updated
The target identification vector;And the target identification vector set is determined according to the updated target identification vector;
The loop termination condition includes the target range after the cycle-index of continuous first preset quantity, is remained unchanged.
3. the method according to claim 1, wherein it is described according to the first time sequence data vector set with
And the target identification vector set determines that activation vector set includes:
The corresponding object vector of the data to be predicted is determined from the first time sequence data vector set;
Calculate the second distance of each target identification vector in the object vector and the target identification vector set;
In the target identification vector set, according to determining second present count nearest with the object vector of the second distance
The data vector of amount obtains the activation vector set.
4. method according to any one of claims 1 to 3, which is characterized in that it is described according to the density function determine to
The predicted density function of prediction data includes:
The comentropy of each data vector in the activation vector set according to the Density functional calculations;
The pre-set density function is determined according to the comentropy.
5. a kind of data prediction meanss, which is characterized in that described device includes:
First obtains module, for obtaining multiple historical time sequence datas;
Data conversion module, for according to multiple historical time sequence datas corresponding acquisition moment by the multiple history
Time series data is converted to first time sequence data vector set;
Second obtains module, for obtaining mark vector collection according to the first time sequence data vector set;
Third obtains module, for determining target mark according to the mark vector collection and the first time sequence data vector set
Know vector set;
First determining module, for being determined according to the first time sequence data vector set and the target identification vector set
Activate vector set;
Second determining module, for obtaining the density function of each data vector on a preset condition based in the activation vector set,
And the predicted density function of data to be predicted is determined according to the density function;
Prediction module meets the general of the preset condition for the data to be predicted according to the predicted density function prediction
Rate.
6. device according to claim 5, which is characterized in that the third obtains module, for recycle execute mark to
Quantity set updates step, until meeting loop termination condition, and the mark vector collection when meeting loop termination condition is determined as
The target identification vector set;It includes: to calculate the first time sequence data vector set that the mark vector collection, which updates step,
In the corresponding first distance of each data vector, the first distance includes the first data vector and the mark vector concentrate it is every
The distance of a mark vector, first data vector include any data in the first time sequence data vector set to
Amount;The corresponding target range of first data vector is determined from the first distance, and determines that the target range is corresponding
Mark vector be the corresponding target identification vector of first data vector, which includes that the first distance is the smallest
Distance;Calculate the mean vector of the corresponding data vector of the target identification vector, and using the mean vector as updating after
The target identification vector;And the target identification vector set is determined according to the updated target identification vector;It is described
Loop termination condition includes the target range after the cycle-index of continuous first preset quantity, is remained unchanged.
7. device according to claim 5, which is characterized in that first determining module is used for from the first time sequence
The corresponding object vector of the data to be predicted is determined in column data vector set;Calculate the object vector and the target identification
The second distance of each target identification vector in vector set;It is true according to the second distance in the target identification vector set
The data vector of fixed second preset quantity nearest with the object vector, obtains the activation vector set.
8. according to the described in any item devices of claim 5 to 7, which is characterized in that second determining module, for according to institute
State the comentropy of each data vector in activation vector set described in Density functional calculations;It is determined according to the comentropy described default
Density function.
9. a kind of computer readable storage medium, is stored thereon with computer program, which is characterized in that the program is held by processor
The step of any one of claim 1-4 the method is realized when row.
10. a kind of electronic equipment characterized by comprising
Memory is stored thereon with computer program;
Processor, for executing the computer program in the memory, to realize described in any one of claim 1-4
The step of method.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811475791.8A CN109754115B (en) | 2018-12-04 | 2018-12-04 | Data prediction method and device, storage medium and electronic equipment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811475791.8A CN109754115B (en) | 2018-12-04 | 2018-12-04 | Data prediction method and device, storage medium and electronic equipment |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109754115A true CN109754115A (en) | 2019-05-14 |
CN109754115B CN109754115B (en) | 2021-03-26 |
Family
ID=66403636
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811475791.8A Active CN109754115B (en) | 2018-12-04 | 2018-12-04 | Data prediction method and device, storage medium and electronic equipment |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109754115B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111291824A (en) * | 2020-02-24 | 2020-06-16 | 网易(杭州)网络有限公司 | Time sequence processing method and device, electronic equipment and computer readable medium |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20160034615A1 (en) * | 2014-08-01 | 2016-02-04 | Tata Consultancy Services Limited | System and method for forecasting a time series data |
CN107092582A (en) * | 2017-03-31 | 2017-08-25 | 江苏方天电力技术有限公司 | One kind is based on the posterior exceptional value on-line checking of residual error and method for evaluating confidence |
CN107180278A (en) * | 2017-05-27 | 2017-09-19 | 重庆大学 | A kind of real-time passenger flow forecasting of track traffic |
CN108549647A (en) * | 2018-01-17 | 2018-09-18 | 中移在线服务有限公司 | The method without accident in mark language material active predicting movement customer service field is realized based on SinglePass algorithms |
-
2018
- 2018-12-04 CN CN201811475791.8A patent/CN109754115B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20160034615A1 (en) * | 2014-08-01 | 2016-02-04 | Tata Consultancy Services Limited | System and method for forecasting a time series data |
CN107092582A (en) * | 2017-03-31 | 2017-08-25 | 江苏方天电力技术有限公司 | One kind is based on the posterior exceptional value on-line checking of residual error and method for evaluating confidence |
CN107180278A (en) * | 2017-05-27 | 2017-09-19 | 重庆大学 | A kind of real-time passenger flow forecasting of track traffic |
CN108549647A (en) * | 2018-01-17 | 2018-09-18 | 中移在线服务有限公司 | The method without accident in mark language material active predicting movement customer service field is realized based on SinglePass algorithms |
Non-Patent Citations (1)
Title |
---|
王婷: ""Pair_Copula自回归模型及其在股票指数中的应用"", 《中国优秀硕士学位论文全文数据库 经济与管理科学辑》 * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111291824A (en) * | 2020-02-24 | 2020-06-16 | 网易(杭州)网络有限公司 | Time sequence processing method and device, electronic equipment and computer readable medium |
CN111291824B (en) * | 2020-02-24 | 2024-03-22 | 网易(杭州)网络有限公司 | Time series processing method, device, electronic equipment and computer readable medium |
Also Published As
Publication number | Publication date |
---|---|
CN109754115B (en) | 2021-03-26 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108197327B (en) | Song recommendation method, device and storage medium | |
Chorin et al. | Discrete approach to stochastic parametrization and dimension reduction in nonlinear dynamics | |
WO2019120019A1 (en) | User gender prediction method and apparatus, storage medium and electronic device | |
JP6091981B2 (en) | Menstruation scheduled date calculation device and program | |
JP5940581B2 (en) | Power consumption prediction apparatus, method, and non-transitory computer-readable storage medium | |
CN108317996B (en) | Floor determining method, related equipment and system | |
CN113240936B (en) | Parking area recommendation method and device, electronic equipment and medium | |
CN102741840B (en) | For the method and apparatus to individual scene modeling | |
JP6521835B2 (en) | Movement path prediction device, movement path prediction method, and movement path prediction program | |
CN105528403B (en) | Target data identification method and device | |
CN110858062B (en) | Target optimization parameter obtaining method and model training method and device | |
CN109239807A (en) | Rainfall appraisal procedure and system and terminal | |
JP6543215B2 (en) | Destination prediction apparatus, destination prediction method, and destination prediction program | |
CN112764513A (en) | Prompting method and electronic equipment | |
CN108764283A (en) | A kind of the loss value-acquiring method and device of disaggregated model | |
CN109615171A (en) | Characteristic threshold value determines that method and device, problem objects determine method and device | |
CN109754115A (en) | Method, apparatus, storage medium and the electronic equipment of data prediction | |
CN108363947A (en) | Delay demographic method for early warning based on big data and device | |
JP6433877B2 (en) | Destination prediction apparatus, destination prediction method, and destination prediction program | |
CN109658187A (en) | Recommend method, apparatus, storage medium and the electronic equipment of cloud service provider | |
CN114795000B (en) | Control method and control device of cleaning equipment, electronic equipment and storage medium | |
CN116721724A (en) | Alloy performance prediction method and device, storage medium, electronic equipment and chip | |
CN109461231B (en) | Door lock control method and device, control equipment and readable storage medium | |
CN110268366B (en) | Information processing device, information processing method, and program | |
JP6107944B2 (en) | Portable information processing apparatus, information processing system, and information processing method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |