The content of the invention
In view of the above problems, it is proposed that the embodiment of the present application overcomes above mentioned problem or extremely to provide one kind
A kind of data predication method based on time series partially solved the above problems and corresponding one kind
Data prediction device based on time series.
In order to solve the above problems, this application discloses a kind of data predication method based on time series,
Described method includes:
The historical time sequence data of multiple classification objects is obtained, wherein, the classification object includes one
Or multiple data objects;
Feature classification object is filtered out from the multiple classification object, wherein, the feature classification object
For the classification object comprising characteristic object, when the characteristic object is that life cycle is less than default
Between threshold value data object;
Based on the corresponding historical time sequence data of the feature classification object, from the feature classification object
Comprising data object in predict target data objects, the target data objects will be preset for future first
The future time sequence data that will be produced in period meets the data object of default growth trend.
Preferably, methods described also includes:
Predict future time sequence of the target data objects in following first preset time period
Data.
Preferably, the step of historical time sequence data of the multiple classification objects of acquisition includes:
For default multiple time intervals, calculate what is stored in each time interval in presetting database,
The quantity of the corresponding specific characteristic data of the data object, as the data object in the time zone
Interior history feature data;
Organize the data object in the history feature data of all time intervals, obtain the data object
Historical time sequence data;
According to the time interval, the data object included in each classification object is counted in the time zone
Between history feature data summation;
When the summation of the history feature data of all time intervals is organized into the history of the classification object
Between sequence data.
Preferably, it is described to include the step of feature classification object is filtered out from the multiple classification object:
Based on the historical time sequence data of the classification object, filtered out from the multiple classification object
Fisrt feature classification object;
Obtain default second feature classification object;
By the fisrt feature classification object and the second feature classification object tissue into feature classification
Object.
Preferably, the historical time sequence data based on the classification object, from the multiple classification
The step of fisrt feature classification object is filtered out in object includes:
Calculate in the preset time period of past first in the historical time sequence data of each classification object
Value M;
Calculate the quantity of the time interval of preset multiple of the summation more than the M of history feature data;
If the summation of the history feature data is more than the quantity of the time interval of the preset multiple of the M
Within a preset range, then judge the classification object as fisrt feature classification object.
Preferably, it is described to be based on the corresponding historical time sequence data of the feature classification object, from described
The step of predicting target data objects in the data object that feature classification object is included includes:
Based on the corresponding historical time sequence data of the feature classification object, to the feature classification object
It is normalized;
The data object included in feature classification object after all normalizeds is clustered, obtained
Class cluster object;
Target class cluster object is predicted from the class cluster object;
The data object that will be included in the target class cluster object, is used as target data objects.
Preferably, it is described to include the step of target class cluster object is predicted from the class cluster object:
Historical time sequence data based on the data object in the class cluster object within past one month,
Calculate the first averaged historical time series data of the class cluster object;
Based on the data object in the class cluster object in the trimestral historical time sequence number of past the tenth
According to the second averaged historical time series data of the calculating class cluster object;
Based on the data object in the class cluster object past the 12nd month historical time sequence number
According to the 3rd averaged historical time series data of the calculating class cluster object;
According to the first averaged historical time series data, the second averaged historical time series data
And the 3rd averaged historical time series data, the class cluster object is estimated when future first is default
Between following average time sequence data in section;
Calculate the following average time sequence data and the first averaged historical time series data
Difference, obtains the achievement data of the class cluster object;
It regard the class cluster object that achievement data is more than predetermined threshold value as target class cluster object.
Preferably, it is described to predict the target data objects in following first preset time period not
The step of carrying out time series data includes:
Following average time sequence data of the class cluster object in following first preset time period is entered
The processing of row renormalization, obtains the benchmark average time sequence number of each data object in the class cluster object
According to;
The benchmark average time sequence data of each data object is modified, corresponding data is obtained
Future time sequence data of the object in following first preset time period.
Preferably, the data object is commodity data, and the classification object is commodity classification, the spy
Classification object is levied for perishable commodity classification, the life cycle is the timeliness of commodity, the time series
Data are the day sales volume of the commodity.
Disclosed herein as well is a kind of data prediction device based on time series, described device includes:
History time series data acquisition module, the historical time sequence data for obtaining multiple classification objects,
Wherein, the classification object includes one or more data objects;
Feature classification object screening module, for filtering out feature classification pair from the multiple classification object
As, wherein, the feature classification object is the classification object comprising characteristic object, the characteristic
It is the data object that life cycle is less than preset time threshold according to object;
Target data objects prediction module, for based on the corresponding historical time sequence of the feature classification object
Target data objects are predicted in column data, the data object included from the feature classification object, it is described
Target data objects are that the future time sequence data that will be produced in following first preset time period is met
The data object of default growth trend.
Preferably, described device also includes:
Future time series data prediction module, for predicting that the target data objects are pre- described following first
If the future time sequence data in the period.
Preferably, the history time series data acquisition module includes:
History feature data calculating sub module, for for default multiple time intervals, when calculating each
Between it is interval in store in presetting database, the quantity of the corresponding specific characteristic data of the data object,
It is used as history feature data of the data object in the time interval;
History feature data tissue submodule, for the going through in all time intervals of data object described in tissue
History characteristic, obtains the historical time sequence data of the data object;
History feature data statistics submodule, for according to the time interval, counting each classification object
In the data object that includes the history feature data of the time interval summation;
History time series data tissue submodule, for by the summation of the history feature data of all time intervals
It is organized into the historical time sequence data of the classification object.
Preferably, the feature classification object screening module includes:
Fisrt feature classification object screens submodule, for the historical time sequence based on the classification object
Data, filter out fisrt feature classification object from the multiple classification object;
Second feature classification object acquisition submodule, for obtaining default second feature classification object;
Submodule is organized, for by the fisrt feature classification object and the second feature classification object
It is organized into feature classification object.
Preferably, the fisrt feature classification object screening submodule is additionally operable to:
Calculate in the preset time period of past first in the historical time sequence data of each classification object
Value M;
Calculate the quantity of the time interval of preset multiple of the summation more than the M of history feature data;
If the summation of the history feature data is more than the quantity of the time interval of the preset multiple of the M
Within a preset range, then judge the classification object as fisrt feature classification object.
Preferably, the target data objects prediction module includes:
Submodule is normalized, for based on the corresponding historical time sequence data of the feature classification object,
The feature classification object is normalized;
Submodule is clustered, for the data pair that will be included in the feature classification object after all normalizeds
As being clustered, class cluster object is obtained;
Submodule is predicted, for predicting target class cluster object from the class cluster object;
Target data objects acquisition submodule, for the data pair that will be included in the target class cluster object
As being used as target data objects.
Preferably, the prediction submodule is additionally operable to:
Historical time sequence data based on the data object in the class cluster object within past one month,
Calculate the first averaged historical time series data of the class cluster object;
Based on the data object in the class cluster object in the trimestral historical time sequence number of past the tenth
According to the second averaged historical time series data of the calculating class cluster object;
Based on the data object in the class cluster object past the 12nd month historical time sequence number
According to the 3rd averaged historical time series data of the calculating class cluster object;
According to the first averaged historical time series data, the second averaged historical time series data
And the 3rd averaged historical time series data, the class cluster object is estimated when future first is default
Between following average time sequence data in section;
Calculate the following average time sequence data and the first averaged historical time series data
Difference, obtains the achievement data of the class cluster object;
It regard the class cluster object that achievement data is more than predetermined threshold value as target class cluster object.
Preferably, the future time series data prediction module includes:
Reference data acquisition submodule, for the class cluster object in following first preset time period
Following average time sequence data carries out renormalization processing, obtains each data pair in the class cluster object
The benchmark average time sequence data of elephant;
Submodule is corrected, is repaiied for the benchmark average time sequence data to each data object
Just, future time sequence data of the corresponding data object in following first preset time period is obtained.
Preferably, the data object is commodity data, and the classification object is commodity classification, the spy
Classification object is levied for perishable commodity classification, the life cycle is the timeliness of commodity, the time series
Data are the day sales volume of the commodity.
The embodiment of the present application includes advantages below:
In the embodiment of the present application, it can be filtered out from multiple classification objects with aging characteristic and season
The feature classification object of characteristic is saved, and based on the historical time sequence data of this feature classification object, from spy
Levy and the future time sequence data that will produce in the recent period is predicted in the data object that classification object is included expire
The data object of the default growth trend of foot, i.e., the target data objects that will be broken out in the recent period, the application is implemented
Example predicts the target data objects in the recent period with explosive force according to the principle of time series data so that
Predict the outcome and more coincide with actual, accuracy rate is higher.
Embodiment
To enable above-mentioned purpose, the feature and advantage of the application more obvious understandable, below in conjunction with the accompanying drawings
The application is described in further detail with embodiment.
Reference picture 1, shows a kind of data predication method embodiment one based on time series of the application
Step flow chart, it is flat with tree-like bibliography system that the embodiment of the present application can apply to electric business platform etc.
In platform, tree-like bibliography system can obtain class destination party to classify according to tree-shaped classification to data
Method, wherein, tree-shaped classification is a kind of vivid classification, according to level, comes to divide in layer, just
As one big tree, there are leaf, branch, bar, root.
For example, in electric business platform, the consumer groups for adaptation current era are targeted in Online Store
Choose various commodity, the classification that can be made to commodity using tree-shaped classification obtains commodity
Classification, for example, clothes, accessories, beauty, number, household, mother and baby, food, style, service and
Insurance etc..
As shown in figure 1, the embodiment of the present application may include steps of:
Step 101, the historical time sequence data of multiple classification objects is obtained;
Applied to the embodiment of the present application, a classification object can include one or more data objects, example
Such as, in electric business platform, as shown in Fig. 2 classification tree schematic diagram, in commodity classification such as " seafood " class
Now, the commodity datas such as " steamed crab ", " octopus ", " precious jade post " can be included.
Further, each data object has corresponding multiple specific characteristic data, the specific characteristic
Data are previously generated, and detect the record generated when occurring specifies behavior to the data object.Example
Such as, in electric business platform, the specifies behavior can include sales behavior, and the specific characteristic data are
The sales figure generated when producing sales behavior to some commodity.
, should in the specific implementation, the specific characteristic data of data object can be obtained from presetting database
Presetting database can be the database previously generated.For example, the presetting database can be commodity data
Be stored with a plurality of sales figure for one or more commodity in storehouse, the merchandising database.
In practice, one can be used as with the data attribute information of data storage object in presetting database
Example is planted, the data attribute information can include time attribute information, identity property information, characteristic attribute
Information etc..For example, in merchandising database, the information attribute value of each commodity can also be stored, should
Information attribute value can include the base attributes of commodity, time attribute, transaction attribute, credit attribute and
Marketing attribute etc..Wherein, the base attribute of the commodity can include the titles of commodity, affiliated Merchant ID,
Price, restocking duration, affiliated classification etc.;Time attribute can include occur buying behavior, comment behavior,
The temporal information of the behaviors such as restocking behavior;The transaction attribute of the commodity can include articles storage, plus purchase,
Purchase etc.;The credit attribute of the commodity can include businessman's star, difference and comment number, difference comments rate, logistics scoring
Deng;Whether the marketing attribute of the commodity may include whether to go on the razzle-dazzle, is commodity sales promotion etc..
In a kind of preferred embodiment of the embodiment of the present application, step 101 can include following sub-step:
Sub-step S11, for default multiple time intervals, calculates preset data in each time interval
Stored in storehouse, the quantity of the corresponding specific characteristic data of the data object is used as the data object
History feature data in the time interval;
In the specific implementation, time interval can be the interval that is set according to time interval, for example, this when
Between interval can be one day, half a day, one week, one month etc., if time interval be one day, time zone
Between can be daily [00:00,23:59], the time interval can also add date and time information certainly, for example
The time interval on November 18th, 2015 is [2015-11-18-00:00,2015-11-18-23:59].Should
Default time interval can be developer's time interval set in advance.
Obtain after multiple default time intervals, can further calculate the data object in each time
The quantity of the specific characteristic data of (such as daily) in interval, obtains the history feature number of the time interval
According to.For example, calculating the quantity of the sales figure of a certain commodity every day, a day sales volume is obtained.
Sub-step S12, organizes the data object in the history feature data of all time intervals, obtains
The historical time sequence data of the data object;
Obtain after history feature data of the data object in each time interval, organize all time zones
Between history feature data, the historical time sequence data of the data object can be obtained.Wherein, the time
Sequence data refers to the data being collected into different time points, and this kind of data reflect a certain things, phenomenon
Etc. the state of changing with time or degree.Time series data is the special shape that data are present, sequence
Past value influences whether future value, and the size of this influence and the mode of influence can be by time series datas
In trend cycle and the behavior such as non-stationary portray.Time series excavate its essence be according to data at any time
Between the value in trend prediction future that changes.What emphasis to be considered is the special nature of time, as some cycles
In timing definition such as week, the moon, season, year etc. of property, the different dates are as festivals or holidays are likely to result in
Influence, the computational methods of date in itself also have some to need the phase before and after the place such as time of special consideration
Closing property (bygone has great influence power to future) etc..Time factor is only taken into full account, is utilized
The a series of value that available data is changed over time, can just be better anticipated the value in future.
For example, the day for obtaining commodity after sales volume, organizing daily day sales volume, obtaining the commodity
History sales volume.
The historical time sequence data of one data object can reflect the data object the past some when
Between tendency in section.
Sub-step S13, according to the time interval, counts the data object included in each classification object
In the summation of the history feature data of the time interval;
Because a classification object can include one or more data objects, when obtaining under the classification object
After the history feature data of each data object, the classification pair can be calculated in units of time interval
As lower all data objects are in the history feature data summation of the time interval.
For example, in some day, in " seafood " class now, the day sales volume of " steamed crab " is 1000 jin,
The day sales volume of " octopus " is 500 jin, the day sales volume of " precious jade post " is 300 jin, then should " seafood "
Class is now 1800 jin in the date Sino-Japan sales volume summation.
Sub-step S14, the classification pair is organized into by the summation of the history feature data of all time intervals
The historical time sequence data of elephant.
The summation of the history feature data of all time intervals is organized, the history of the classification object can be obtained
Time series data.
For example, calculate " seafood " classification in nearly one month after daily day sales volume summation, by this one
The day sales volume summation of all number of days of individual month is organized, and can obtain " seafood " classification going through in this month
History time series data.
The historical time sequence data of one classification object can reflect the classification object the past some when
Between tendency in section.
In the specific implementation, step 101 can be completed by a classification Data Generator, the maker root
According to the tree-like bibliography system of current platform, the historical time sequence data of each classification object is generated, by step
After rapid 101, originally the historical time sequence data of the data object of magnanimity can be using merger as each classification
The historical time sequence data of object, strong data supporting is provided for subsequent operation.
Step 102, feature classification object is filtered out from the multiple classification object;
In the embodiment of the present application, can after the historical time sequence data of each classification object is obtained
Further to filter out feature classification object from multiple classification objects, wherein, feature classification object can be with
For the classification object comprising characteristic object, and characteristic object can be less than for life cycle it is default
The data object of time threshold, i.e., with ageing data object.For example, when classification object is commodity
During classification, this feature classification object can be perishable commodity classification, and perishable commodity classification can be tool
The classification object of effective property commodity, perishable commodity refers to there is certain consumption aging characteristic, and guarantees the quality
Phase very of short duration commodity, for example:Moon cake, steamed crab etc., and perishable commodity classification can include vegetable
The fresh classification such as dish, fruit, seafood, raw meat, prepared food.
In a kind of preferred embodiment of the embodiment of the present application, step 102 can include following sub-step:
Sub-step S21, based on the historical time sequence data of the classification object, from the multiple classification
Fisrt feature classification object is filtered out in object;
, can further base after the historical time sequence data for all classification objects for obtaining current platform
In the historical time sequence data of classification object, Automatic sieve selects fisrt feature class from multiple classification objects
Mesh object.
In a kind of preferred embodiment of the embodiment of the present application, sub-step S21 can further include as follows
Sub-step:
Sub-step S211, calculates the historical time sequence of each classification object in the preset time period of past first
The intermediate value M of column data;
Specifically, intermediate value is also referred to as median, is to occupy middle number in one group of data (to pay special attention to
Place be:Ascending order or descending arrangement are passed through before this group of data), i.e., in this group of data,
There are the data of half bigger than it, there are the data of half less than that.If this group of packet is containing even number numeral,
Intermediate value is the average value positioned at two middle numbers, if there are n data, when n is even number, middle position
Number is the average of the n-th/2 digit and the digit of (n+2)/2;If n is odd number, then median is the
(n+1)/2 value of digit.
In the specific implementation, can be by the time range of the historical time sequence data of each classification object
The first preset time period is defined as over, for example, past first time period can be set as to 1 year in the past.
, can be by its historical time sequence data according to ascending order or descending sort for each classification object, will
The summation of the corresponding historical time sequence data of all time intervals was entered in the classification object in past 1 year
Row sequence, obtains the intermediate value M of the classification object, as by each commodity classification in past 1 year after sequence
After daily day sales volume summation is ranked up, obtains sequence and be used as the commodity class in middle day sales volume summation
The intermediate value M of mesh in the past year.
It should be noted that herein calculate intermediate value rather than calculate average value, be due in one group of data,
Average value is easily influenceed by extremum, and intermediate value will not then be influenceed by extremum, so as to make and real
The more identical prediction of border situation.
Sub-step S212, calculates the time of preset multiple of the summation more than the M of history feature data
Interval quantity;
After obtaining intermediate value M, M can amplify to n times, such as 1.5 times (1.5M can be expressed as),
And the summation by the classification object in the history feature data of each time interval is compared with 1.5M, obtain
The summation of history feature data is more than the quantity of 1.5M time interval.For example, calculating in commodity classification
Day sales volume summation is more than 1.5M number of days.
Sub-step S213, if the summation of the history feature data be more than the M preset multiple when
Between interval quantity judge the classification object as fisrt feature classification object within a preset range, then.
If M amplifies 1.5 times, the summations of the history feature data of the classification object be more than 1.5M when
Between interval quantity within a preset range when, it is possible to determine that the classification object is fisrt feature classification object.
For example, being 10-45 by preset range value, if the Sino-Japan sales volume summation of commodity classification is more than 1.5M
Number of days can be determined that in the range of this, then the commodity classification be perishable commodity classification.
Sub-step S22, obtains default second feature classification object;
Applied to the embodiment of the present application, default second feature classification object can be the classification in white list
Object, the white list can be preselect by artificial mode, for example, perishable commodity classification can be with
To run the commodity classification selected in advance, and the commodity classification that this is selected is added in white list.
Sub-step S23, by the fisrt feature classification object and the second feature classification object tissue
Into feature classification object.
, can be by fisrt feature class after obtaining fisrt feature classification object and second feature classification object
Mesh object and second feature classification object tissue into feature classification object, wherein, the mode of tissue can be with
Including duplicate removal mode, i.e., the feature that will be repeated in fisrt feature classification object and second feature classification object
Classification object is removed, and finally exports all feature classification objects.
In the embodiment of the present application, the sieve of feature classification object can be carried out by automatic and artificial mode
Choosing so that the selection result more conforms to user's request, also more perfect, intelligence degree is high.
Step 103, based on the corresponding historical time sequence data of the feature classification object, from the spy
Levy in the data object that classification object is included and predict target data objects.
Determine after feature classification object, can be filtered out from the data object that feature classification object is included
Target data objects, wherein, it will be produced in following first preset time period of the target data objects
Future time sequence data meets the data object of default growth trend, i.e., will produce quantity outburst in the recent period
Data object.
In the specific implementation, in order to improve the reliability predicted the outcome, following first preset time period can be with
For a recent period, a mid-term period that for example can be including future or a short period
Section.As a kind of example, the mid-term period can be the time of one month, i.e., when future first is default
Between Duan Weicong current times start ensuing one month;The short-term period can for two weeks,
The time waited in a short time for one week, i.e., following first preset time period is ensuing half since current time
Individual month or week age etc..
The target data objects can be the future time sequence that will be produced in following first preset time period
Column data meets the data object of default growth trend, that is, the quantity produced has abnormity point or bursting point
Data object.For example, before the Mid-autumn Festival, the sales volume of moon cake explosive will increase, then moon cake can
Think target data objects.
, can be from feature classification object after determining feature classification object applied to the embodiment of the present application
Comprising data object in further filter out target data objects.For example, determining perishable commodity classification
After, recent incite somebody to action is filtered out in the perishable commodity that further can be included from the perishable commodity classification
The target perishable commodity that can be sold fast and (produce bursting point or abnormity point).
In a kind of preferred embodiment of the embodiment of the present application, step 103 can include following sub-step:
Sub-step S31, based on the corresponding historical time sequence data of the feature classification object, to described
Feature classification object is normalized;
After determining feature classification object, in order to eliminate in feature classification object between each data object
Difference, is more accurately predicted the outcome, and this feature classification object can be normalized.Its
In, normalization is a kind of mode of simplified calculating, will there is the expression formula of dimension, by conversion, is turned to
Nondimensional expression formula, as scalar.
In one embodiment, place can be normalized to feature classification object in the following way
Reason:
The feature classification object in the preset time period of past first obtained according to above-mentioned sub-step S211
The intermediate value M of historical time sequence data;The each history calculated respectively in the historical time sequence data is special
The summation of data and intermediate value M ratio are levied, the summation of the history feature data after being normalized will
The summation of history feature data after all normalization is organized into normalized the going through of this feature classification object
History time series data.
Certainly, the embodiment of the present application is not limited to above-mentioned normalized mode, and those skilled in the art use
Other normalized modes are possible.
Sub-step S32, the data object included in the feature classification object after all normalizeds is entered
Row cluster, obtains class cluster object;
Applied to the embodiment of the present application, the historical time sequence data of feature classification object is normalized
After processing, all feature classification objects can further be clustered, in practice, the cluster can
Think and clustered all data objects included in all feature classification objects, by historical time sequence
The data object (for example, data object with similar explosive force) that column data has similar trend polymerize
Together, one or more class cluster objects are obtained.
Specifically, the set of physics or abstract object to be divided into the mistake for the multiple classes being made up of similar object
Journey is referred to as cluster, by clustering the set that generated class cluster is a group objects, these objects with it is same
Object in cluster is similar each other, different with object in other clusters.In the specific implementation, can use many
Kind of cluster mode is clustered, for example hierarchical clustering, partition clustering, density clustering, based on net
The cluster of lattice, cluster based on model etc., the embodiment of the present application is not restricted to specific clustering method.
, can be with for example, obtained feature classification object is fruit classification, seafood classification, prepared food classification etc.
These three classification objects are normalized respectively, and will be wrapped in the classification object after normalized
The commodity contained are clustered, and the commodity for having similar explosive force are condensed together, and obtain one or more classes
Cluster, for example, the steamed crab peak delicious due to having arrived many cream during mid-autumn, can together with moon cake
Outburst peak is welcome during time in mid-autumn simultaneously, the tendency of both historical time sequence datas is similar, then
Steamed crab and moon cake can be put into same class cluster.
Sub-step S33, predicts target class cluster object from the class cluster object;
Obtain after class cluster object, can be filtered out from such cluster object in the recent period (when future first is default
Between in section) the class cluster object that will break out, be used as target class cluster object.For example, from multiple class cluster objects
In filter out the class cluster object that will be sold fast as target class cluster object.
In a kind of preferred embodiment of the embodiment of the present application, sub-step S33 can further include as follows
Sub-step:
Sub-step S331, the history based on the data object in the class cluster object within past one month
Time series data, calculates the first averaged historical time series data of the class cluster object;
In the specific implementation, can according in class cluster object each data object within past one month
Historical time sequence data after the normalization of (nearest one month), calculates all data pair under such cluster
The average value of the historical time sequence data of elephant, i.e., in units of time interval (such as in units of day),
The history feature data sum under such cluster after the normalization of all data objects of the time interval is calculated to remove
With the quantity of all data objects under the time interval, the average value under the time interval is obtained;Institute is sometimes
Between interval average value constitute the first averaged historical time series data of such cluster.
Sub-step S332, trimestral was gone through based on the data object in the class cluster object in the past the tenth
History time series data, calculates the second averaged historical time series data of the class cluster object;
In the specific implementation, can according in class cluster object each data object past the 13rd month
Historical time sequence data after the normalization on (date of a nearest month corresponding last year), calculating should
The average value of the historical time sequence data of all data objects under class cluster, i.e., in units of time interval
(such as in units of day), calculates under such cluster after the normalization of all data objects of the time interval
The quantity of all data objects, obtains the time interval under history feature data sum divided by the time interval
Under average value;The average value of all time intervals constitutes the second averaged historical time series of the class cluster
Data.
Sub-step S333, based on the target data objects in the class cluster object past the 12nd month
Historical time sequence data, calculate the 3rd averaged historical time series data of the class cluster object;
Using the method with above-mentioned sub-step S332, the 3rd averaged historical time sequence of class cluster object is calculated
Column data, that is, calculate the average normalized data of last year current date.
Sub-step S334, averagely goes through according to the first averaged historical time series data, described second
History time series data and the 3rd averaged historical time series data, estimate the class cluster object and exist
Following average time sequence data in following first preset time period;
After in the specific implementation, obtaining the first averaged historical time series data, can further it calculate
The first average value (being averaged under each time interval of class cluster of the first averaged historical time series data
It is worth the quantity of sum divided by time interval), and, after obtaining the second averaged historical time series data,
Can further calculate the second averaged historical time series data the second average value (class cluster it is each when
Between it is interval under average value sum divided by time interval quantity).
Then the ratio of the first average value and the second average value is calculated, ratio A is obtained.
Then the 3rd averaged historical time series data is multiplied by ratio A respectively, obtains the feature classification
Following average time sequence data of the object in following first preset time period.
It should be noted that following first preset time period can be the period of lunar calendar benchmark, if
If occurring great Gregorian calendar red-letter day (such as National Day, New Year's Day in first preset time period in some time interval
Deng), then the corresponding amendment of vacation calendar day is carried out, i.e., in the festivals or holidays, lunar calendar benchmark is become paired
The Gregorian calendar benchmark answered, other insignificant Gregorian calendar red-letter days are constant.
Sub-step S335, when calculating the following average time sequence data with first averaged historical
Between sequence data difference, obtain the achievement data of the class cluster object;
, can be further after obtaining the following average time sequence data in following first preset time period
Calculate the first summation (average value of each time interval lower class cluster of the following average time sequence data
Sum), and, the second summation of the first averaged historical time series data.
Then the difference of the second summation described in the first summation is calculated, the index number of such cluster object can be obtained
According to.
Sub-step S336, regard the class cluster object that achievement data is more than predetermined threshold value as target class cluster object.
After the achievement data for obtaining class cluster object, the larger class cluster object of achievement data can be filtered out and made
For target class cluster object, in one embodiment, achievement data can be filtered out more than predetermined threshold value
Class cluster object is used as target class cluster object.
For example, the achievement data of two obtained class clusters is following respectively, (M is the historical series before normalization
The intermediate value of data):
Steamed crab+moon cake (first kind cluster):1.1M
Octopus (Equations of The Second Kind cluster):-0.01M
After sequence, can be easy to judge within following two weeks first kind cluster, i.e. steamed crab and
The sales volume of moon cake will break out, and octopus can then tend to be steady.
In the embodiment of the present application, its short-term and mid-term can be judged according to the explosive force achievement data of class cluster
The possibility of outburst.
Sub-step S34, the data object that will be included in the target class cluster, is used as target data objects.
Determine after target class cluster object, the data object that will can be included in the target class cluster object is made
For target data objects.
In the embodiment of the present application, it can be filtered out from multiple classification objects with aging characteristic and season
The feature classification object of characteristic is saved, and based on the historical time sequence data of this feature classification object, from spy
Levy in the data object that classification object is included and predict the target data objects that will be broken out in the recent period, the application
Embodiment predicts the target data objects in the recent period with explosive force according to the principle of time series data,
More it is coincide with actual so that predicting the outcome, accuracy rate is higher.
Reference picture 3, shows a kind of data predication method embodiment two based on time series of the application
Step flow chart, may include steps of:
Step 301, the historical time sequence data of multiple classification objects is obtained;
Applied to the embodiment of the present application, a classification object can include one or more data objects.
In a kind of preferred embodiment of the embodiment of the present application, step 301 can include following sub-step:
Sub-step S41, for default multiple time intervals, calculates preset data in each time interval
Stored in storehouse, the quantity of the corresponding specific characteristic data of the data object is used as the data object
History feature data in the time interval;
Sub-step S42, organizes the data object in the history feature data of all time intervals, obtains
The historical time sequence data of the data object;
Sub-step S43, according to the time interval, counts the data object included in each classification object
In the summation of the history feature data of the time interval;
Sub-step S44, the classification pair is organized into by the summation of the history feature data of all time intervals
The historical time sequence data of elephant.
Step 302, feature classification object is filtered out from the multiple classification object;
In the embodiment of the present application, can after the historical time sequence data of each classification object is obtained
Further to filter out feature classification object from multiple classification objects, wherein, feature classification object can be with
For the classification object comprising characteristic object, and characteristic object can be less than for life cycle it is default
The data object of time threshold, i.e., with ageing data object.
In a kind of preferred embodiment of the embodiment of the present application, step 302 can include following sub-step:
Sub-step S51, based on the historical time sequence data of the classification object, from the multiple classification
Fisrt feature classification object is filtered out in object;
In a kind of preferred embodiment of the embodiment of the present application, sub-step S51 can further include as follows
Sub-step:
Sub-step S511, calculates the historical time sequence of each classification object in the preset time period of past first
The intermediate value M of column data;
Sub-step S512, calculates the time of preset multiple of the summation more than the M of history feature data
Interval quantity;
Sub-step S513, if the summation of the history feature data be more than the M preset multiple when
Between interval quantity judge the classification object as fisrt feature classification object within a preset range, then.
Sub-step S52, obtains default second feature classification object;
Sub-step S53, by the fisrt feature classification object and the second feature classification object tissue
Into feature classification object.
Step 303, based on the corresponding historical time sequence data of the feature classification object, from the spy
Levy in the data object that classification object is included and predict target data objects;
Determine after feature classification object, can be filtered out from the data object that feature classification object is included
Target data objects, wherein, the target data objects can be that will be produced in following first preset time period
Raw future time sequence data meets the data object of default growth trend.
In a kind of preferred embodiment of the embodiment of the present application, step 303 can include following sub-step:
Sub-step S61, based on the corresponding historical time sequence data of the feature classification object, to described
Feature classification object is normalized;
Sub-step S62, the data object included in the feature classification object after all normalizeds is entered
Row cluster, obtains class cluster object;
Sub-step S63, predicts target class cluster object from the class cluster object;
In a kind of preferred embodiment of the embodiment of the present application, sub-step S63 can further include as follows
Sub-step:
Sub-step S631, the history based on the data object in the class cluster object within past one month
Time series data, calculates the first averaged historical time series data of the class cluster object;
Sub-step S632, trimestral was gone through based on the data object in the class cluster object in the past the tenth
History time series data, calculates the second averaged historical time series data of the class cluster object;
Sub-step S633, based on the going through past the 12nd month of the data object in the class cluster object
History time series data, calculates the 3rd averaged historical time series data of the class cluster object;
Sub-step S634, averagely goes through according to the first averaged historical time series data, described second
History time series data and the 3rd averaged historical time series data, estimate the class cluster object and exist
Following average time sequence data in following first preset time period;
Sub-step S635, when calculating the following average time sequence data with first averaged historical
Between sequence data difference, obtain the achievement data of the class cluster object;
Sub-step S636, regard the class cluster object that achievement data is more than predetermined threshold value as target class cluster object.
Sub-step S64, the data object that will be included in the target class cluster object, is used as target data pair
As.
Step 304, future of the target data objects in following first preset time period is predicted
Time series data.
In a kind of preferred embodiment of the embodiment of the present application, step 304 can include following sub-step:
Sub-step S71, to following average time of the class cluster object in following first preset time period
Sequence data carries out renormalization processing, and the benchmark for obtaining each data object in the class cluster object is averaged
Time series data;
Due to being according to the following average time sequence data of the sub-step S634 class cluster objects estimated
A kind of value after normalization, therefore renormalization processing can be carried out to the value after the normalization first, i.e.,
The following average time sequence data is multiplied by intermediate value M, each data pair in such cluster object can be obtained
The benchmark average time sequence data of elephant.
Sub-step S72, is modified to the benchmark average time sequence data of each data object,
Obtain future time sequence data of the corresponding data object in following first preset time period.
, can be to the benchmark mean time after the benchmark average time sequence data for obtaining each data object
Between sequence data be modified, obtain future time of the data object in following first preset time period
Sequence data.In one embodiment, the amendment can include being put according to predetermined reference parameters
Compensating approach that is big or reducing.
Predetermined reference parameters can be the compensating parameter in other databases, for example, in electric business platform,
In order to resist the influence that platform businessman number change is brought, the predetermined reference parameters can be merchant database
In data, the merchant database have recorded each businessman of platform and its main feature, including businessman
Base attribute, transaction the feature such as attribute and credit attribute.Can with current businessman's number and last year to correspondence when
Phase businessman's number is somebody's turn to do compared to amendments such as the amplifications (or reducing) for carrying out benchmark average time sequence data
The future time sequence data of commodity classification.
For example, businessman's quantity that last year compared with the same period in this year, preserves in merchant database increases from 100
Be added to 1000, businessman's quantity adds 10 times, and sales volume adds 20 times, then can be by benchmark
Average time sequence data amplifies twice, obtains future time sequence data.
As a kind of preferred exemplary of the embodiment of the present application, if the embodiment of the present application is applied into electric business platform
In, then the data object can be commodity data, and the classification object can be commodity classification, described
Feature classification object can be perishable commodity classification, and the life cycle can be the timeliness of commodity, institute
State the day sales volume that time series data can be the commodity.
In the embodiment of the present application, it can be filtered out from multiple classification objects with aging characteristic and season
The feature classification object of characteristic is saved, and based on the historical time sequence data of this feature classification object, from spy
Levy in the data object that classification object is included and predict the target data objects that will be broken out in the recent period, and predict
The recent future time sequence data of the target data objects, the embodiment of the present application is according to time series data
Principle, predict in the recent period with explosive force target data objects and the target data objects future
Time series data so that predict the outcome and more coincide with actual, accuracy rate is higher.
For Fig. 3 embodiment of the method, because it is substantially similar to Fig. 1 embodiment of the method, institute
With the fairly simple of description, the relevent part can refer to the partial explaination of embodiments of method.
It should be noted that for embodiment of the method, in order to be briefly described, therefore it is all expressed as to one it is
The combination of actions of row, but those skilled in the art should know that the embodiment of the present application is not by described
Sequence of movement limitation because according to the embodiment of the present application, some steps can using other orders or
Person is carried out simultaneously.Secondly, those skilled in the art should also know, embodiment described in this description
Belong to necessary to preferred embodiment, involved action not necessarily the embodiment of the present application.
Reference picture 4, shows a kind of data prediction device embodiment based on time series of the application
Structured flowchart, can specifically include following module:
History time series data acquisition module 401, the historical time sequence number for obtaining multiple classification objects
According to, wherein, the classification object includes one or more data objects;
Feature classification object screening module 402, for filtering out feature class from the multiple classification object
Mesh object, wherein, the feature classification object is the classification object comprising characteristic object, the spy
Levy the data object that data object is less than preset time threshold for life cycle;
Target data objects prediction module 403, during for history corresponding based on the feature classification object
Between sequence data, predict target data objects in the data object included from the feature classification object,
The target data objects are the future time sequence data that will be produced in following first preset time period
Meet the data object of default growth trend.
In a kind of preferred embodiment of the embodiment of the present application, described device can also include:
Future time series data prediction module, for predicting that the target data objects are pre- described following first
If the future time sequence data in the period.
In a kind of preferred embodiment of the embodiment of the present application, the history time series data acquisition module 401
Including:
History feature data calculating sub module, for for default multiple time intervals, when calculating each
Between it is interval in store in presetting database, the quantity of the corresponding specific characteristic data of the data object,
It is used as history feature data of the data object in the time interval;
History feature data tissue submodule, for the going through in all time intervals of data object described in tissue
History characteristic, obtains the historical time sequence data of the data object;
History feature data statistics submodule, for according to the time interval, counting each classification object
In the data object that includes the history feature data of the time interval summation;
History time series data tissue submodule, for by the summation of the history feature data of all time intervals
It is organized into the historical time sequence data of the classification object.
In a kind of preferred embodiment of the embodiment of the present application, the feature classification object screening module 402
Including:
Fisrt feature classification object screens submodule, for the historical time sequence based on the classification object
Data, filter out fisrt feature classification object from the multiple classification object;
Second feature classification object acquisition submodule, for obtaining default second feature classification object;
Submodule is organized, for by the fisrt feature classification object and the second feature classification object
It is organized into feature classification object.
In a kind of preferred embodiment of the embodiment of the present application, the fisrt feature classification object screens submodule
Block is additionally operable to:
Calculate in the preset time period of past first in the historical time sequence data of each classification object
Value M;
Calculate the quantity of the time interval of preset multiple of the summation more than the M of history feature data;
If the summation of the history feature data is more than the quantity of the time interval of the preset multiple of the M
Within a preset range, then judge the classification object as fisrt feature classification object.
In a kind of preferred embodiment of the embodiment of the present application, the target data objects prediction module 403
Including:
Submodule is normalized, for based on the corresponding historical time sequence data of the feature classification object,
The feature classification object is normalized;
Submodule is clustered, for the data pair that will be included in the feature classification object after all normalizeds
As being clustered, class cluster object is obtained;
Submodule is predicted, for predicting target class cluster object from the class cluster object;
Target data objects acquisition submodule, for the data pair that will be included in the target class cluster object
As being used as target data objects.
In a kind of preferred embodiment of the embodiment of the present application, the prediction submodule is additionally operable to:
Historical time sequence data based on the data object in the class cluster object within past one month,
Calculate the first averaged historical time series data of the class cluster object;
Based on the data object in the class cluster object in the trimestral historical time sequence number of past the tenth
According to the second averaged historical time series data of the calculating class cluster object;
Based on the data object in the class cluster object past the 12nd month historical time sequence number
According to the 3rd averaged historical time series data of the calculating class cluster object;
According to the first averaged historical time series data, the second averaged historical time series data
And the 3rd averaged historical time series data, the class cluster object is estimated when future first is default
Between following average time sequence data in section;
Calculate the following average time sequence data and the first averaged historical time series data
Difference, obtains the achievement data of the class cluster object;
It regard the class cluster object that achievement data is more than predetermined threshold value as target class cluster object.
In a kind of preferred embodiment of the embodiment of the present application, the future time series data prediction module bag
Include:
Reference data acquisition submodule, for the class cluster object in following first preset time period
Following average time sequence data carries out renormalization processing, obtains each data pair in the class cluster object
The benchmark average time sequence data of elephant;
Submodule is corrected, is repaiied for the benchmark average time sequence data to each data object
Just, future time sequence data of the corresponding data object in following first preset time period is obtained.
In a kind of preferred embodiment of the embodiment of the present application, the data object is commodity data, described
Classification object is commodity classification, and the feature classification object is perishable commodity classification, the life cycle
For the timeliness of commodity, the time series data is the day sales volume of the commodity.
For device embodiment, because it is substantially similar to embodiment of the method, so the comparison of description
Simply, the relevent part can refer to the partial explaination of embodiments of method.
Each embodiment in this specification is described by the way of progressive, and each embodiment is stressed
Be all between difference with other embodiment, each embodiment identical similar part mutually referring to
.
It should be understood by those skilled in the art that, the embodiment of the embodiment of the present application can be provided as method, dress
Put or computer program product.Therefore, the embodiment of the present application can using complete hardware embodiment, completely
The form of embodiment in terms of software implementation or combination software and hardware.Moreover, the embodiment of the present application
Can use can be situated between in one or more computers for wherein including computer usable program code with storage
The computer journey that matter is implemented on (including but is not limited to magnetic disk storage, CD-ROM, optical memory etc.)
The form of sequence product.
In a typical configuration, the computer equipment includes one or more processors
(CPU), input/output interface, network interface and internal memory.Internal memory potentially includes computer-readable medium
In volatile memory, the shape such as random access memory (RAM) and/or Nonvolatile memory
Formula, such as read-only storage (ROM) or flash memory (flash RAM).Internal memory is computer-readable medium
Example.Computer-readable medium includes permanent and non-permanent, removable and non-removable media
It can realize that information is stored by any method or technique.Information can be computer-readable instruction,
Data structure, the module of program or other data.The example of the storage medium of computer includes, but
Phase transition internal memory (PRAM), static RAM (SRAM), dynamic random is not limited to deposit
Access to memory (DRAM), other kinds of random access memory (RAM), read-only storage
(ROM), Electrically Erasable Read Only Memory (EEPROM), fast flash memory bank or other in
Deposit technology, read-only optical disc read-only storage (CD-ROM), digital versatile disc (DVD) or other
Optical storage, magnetic cassette tape, tape magnetic rigid disk storage other magnetic storage apparatus or it is any its
His non-transmission medium, the information that can be accessed by a computing device available for storage.According to herein
Define, computer-readable medium does not include the computer readable media (transitory media) of non-standing,
Such as the data-signal and carrier wave of modulation.
The embodiment of the present application is with reference to according to the method for the embodiment of the present application, terminal device (system) and meter
The flow chart and/or block diagram of calculation machine program product is described.It should be understood that can be by computer program instructions
Each flow and/or square frame and flow chart and/or square frame in implementation process figure and/or block diagram
The combination of flow and/or square frame in figure.Can provide these computer program instructions to all-purpose computer,
The processor of special-purpose computer, Embedded Processor or other programmable data processing terminal equipments is to produce
One machine so that pass through the computing devices of computer or other programmable data processing terminal equipments
Instruction produce be used to realize in one flow of flow chart or multiple flows and/or one square frame of block diagram or
The device for the function of being specified in multiple square frames.
These computer program instructions, which may be alternatively stored in, can guide computer or other programmable datas to handle
In the computer-readable memory that terminal device works in a specific way so that be stored in this computer-readable
Instruction in memory, which is produced, includes the manufacture of command device, and command device realization is in flow chart one
The function of being specified in flow or multiple flows and/or one square frame of block diagram or multiple square frames.
These computer program instructions can also be loaded into computer or other programmable data processing terminals are set
It is standby upper so that series of operation steps is performed on computer or other programmable terminal equipments in terms of producing
The processing that calculation machine is realized, so that the instruction performed on computer or other programmable terminal equipments provides use
In realization in one flow of flow chart or multiple flows and/or one square frame of block diagram or multiple square frames
The step of function of specifying.
Although having been described for the preferred embodiment of the embodiment of the present application, those skilled in the art are once
Basic creative concept is known, then other change and modification can be made to these embodiments.So,
Appended claims are intended to be construed to include preferred embodiment and fall into the institute of the embodiment of the present application scope
Have altered and change.
Finally, in addition it is also necessary to explanation, herein, such as first and second or the like relational terms
It is used merely to make a distinction an entity or operation with another entity or operation, and not necessarily requires
Or imply between these entities or operation there is any this actual relation or order.Moreover, art
Language " comprising ", "comprising" or any other variant thereof is intended to cover non-exclusive inclusion, so that
Process, method, article or terminal device including a series of key elements not only include those key elements, and
Also include other key elements for being not expressly set out, or also include for this process, method, article or
The intrinsic key element of person's terminal device.In the absence of more restrictions, by sentence " including one
It is individual ... " limit key element, it is not excluded that at the process including the key element, method, article or end
Also there is other identical element in end equipment.
A kind of data predication method and one kind based on time series provided herein are based on above
The data prediction device of time series, is described in detail, and specific case used herein is to this Shen
Principle and embodiment please is set forth, and the explanation of above example is only intended to help and understands this Shen
Method and its core concept please;Simultaneously for those of ordinary skill in the art, according to the application's
Thought, will change in specific embodiments and applications, in summary, this specification
Content should not be construed as the limitation to the application.