CN107491830B - Method and device for processing time series curve - Google Patents

Method and device for processing time series curve Download PDF

Info

Publication number
CN107491830B
CN107491830B CN201710534624.5A CN201710534624A CN107491830B CN 107491830 B CN107491830 B CN 107491830B CN 201710534624 A CN201710534624 A CN 201710534624A CN 107491830 B CN107491830 B CN 107491830B
Authority
CN
China
Prior art keywords
time
amplitude
value
amplitude difference
point
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710534624.5A
Other languages
Chinese (zh)
Other versions
CN107491830A (en
Inventor
刘少华
吴健君
刘国辉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing QIYI Century Science and Technology Co Ltd
Original Assignee
Beijing QIYI Century Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing QIYI Century Science and Technology Co Ltd filed Critical Beijing QIYI Century Science and Technology Co Ltd
Priority to CN201710534624.5A priority Critical patent/CN107491830B/en
Publication of CN107491830A publication Critical patent/CN107491830A/en
Application granted granted Critical
Publication of CN107491830B publication Critical patent/CN107491830B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data
    • G06Q30/0202Market predictions or forecasting for commercial activities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0241Advertisements
    • G06Q30/0277Online advertisement

Abstract

The invention provides a method and a device for processing a time series curve, wherein the method comprises the following steps: determining a sequence of amplitude difference values for a time series curve, the sequence of amplitude difference values comprising a plurality of amplitude difference values; filtering the amplitude difference values in the amplitude difference value sequence according to a preset condition; determining an amplitude peak value and/or an amplitude valley value of the time series curve according to the plurality of filtered amplitude difference values; determining the time point of the amplitude peak value and/or the amplitude valley value, and recording as a target time point; and truncating the time series curve in different truncation modes according to the sum of the number of the target time points and the nearest predicted time point within the future preset time interval, wherein the truncated time series curve is used for predicting data located at each time point within the future preset time interval. The invention can improve the data prediction effect.

Description

Method and device for processing time series curve
Technical Field
The invention relates to the technical field of data prediction, in particular to a method and a device for processing a time series curve.
Background
Curve fitting is a common method in time series analysis, and the conventional curve fitting method is suitable for a relatively stable curve, which ignores some steeply changing intervals that may exist in the curve, for example, a huge oscillation interval exists in the historical trend of the time series curve due to the influence of data acquisition or an emergency (such as a hot video online, a tv drama ending, and the like).
Therefore, when a curve with a large oscillation interval is faced by the conventional curve fitting method in the prior art, only a few abnormal values in the curve are simply removed, and then data prediction on some future time points is performed on the time series curve from which the abnormal values are removed, so that a large data prediction deviation is generated, the fitting effect is poor, and the prediction accuracy is influenced.
Disclosure of Invention
The invention provides a method and a device for processing a time series curve, which aim to solve the problem of poor data prediction accuracy of the time series curve in the prior art.
In order to solve the above problem, according to an aspect of the present invention, the present invention discloses a method for processing a time series curve, including:
determining a sequence of amplitude difference values for a time series curve, the sequence of amplitude difference values comprising a plurality of amplitude difference values;
filtering the amplitude difference values in the amplitude difference value sequence according to a preset condition;
determining an amplitude peak value and/or an amplitude valley value of the time series curve according to the plurality of filtered amplitude difference values;
determining the time point of the amplitude peak value and/or the amplitude valley value, and recording as a target time point;
and truncating the time series curve in different truncation modes according to the sum of the number of the target time points and the nearest predicted time point within the future preset time interval, wherein the truncated time series curve is used for predicting data located at each time point within the future preset time interval.
Optionally, the determining the amplitude difference value sequence of the time series curve includes:
sequentially traversing each point in the time series curve, and determining at least two extreme points in the time series curve;
and sequentially carrying out difference operation on the amplitude of one extreme point behind the time and the amplitude of one extreme point ahead the time in every two adjacent extreme points on the time coordinate to obtain an amplitude difference value sequence consisting of a plurality of amplitude difference values.
Optionally, the determining the peak amplitude value and/or the valley amplitude value of the time series curve according to the filtered plurality of amplitude difference values includes:
and in the plurality of amplitude difference values after filtering, performing superposition calculation on at least two amplitude difference values which have the same sign and are adjacent in time to obtain an amplitude peak value and/or an amplitude valley value.
Optionally, the determining a time point of the amplitude peak and/or the amplitude valley, which is denoted as a target time point, includes:
determining the time point of an extreme point behind the time corresponding to each amplitude difference value as the time point of each amplitude difference value in the filtered amplitude difference values;
recording a target time point of a target amplitude difference value later in time point of the at least two superimposed amplitude difference values as a time point of the amplitude peak value and/or the amplitude valley value.
Optionally, the step of truncating the time-series curve in different truncation manners according to the sum of the number of the target time points and the nearest predicted time point within the future predetermined time interval includes:
if the sum of the number of the target time points is an even number, calculating the time interval between the predicted time point and the latest time point in the recorded target time points;
if the time interval is greater than or equal to a preset time threshold, truncating the curves positioned above and before the latest time point in the recorded target time points in the time series curves;
and if the time interval is smaller than the preset time threshold, processing an abnormal value on the time series curve.
Optionally, the step of truncating the time-series curve by using different truncation manners according to the sum of the number of the target time points and the predicted time point closest to the present day within the future predetermined time interval further includes:
if the sum of the number of the target time points is an odd number, cutting off the curves positioned above and before the time point which is the second latest time point in the recorded target time points in the time sequence curve;
and carrying out abnormal value processing on the cut time series curve.
According to another aspect of the present invention, the present invention also discloses a processing apparatus for time series curves, comprising:
a first determining module for determining an amplitude difference value sequence of a time series curve, the amplitude difference value sequence comprising a plurality of amplitude difference values;
the filtering module is used for filtering the amplitude difference value in the amplitude difference value sequence according to a preset condition;
a second determining module, configured to determine, according to the filtered multiple amplitude difference values, an amplitude peak value and/or an amplitude valley value of the time series curve;
the third determining module is used for determining the time point of the amplitude peak value and/or the amplitude valley value and recording as the target time point;
and the truncation module is used for truncating the time series curve in different truncation modes according to the sum of the number of the target time points and the nearest predicted time point within the future preset time interval, wherein the truncated time series curve is used for predicting the data located at each time point within the future preset time interval.
Optionally, the first determining module includes:
the first determining submodule is used for sequentially traversing each point in the time series curve and determining at least two extreme points in the time series curve;
and the operation submodule is used for carrying out difference operation on the amplitude of one extreme point behind the time and the amplitude of one extreme point ahead the time in every two adjacent extreme points on the time coordinate in sequence to obtain an amplitude difference value sequence consisting of a plurality of amplitude difference values.
Optionally, the second determining module includes:
and the superposition calculation submodule is used for carrying out superposition calculation on at least two amplitude difference values which have the same sign and are adjacent in time in the plurality of amplitude difference values after filtering to obtain an amplitude peak value and/or an amplitude valley value.
Optionally, the third determining module includes:
the second determining submodule is used for determining the time point of an extreme point behind the time corresponding to each amplitude difference value as the time point of each amplitude difference value in the filtered amplitude difference values;
a recording submodule configured to record a target time point of a target amplitude difference value later than a time point of the at least two superimposed amplitude difference values as a time point of the amplitude peak value and/or the amplitude valley value.
Optionally, the truncation module comprises:
the calculation submodule is used for calculating the time interval between the predicted time point and the latest time point in the recorded target time points if the sum of the number of the target time points is an even number;
the first truncation submodule is used for truncating the curve which is positioned above and before the latest time point in the recorded target time points in the time series curve if the time interval is greater than or equal to a preset time threshold;
and the first exception handling submodule is used for processing an exception value of the time series curve if the time interval is smaller than the preset time threshold.
Optionally, the truncation module further comprises:
a second truncation sub-module, configured to truncate a curve located above and before a time point that is second-latest of the recorded target time points in the time series curve if a sum of the number of the target time points is an odd number;
and the second exception handling submodule is used for carrying out exception value processing on the time series curve after the truncation.
Compared with the prior art, the invention has the following advantages:
according to the invention, the target time points with severe amplitude in the time series curve are recorded, and then different truncation modes are adopted to truncate the abnormal interval of the time series curve according to the specific situation of the sum of the number of the target time points with severe amplitude and the prediction time point which is nearest to the present day in the future preset time interval, so that the accuracy of the predicted data can be ensured when the truncated time series curve is adopted to predict each time point in the future preset time interval, and the data prediction effect is improved.
Drawings
FIG. 1 is a flow chart of steps of an embodiment of a method of processing a time series of curves of the present invention;
FIG. 2 is a flow chart illustrating the steps of another embodiment of a method of processing a time series plot according to the present invention;
fig. 3 is a block diagram of an embodiment of a processing apparatus for processing a time series curve according to the present invention.
Detailed Description
In order to make the aforementioned objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in further detail below.
Referring to fig. 1, a flow chart of steps of an embodiment of a method for processing a time series curve according to the present invention is shown, and before the steps of the embodiment of the present invention are explained, the following description is made:
the time series curve is characterized in that the horizontal axis of the curve is a coordinate with time, and the coordinate of the vertical axis of the curve can be formed on the coordinate according to the type of data (such as advertisement inventory) needing to be predicted in the actual application scene, so that a specific value of the data in historical time (such as data of advertisement inventory which is 300 days before today) is formed on the coordinate, and a time series curve of the certain type of data in the historical time is formed, wherein each point on the curve represents a data value, such as an advertisement inventory value, of the corresponding time point. Here, the "advertisement inventory" is an advertisement playing amount of a certain video website at a certain time point (for example, one day).
In order to predict the data of the ordinate within a certain period of time today or in the future by using the time series curve of the historical time, and to ensure the accuracy of the prediction, the embodiment of the invention can perform truncation processing on the time series curve in an adaptive manner by identifying the end point of the abnormal interval in the middle time series curve, so that the time series curve after the abnormal interval is truncated is used for data prediction, and the accuracy of the predicted data is ensured. The following describes the identification of the end point of the abnormal section through steps 101 to 104:
step 101, determining an amplitude difference value sequence of a time series curve, wherein the amplitude difference value sequence comprises a plurality of amplitude difference values;
wherein the amplitude difference is the difference between two amplitudes in the time series curve.
Step 102, filtering the amplitude difference value in the amplitude difference value sequence according to a preset condition;
for example, the preset condition may indicate that the amplitude difference value smaller than the threshold is filtered, but of course, the preset condition may be any condition capable of filtering the amplitude difference value affecting the data prediction.
103, determining an amplitude peak value and/or an amplitude valley value of the time series curve according to the plurality of filtered amplitude difference values;
wherein, the amplitude peak value and the amplitude valley value are selected from the superposition result of the amplitude difference values.
Step 104, determining time points of amplitude peak values and/or amplitude valley values, and recording the time points as target time points;
the target time point is a time point corresponding to the amplitude peak value and the amplitude valley value in the time series curve.
The time points of the peak amplitude value and the bottom amplitude value reflect the time points of the end of the sharp amplitude, and these time points (i.e., the target time points) can be recorded.
And 105, truncating the time series curve in different truncation modes according to the sum of the number of the target time points and the nearest predicted time point within the future preset time interval, wherein the truncated time series curve is used for predicting data located at each time point within the future preset time interval.
After the time point at which the severe amplitude in the time series curve ends is found, the time series curve can be cut off in different cutting ways according to the sum of the number of the time points of the severe amplitude in the curve and the time point (i.e., the predicted time point) closest to today in the current predicted time (e.g., the current time or a certain period of time in the future), and then the data (i.e., the numerical value of the ordinate) at the predicted time point is predicted by using the cut-off time series curve.
By means of the technical scheme of the embodiment of the invention, the target time point with severe amplitude in the time series curve is determined, and then the abnormal interval of the time series curve is cut off in different cutting-off modes according to the specific situation of the sum of the number of the time points with severe amplitude and the nearest predicted time point within the future preset time interval, so that the accuracy of the predicted data can be ensured when the data is predicted by the cut-off time series curve for each time point within the future preset time interval, and the data prediction effect is improved.
Referring to fig. 2, a flowchart of steps of another embodiment of a method for processing a time series curve according to the present invention is shown, and specifically includes the following steps:
step 201, sequentially traversing each point in a time series curve, and determining at least two extreme points in the time series curve;
wherein, each point in the time series curve can be traversed to determine the extreme point in the curve, wherein, the curve has at least two extreme points. In this way, all extreme points in the curve can be found.
Step 202, sequentially performing difference operation on the amplitude of one extreme point behind the time and the amplitude of one extreme point ahead the time in every two adjacent extreme points on the time coordinate to obtain an amplitude difference value sequence consisting of a plurality of amplitude difference values;
then, after finding all the extreme points in the curve, the difference between the values (i.e. amplitudes) of the vertical coordinates of two adjacent extreme points can be calculated, wherein, in order to find the variation trend between the extreme points, the amplitude of the extreme point at the later time is required to be used as the decrement, and the amplitude of the extreme point at the earlier time is required to be used as the decrement, so that the amplitude difference between every two adjacent extreme points can be obtained. The amplitude difference describes the intensity of the change between two adjacent extreme points, so that a plurality of amplitude differences can be obtained, and thus an amplitude difference sequence can be formed.
Step 203, filtering out the amplitude difference value of which the absolute value is smaller than a preset amplitude threshold value in the amplitude difference value sequence;
in order to find an interval with a relatively severe historical change in the time series curve and avoid the influence of a small-change interval on the searched endpoint, it is necessary to remove the amplitude difference with a relatively weak change degree between the extreme points in the curve, that is, to filter out the amplitude difference with the absolute value of the amplitude difference smaller than the preset amplitude threshold in the amplitude difference sequence obtained by the above calculation.
Step 204, performing superposition calculation on at least two amplitude difference values which are same in number and adjacent in time in the filtered multiple amplitude difference values to obtain an amplitude peak value and/or an amplitude valley value;
wherein after filtering the amplitude difference values, the remaining amplitude difference values can be subjected to a superposition calculation to find points with severe amplitude. When the superposition is performed, at least two adjacent amplitude difference values of the same sign are subjected to superposition operation according to a time adjacency principle, for example, 5 amplitude difference values which are sorted from front to back according to the time order are respectively B1, B2, B3, -B4 and B5 after the filtering, then the amplitude difference value B1, the amplitude difference value B2 and the amplitude difference value B3 are superposed, the amplitude difference value-B4 of the opposite sign is found after the superposition to the amplitude difference value B3, the superposition mode of-B4 and-B5 of the same sign is the same, wherein the sum of the amplitude difference values (B1+ B2+ B3) is an amplitude peak value, and the sum of the amplitude difference values- (B4+ B5) is an amplitude valley value.
Step 205, determining a time point of an extreme point behind a time corresponding to each amplitude difference value as a time point of each amplitude difference value in the filtered plurality of amplitude difference values;
specifically, for example, the filtered plurality of amplitude difference values are formed by: the extreme point at time t1 corresponds to amplitude a1, the extreme point at time t2 corresponds to amplitude a2, the extreme point at time t3 corresponds to amplitude A3, the extreme point at time t4 corresponds to amplitude a4, t1< t2< t3< t4, the amplitude difference B1 is a2-a1, the amplitude difference B2 is A3-a2, and the amplitude difference B3 is a 4-A3. Taking the amplitude difference B1 as an example, which corresponds to the extreme point of the time point t1 and the extreme point of the time point t2, the time point t2 of the extreme point after the time point may be determined as the time point of the amplitude difference B1, and similarly, the time point of the amplitude difference B2 may be determined as t3 and the time point of the amplitude difference B3 may be determined as t 4.
The time point of the amplitude difference is understood to be the time point at which each amplitude segment ends.
Step 206, recording a target time point of a target amplitude difference value behind the time point of the at least two superposed amplitude difference values as the time point of the amplitude peak value and/or the amplitude valley value;
here, the embodiment of the present invention finds the sharp amplitudes (e.g., the sum of the amplitude differences (B1+ B2+ B3) obtained above), - (B4+ B5), i.e., the peak amplitude values and the valley amplitude values, in the time series curve, and when determining the time points of the peak amplitude values and the valley amplitude values, the time point t4 of the target amplitude difference B3, which is later than the time point, of the amplitude differences B1, B2 and B3 may be recorded as the time point of the peak amplitude value (B1+ B2+ B3), and the time point of the target amplitude difference, e.g., -B5, which is later than the time point, of the amplitude differences-B4 and B5 may be recorded as the time point of the valley amplitude- (B4+ B5), wherein the time points of the peak amplitude values and the valley amplitude values reflect the time points of the sharp ending, and the time points (i.e., the target time points) may be recorded.
Step 207, judging whether the sum of the number of the target time points is an odd number or an even number;
if the sum of the number of the target time points is an even number, step 208, calculating a time interval between the predicted time point and the latest time point of the recorded target time points;
wherein the sum of the number of the target time points is an even number, which indicates that: the number of sharp amplitudes on the curve before the predicted time point is an even number. The latest point in time, i.e. the point in time closest to the predicted point in time, e.g. day 210.
Step 209, determining whether the time interval is greater than or equal to a preset time threshold;
if the time interval is greater than or equal to a preset time threshold, in step 210, truncating the curve located above and before the latest time point in the recorded target time points in the time series curve;
if the time interval is greater than or equal to the preset time threshold (for example, 70 days), it indicates that the time point with the severe amplitude is far from the current predicted time point, it is considered that the curve has entered a steady state from the time point with the severe amplitude nearest to the predicted time point, and is not affected by the data predicted at the current predicted time point, that is, the data to be predicted at the current predicted time point is not affected by the severe amplitude, and therefore, the curves above and before the latest time point in the recorded target time points in the time series curve (i.e., the curves at the latest time point and the time point before the latest time point) may be truncated (i.e., the curves with the time coordinates before day 210 and day 210 in the curve are truncated, wherein, when data prediction is performed, a stationary data set in a curve left after truncation can be adopted to predict data which needs to be predicted currently in a future period of time).
If the time interval is smaller than the preset time threshold, step 211 is executed to perform outlier processing on the time series curve.
If the time interval is smaller than the preset time threshold, it indicates that the time point with the severe amplitude is closer to the time interval of the prediction time point, and the severe amplitude of the time point will affect the data predicted by the prediction time point, so that the curve with the severe amplitude cannot be truncated, but the abnormal value processing is performed on the time series curve (where the abnormal value processing flow is a flow of performing the abnormal value processing on the curve in the prior art, and is not described herein again).
If the sum of the number of the target time points is an odd number, step 212, truncating the curve located above and before the time point which is the second latest time point in the recorded target time points in the time series curve;
if the sum of the number of the target time points is odd, that is, the number of the sharp amplitudes on the curve before the predicted time point is odd, that is, an unpaired peak or trough is included, it is necessary to truncate the curve at the time point next to the predicted time point in the time series curve and the curve point before the time point.
And step 213, performing abnormal value processing on the cut time series curve.
Wherein, because any curve may have an abnormal value, in order to ensure that the points in the curve are normal, the abnormal value processing flow in the prior art is performed; whereas if no outlier is found in the curve, either manually or by machine detection, the outlier processing flow in the above embodiment may be omitted. This method is also within the scope of the present application.
By means of the technical scheme of the embodiment of the invention, the target time points with severe amplitude in the time series curve are recorded, and then different truncation modes are adopted to truncate the abnormal interval of the time series curve according to the specific situation of the sum of the number of the target time points with severe amplitude and the nearest predicted time point within the future predetermined time interval, so that the accuracy of the predicted data can be ensured when the truncated time series curve is adopted to predict each time point within the future predetermined time interval, and the data prediction effect is improved.
Optionally, before performing step 201, the method according to the embodiment of the present invention may further include: and smoothing the time series curve by adopting a smoothing algorithm.
Because the time series curve is not a smooth curve but a sawtooth line, in order to find a maximum value and a minimum value in the curve, a smoothing algorithm such as moving average, explicit smoothing and the like is firstly adopted to smooth the time series curve, that is, small sawteeth in the curve are corrected to form a smooth curve, so that the trend detection of the curve in the subsequent step is facilitated, and the maximum value and/or the minimum value is found.
Therefore, the small saw teeth in the curve are corrected by performing smoothing treatment on the curve in advance to form a smooth curve, so that the trend detection of the curve in the subsequent steps is facilitated, and a maximum value and/or a minimum value are/is found; and by setting a preset amplitude threshold value, micro fluctuations in the curve can be filtered out, and the influence of the micro fluctuations on the search of a severe amplitude interval is avoided.
Wherein, in one embodiment, the preset amplitude threshold may be 50% of the average of all amplitudes of all points in the time series curve.
Thus, by setting such a preset amplitude threshold, it is possible to filter out small fluctuations in the curve, avoiding them from affecting the search for a severe amplitude interval.
It should be noted that, for simplicity of description, the method embodiments are described as a series of acts or combination of acts, but those skilled in the art will recognize that the present invention is not limited by the illustrated order of acts, as some steps may occur in other orders or concurrently in accordance with the embodiments of the present invention. Further, those skilled in the art will appreciate that the embodiments described in the specification are presently preferred and that no particular act is required to implement the invention.
Corresponding to the method provided by the embodiment of the present invention, referring to fig. 3, a structural block diagram of an embodiment of a processing apparatus for a time series curve according to the present invention is shown, and specifically, the structural block diagram may include the following modules:
a first determining module 31, configured to determine an amplitude difference value sequence of the time series curve, where the amplitude difference value sequence includes a plurality of amplitude difference values;
the filtering module 32 is configured to filter the amplitude difference values in the amplitude difference value sequence according to a preset condition;
a second determining module 33, configured to determine, according to the filtered multiple amplitude difference values, an amplitude peak value and/or an amplitude valley value of the time series curve;
a third determining module 34, configured to determine a time point of the amplitude peak and/or the amplitude valley, which is recorded as a target time point;
and the truncation module 35 is configured to truncate the time series curve in different truncation manners according to the sum of the number of the target time points and a predicted time point closest to the present day within a future predetermined time interval, where the truncated time series curve is used to predict data at each time point within the future predetermined time interval.
Optionally, the first determining module 31 includes:
the first determining submodule is used for sequentially traversing each point in the time series curve and determining at least two extreme points in the time series curve;
and the operation submodule is used for carrying out difference operation on the amplitude of one extreme point behind the time and the amplitude of one extreme point ahead the time in every two adjacent extreme points on the time coordinate in sequence to obtain an amplitude difference value sequence consisting of a plurality of amplitude difference values.
Optionally, the second determining module 33 includes:
and the superposition calculation submodule is used for carrying out superposition calculation on at least two amplitude difference values which have the same sign and are adjacent in time in the plurality of amplitude difference values after filtering to obtain an amplitude peak value and/or an amplitude valley value.
Optionally, the third determining module 34 includes:
the second determining submodule is used for determining the time point of an extreme point behind the time corresponding to each amplitude difference value as the time point of each amplitude difference value in the filtered amplitude difference values;
a recording submodule configured to record a target time point of a target amplitude difference value later than a time point of the at least two superimposed amplitude difference values as a time point of the amplitude peak value and/or the amplitude valley value.
Optionally, the truncation module 35 includes:
the calculation submodule is used for calculating the time interval between the predicted time point and the latest time point in the recorded target time points if the sum of the number of the target time points is an even number;
the first truncation submodule is used for truncating the curve which is positioned above and before the latest time point in the recorded target time points in the time series curve if the time interval is greater than or equal to a preset time threshold;
and the first exception handling submodule is used for processing an exception value of the time series curve if the time interval is smaller than the preset time threshold.
Optionally, the truncation module 35 further includes:
a second truncation sub-module, configured to truncate a curve located above and before a time point that is second-latest of the recorded target time points in the time series curve if a sum of the number of the target time points is an odd number;
and the second exception handling submodule is used for carrying out exception value processing on the time series curve after the truncation.
Optionally, the apparatus further comprises:
and the smoothing processing module is used for smoothing the time series curve by adopting a smoothing algorithm.
Preferably, the preset amplitude threshold is 50% of the average of all amplitudes in the time series curve.
For the device embodiment, since it is basically similar to the method embodiment, the description is simple, and for the relevant points, refer to the partial description of the method embodiment.
The embodiments in the present specification are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other.
As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, apparatus, or computer program product. Accordingly, embodiments of the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, embodiments of the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
Embodiments of the present invention are described with reference to flowchart illustrations and/or block diagrams of methods, terminal devices (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing terminal to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing terminal, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing terminal to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing terminal to cause a series of operational steps to be performed on the computer or other programmable terminal to produce a computer implemented process such that the instructions which execute on the computer or other programmable terminal provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
While preferred embodiments of the present invention have been described, additional variations and modifications of these embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including preferred embodiments and all such alterations and modifications as fall within the scope of the embodiments of the invention.
Finally, it should also be noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or terminal that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or terminal. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or terminal that comprises the element.
The present invention provides a method and a device for processing a time series curve, which are introduced in detail, and a specific example is applied in the text to explain the principle and the implementation of the present invention, and the description of the above example is only used to help understand the method and the core idea of the present invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present invention.

Claims (10)

1. A method for processing a time series curve is characterized by comprising the following steps:
determining a time series curve of data to be predicted in a preset historical time period before today;
determining a sequence of amplitude difference values for the time series curve, the sequence of amplitude difference values comprising a plurality of amplitude difference values;
filtering the amplitude difference values in the amplitude difference value sequence according to a preset condition;
determining an amplitude peak value and/or an amplitude valley value of the time series curve according to the plurality of filtered amplitude difference values;
determining the time point of the amplitude peak value and/or the amplitude valley value, and recording as a target time point;
according to the sum of the number of the target time points and the nearest prediction time point within the future preset time interval, the time series curve is cut off in different cutting-off modes;
predicting data to be predicted at each time point in the future preset time interval by using the cut time series curve;
the coordinate point of the time series curve is used for representing the value of the data to be predicted at each time point in a preset time period before today;
the data to be predicted is advertisement playing amount;
the amplitude difference value refers to the difference value between two adjacent extreme points in the time series curve;
determining an amplitude peak and/or an amplitude valley of the time series curve according to the filtered plurality of amplitude difference values, including: and in the plurality of amplitude difference values after filtering, performing superposition calculation on at least two amplitude difference values which have the same sign and are adjacent in time to obtain an amplitude peak value and/or an amplitude valley value.
2. The method of claim 1, wherein determining the sequence of amplitude difference values for the time series of curves comprises:
sequentially traversing each point in the time series curve, and determining at least two extreme points in the time series curve;
and sequentially carrying out difference operation on the amplitude of one extreme point behind the time and the amplitude of one extreme point ahead the time in every two adjacent extreme points on the time coordinate to obtain an amplitude difference value sequence consisting of a plurality of amplitude difference values.
3. The method according to claim 1, wherein the determining the time point of the peak amplitude value and/or the valley amplitude value, which is denoted as a target time point, comprises:
determining the time point of an extreme point behind the time corresponding to each amplitude difference value as the time point of each amplitude difference value in the filtered amplitude difference values;
recording a target time point of a target amplitude difference value later in time point of the at least two superimposed amplitude difference values as a time point of the amplitude peak value and/or the amplitude valley value.
4. The method according to claim 1, wherein the step of truncating the time series curve in different truncating manners according to the number sum of the target time points and the nearest predicted time point within the future predetermined time interval comprises:
if the sum of the number of the target time points is an even number, calculating the time interval between the predicted time point and the latest time point in the recorded target time points;
if the time interval is greater than or equal to a preset time threshold, truncating the curves positioned above and before the latest time point in the recorded target time points in the time series curves;
and if the time interval is smaller than the preset time threshold, processing an abnormal value on the time series curve.
5. The method according to claim 1, wherein the step of truncating the time series curve in different truncating manners according to the sum of the number of the target time points and the nearest predicted time point within the future predetermined time interval from today further comprises:
if the sum of the number of the target time points is an odd number, cutting off the curves positioned above and before the time point which is the second latest time point in the recorded target time points in the time sequence curve;
and (4) carrying out abnormal value processing on the cut time series curve.
6. A device for processing a time series curve, comprising:
the device comprises a first determining module, a second determining module and a prediction module, wherein the first determining module is used for determining a time series curve of data to be predicted in a preset historical time period before today;
determining a sequence of amplitude difference values for the time series curve, the sequence of amplitude difference values comprising a plurality of amplitude difference values;
the filtering module is used for filtering the amplitude difference value in the amplitude difference value sequence according to a preset condition;
a second determining module, configured to determine, according to the filtered multiple amplitude difference values, an amplitude peak value and/or an amplitude valley value of the time series curve;
the third determining module is used for determining the time point of the amplitude peak value and/or the amplitude valley value and recording as the target time point;
the truncation module is used for truncating the time series curve by adopting different truncation modes according to the sum of the number of the target time points and the nearest prediction time point within the future preset time interval;
predicting data to be predicted at each time point in the future preset time interval by using the cut time series curve;
the coordinate point of the time series curve is used for representing the value of the data to be predicted at each time point in a preset time period before today;
the data to be predicted is advertisement playing amount;
the amplitude difference value refers to the difference value between two adjacent extreme points in the time series curve;
the second determining module includes: and the superposition calculation submodule is used for carrying out superposition calculation on at least two amplitude difference values which have the same sign and are adjacent in time in the plurality of amplitude difference values after filtering to obtain an amplitude peak value and/or an amplitude valley value.
7. The apparatus of claim 6, wherein the first determining module comprises:
the first determining submodule is used for sequentially traversing each point in the time series curve and determining at least two extreme points in the time series curve;
and the operation submodule is used for carrying out difference operation on the amplitude of one extreme point behind the time and the amplitude of one extreme point ahead the time in every two adjacent extreme points on the time coordinate in sequence to obtain an amplitude difference value sequence consisting of a plurality of amplitude difference values.
8. The apparatus of claim 6, wherein the third determining module comprises:
the second determining submodule is used for determining the time point of an extreme point behind the time corresponding to each amplitude difference value as the time point of each amplitude difference value in the filtered amplitude difference values;
a recording submodule configured to record a target time point of a target amplitude difference value later than a time point of the at least two superimposed amplitude difference values as a time point of the amplitude peak value and/or the amplitude valley value.
9. The apparatus of claim 6, wherein the truncation module comprises:
the calculation submodule is used for calculating the time interval between the predicted time point and the latest time point in the recorded target time points if the sum of the number of the target time points is an even number;
the first truncation submodule is used for truncating the curve which is positioned above and before the latest time point in the recorded target time points in the time series curve if the time interval is greater than or equal to a preset time threshold;
and the first exception handling submodule is used for processing an exception value of the time series curve if the time interval is smaller than the preset time threshold.
10. The apparatus of claim 6, wherein the truncation module further comprises:
a second truncation sub-module, configured to truncate, if the sum of the number of the target time points is an odd number, a curve located above and before a time point that is second-latest of the recorded target time points in the time series curve;
and the second exception handling submodule is used for carrying out exception value handling on the truncated time series curve.
CN201710534624.5A 2017-07-03 2017-07-03 Method and device for processing time series curve Active CN107491830B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710534624.5A CN107491830B (en) 2017-07-03 2017-07-03 Method and device for processing time series curve

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710534624.5A CN107491830B (en) 2017-07-03 2017-07-03 Method and device for processing time series curve

Publications (2)

Publication Number Publication Date
CN107491830A CN107491830A (en) 2017-12-19
CN107491830B true CN107491830B (en) 2021-03-26

Family

ID=60644516

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710534624.5A Active CN107491830B (en) 2017-07-03 2017-07-03 Method and device for processing time series curve

Country Status (1)

Country Link
CN (1) CN107491830B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111937012A (en) * 2018-03-30 2020-11-13 日本电气方案创新株式会社 Index calculation device, prediction system, progress prediction evaluation method, and program

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2010001966A1 (en) * 2008-07-03 2010-01-07 日本電気株式会社 Time-series data processing device and method and program thereof
CN102346745A (en) * 2010-08-02 2012-02-08 阿里巴巴集团控股有限公司 Method and device for predicting user behavior number for words
CN103888315A (en) * 2014-03-24 2014-06-25 北京邮电大学 Self-adaptation burst flow detection device and detection method thereof
CN104462215A (en) * 2014-11-05 2015-03-25 大连理工大学 Scientific and technical literature quoting number predicting method based on time sequence
CN105303167A (en) * 2008-01-23 2016-02-03 加州大学评议会 Systems and methods for behavioral monitoring and calibration
CN105320845A (en) * 2015-11-26 2016-02-10 电子科技大学 Time sequence forecast method based on quantum gravity algorithm

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103930912A (en) * 2011-11-08 2014-07-16 国际商业机器公司 Time-series data analysis method, system and computer program
US9558666B2 (en) * 2014-12-02 2017-01-31 Robert Bosch Gmbh Collision avoidance in traffic crossings using radar sensors

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105303167A (en) * 2008-01-23 2016-02-03 加州大学评议会 Systems and methods for behavioral monitoring and calibration
WO2010001966A1 (en) * 2008-07-03 2010-01-07 日本電気株式会社 Time-series data processing device and method and program thereof
CN102346745A (en) * 2010-08-02 2012-02-08 阿里巴巴集团控股有限公司 Method and device for predicting user behavior number for words
CN103888315A (en) * 2014-03-24 2014-06-25 北京邮电大学 Self-adaptation burst flow detection device and detection method thereof
CN104462215A (en) * 2014-11-05 2015-03-25 大连理工大学 Scientific and technical literature quoting number predicting method based on time sequence
CN105320845A (en) * 2015-11-26 2016-02-10 电子科技大学 Time sequence forecast method based on quantum gravity algorithm

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
A Novel Time Series Approach for Predicting the Long-Term Popularity of Online Videos;Tan, Zhiyi;《IEEE TRANSACTIONS ON BROADCASTING》;20160731;第62卷(第2期);全文 *
Collaborated Online Change-point Detection in Sparse Time Series for Online Advertising;Zhang, Jie;《IEEE International Conference on Data Mining》;20151130;全文 *
时间序列数据挖掘在股市预测分析中的应用研究;何永沛;《中国优秀硕士学位论文全文数据库(信息科技辑)》;20090615;全文 *
视频分享网站热门视频快速挖掘预测模型;朱开成;《中国优秀硕士学位论文全文数据库(信息科技辑)》;20170415;全文 *
面向在线视频服务的播放量预测算法研究;张俊池;《中国优秀硕士学位论文全文数据库(信息科技辑)》;20170515;全文 *

Also Published As

Publication number Publication date
CN107491830A (en) 2017-12-19

Similar Documents

Publication Publication Date Title
KR101609088B1 (en) Media identification system with fingerprint database balanced according to search loads
CN110874674B (en) Abnormality detection method, device and equipment
CN109685144B (en) Method and device for evaluating video model and electronic equipment
EP3572979B1 (en) Comparing audiovisual products
CN109857804B (en) Distributed model parameter searching method and device and electronic equipment
CN108600836B (en) Video processing method and device
CN108366274B (en) Method and device for detecting brushing playing amount
CN106648839B (en) Data processing method and device
CN110046278B (en) Video classification method and device, terminal equipment and storage medium
EP3648059A1 (en) Video processing device and method for determining motion metadata for an encoded video
CN107491830B (en) Method and device for processing time series curve
CN105657446B (en) The detection method and device of bumper advertisements in a kind of video
CN106937173B (en) Video playing method and device
CN112651429B (en) Audio signal time sequence alignment method and device
US11223795B2 (en) Systems and methods for bidirectional speed ramping
CN104572996A (en) Processing method and device for video webpage
CN109617887B (en) Information processing method, device and storage medium
CN108764021B (en) Cheating video identification method and device
CN108984572B (en) Website information pushing method and device
CN115774646A (en) Process early warning method and device, electronic equipment and storage medium
CN106681992B (en) Method and device for managing website login information
CN113158953B (en) Personnel searching method, device, equipment and medium
CN111277915A (en) Video conversion method and device
CN104023244A (en) Method and apparatus for slicing stream media data in CDN system
CN110008269B (en) Data reflow method, device, equipment and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant