CN109933607A - Periodical time series data processing method - Google Patents

Periodical time series data processing method Download PDF

Info

Publication number
CN109933607A
CN109933607A CN201910075079.7A CN201910075079A CN109933607A CN 109933607 A CN109933607 A CN 109933607A CN 201910075079 A CN201910075079 A CN 201910075079A CN 109933607 A CN109933607 A CN 109933607A
Authority
CN
China
Prior art keywords
data
point
turning point
time series
date
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910075079.7A
Other languages
Chinese (zh)
Other versions
CN109933607B (en
Inventor
文曙东
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Weinuo Times Beijing Technology Co ltd
Original Assignee
Sichuan Cheng Cheng Tian You Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sichuan Cheng Cheng Tian You Technology Co Ltd filed Critical Sichuan Cheng Cheng Tian You Technology Co Ltd
Priority to CN201910075079.7A priority Critical patent/CN109933607B/en
Publication of CN109933607A publication Critical patent/CN109933607A/en
Application granted granted Critical
Publication of CN109933607B publication Critical patent/CN109933607B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Abstract

The present invention relates to technical field of data processing.The invention discloses a kind of periodical time series data processing methods, to solve the problems, such as prior art data sequence trend turning point judgement inaccuracy.Periodical time series data processing method of the invention, comprising steps of a, the data in period K are grouped by natural week, the data for corresponding to the time in each group are extracted respectively, form 7 data sequence S (1), S (2) ... S (7);B, it sums to the data of the corresponding position in 7 data sequences, obtains the 8th data sequence S (8);C, the trend reverse point of data sequence S (1), S (2) ... S (8) are sought;Wherein: K >=7n, n are positive integer.The invention has the advantages that the influence that data cyclically-varying identifies turning point can be excluded, the accuracy of turning point identification is improved, the variation fluctuation tendency of real response data provides more scientific foundation for decision.This invention simplifies data handling procedures, improve data-handling efficiency.

Description

Periodical time series data processing method
Technical field
The present invention relates to technical field of data processing, in particular to the time series data processing side of periodic feature Method, in particular to the recognition methods of data sequence periodic data trend reverse point.
Background technique
Do we need the shape feature of analysis time sequence when predicting, are to rise? decline? or it is steady? this is needed The point most paid close attention in people's vision, that is, time series data turning point are chosen, the common trait of turning point is exactly two sides Variation tendency have it is significantly different.
Time series data contains tendency information, can extract trend reverse point according to the tendency information of data, reach pressure Contracting data, the purpose for reducing influence of noise.By analysis time sequence data can the variation tendency to event carry out prediction and Judgement, provides foundation for various decisions.
Conventional turning point recognition methods does not all consider the cyclic fluctuation in one week.In a typical application scenarios In, the trip data of passenger just has the feature significantly with week naturally for the period, and travelling data were a circulation with 7 days Period.Due to the presence of 7 days periodically fluctuation, data is directly carried out processing and seek turning point, it can not turning point is accurate Ground navigates to some day.
In practice, a large amount of traffic travelling data are in periodic feature, such as railway traffic data, Monday, week Two ..., there is apparent periodicity on Sunday.Beijing-Shanghai express railway Beijing-Shanghai section Friday, Sunday passenger flow are apparently higher than it some other time Phase.Using whole year as target, July~August is summer transportation peak period, there is apparent ascendant trend June.Within this data sequence one week There are cyclically-varying, the trend also risen or fallen in annual range.Due to the data cyclic fluctuation as unit of week, press Be inconvenient to find out the passenger traffic trend reverse point date of annual range according to common turning point recognition methods.How easily and accurately to find The turning point date that data rise or fall in annual range is the marshalling of passenger traffic train number by decision-making foundation, is one and needs to solve Certainly problem.
Summary of the invention
The main purpose of the present invention is to provide periodical time series data processing methods, to solve prior art data The problem of Sequence Trend turning point judgement inaccuracy.
To achieve the goals above, the one aspect of specific embodiment according to the present invention, when providing a kind of periodicity Between sequence data processing method, which is characterized in that comprising steps of
A, the data in period K are grouped by natural week, extract the data for corresponding to the time in each group, composition respectively 7 data sequence S (1), S (2) ... S (7);
B, it sums to the data of the corresponding position in 7 data sequences, obtains the 8th data sequence S (8);
C, the trend reverse point of data sequence S (1), S (2) ... S (8) are sought;
Wherein: K >=7n, n are positive integer.
Further, it further comprises the steps of:
D, the trend reverse point quantity using the trend reverse point quantity of data sequence S (m+1) as period K.
Further, step c specifically:
The trend reverse point of data sequence S (1), S (2) ... S (8) are sought using mathematical method.
Further, the mathematical method specifically:
Data sequence is lined up, each data point is connected with straight line, according to certain data point and left and right sides consecutive number The slope differences of line are judged between strong point, when slope differences are greater than setting threshold values, i.e., the data are classified as turning point.
Or
Data sequence is lined up, connects head and the tail data point with straight line, is calculated between intermediate all data points and straight line Vertical vertical range, the maximum point of selected distance be turning point.This subsequent turning point as new endpoint, endpoint with it is original Head and the tail point forms two data sequences, and same method seeks new turning point.It circuits sequentially down until all the points to straight line Distance reach setting value, or until turning point quantity reaches setting value.
Further, the unit of the K is year;N=52.
Further, it further comprises the steps of:
E, the trend reverse point location 1 year is all in the turning point of S (8), 3 days after 7 days this weeks and last week is formed continuous 10 days date collection Date (10) then check front S (1)~S (7) turning point date that extraction falls in Date (10) set In the turning point date form new set, choose the smallest date value in the new set, and then turning point week is turned The break date navigates to this day, which also becomes annual trend reverse point.
Further, until the last day the turning point date positioning in the last one turning point week.
The invention has the advantages that the influence that data cyclically-varying identifies turning point can be excluded, turnover is improved The accuracy of point identification, the variation fluctuation tendency of real response data provide more scientific foundation for decision.This invention simplifies Data handling procedure improves data-handling efficiency.
The present invention is described further with reference to the accompanying drawings and detailed description.The additional aspect of the present invention and excellent Point will be set forth in part in the description, and partially will become apparent from the description below, or practice through the invention It solves.
Detailed description of the invention
The attached drawing constituted part of this application is used to provide further understanding of the present invention, specific implementation of the invention Mode, illustrative embodiments and their description are used to explain the present invention, and are not constituted improper limitations of the present invention.In the accompanying drawings:
Fig. 1 is certain station passenger traffic volume schematic diagram;
Fig. 2 is that the data of embodiment are fitted schematic diagram.
Specific embodiment
It should be noted that in the absence of conflict, specific embodiment, embodiment in the application and therein Feature can be combined with each other.It lets us now refer to the figures and combines the following contents the present invention will be described in detail.
In order to make those skilled in the art better understand the present invention program, below in conjunction with specific embodiment party of the present invention Attached drawing in formula, embodiment carries out clear, complete description to the technical solution in the specific embodiment of the invention, embodiment, Obviously, described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.Based in the present invention Specific embodiment, embodiment, those of ordinary skill in the art institute obtained without making creative work There are other embodiments, embodiment, should fall within the scope of the present invention.
In the present invention, data sequence trend turning point is a kind of characteristic of response data variation tendency, the data point The data variation trend of front and back is significantly different.
The time series data that the present invention is handled has minor cycle circulation in 7 days one week, the characteristic of annual trend complexity, Therefore, it carries out dimensionality reduction to time series data to be even more important, i.e., under conditions of retention time sequence data general shape, as far as possible Reduce the number at its midpoint.
Dimension-reduction treatment: time series has the periodic feature as unit of week naturally, in order to eliminate the periodicity to turnover Every weekly data is carried out dimension-reduction treatment by the interference of point analysis.Accordingly, present invention introduces 8 time serieses.
Firstly, the data grouping on Monday to Sunday is extracted, 7 groups of time serieses are formd, it should be noted that It is extra one day to be eliminated when discussed herein one day more than totally 52 weeks for 365 days 1 year.It is as follows:
The time series S (1) of all Monday compositions, 52 data;
The time series S (2) of all Tuesday compositions, 52 data;
The time series S (3) of all Wednesday compositions, 52 data;
The time series S (4) of all Thursday compositions, 52 data;
The time series S (5) of all Friday compositions, 52 data;
The time series S (6) of all Saturday compositions, 52 data;
The time series S (7) of all Sunday compositions, 52 data.
The minor cycle cycle specificity of this 7 sequence elimination as unit of week can generally react annual data variation Trend, but the data on certain dates are possible to singular value occur, and the trend shape of this 7 sequences can be made inconsistent.
Secondly, the present invention sums to every weekly data to form weekly data sequence, as the 8th time series S (8), such one Year data became from 365 days 52 weeks, eliminate the influence of every cyclic fluctuation in 7 days in this way, also allow each date in one week The positive negative error of data offsets each other, and reduces the influence of data singular value.The time series that this 52 weekly datas are formed can be with Show the trend of annual data.The turning point week of the selected annual data of this time series can be passed through.
To 8 time serieses above according to carrying out turnover point analysis and extract, that is, 52 data are carried out segment processings, Time series data in each period approximate can be simulated with straightway.Time series data is indicated into adjacent line segment Cluster replaces original time series with the adjacent straightway of several head and the tail come approximate, and interval might not be equal.Conventional method There is the maximum method (vertical range, orthogonal distance) of distance, the time series segmentation linear method etc. of marginal point is extracted based on slope.
We also need using the trend reverse point of S (8) as annual trend reverse point week annual trend reverse point Some exact date is navigated to, steps are as follows for whole process:
1, for having with the railway passenger demand data of all cycle specificities, the data on Monday to Sunday in 1 year are carried out Grouping is extracted, and seven subsequence S (1), S (2) ... S (7) are decomposed into, and data amount check is 52 in sequence.
2, it sums to one week 7 day data, forms 52 weekly data sequence S (8), Xiao Zhou of the sequence elimination as unit of week Phase cycle specificity also eliminates X factor and interferes the influence generated to single date data.The sequence is the 8th sequence.
3, turning point extraction is carried out respectively to eight newly-generated time serieses.Due to exist in reality many interference because Element, seven sequences in step 1 have some singular datas, and selected turning point also will receive interference and inaccurate, step 2 In the 8th sequence S (8) data sum to one week data, carried out smooth, presented annual data trend, it is selected Turning point is exactly annual trend reverse point, but navigates to week rather than exact date.
4, by the turning point of the 8th sequence S (8), learn that turning point, but may annual variation in which specific week The turnover point location date of trend may not be in this week, and 3 days behind last week.Namely begun to out from latter half last week Existing Long-term change trend.We check 7 days weeks of turning point and upper continuous 10 days date set Date (10) of three days compositions after a week The turning point of seven sequences in front, selection include all turning points of 7 sequences in date set Date (10), are chosen minimum Date, the turning point has thus been navigated to this exact date.
The one turning point date of table was included in turning point week
…… …… …… …… …… …… …… ……
N-th week Trend 1 Trend 1 Trend 1 Trend 1 Trend 1 Trend 1 Trend 1
(n+1)th week (turning point week) Trend 1 Turning point Turning point Turning point Turning point Turning point Turning point
N-th+2 week Turning point Trend 2 Trend 2 Trend 2 Trend 2 Trend 2 Trend 2
N-th+3 week Trend 2 Trend 2 Trend 2 Trend 2 Trend 2 Trend 2 Trend 2
…… …… …… …… …… …… …… ……
The two turning point date of table is in latter three days of turning point Zhou Shangyi weeks
If 5, step 4 can not select the turning point date, this week minimum date is just positioned at the turning point date.
6, particularly, last week is endpoint week, the last one turning point is set as last day.
Embodiment:
Data are sent for 22 weeks before certain station 2015 passengers, as shown in Figure 1.Data Dimensionality Reduction is handled, week is obtained Seven subsequence S (1) on one to Sunday, S (2) ... S (7), and all volume of the flow of passengers time series S to summation in continuous 7 days (8).Turning point detection is carried out respectively to (8) eight S (1), S (2) ... S sequences.
Turnover point detecting method:
Head and the tail are put, height sequence is added, two heights is connected, according to coordinate (Xi,Yi), wherein XiFor all numbers, YiIt is right The volume of the flow of passengers answered.Straight line formula Y=aX+b is obtained, with range formulaRemaining each point arrives in computation interval The distance of height line selects the point farthest apart from straight line, is added into height set, reconnects two adjacent heights, and count Each point in section is calculated to select point wherein farthest apart from straight line to the distance of straight line, continue, until selecting 5 changes Point (including two endpoints).
The 1 year volume of the flow of passengers data in the station are analyzed, above-mentioned turning point is done to Zhou Xulie and subsequence respectively and is examined It surveys.The turning point for obtaining 8 sequences is as follows:
Monday: (1,11,13,16,22)
Tuesday: (1,2,4,11,22)
Wednesday: (1,2,4,11,22)
Thursday: (1,3,5,12,22)
Friday: (1,3,5,16,22)
Saturday: (1,3,5,12,22)
Sunday: (1,5,10,15,22)
Weekly data: (1,3,5,11,22)
Eight sequences are analyzed, the turning point of each sequence is found out, by taking the intersection of turning point, find out Zhou Xulie The point having an effect at first in turning point, such as the following table 3, the green turning point selected for weekly data sequence of getting the bid, data are son in frame The sequence turning point corresponding date.Specific step is as follows:
First week in weekly data is turning point, and selecting the turning point most started is January 1;Third Zhou Weizhuan in Zhou Xulie Break, the point most started are January 15, and the discovery of eyes front three days January 13, January 14 are turning point for the sake of insurance, so this The turning point most started selected is January 13.
The 5th week in weekly data is turning point, and the turning point most started is January 29, and eyes front two days are also turning point, i.e., The turning point most started was moved forward as January 27.The rest may be inferred goes down, and constantly takes the friendship of Zhou Xulie Yu subsequence turning point Collection, it is also necessary to which the turning point in last week is positioned at last day.Finally select whole turning point date such as table 4.
Table 3
Table 4
Turning point Date Corresponding day
1 January 1 1
2 January 13 13
3 January 27 17
4 March 16 75
5 June 3 154
Obtained fitted figure is as shown in Fig. 2, the volume of the flow of passengers data that wherein each square dot is 154 days, dot are turn selected Break (including start-stop point).

Claims (7)

1. periodical time series data processing method, which is characterized in that comprising steps of
A, the data in period K are grouped by natural week, extract the data for corresponding to the time in each group respectively, form 7 Data sequence S (1), S (2) ... S (7);
B, it sums to the data of the corresponding position in 7 data sequences, obtains the 8th data sequence S (8);
C, the trend reverse point of data sequence S (1), S (2) ... S (8) are sought;
Wherein: K >=7n, n are positive integer.
2. periodicity time series data processing method according to claim 1, which is characterized in that further comprise the steps of:
D, the trend reverse point quantity using the trend reverse point quantity of data sequence S (8) as period K.
3. periodicity time series data processing method according to claim 1, which is characterized in that step c specifically:
The trend reverse point of data sequence S (1), S (2) ... S (8) are sought using mathematical method.
4. periodicity time series data processing method according to claim 3, which is characterized in that the mathematical method tool Body are as follows:
Data sequence is lined up, each data point is connected with straight line, according to certain data point and left and right sides consecutive number strong point Between the slope differences of line judged, when slope differences are greater than setting threshold values, i.e., the data are classified as turning point;
Or
Data sequence is lined up, connects head and the tail data point with straight line, is calculated perpendicular between intermediate all data points and straight line To vertical range, the maximum point of selected distance is turning point.This subsequent turning point is as new endpoint, endpoint and original head and the tail Point forms two data sequences, and same method seeks new turning point.Circuit sequentially down until all the points to straight line away from From reaching setting value, or until turning point quantity reaches setting value.
5. periodical time series data processing method described in any one according to claim 1~4, which is characterized in that institute The unit for stating K is year;N=52.
6. periodicity time series data processing method according to claim 5, which is characterized in that further comprise the steps of:
E, the trend reverse point location 1 year is all in the turning point of S (8), and 3 days after 7 days this weeks and last week are formed continuous 10 days Date collection Date (10), then check data sequence S (1)~S (7) turning point date, extraction fall in Date (10) set In the turning point date form new set, choose the smallest date value in the new set, and then turning point week is turned The break date navigates to this day, which also becomes annual trend reverse point.
7. periodicity time series data processing method according to claim 6, which is characterized in that the last one is transferred The turning point date positioning in point week is until the last day.
CN201910075079.7A 2019-01-25 2019-01-25 Periodic time series data processing method Active CN109933607B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910075079.7A CN109933607B (en) 2019-01-25 2019-01-25 Periodic time series data processing method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910075079.7A CN109933607B (en) 2019-01-25 2019-01-25 Periodic time series data processing method

Publications (2)

Publication Number Publication Date
CN109933607A true CN109933607A (en) 2019-06-25
CN109933607B CN109933607B (en) 2023-10-03

Family

ID=66985239

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910075079.7A Active CN109933607B (en) 2019-01-25 2019-01-25 Periodic time series data processing method

Country Status (1)

Country Link
CN (1) CN109933607B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111125184A (en) * 2019-11-23 2020-05-08 同济大学 Bus passenger flow dynamic monitoring method based on time sequence structural variable point identification

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2011081491A (en) * 2009-10-05 2011-04-21 Nec Biglobe Ltd Time series analysis device, time series analysis method and program
CN104268660A (en) * 2014-10-13 2015-01-07 国家电网公司 Trend recognition method for electric power system predication-like data
JP2016045917A (en) * 2014-08-27 2016-04-04 株式会社日立ソリューションズ西日本 Device for tendency extraction and evaluation of time series data
CN107764458A (en) * 2017-09-25 2018-03-06 中国航空工业集团公司西安飞机设计研究所 A kind of aircraft handing characteristics curve generation method
CN108804731A (en) * 2017-09-12 2018-11-13 中南大学 Based on the dual evaluation points time series trend feature extracting method of vital point

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2011081491A (en) * 2009-10-05 2011-04-21 Nec Biglobe Ltd Time series analysis device, time series analysis method and program
JP2016045917A (en) * 2014-08-27 2016-04-04 株式会社日立ソリューションズ西日本 Device for tendency extraction and evaluation of time series data
CN104268660A (en) * 2014-10-13 2015-01-07 国家电网公司 Trend recognition method for electric power system predication-like data
CN108804731A (en) * 2017-09-12 2018-11-13 中南大学 Based on the dual evaluation points time series trend feature extracting method of vital point
CN107764458A (en) * 2017-09-25 2018-03-06 中国航空工业集团公司西安飞机设计研究所 A kind of aircraft handing characteristics curve generation method

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
桑夏夏;李旭伟;: "一种金融时间序列区域分割方法的研究" *
王炜炜;单杏花;: "基于时间序列聚类方法的小长假铁路客流规律研究" *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111125184A (en) * 2019-11-23 2020-05-08 同济大学 Bus passenger flow dynamic monitoring method based on time sequence structural variable point identification

Also Published As

Publication number Publication date
CN109933607B (en) 2023-10-03

Similar Documents

Publication Publication Date Title
CN107103754B (en) Road traffic condition prediction method and system
CN105809292B (en) Bus IC card passenger getting off car website projectional technique
CN102102992B (en) Multistage network division-based preliminary screening method for matched roads and map matching system
CN109886997A (en) Method, apparatus and terminal device are determined based on the identification frame of target detection
JP4515332B2 (en) Image processing apparatus and target area tracking program
CN109917430B (en) Satellite positioning track drift correction method based on track smoothing algorithm
CN109034187B (en) User family work address mining process
CN112733904B (en) Water quality abnormity detection method and electronic equipment
CN109712393B (en) Intelligent traffic time interval division method based on Gaussian process regression algorithm
CN108241819A (en) The recognition methods of pavement markers and device
CN108647261A (en) Global isoplethes drawing method based on meteorological data discrete point gridding processing
CN109933607A (en) Periodical time series data processing method
CN113536127A (en) Data processing method based on big data and artificial intelligence and cloud server
CN108830403B (en) Visual analysis method for tobacco retail customer visiting path based on commercial value calculation
CN112652164B (en) Traffic time interval dividing method, device and equipment
CN108681741A (en) Based on the subway of IC card and resident's survey data commuting crowd's information fusion method
CN115984559B (en) Intelligent sample selection method and related device
CN107818415A (en) A kind of recognition methods of attending a school by taking daily trips based on subway brushing card data
CN115830514B (en) Whole river reach surface flow velocity calculation method and system suitable for curved river channel
CN113361885B (en) Dual-target urban public transportation benefit evaluation method based on multi-source data
CN116452845A (en) Bird fine granularity image classification method based on data enhancement
CN108446923A (en) A kind of task pricing method based on self-service labor service crowdsourcing platform
CN110475198B (en) Urban road user track deviation correction processing method and device
CN112862767A (en) Measurement learning-based surface defect detection method for solving difficult-to-differentiate unbalanced samples
CN113723981A (en) Rapid evaluation method and system for mass advertisement positions

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right

Effective date of registration: 20230823

Address after: Room 203, 2nd Floor, Building 12, East District, No. 31 Jiaoda East Road, Haidian District, Beijing, 100044

Applicant after: WEINUO TIMES (BEIJING) TECHNOLOGY CO.,LTD.

Address before: 1602-16, 16th floor, innovation building, Southwest Jiaotong University, No. 111, North Section 1, 2nd Ring Road, Jinniu District, Chengdu, Sichuan 610000

Applicant before: SICHUAN QUANCHENG TIANYOU TECHNOLOGY Co.,Ltd.

TA01 Transfer of patent application right
GR01 Patent grant
GR01 Patent grant