CN110489451A - Flow calculation methodologies based on Iterative statistical - Google Patents

Flow calculation methodologies based on Iterative statistical Download PDF

Info

Publication number
CN110489451A
CN110489451A CN201910745061.3A CN201910745061A CN110489451A CN 110489451 A CN110489451 A CN 110489451A CN 201910745061 A CN201910745061 A CN 201910745061A CN 110489451 A CN110489451 A CN 110489451A
Authority
CN
China
Prior art keywords
statistical
data
value
formula
flow
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910745061.3A
Other languages
Chinese (zh)
Inventor
谢刚
王灿
郭国彬
郑兴
赵轩
舒建国
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chengdu Aircraft Industrial Group Co Ltd
Original Assignee
Chengdu Aircraft Industrial Group Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chengdu Aircraft Industrial Group Co Ltd filed Critical Chengdu Aircraft Industrial Group Co Ltd
Priority to CN201910745061.3A priority Critical patent/CN110489451A/en
Publication of CN110489451A publication Critical patent/CN110489451A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • G06F16/24568Data stream processing; Continuous queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2462Approximate or statistical queries

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Fuzzy Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Complex Calculations (AREA)

Abstract

The present invention relates to data processing fields, are a kind of flow calculation methodologies based on Iterative statistical for avoiding calculation procedure obstruction, improving statistical value computational efficiency specifically.The data flow X of real time data splits data flow X according to the time;The data X of a cycle is calculated with statistical formula G1Statistical value, obtain initial value f1;According to initial value f1It with statistical formula G, is successively iterated to calculate in each period of data flow X, obtains current data stream XmCurrent statistic value fm;If m < n, continues to execute step S3, if m=n, final statistical value f is exportedn=Gn, n is data flow X divided data number, and m is current data number.Flow calculation methodologies based on Iterative statistical of the invention, without by statistics in need value carry out one-time calculation, each specific period can be decomposed for a large amount of calculation amounts in real time by carrying out statistical formula iteration by loop structure, it avoids prolonged calculation procedure obstruction from waiting, achievees the purpose that real-time high-efficiency calculates.

Description

Flow calculation methodologies based on Iterative statistical
Technical field
The present invention relates to data processing fields, are that one kind avoids calculation procedure obstruction, improves statistical value calculating specifically The flow calculation methodologies based on Iterative statistical of efficiency.
Background technique
In traditional flow chart of data processing, data are always first collected, are then placed data into DB.It is needed as people When query done to data by DB, obtain answer or carry out relevant processing.Although seeming so very rationally, As a result but very compact, especially in some real-time search application environments certain particular problems, are similar to MapReduce The processed offline of mode not can be well solved problem, therefore stream calculation is come into being.
Stream calculation is that data flow is loaded onto calculator memory in chronological order, within save as carrier carry out efficiently in real time meter A kind of calculating mode calculated, since period does not interact with external resources such as hard disk and networks, computational efficiency with higher. In industrial manufacturing process, data are generated in real time by a large amount of automation equipments, and industrial big data processing requirement high real-time reaches The response requirement of Millisecond.Therefore, in a certain range, how quick Realtime Statistics, and calculate data characteristics, being must It need solve the problems, such as.Traditional stream calculation need to disposably calculate all data to be counted, and efficiency is very low.To keep away Exempt from prolonged calculation procedure obstruction to wait, achieve the purpose that efficiently to calculate, a large amount of calculation amounts in real time are decomposed into each week Phase introduces the thought of Iterative statistical, designs a kind of flow calculation methodologies based on Iterative statistical.
Summary of the invention
It is an object of the invention to: provide it is a kind of avoid calculation procedure obstruction, improve statistical value computational efficiency based on repeatedly The flow calculation methodologies of generation statistics.
The present invention is achieved through the following technical solutions: the flow calculation methodologies based on Iterative statistical, comprising the following steps:
Step S1: the data flow X of real time data is obtained, and data flow X is split according to the time;
Step S2: the data X of a cycle is calculated using statistical formula G1Statistical value, obtain initial value f1
Step S3: according to initial value f1With statistical formula G, each period by loop structure in data flow X is carried out It successively iterates to calculate, to obtain current data stream X in real timemCurrent statistic value fm
Step S4: if m < n, continues to execute step S3, if m=n, jump procedure S5;
Step S5: final statistical value f is exportedn=Gn, f is statistical value, total data amount check that wherein n is data flow X points, m For the current data number used when calculating.
Further, in order to preferably realize the present invention, following settings are especially used: described in the step S1 Data flow X is read in real time from memory pipeline, and data flow X is broken down into n sections according to time cycle property, is data X1, X2... Xn
Further, in order to preferably realize the present invention, following settings are especially used: described in the step S2 Statistical formula G is most to be worth formula.
Further, in order to preferably realize the present invention, especially use following settings: the statistical formula G is variance Formula.
Further, in order to preferably realize the present invention, especially use following settings: the statistical formula G is standard Poor formula.
Further, in order to preferably realize the present invention, especially use following settings: the statistical formula G is expectation Formula.
Further, in order to preferably realize the present invention, following settings are especially used: when the statistical formula G is scheduled to last When hoping formula, the statistical value f of final outputn=Gn=E (Xn), wherein
N is the data amount check of data flow X, f in formulanFor The statistical value of all data, f in data flow X(n-1)Except the statistical value after recent statistics value, that is, to end previous in data flow Data X(n-1)Statistical value, XnFor latest data value.
Further, in order to preferably realize the present invention, following settings are especially used: current to unite in the step S3 Evaluation fm=Gm=E (Xm), wherein
M is the current data amount check of data flow X in formula, fmFor the statistical value of current data in data flow X, f(m-1)For in data flow except the statistical value after current statistic value, that is, before ending One data X(m-1)Statistical value, XmFor Current data values.
Compared with prior art, the present invention having the following advantages that and the utility model has the advantages that the stream of the invention based on Iterative statistical Calculation method, without by statistics in need value carry out one-time calculation, by loop structure carry out statistical formula iteration be A large amount of calculation amounts in real time can be decomposed each specific period, avoid prolonged calculation procedure obstruction from waiting, reach real The purpose that Shi Gaoxiao is calculated.
In addition, efficiency of algorithm can be effectively improved, algorithm space complexity is reduced.According to statistics iterative formula fn=G (f(n-1),Xn, n), in memory-resident there are three variables, n, Xn, fn, when data flow changes, these three variables are changed Generation, which calculates, to be updated, and space complexity is that o (1) remains unchanged.
Furthermore it is possible to effectively improve efficiency of algorithm, algorithm space complexity is reduced.If using traditional statistics calculating side Method, each data will carry out n-1 sub-addition and 1 division in stream calculation, amount to n times operation, and time complexity is o (n), When n is very big, time complexity can be very big, and calculation procedure may block;If using the calculation method of Iterative statistical According to statistics iterative formula fn=G (f(n-1),Xn, n), 1 multiplication, 1 sub-addition and 1 division totally 3 fortune need to be only carried out every time It calculates, time complexity is that o (1) remains unchanged.
Detailed description of the invention
Fig. 1 is a kind of flow diagram of the flow calculation methodologies of the invention based on Iterative statistical;
Fig. 2 is a kind of loop structure schematic diagram of the flow calculation methodologies of the invention based on Iterative statistical.
Specific embodiment
The embodiment of the present invention is described below in detail, examples of the embodiments are shown in the accompanying drawings, wherein from beginning to end Same or similar label indicates same or similar element or element with the same or similar functions.Below with reference to attached The embodiment of figure description is exemplary, and for explaining only the invention, and is not considered as limiting the invention.
Below will be in conjunction with attached drawing in the embodiment of the present application, technical solutions in the embodiments of the present application carries out clear, complete Site preparation description, it is clear that described embodiments are only a part of embodiments of the present application, instead of all the embodiments.Usually The component of the embodiment of the present application being described and illustrated herein in the accompanying drawings can be arranged and be designed with a variety of different configurations.
Therefore, the detailed description of the embodiments herein provided in the accompanying drawings is not intended to limit below claimed Scope of the present application, but be merely representative of the selected embodiment of the application.Based on embodiments herein, those skilled in the art Member's every other embodiment obtained without making creative work, shall fall in the protection scope of this application.
Embodiment 1:
The present invention is achieved through the following technical solutions, as shown in Figure 1 and Figure 2, the stream calculation of the invention based on Iterative statistical Method, comprising the following steps:
Step S1: the data flow X of real time data is obtained, and data flow X is split according to the time;
Step S2: the data X of a cycle is calculated using statistical formula G1Statistical value, obtain initial value f1
Step S3: according to initial value f1With statistical formula G, each period by loop structure in data flow X is carried out It successively iterates to calculate, to obtain current data stream X in real timemCurrent statistic value fm
Step S4: if m < n, continues to execute step S3, if m=n, jump procedure S5;
Step S5: final statistical value f is exportedn=Gn, f is statistical value, total data amount check that wherein n is data flow X points, m For the current data number used when calculating.
Flow calculation methodologies based on Iterative statistical of the invention, without by the value of statistics in need disposably counted It calculates, each specific period can be decomposed for a large amount of calculation amounts in real time by carrying out statistical formula iteration by loop structure, be kept away Exempt from prolonged calculation procedure obstruction to wait, achievees the purpose that real-time high-efficiency calculates.
In addition, efficiency of algorithm can be effectively improved, algorithm space complexity is reduced.According to statistics iterative formula fn=G (f(n-1),Xn, n), in memory-resident there are three variables, n, Xn, fn, when data flow changes, these three variables are changed Generation, which calculates, to be updated, and space complexity is that o (1) remains unchanged.
Furthermore it is possible to effectively improve efficiency of algorithm, algorithm space complexity is reduced.If using traditional statistics calculating side Method, each data will carry out n-1 sub-addition and 1 division in stream calculation, amount to n times operation, and time complexity is o (n), When n is very big, time complexity can be very big, and calculation procedure may block;If using the calculation method of Iterative statistical According to statistics iterative formula fn=G (f(n-1),Xn, n), 1 multiplication, 1 sub-addition and 1 division totally 3 fortune need to be only carried out every time It calculates, time complexity is that o (1) remains unchanged.
When carrying out stream calculation, the data flow X of real time data is obtained first, is torn open according to time series according to period distances Point, data flow X is split as the data X with number of cycles equivalent number1, X2... Xn;Then the is calculated using statistical formula G The data X of a cycle1, obtain the statistical value f of a cycle1, then with statistical value f1For initial value, in conjunction with next period Data X2The statistical value f of iteration calculating second round2, primary iteration is the f as n=11=G (f0,X1, 1), as n=2, f2=G (f1,X2,2).Successively iterate to calculate out current data XmStatistical value fm;Judge whether m is final number of cycles n, if M < n, then continue iteration, if m=n, exports final statistical value fn, without by the value of statistics in need disposably counted It calculates, each specific period can be decomposed for a large amount of calculation amounts in real time by carrying out statistical formula iteration by loop structure, be kept away Exempt from prolonged calculation procedure obstruction to wait, achievees the purpose that real-time high-efficiency calculates.
Embodiment 2:
The present embodiment advanced optimizes on the basis of the above embodiments, in the step S1, the data flow X It is read in real time from memory pipeline, data flow X is broken down into n sections according to time cycle property, is data X1, X2... Xn.Its Middle n is number of cycles and the data amount check that data flow X is split.
Embodiment 3:
The present embodiment advanced optimizes on the basis of the above embodiments, and in the step S2, the statistics is public Formula G is most to be worth formula, formula of variance, standard deviation formula or expectation formula.The statistical formula G can be according to actual demand It chooses, can be formula of variance, can be standard deviation formula, or it is expected formula, can also be to be most worth formula Other formula.
Embodiment 4:
The present embodiment advanced optimizes on the basis of the above embodiments, when the statistical formula G is desired formula When, the statistical value f of final outputn=Gn=E (Xn), wherein
N is the data amount check of data flow X, f in formulanFor The statistical value of all data, f in data flow X(n-1)Except the statistical value after recent statistics value, that is, to end previous in data flow Data X(n-1)Statistical value, XnFor latest data value.
Embodiment 5:
The present embodiment advanced optimizes on the basis of the above embodiments, in the step S3, is adopted with statistical formula G For desired formula, current statistic value fm=Gm=E (Xm), wherein
M is the current data amount check of data flow X in formula, fmFor the statistical value of current data in data flow X, f(m-1)For in data flow except the statistical value after current statistic value, that is, before ending One data X(m-1)Statistical value, XmFor Current data values.When carrying out stream calculation, the data flow X of real time data, root are obtained first It is split according to time series according to period distances, data flow X is split as the data X with number of cycles equivalent number1, X2... Xn;Then the data X of a cycle is calculated using statistical formula G1, obtain the statistical value f of a cycle1, then With statistical value f1For initial value, in conjunction with the data X in next period2The statistical value f of iteration calculating second round2, successively iterate to calculate Current data X outmStatistical value fm;Judge whether m is final number of cycles n, if m < n, continues iteration, if m=n, Export final statistical value fn, without by the value of statistics in need carry out one-time calculation, it is public to carry out statistics by loop structure A large amount of calculation amounts in real time can be decomposed each specific period by formula iteration, avoid prolonged calculation procedure obstruction etc. Wait achieve the purpose that real-time high-efficiency calculates.
The above is only presently preferred embodiments of the present invention, not does limitation in any form to the present invention, it is all according to According to technical spirit any simple modification to the above embodiments of the invention, equivalent variations, protection of the invention is each fallen within Within the scope of.

Claims (8)

1. the flow calculation methodologies based on Iterative statistical, it is characterised in that the following steps are included:
Step S1: the data flow X of real time data is obtained, and data flow X is split according to the time;
Step S2: the data X of a cycle is calculated using statistical formula G1Statistical value, obtain initial value f1
Step S3: according to initial value f1With statistical formula G, each period by loop structure in data flow X is successively changed In generation, calculates, to obtain current data stream X in real timemCurrent statistic value fm
Step S4: if m < n, continues to execute step S3, if m=n, jump procedure S5;
Step S5: final statistical value f is exportedn=Gn, f is statistical value, total data amount check that wherein n is data flow X points, and m is meter The current data number used when calculation.
2. the flow calculation methodologies according to claim 1 based on Iterative statistical, it is characterised in that: in the step S1, The data flow X is read in real time from memory pipeline, and data flow X is broken down into n sections according to time cycle property, is data X1, X2... Xn
3. the flow calculation methodologies according to claim 2 based on Iterative statistical, it is characterised in that: in the step S2, The statistical formula G is most to be worth formula.
4. the flow calculation methodologies according to claim 2 based on Iterative statistical, it is characterised in that: the statistical formula G For formula of variance.
5. the flow calculation methodologies according to claim 2 based on Iterative statistical, it is characterised in that: the statistical formula G For standard deviation formula.
6. the flow calculation methodologies according to claim 2 based on Iterative statistical, it is characterised in that: the statistical formula G It is expected formula.
7. the flow calculation methodologies according to claim 6 based on Iterative statistical, it is characterised in that: when the statistical formula When G is desired formula, the statistical value f of final outputn=Gn=E (Xn), wherein
N is the data amount check of data flow X, f in formulanFor data Flow the statistical value of all data in X, f(n-1)Except the statistical value after recent statistics value, that is, to end previous data in data flow X(n-1)Statistical value, XnFor latest data value.
8. the flow calculation methodologies according to claim 7 based on Iterative statistical, it is characterised in that: in the step S3, Current statistic value fm=Gm=E (Xm), wherein
M is the current data amount check of data flow X, f in formulamFor The statistical value of current data, f in data flow X(m-1)Except the statistical value after current statistic value, that is, to end previous in data flow Data X(m-1)Statistical value, XmFor Current data values.
CN201910745061.3A 2019-08-13 2019-08-13 Flow calculation methodologies based on Iterative statistical Pending CN110489451A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910745061.3A CN110489451A (en) 2019-08-13 2019-08-13 Flow calculation methodologies based on Iterative statistical

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910745061.3A CN110489451A (en) 2019-08-13 2019-08-13 Flow calculation methodologies based on Iterative statistical

Publications (1)

Publication Number Publication Date
CN110489451A true CN110489451A (en) 2019-11-22

Family

ID=68549721

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910745061.3A Pending CN110489451A (en) 2019-08-13 2019-08-13 Flow calculation methodologies based on Iterative statistical

Country Status (1)

Country Link
CN (1) CN110489451A (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8055970B1 (en) * 2005-11-14 2011-11-08 Raytheon Company System and method for parallel processing of data integrity algorithms
CN104267939A (en) * 2014-09-17 2015-01-07 华为技术有限公司 Business processing method, device and system
CN104915247A (en) * 2015-04-29 2015-09-16 上海瀚银信息技术有限公司 Real time data calculation method and system
CN108256045A (en) * 2018-01-12 2018-07-06 福建星瑞格软件有限公司 The structuring parsing of real-time streaming data, the method and computer equipment of stream calculation
CN108804781A (en) * 2018-05-25 2018-11-13 武汉大学 The geographical process near real-time analogy method that stream calculation is integrated with Sensor Network
CN109542946A (en) * 2018-10-26 2019-03-29 贵州斯曼特信息技术开发有限责任公司 It is a kind of to calculate big data system and method in real time
CN109800129A (en) * 2019-01-17 2019-05-24 青岛特锐德电气股份有限公司 A kind of real-time stream calculation monitoring system and method for processing monitoring big data

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8055970B1 (en) * 2005-11-14 2011-11-08 Raytheon Company System and method for parallel processing of data integrity algorithms
CN104267939A (en) * 2014-09-17 2015-01-07 华为技术有限公司 Business processing method, device and system
CN104915247A (en) * 2015-04-29 2015-09-16 上海瀚银信息技术有限公司 Real time data calculation method and system
CN108256045A (en) * 2018-01-12 2018-07-06 福建星瑞格软件有限公司 The structuring parsing of real-time streaming data, the method and computer equipment of stream calculation
CN108804781A (en) * 2018-05-25 2018-11-13 武汉大学 The geographical process near real-time analogy method that stream calculation is integrated with Sensor Network
CN109542946A (en) * 2018-10-26 2019-03-29 贵州斯曼特信息技术开发有限责任公司 It is a kind of to calculate big data system and method in real time
CN109800129A (en) * 2019-01-17 2019-05-24 青岛特锐德电气股份有限公司 A kind of real-time stream calculation monitoring system and method for processing monitoring big data

Similar Documents

Publication Publication Date Title
US8276135B2 (en) Profiling of software and circuit designs utilizing data operation analyses
Mandal et al. Design of optimal linear phase FIR high pass filter using craziness based particle swarm optimization technique
US20110282866A1 (en) System And Method For Retrieving And Processing Information From A Supervisory Control Manufacturing/Production Database
CN110187965B (en) Operation optimization and data processing method and device of neural network and storage medium
Pintelon et al. Frequency domain system identification with missing data
Metaxoglou et al. Maximum likelihood estimation of VARMA models using a state‐space EM algorithm
CN107608870A (en) A kind of statistical method and system of system resource utilization rate
CN110489451A (en) Flow calculation methodologies based on Iterative statistical
CN110222402A (en) Electrical design system and method
Li et al. A novel self-similar traffic prediction method based on wavelet transform for satellite Internet
Mandal et al. FIR band stop filter optimization by improved particle swarm optimization
CN106407272A (en) Service statistical line display method and device
Mandal et al. Design of optimal linear phase fir high pass filter using improved particle swarm optimization
WO2020189360A1 (en) Pipeline computing apparatus, programmable logic controller, and pipeline processing execution method
Deryckere et al. Online matching with set and concave delays
Chong et al. Efficient extraction of high-betweenness vertices
Mukhopadhyay et al. Optimal design of linear phase FIR band stop filter using particle swarm optimization with improved inertia weight technique
Bordoloi et al. Design space exploration of instruction set customizable MPSoCs for multimedia applications
CN110635780A (en) Variable-rate baseband pulse shaping filter implementation method based on FPGA and filter
Boccadoro et al. A modelling approach for the dynamic scheduling problem of manufacturing systems with non negligible setup times and finite buffers
CN109143017B (en) Production test data processing method for semiconductor industry
CN113343064B (en) Data processing method, apparatus, device, storage medium, and computer program product
CN111754036B (en) Cantilever pre-batching nesting method, processing device and terminal equipment
RU2681694C1 (en) Method of constructing physical structure of user terminal of info-communication system
CN110941541B (en) Method and device for problem grading of data stream service

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20191122

RJ01 Rejection of invention patent application after publication