CN107193782A - A kind of method of abnormal value removing and correction fitted based on multinomial - Google Patents

A kind of method of abnormal value removing and correction fitted based on multinomial Download PDF

Info

Publication number
CN107193782A
CN107193782A CN201710253952.8A CN201710253952A CN107193782A CN 107193782 A CN107193782 A CN 107193782A CN 201710253952 A CN201710253952 A CN 201710253952A CN 107193782 A CN107193782 A CN 107193782A
Authority
CN
China
Prior art keywords
mrow
mtd
mtr
munderover
msub
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201710253952.8A
Other languages
Chinese (zh)
Inventor
郭嵩
李斌
万涛
张伟
何晋秋
李霖
潘慧
佘莹莹
徐侃
王磊
李金�
余良甫
管阳
赵寅
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
719th Research Institute of CSIC
Original Assignee
719th Research Institute of CSIC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 719th Research Institute of CSIC filed Critical 719th Research Institute of CSIC
Priority to CN201710253952.8A priority Critical patent/CN107193782A/en
Publication of CN107193782A publication Critical patent/CN107193782A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/17Function evaluation by approximation methods, e.g. inter- or extrapolation, smoothing, least mean square method
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/18Complex mathematical operations for evaluating statistical data, e.g. average values, frequency distributions, probability functions, regression analysis

Abstract

The invention discloses a kind of method of abnormal value removing and correction fitted based on multinomial, including step:N-order polynomial is carried out to former measurement data to fit, coefficient matrix is drawn and fits multinomial, rough scatter diagram is drawn by known observation data i, is chosen suitable frequency n progress least square multinomial and is fitted, to given measurement data (xi,yi) construction one function p (x) be used as data-oriented (xi,yi) approximate expression, make error ri=p (xi)‑yiQuadratic sum it is minimum, i.e.,Wherein i is 0 integer for arriving m.This method is based on multinomial is fitted, the method that invention computer automatically removes the outlier in measurement data, by being fitted the outlier that the residual sequence of estimate and observation is recognized and rejecting is observed in data sequence, to the important application value of practical engineering application.

Description

A kind of method of abnormal value removing and correction fitted based on multinomial
Technical field
The present invention relates to a kind of method of abnormal value removing and correction fitted based on multinomial, it is adaptable to which the TT&C system such as communicate, navigate Field.
Background technology
The data point that substantial amounts of substantial deviation is measured true value has been usually contained in the measurement data such as communication, navigation, this A little abnormal datas are known as outlier.Although outlier negligible amounts, processing and analysis meeting to data produce large effect, Reduce the reliability of data.Although some filtering methods can reject outlier to a certain extent, when parameter choosing Select it is bad if, it is likely that make data processed result because distortion is too serious without convincingness, or does not reach smooth Effect.Therefore, before the smoothing processing of data, first the outlier in measurement data should be carried out effectively to recognize and reject. Identification to outlier in test data is rejected, and has the artificial and automatic two ways of computer.Manual type is to there is apparent error The judgement of exceptional value relatively succeeds, but this mode efficiency is very low, and standard is difficult to grasp, when particularly larger to data volume Wait, this shortcoming is particularly evident.
The content of the invention
In place of in order to overcome the above-mentioned deficiencies of the prior art, present invention offer is a kind of to be picked based on the outlier that multinomial is fitted Except method, based on multinomial is fitted, the method that invention computer automatically removes the outlier in measurement data passes through fitting Outlier in the residual sequence identification of estimate and observation and rejecting observation data sequence, to the important of practical engineering application Application value.
To achieve the above object, the present invention is adopted the following technical scheme that:A kind of unruly-value rejecting side fitted based on multinomial Method, including step:
1st, carry out n-order polynomial to former measurement data to fit, draw coefficient matrix and fit multinomial, seen by known Survey data i and draw rough scatter diagram, choose suitable frequency n progress least square multinomial and fit, to given measurement number According to (xi,yi) construction one function p (x) be used as data-oriented (xi,yi) approximate expression, make error ri=p (xi)-yi's Quadratic sum is minimum, i.e.,Wherein i is 0 integer for arriving m,
Seek from geometric meaning and set point (xi,yi) square distance and be minimum curve y=p (x), function p (x) it is fitting function or least square solution, the method for seeking fitting function p (x) is the least square method of curve matching, works as fitting When function is multinomial, that is, work asIt is least square fitting multinomial when (n≤m),
For a0,a1,...,anThe function of many variables, to I=I (a0,a1,...,an) ask pole Value, the necessary condition of extreme value is sought by the function of many variables,I.e.It is on a0,a1,...,anLinear equation, be expressed in matrix as
The formula is to carry out n-order polynomial to former measurement data to fit, and draws coefficient matrix a0,a1,...,an, can be intended And multinomialObtain corresponding fitting value sequence and corresponding residual sequence;
2nd, it is { p to calculate corresponding fitting value sequencei:I=1,2 ..., m }:It is { Δ y to generate regression criterion sequencei= pi-yi, i=1,2 ... m };
3rd, the mean square error σ of digital simulation residual sequence, is calculated as follows:
4th, the 3 σ criterions commonly used on utilizing works judge and reject outlier, and it is y to reject the data after outlieri':I=1, 2,...,m
Judged according to the formula, if residual values are less than threshold value, the point is normal value, and this value is constant;If residual error When value is more than or equal to threshold value, then the point is judged for outlier, the value is replaced with the average of the point the first six point;
5th, judge whether all to handle all data, if not being disposed, to not having the data of processing again Carry out outlier judgement;
6th, total data is carried out after outlier judgement, the data after outlier are rejected in output.
In the above-mentioned technical solutions, the σ of threshold value 3 can make appropriate modification according to experiment concrete condition.
The beneficial effects of the invention are as follows:The present invention is based on multinomial is fitted, by being fitted estimate and observation Residual sequence recognizes and rejected the outlier in observation data sequence, to the important application value of practical engineering application;With it is existing Method is compared, independent of design experiences, therefore more convenient;The σ of threshold value 3 can make appropriate change according to experiment concrete condition, Therefore the parameter designed is more accurate, and application is wider;The mode of benefit value maintains the continuity of data after unruly-value rejecting, The requirement of data processing reconnaissance is met, the primary filtering to data source is realized.
Brief description of the drawings
Fig. 1 is step flow chart of the invention.
Fig. 2 is the distribution map of source measurement data.
Fig. 3 is that Fig. 2 carries out fitting curve map after n-order polynomial.
Fig. 4 is residual error ordered series of numbers figure.
Fig. 5 is the datagram after rejecting outlier.
Embodiment
Below in conjunction with the accompanying drawings and specific embodiment the invention will be further described.
A kind of method of abnormal value removing and correction fitted based on multinomial as shown in Figure 1, including step:
Step one:To former measurement data according to n-order polynomial fitting is carried out, draw coefficient matrix and fit multinomial;
By known observation data (i=0,1 ..., m) draw rough figure --- scatter diagram, choose suitable number of times N carries out least square polynomial fit;
To given measurement data (xi,yi) (i=0,1 ..., m), one function p (x) of construction is used as data-oriented (xi, yi) approximate expression, make error ri=p (xi)-yi(i=0,1 ..., quadratic sum m) is minimum, i.e.,
Seek from geometric meaning and set point (xi,yi) (i=0,1 ..., square distance and the curve y for minimum m) =p (x).Function p (x) is referred to as fitting function or least square solution, asks fitting function p (x) method to be referred to as curve matching most Small square law.When fitting function is multinomial, that is, work asWhen, referred to as least square fitting is more Item formula.Obviously
For a0,a1,...,anThe function of many variables, therefore above mentioned problem is to seek I=I (a0,a1,...,an) extreme value ask Topic.The necessary condition of extreme value is sought by the function of many variables, is obtained
I.e.
Formula (4) is on a0,a1,...,anSystem of linear equations, be expressed in matrix as
Formula (5) is, according to n-order polynomial fitting is carried out, to draw coefficient matrix to former measurement data.Obtain coefficient matrix a0,a1,...,an, obtain polynomial fitting:
Step 2: obtaining corresponding fitting value sequence and corresponding residual sequence;
It is { p to calculate corresponding fitting value sequencei:I=1,2 ..., m }:It is { Δ y to generate regression criterion sequencei= pi-yi, i=1,2 ... m };
Step 3: calculating mean square error
The mean square error σ of digital simulation residual sequence is calculated as follows:
Step 4: judging outlier
The 3 σ criterions commonly used on utilizing works judge and reject outlier, and it is { y to reject the data after outlieri':I=1, 2,...,m}
Judged according to formula (8), be normal value, this value is constant if residual values are less than threshold value;If being more than or waiting In threshold value, then it is judged as outlier.If the point is outlier, the point is replaced with the average of preceding 6 points, if the point is not outlier, Then the value of the point is constant.
Step 5:Judge whether all to handle all data, if not being disposed, to the number of no processing Judge according to outlier is re-started;
Step 6:If all data have all been carried out with outlier judgement, the data after outlier are rejected in output.Entirely Flow terminates.
In the above-mentioned technical solutions, the σ of threshold value 3 can make appropriate modification according to experiment concrete condition.
In step one, to former measurement data according to n-order polynomial fitting is carried out, draw coefficient matrix and fit multinomial Formula;Corresponding fitting value sequence and corresponding residual sequence are obtained in step 2;Mean square error is calculated in step 3;Step 4 It is middle to judge using 3 σ criterions and reject outlier;Test data is judged in step 5, judges whether all to enter all data Processing is gone, if not being disposed, outlier has been re-started to the data of no processing and judged;Export and reject in step 6 Data after outlier, flow terminates.This method flow can be used for computer automatic discrimination.
By taking the current speed size data actually measured as an example, therefrom choose 800 continuous data points and carry out rejecting wild Value processing.Former measurement data is as shown in Fig. 2 carry out fitting curve after n-order polynomial is fitted as shown in figure 3, residual sequence is as schemed Shown in 4, the data that output is rejected after outlier are as shown in Figure 5.
The present invention is recognized and rejected by being fitted the residual sequence of estimate and observation based on multinomial is fitted The outlier observed in data sequence, to the important application value of practical engineering application;Compared with the conventional method, independent of setting Meter experience, thus it is more convenient;The σ of threshold value 3 can make appropriate change, therefore the parameter designed according to experiment concrete condition More accurate, application is wider;The mode of benefit value maintains the continuity of data after unruly-value rejecting, meets data processing reconnaissance Requirement, realize to data source primary filtering.
The foregoing is only a preferred embodiment of the present invention, but protection scope of the present invention be not limited to This, any one skilled in the art the invention discloses technical scope in, the change that can readily occur in or replace Change, should all be included within the scope of the present invention.

Claims (2)

1. a kind of method of abnormal value removing and correction fitted based on multinomial, it is characterized in that:Including step:
1st, carry out n-order polynomial to former measurement data to fit, draw coefficient matrix and fit multinomial, by known observation data I draws rough scatter diagram, chooses suitable frequency n progress least square multinomial and fits, to given measurement data (xi, yi) construction one function p (x) be used as data-oriented (xi,yi) approximate expression, make error ri=p (xi)-yiQuadratic sum most It is small, i.e.,Wherein i is 0 integer for arriving m,
Seek from geometric meaning and set point (xi,yi) square distance and be minimum curve y=p (x), function p (x) is Fitting function or least square solution, the method for seeking fitting function p (x) are the least square method of curve matching, when fitting function is During multinomial, that is, work asIt is least square fitting multinomial when (n≤m),
For a0,a1,...,anThe function of many variables, to I=I (a0,a1,...,an) extreme value is sought, by The function of many variables seek the necessary condition of extreme value,I.e.It is on a0,a1,...,anLinear equation, represented with matrix For
<mrow> <mfenced open = "[" close = "]"> <mtable> <mtr> <mtd> <mrow> <mi>m</mi> <mo>+</mo> <mn>1</mn> </mrow> </mtd> <mtd> <mrow> <munderover> <mi>&amp;Sigma;</mi> <mrow> <mi>i</mi> <mo>=</mo> <mn>0</mn> </mrow> <mi>m</mi> </munderover> <msub> <mi>x</mi> <mi>i</mi> </msub> </mrow> </mtd> <mtd> <mi>&amp;Lambda;</mi> </mtd> <mtd> <mrow> <munderover> <mi>&amp;Sigma;</mi> <mrow> <mi>i</mi> <mo>=</mo> <mn>0</mn> </mrow> <mi>m</mi> </munderover> <msubsup> <mi>x</mi> <mi>i</mi> <mi>n</mi> </msubsup> </mrow> </mtd> </mtr> <mtr> <mtd> <mrow> <munderover> <mi>&amp;Sigma;</mi> <mrow> <mi>i</mi> <mo>=</mo> <mn>0</mn> </mrow> <mi>m</mi> </munderover> <msub> <mi>x</mi> <mi>i</mi> </msub> </mrow> </mtd> <mtd> <mrow> <munderover> <mi>&amp;Sigma;</mi> <mrow> <mi>i</mi> <mo>=</mo> <mn>0</mn> </mrow> <mi>m</mi> </munderover> <msubsup> <mi>x</mi> <mi>i</mi> <mn>2</mn> </msubsup> </mrow> </mtd> <mtd> <mi>&amp;Lambda;</mi> </mtd> <mtd> <mrow> <munderover> <mi>&amp;Sigma;</mi> <mrow> <mi>i</mi> <mo>=</mo> <mn>0</mn> </mrow> <mi>m</mi> </munderover> <msubsup> <mi>x</mi> <mi>i</mi> <mrow> <mi>n</mi> <mo>+</mo> <mn>1</mn> </mrow> </msubsup> </mrow> </mtd> </mtr> <mtr> <mtd> <mi>M</mi> </mtd> <mtd> <mi>M</mi> </mtd> <mtd> <mrow></mrow> </mtd> <mtd> <mi>M</mi> </mtd> </mtr> <mtr> <mtd> <mrow> <munderover> <mi>&amp;Sigma;</mi> <mrow> <mi>i</mi> <mo>=</mo> <mn>0</mn> </mrow> <mi>m</mi> </munderover> <msubsup> <mi>x</mi> <mi>i</mi> <mi>n</mi> </msubsup> </mrow> </mtd> <mtd> <mrow> <munderover> <mi>&amp;Sigma;</mi> <mrow> <mi>i</mi> <mo>=</mo> <mn>0</mn> </mrow> <mi>m</mi> </munderover> <msubsup> <mi>x</mi> <mi>i</mi> <mrow> <mi>n</mi> <mo>+</mo> <mn>1</mn> </mrow> </msubsup> </mrow> </mtd> <mtd> <mi>&amp;Lambda;</mi> </mtd> <mtd> <mrow> <munderover> <mi>&amp;Sigma;</mi> <mrow> <mi>i</mi> <mo>=</mo> <mn>0</mn> </mrow> <mi>m</mi> </munderover> <msubsup> <mi>x</mi> <mi>i</mi> <mrow> <mn>2</mn> <mi>n</mi> </mrow> </msubsup> </mrow> </mtd> </mtr> </mtable> </mfenced> <mfenced open = "[" close = "]"> <mtable> <mtr> <mtd> <msub> <mi>a</mi> <mn>0</mn> </msub> </mtd> </mtr> <mtr> <mtd> <msub> <mi>a</mi> <mn>1</mn> </msub> </mtd> </mtr> <mtr> <mtd> <mi>M</mi> </mtd> </mtr> <mtr> <mtd> <msub> <mi>a</mi> <mi>n</mi> </msub> </mtd> </mtr> </mtable> </mfenced> <mo>=</mo> <mfenced open = "[" close = "]"> <mtable> <mtr> <mtd> <mrow> <munderover> <mi>&amp;Sigma;</mi> <mrow> <mi>i</mi> <mo>=</mo> <mn>0</mn> </mrow> <mi>m</mi> </munderover> <msub> <mi>y</mi> <mi>i</mi> </msub> </mrow> </mtd> </mtr> <mtr> <mtd> <mrow> <munderover> <mi>&amp;Sigma;</mi> <mrow> <mi>i</mi> <mo>=</mo> <mn>0</mn> </mrow> <mi>m</mi> </munderover> <msub> <mi>x</mi> <mi>i</mi> </msub> <msub> <mi>y</mi> <mi>i</mi> </msub> </mrow> </mtd> </mtr> <mtr> <mtd> <mi>M</mi> </mtd> </mtr> <mtr> <mtd> <mrow> <munderover> <mi>&amp;Sigma;</mi> <mrow> <mi>i</mi> <mo>=</mo> <mn>0</mn> </mrow> <mi>m</mi> </munderover> <msubsup> <mi>x</mi> <mi>i</mi> <mi>n</mi> </msubsup> <msub> <mi>y</mi> <mi>i</mi> </msub> </mrow> </mtd> </mtr> </mtable> </mfenced> </mrow>
The formula is to carry out n-order polynomial to former measurement data to fit, and draws coefficient matrix a0,a1,...,an, can obtain fitting many Item formulaObtain corresponding fitting value sequence and corresponding residual sequence;
2nd, it is { p to calculate corresponding fitting value sequencei:I=1,2 ..., m }:It is { Δ y to generate regression criterion sequencei=pi-yi, I=1,2 ... m };
3rd, the mean square error σ of digital simulation residual sequence, is calculated as follows:
4th, the 3 σ criterions commonly used on utilizing works judge and reject outlier, and it is y to reject the data after outlieri':I=1,2 ..., m
Judged according to the formula, if residual values are less than threshold value, the point is normal value, and this value is constant;If residual values are big When threshold value, then the point is judged for outlier, the value is replaced with the average of the point the first six point;
5th, judge whether all to handle all data, if not being disposed, the data to no processing are re-started Outlier judges;
6th, total data is carried out after outlier judgement, the data after outlier are rejected in output.
2. the method for abnormal value removing and correction according to claim 1 fitted based on multinomial, it is characterized in that:The σ of threshold value 3 can Appropriate modification is made according to experiment concrete condition.
CN201710253952.8A 2017-04-18 2017-04-18 A kind of method of abnormal value removing and correction fitted based on multinomial Pending CN107193782A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710253952.8A CN107193782A (en) 2017-04-18 2017-04-18 A kind of method of abnormal value removing and correction fitted based on multinomial

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710253952.8A CN107193782A (en) 2017-04-18 2017-04-18 A kind of method of abnormal value removing and correction fitted based on multinomial

Publications (1)

Publication Number Publication Date
CN107193782A true CN107193782A (en) 2017-09-22

Family

ID=59871423

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710253952.8A Pending CN107193782A (en) 2017-04-18 2017-04-18 A kind of method of abnormal value removing and correction fitted based on multinomial

Country Status (1)

Country Link
CN (1) CN107193782A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110967661A (en) * 2019-12-20 2020-04-07 宁夏凯晨电气集团有限公司 Electrical data calibration method based on curve fitting
CN111143777A (en) * 2019-12-27 2020-05-12 新奥数能科技有限公司 Data processing method and device, intelligent terminal and storage medium
CN111736626A (en) * 2020-06-22 2020-10-02 中国人民解放军国防科技大学 Stable missile path data processing method
CN113114161A (en) * 2021-03-26 2021-07-13 哈尔滨工业大学 Electromechanical system signal filtering method for eliminating outliers by using minimum median method
CN113111573A (en) * 2021-03-24 2021-07-13 桂林电子科技大学 Landslide displacement prediction method based on GRU

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110967661A (en) * 2019-12-20 2020-04-07 宁夏凯晨电气集团有限公司 Electrical data calibration method based on curve fitting
CN111143777A (en) * 2019-12-27 2020-05-12 新奥数能科技有限公司 Data processing method and device, intelligent terminal and storage medium
CN111736626A (en) * 2020-06-22 2020-10-02 中国人民解放军国防科技大学 Stable missile path data processing method
CN113111573A (en) * 2021-03-24 2021-07-13 桂林电子科技大学 Landslide displacement prediction method based on GRU
CN113111573B (en) * 2021-03-24 2022-09-23 桂林电子科技大学 Landslide displacement prediction method based on GRU
CN113114161A (en) * 2021-03-26 2021-07-13 哈尔滨工业大学 Electromechanical system signal filtering method for eliminating outliers by using minimum median method
CN113114161B (en) * 2021-03-26 2023-03-24 哈尔滨工业大学 Electromechanical system signal filtering method for eliminating outliers by using minimum median method

Similar Documents

Publication Publication Date Title
CN107193782A (en) A kind of method of abnormal value removing and correction fitted based on multinomial
CN109784383B (en) Rail crack identification method based on graph domain feature and DS evidence theory fusion
CN109284779A (en) Object detecting method based on the full convolutional network of depth
CN109727446A (en) A kind of identification and processing method of electricity consumption data exceptional value
CN111709465B (en) Intelligent identification method for rough difference of dam safety monitoring data
CN108829878B (en) Method and device for detecting abnormal points of industrial experimental data
CN107688554A (en) Frequency domain identification method based on adaptive Fourier decomposition
CN113516228B (en) Network anomaly detection method based on deep neural network
CN110288624A (en) Detection method, device and the relevant device of straightway in a kind of image
CN111639882B (en) Deep learning-based electricity risk judging method
CN110308658A (en) A kind of pid parameter setting method, device, system and readable storage medium storing program for executing
CN114970396B (en) CFD model correction method considering random and cognitive uncertainty
CN102945222A (en) Poor information measurement data gross error discrimination method based on Grey System Theory
CN114861120A (en) Flotation froth grade calculation method, device, electronic equipment and medium
CN109753634B (en) Historical data steady-state value-based dynamic system gain estimation method
CN104484747A (en) Method for determining qualified rate of products by utilizing truncation samples
CN104715160A (en) Soft measurement modeling data outlier detecting method based on KMDB
CN115455833B (en) Pneumatic uncertainty characterization method considering classification
CN116992773A (en) Belt conveyor coal flow prediction method based on integral LSTM and priori information
CN107562778A (en) A kind of outlier excavation method based on deviation feature
CN110298767A (en) A kind of thermal power plant time series variable method for monitoring abnormality and system
CN110196797A (en) Automatic optimization method and system suitable for credit scoring card system
CN116050249A (en) Reflow soldering spot morphology prediction method
CN109766905A (en) Target cluster dividing method based on Self-Organizing Feature Maps
CN114218782A (en) Truncated sequential test data evaluation method based on binomial distribution

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20170922