CN107193782A - A kind of method of abnormal value removing and correction fitted based on multinomial - Google Patents
A kind of method of abnormal value removing and correction fitted based on multinomial Download PDFInfo
- Publication number
- CN107193782A CN107193782A CN201710253952.8A CN201710253952A CN107193782A CN 107193782 A CN107193782 A CN 107193782A CN 201710253952 A CN201710253952 A CN 201710253952A CN 107193782 A CN107193782 A CN 107193782A
- Authority
- CN
- China
- Prior art keywords
- mrow
- mtd
- mtr
- munderover
- msub
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/17—Function evaluation by approximation methods, e.g. inter- or extrapolation, smoothing, least mean square method
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/18—Complex mathematical operations for evaluating statistical data, e.g. average values, frequency distributions, probability functions, regression analysis
Abstract
The invention discloses a kind of method of abnormal value removing and correction fitted based on multinomial, including step:N-order polynomial is carried out to former measurement data to fit, coefficient matrix is drawn and fits multinomial, rough scatter diagram is drawn by known observation data i, is chosen suitable frequency n progress least square multinomial and is fitted, to given measurement data (xi,yi) construction one function p (x) be used as data-oriented (xi,yi) approximate expression, make error ri=p (xi)‑yiQuadratic sum it is minimum, i.e.,Wherein i is 0 integer for arriving m.This method is based on multinomial is fitted, the method that invention computer automatically removes the outlier in measurement data, by being fitted the outlier that the residual sequence of estimate and observation is recognized and rejecting is observed in data sequence, to the important application value of practical engineering application.
Description
Technical field
The present invention relates to a kind of method of abnormal value removing and correction fitted based on multinomial, it is adaptable to which the TT&C system such as communicate, navigate
Field.
Background technology
The data point that substantial amounts of substantial deviation is measured true value has been usually contained in the measurement data such as communication, navigation, this
A little abnormal datas are known as outlier.Although outlier negligible amounts, processing and analysis meeting to data produce large effect,
Reduce the reliability of data.Although some filtering methods can reject outlier to a certain extent, when parameter choosing
Select it is bad if, it is likely that make data processed result because distortion is too serious without convincingness, or does not reach smooth
Effect.Therefore, before the smoothing processing of data, first the outlier in measurement data should be carried out effectively to recognize and reject.
Identification to outlier in test data is rejected, and has the artificial and automatic two ways of computer.Manual type is to there is apparent error
The judgement of exceptional value relatively succeeds, but this mode efficiency is very low, and standard is difficult to grasp, when particularly larger to data volume
Wait, this shortcoming is particularly evident.
The content of the invention
In place of in order to overcome the above-mentioned deficiencies of the prior art, present invention offer is a kind of to be picked based on the outlier that multinomial is fitted
Except method, based on multinomial is fitted, the method that invention computer automatically removes the outlier in measurement data passes through fitting
Outlier in the residual sequence identification of estimate and observation and rejecting observation data sequence, to the important of practical engineering application
Application value.
To achieve the above object, the present invention is adopted the following technical scheme that:A kind of unruly-value rejecting side fitted based on multinomial
Method, including step:
1st, carry out n-order polynomial to former measurement data to fit, draw coefficient matrix and fit multinomial, seen by known
Survey data i and draw rough scatter diagram, choose suitable frequency n progress least square multinomial and fit, to given measurement number
According to (xi,yi) construction one function p (x) be used as data-oriented (xi,yi) approximate expression, make error ri=p (xi)-yi's
Quadratic sum is minimum, i.e.,Wherein i is 0 integer for arriving m,
Seek from geometric meaning and set point (xi,yi) square distance and be minimum curve y=p (x), function p
(x) it is fitting function or least square solution, the method for seeking fitting function p (x) is the least square method of curve matching, works as fitting
When function is multinomial, that is, work asIt is least square fitting multinomial when (n≤m),
For a0,a1,...,anThe function of many variables, to I=I (a0,a1,...,an) ask pole
Value, the necessary condition of extreme value is sought by the function of many variables,I.e.It is on a0,a1,...,anLinear equation, be expressed in matrix as
The formula is to carry out n-order polynomial to former measurement data to fit, and draws coefficient matrix a0,a1,...,an, can be intended
And multinomialObtain corresponding fitting value sequence and corresponding residual sequence;
2nd, it is { p to calculate corresponding fitting value sequencei:I=1,2 ..., m }:It is { Δ y to generate regression criterion sequencei=
pi-yi, i=1,2 ... m };
3rd, the mean square error σ of digital simulation residual sequence, is calculated as follows:
4th, the 3 σ criterions commonly used on utilizing works judge and reject outlier, and it is y to reject the data after outlieri':I=1,
2,...,m
Judged according to the formula, if residual values are less than threshold value, the point is normal value, and this value is constant;If residual error
When value is more than or equal to threshold value, then the point is judged for outlier, the value is replaced with the average of the point the first six point;
5th, judge whether all to handle all data, if not being disposed, to not having the data of processing again
Carry out outlier judgement;
6th, total data is carried out after outlier judgement, the data after outlier are rejected in output.
In the above-mentioned technical solutions, the σ of threshold value 3 can make appropriate modification according to experiment concrete condition.
The beneficial effects of the invention are as follows:The present invention is based on multinomial is fitted, by being fitted estimate and observation
Residual sequence recognizes and rejected the outlier in observation data sequence, to the important application value of practical engineering application;With it is existing
Method is compared, independent of design experiences, therefore more convenient;The σ of threshold value 3 can make appropriate change according to experiment concrete condition,
Therefore the parameter designed is more accurate, and application is wider;The mode of benefit value maintains the continuity of data after unruly-value rejecting,
The requirement of data processing reconnaissance is met, the primary filtering to data source is realized.
Brief description of the drawings
Fig. 1 is step flow chart of the invention.
Fig. 2 is the distribution map of source measurement data.
Fig. 3 is that Fig. 2 carries out fitting curve map after n-order polynomial.
Fig. 4 is residual error ordered series of numbers figure.
Fig. 5 is the datagram after rejecting outlier.
Embodiment
Below in conjunction with the accompanying drawings and specific embodiment the invention will be further described.
A kind of method of abnormal value removing and correction fitted based on multinomial as shown in Figure 1, including step:
Step one:To former measurement data according to n-order polynomial fitting is carried out, draw coefficient matrix and fit multinomial;
By known observation data (i=0,1 ..., m) draw rough figure --- scatter diagram, choose suitable number of times
N carries out least square polynomial fit;
To given measurement data (xi,yi) (i=0,1 ..., m), one function p (x) of construction is used as data-oriented (xi,
yi) approximate expression, make error ri=p (xi)-yi(i=0,1 ..., quadratic sum m) is minimum, i.e.,
Seek from geometric meaning and set point (xi,yi) (i=0,1 ..., square distance and the curve y for minimum m)
=p (x).Function p (x) is referred to as fitting function or least square solution, asks fitting function p (x) method to be referred to as curve matching most
Small square law.When fitting function is multinomial, that is, work asWhen, referred to as least square fitting is more
Item formula.Obviously
For a0,a1,...,anThe function of many variables, therefore above mentioned problem is to seek I=I (a0,a1,...,an) extreme value ask
Topic.The necessary condition of extreme value is sought by the function of many variables, is obtained
I.e.
Formula (4) is on a0,a1,...,anSystem of linear equations, be expressed in matrix as
Formula (5) is, according to n-order polynomial fitting is carried out, to draw coefficient matrix to former measurement data.Obtain coefficient matrix
a0,a1,...,an, obtain polynomial fitting:
Step 2: obtaining corresponding fitting value sequence and corresponding residual sequence;
It is { p to calculate corresponding fitting value sequencei:I=1,2 ..., m }:It is { Δ y to generate regression criterion sequencei=
pi-yi, i=1,2 ... m };
Step 3: calculating mean square error
The mean square error σ of digital simulation residual sequence is calculated as follows:
Step 4: judging outlier
The 3 σ criterions commonly used on utilizing works judge and reject outlier, and it is { y to reject the data after outlieri':I=1,
2,...,m}
Judged according to formula (8), be normal value, this value is constant if residual values are less than threshold value;If being more than or waiting
In threshold value, then it is judged as outlier.If the point is outlier, the point is replaced with the average of preceding 6 points, if the point is not outlier,
Then the value of the point is constant.
Step 5:Judge whether all to handle all data, if not being disposed, to the number of no processing
Judge according to outlier is re-started;
Step 6:If all data have all been carried out with outlier judgement, the data after outlier are rejected in output.Entirely
Flow terminates.
In the above-mentioned technical solutions, the σ of threshold value 3 can make appropriate modification according to experiment concrete condition.
In step one, to former measurement data according to n-order polynomial fitting is carried out, draw coefficient matrix and fit multinomial
Formula;Corresponding fitting value sequence and corresponding residual sequence are obtained in step 2;Mean square error is calculated in step 3;Step 4
It is middle to judge using 3 σ criterions and reject outlier;Test data is judged in step 5, judges whether all to enter all data
Processing is gone, if not being disposed, outlier has been re-started to the data of no processing and judged;Export and reject in step 6
Data after outlier, flow terminates.This method flow can be used for computer automatic discrimination.
By taking the current speed size data actually measured as an example, therefrom choose 800 continuous data points and carry out rejecting wild
Value processing.Former measurement data is as shown in Fig. 2 carry out fitting curve after n-order polynomial is fitted as shown in figure 3, residual sequence is as schemed
Shown in 4, the data that output is rejected after outlier are as shown in Figure 5.
The present invention is recognized and rejected by being fitted the residual sequence of estimate and observation based on multinomial is fitted
The outlier observed in data sequence, to the important application value of practical engineering application;Compared with the conventional method, independent of setting
Meter experience, thus it is more convenient;The σ of threshold value 3 can make appropriate change, therefore the parameter designed according to experiment concrete condition
More accurate, application is wider;The mode of benefit value maintains the continuity of data after unruly-value rejecting, meets data processing reconnaissance
Requirement, realize to data source primary filtering.
The foregoing is only a preferred embodiment of the present invention, but protection scope of the present invention be not limited to
This, any one skilled in the art the invention discloses technical scope in, the change that can readily occur in or replace
Change, should all be included within the scope of the present invention.
Claims (2)
1. a kind of method of abnormal value removing and correction fitted based on multinomial, it is characterized in that:Including step:
1st, carry out n-order polynomial to former measurement data to fit, draw coefficient matrix and fit multinomial, by known observation data
I draws rough scatter diagram, chooses suitable frequency n progress least square multinomial and fits, to given measurement data (xi,
yi) construction one function p (x) be used as data-oriented (xi,yi) approximate expression, make error ri=p (xi)-yiQuadratic sum most
It is small, i.e.,Wherein i is 0 integer for arriving m,
Seek from geometric meaning and set point (xi,yi) square distance and be minimum curve y=p (x), function p (x) is
Fitting function or least square solution, the method for seeking fitting function p (x) are the least square method of curve matching, when fitting function is
During multinomial, that is, work asIt is least square fitting multinomial when (n≤m),
For a0,a1,...,anThe function of many variables, to I=I (a0,a1,...,an) extreme value is sought, by
The function of many variables seek the necessary condition of extreme value,I.e.It is on a0,a1,...,anLinear equation, represented with matrix
For
<mrow>
<mfenced open = "[" close = "]">
<mtable>
<mtr>
<mtd>
<mrow>
<mi>m</mi>
<mo>+</mo>
<mn>1</mn>
</mrow>
</mtd>
<mtd>
<mrow>
<munderover>
<mi>&Sigma;</mi>
<mrow>
<mi>i</mi>
<mo>=</mo>
<mn>0</mn>
</mrow>
<mi>m</mi>
</munderover>
<msub>
<mi>x</mi>
<mi>i</mi>
</msub>
</mrow>
</mtd>
<mtd>
<mi>&Lambda;</mi>
</mtd>
<mtd>
<mrow>
<munderover>
<mi>&Sigma;</mi>
<mrow>
<mi>i</mi>
<mo>=</mo>
<mn>0</mn>
</mrow>
<mi>m</mi>
</munderover>
<msubsup>
<mi>x</mi>
<mi>i</mi>
<mi>n</mi>
</msubsup>
</mrow>
</mtd>
</mtr>
<mtr>
<mtd>
<mrow>
<munderover>
<mi>&Sigma;</mi>
<mrow>
<mi>i</mi>
<mo>=</mo>
<mn>0</mn>
</mrow>
<mi>m</mi>
</munderover>
<msub>
<mi>x</mi>
<mi>i</mi>
</msub>
</mrow>
</mtd>
<mtd>
<mrow>
<munderover>
<mi>&Sigma;</mi>
<mrow>
<mi>i</mi>
<mo>=</mo>
<mn>0</mn>
</mrow>
<mi>m</mi>
</munderover>
<msubsup>
<mi>x</mi>
<mi>i</mi>
<mn>2</mn>
</msubsup>
</mrow>
</mtd>
<mtd>
<mi>&Lambda;</mi>
</mtd>
<mtd>
<mrow>
<munderover>
<mi>&Sigma;</mi>
<mrow>
<mi>i</mi>
<mo>=</mo>
<mn>0</mn>
</mrow>
<mi>m</mi>
</munderover>
<msubsup>
<mi>x</mi>
<mi>i</mi>
<mrow>
<mi>n</mi>
<mo>+</mo>
<mn>1</mn>
</mrow>
</msubsup>
</mrow>
</mtd>
</mtr>
<mtr>
<mtd>
<mi>M</mi>
</mtd>
<mtd>
<mi>M</mi>
</mtd>
<mtd>
<mrow></mrow>
</mtd>
<mtd>
<mi>M</mi>
</mtd>
</mtr>
<mtr>
<mtd>
<mrow>
<munderover>
<mi>&Sigma;</mi>
<mrow>
<mi>i</mi>
<mo>=</mo>
<mn>0</mn>
</mrow>
<mi>m</mi>
</munderover>
<msubsup>
<mi>x</mi>
<mi>i</mi>
<mi>n</mi>
</msubsup>
</mrow>
</mtd>
<mtd>
<mrow>
<munderover>
<mi>&Sigma;</mi>
<mrow>
<mi>i</mi>
<mo>=</mo>
<mn>0</mn>
</mrow>
<mi>m</mi>
</munderover>
<msubsup>
<mi>x</mi>
<mi>i</mi>
<mrow>
<mi>n</mi>
<mo>+</mo>
<mn>1</mn>
</mrow>
</msubsup>
</mrow>
</mtd>
<mtd>
<mi>&Lambda;</mi>
</mtd>
<mtd>
<mrow>
<munderover>
<mi>&Sigma;</mi>
<mrow>
<mi>i</mi>
<mo>=</mo>
<mn>0</mn>
</mrow>
<mi>m</mi>
</munderover>
<msubsup>
<mi>x</mi>
<mi>i</mi>
<mrow>
<mn>2</mn>
<mi>n</mi>
</mrow>
</msubsup>
</mrow>
</mtd>
</mtr>
</mtable>
</mfenced>
<mfenced open = "[" close = "]">
<mtable>
<mtr>
<mtd>
<msub>
<mi>a</mi>
<mn>0</mn>
</msub>
</mtd>
</mtr>
<mtr>
<mtd>
<msub>
<mi>a</mi>
<mn>1</mn>
</msub>
</mtd>
</mtr>
<mtr>
<mtd>
<mi>M</mi>
</mtd>
</mtr>
<mtr>
<mtd>
<msub>
<mi>a</mi>
<mi>n</mi>
</msub>
</mtd>
</mtr>
</mtable>
</mfenced>
<mo>=</mo>
<mfenced open = "[" close = "]">
<mtable>
<mtr>
<mtd>
<mrow>
<munderover>
<mi>&Sigma;</mi>
<mrow>
<mi>i</mi>
<mo>=</mo>
<mn>0</mn>
</mrow>
<mi>m</mi>
</munderover>
<msub>
<mi>y</mi>
<mi>i</mi>
</msub>
</mrow>
</mtd>
</mtr>
<mtr>
<mtd>
<mrow>
<munderover>
<mi>&Sigma;</mi>
<mrow>
<mi>i</mi>
<mo>=</mo>
<mn>0</mn>
</mrow>
<mi>m</mi>
</munderover>
<msub>
<mi>x</mi>
<mi>i</mi>
</msub>
<msub>
<mi>y</mi>
<mi>i</mi>
</msub>
</mrow>
</mtd>
</mtr>
<mtr>
<mtd>
<mi>M</mi>
</mtd>
</mtr>
<mtr>
<mtd>
<mrow>
<munderover>
<mi>&Sigma;</mi>
<mrow>
<mi>i</mi>
<mo>=</mo>
<mn>0</mn>
</mrow>
<mi>m</mi>
</munderover>
<msubsup>
<mi>x</mi>
<mi>i</mi>
<mi>n</mi>
</msubsup>
<msub>
<mi>y</mi>
<mi>i</mi>
</msub>
</mrow>
</mtd>
</mtr>
</mtable>
</mfenced>
</mrow>
The formula is to carry out n-order polynomial to former measurement data to fit, and draws coefficient matrix a0,a1,...,an, can obtain fitting many
Item formulaObtain corresponding fitting value sequence and corresponding residual sequence;
2nd, it is { p to calculate corresponding fitting value sequencei:I=1,2 ..., m }:It is { Δ y to generate regression criterion sequencei=pi-yi,
I=1,2 ... m };
3rd, the mean square error σ of digital simulation residual sequence, is calculated as follows:
4th, the 3 σ criterions commonly used on utilizing works judge and reject outlier, and it is y to reject the data after outlieri':I=1,2 ..., m
Judged according to the formula, if residual values are less than threshold value, the point is normal value, and this value is constant;If residual values are big
When threshold value, then the point is judged for outlier, the value is replaced with the average of the point the first six point;
5th, judge whether all to handle all data, if not being disposed, the data to no processing are re-started
Outlier judges;
6th, total data is carried out after outlier judgement, the data after outlier are rejected in output.
2. the method for abnormal value removing and correction according to claim 1 fitted based on multinomial, it is characterized in that:The σ of threshold value 3 can
Appropriate modification is made according to experiment concrete condition.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710253952.8A CN107193782A (en) | 2017-04-18 | 2017-04-18 | A kind of method of abnormal value removing and correction fitted based on multinomial |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710253952.8A CN107193782A (en) | 2017-04-18 | 2017-04-18 | A kind of method of abnormal value removing and correction fitted based on multinomial |
Publications (1)
Publication Number | Publication Date |
---|---|
CN107193782A true CN107193782A (en) | 2017-09-22 |
Family
ID=59871423
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710253952.8A Pending CN107193782A (en) | 2017-04-18 | 2017-04-18 | A kind of method of abnormal value removing and correction fitted based on multinomial |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107193782A (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110967661A (en) * | 2019-12-20 | 2020-04-07 | 宁夏凯晨电气集团有限公司 | Electrical data calibration method based on curve fitting |
CN111143777A (en) * | 2019-12-27 | 2020-05-12 | 新奥数能科技有限公司 | Data processing method and device, intelligent terminal and storage medium |
CN111736626A (en) * | 2020-06-22 | 2020-10-02 | 中国人民解放军国防科技大学 | Stable missile path data processing method |
CN113114161A (en) * | 2021-03-26 | 2021-07-13 | 哈尔滨工业大学 | Electromechanical system signal filtering method for eliminating outliers by using minimum median method |
CN113111573A (en) * | 2021-03-24 | 2021-07-13 | 桂林电子科技大学 | Landslide displacement prediction method based on GRU |
-
2017
- 2017-04-18 CN CN201710253952.8A patent/CN107193782A/en active Pending
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110967661A (en) * | 2019-12-20 | 2020-04-07 | 宁夏凯晨电气集团有限公司 | Electrical data calibration method based on curve fitting |
CN111143777A (en) * | 2019-12-27 | 2020-05-12 | 新奥数能科技有限公司 | Data processing method and device, intelligent terminal and storage medium |
CN111736626A (en) * | 2020-06-22 | 2020-10-02 | 中国人民解放军国防科技大学 | Stable missile path data processing method |
CN113111573A (en) * | 2021-03-24 | 2021-07-13 | 桂林电子科技大学 | Landslide displacement prediction method based on GRU |
CN113111573B (en) * | 2021-03-24 | 2022-09-23 | 桂林电子科技大学 | Landslide displacement prediction method based on GRU |
CN113114161A (en) * | 2021-03-26 | 2021-07-13 | 哈尔滨工业大学 | Electromechanical system signal filtering method for eliminating outliers by using minimum median method |
CN113114161B (en) * | 2021-03-26 | 2023-03-24 | 哈尔滨工业大学 | Electromechanical system signal filtering method for eliminating outliers by using minimum median method |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107193782A (en) | A kind of method of abnormal value removing and correction fitted based on multinomial | |
CN109784383B (en) | Rail crack identification method based on graph domain feature and DS evidence theory fusion | |
CN109284779A (en) | Object detecting method based on the full convolutional network of depth | |
CN109727446A (en) | A kind of identification and processing method of electricity consumption data exceptional value | |
CN111709465B (en) | Intelligent identification method for rough difference of dam safety monitoring data | |
CN108829878B (en) | Method and device for detecting abnormal points of industrial experimental data | |
CN107688554A (en) | Frequency domain identification method based on adaptive Fourier decomposition | |
CN113516228B (en) | Network anomaly detection method based on deep neural network | |
CN110288624A (en) | Detection method, device and the relevant device of straightway in a kind of image | |
CN111639882B (en) | Deep learning-based electricity risk judging method | |
CN110308658A (en) | A kind of pid parameter setting method, device, system and readable storage medium storing program for executing | |
CN114970396B (en) | CFD model correction method considering random and cognitive uncertainty | |
CN102945222A (en) | Poor information measurement data gross error discrimination method based on Grey System Theory | |
CN114861120A (en) | Flotation froth grade calculation method, device, electronic equipment and medium | |
CN109753634B (en) | Historical data steady-state value-based dynamic system gain estimation method | |
CN104484747A (en) | Method for determining qualified rate of products by utilizing truncation samples | |
CN104715160A (en) | Soft measurement modeling data outlier detecting method based on KMDB | |
CN115455833B (en) | Pneumatic uncertainty characterization method considering classification | |
CN116992773A (en) | Belt conveyor coal flow prediction method based on integral LSTM and priori information | |
CN107562778A (en) | A kind of outlier excavation method based on deviation feature | |
CN110298767A (en) | A kind of thermal power plant time series variable method for monitoring abnormality and system | |
CN110196797A (en) | Automatic optimization method and system suitable for credit scoring card system | |
CN116050249A (en) | Reflow soldering spot morphology prediction method | |
CN109766905A (en) | Target cluster dividing method based on Self-Organizing Feature Maps | |
CN114218782A (en) | Truncated sequential test data evaluation method based on binomial distribution |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20170922 |