CN112116014A - Test data outlier detection method for distribution automation equipment - Google Patents

Test data outlier detection method for distribution automation equipment Download PDF

Info

Publication number
CN112116014A
CN112116014A CN202011017753.5A CN202011017753A CN112116014A CN 112116014 A CN112116014 A CN 112116014A CN 202011017753 A CN202011017753 A CN 202011017753A CN 112116014 A CN112116014 A CN 112116014A
Authority
CN
China
Prior art keywords
test data
data set
median
absolute deviation
niqr
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011017753.5A
Other languages
Chinese (zh)
Inventor
郑友卓
张锐锋
付宇
肖小兵
何洪流
窦陈
文蕾
李前敏
吴鹏
刘安茳
郝树青
王卓月
蔡永祥
苗宇
李跃
张恒荣
黄伟
郭素
柏毅辉
李忠
安波
龙秋风
樊磊
熊锦航
吴应双
王明伟
刘亮
王竹
刘兵
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guizhou Power Grid Co Ltd
Original Assignee
Guizhou Power Grid Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guizhou Power Grid Co Ltd filed Critical Guizhou Power Grid Co Ltd
Priority to CN202011017753.5A priority Critical patent/CN112116014A/en
Publication of CN112116014A publication Critical patent/CN112116014A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/2433Single-class perspective, e.g. one-against-all classification; Novelty detection; Outlier detection
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01RMEASURING ELECTRIC VARIABLES; MEASURING MAGNETIC VARIABLES
    • G01R31/00Arrangements for testing electric properties; Arrangements for locating electric faults; Arrangements for electrical testing characterised by what is being tested not provided for elsewhere

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Tests Of Electronic Circuits (AREA)

Abstract

The invention discloses a method for detecting test data outlier of distribution automation equipment, which comprises the following steps: collecting a test data set; arranging the data of the test data set from small to large to obtain the median and the standardized four-quadrant spacing of the test data set; calculating the absolute deviation value of each data and the median in the test data set to obtain an absolute deviation value data set; arranging the data in the absolute deviation value data set from small to large, and calculating to obtain a median of the standardized absolute deviation value; judging the magnitude of the median of the standardized four-quadrant spacing and the standardized absolute deviation value, taking the interval [ M-n.NIQR, M + n.NIQR ] as a judgment standard for detecting the outlier of the test data set, and taking the out-of-range as the outlier; taking the intervals [ M-n NMAD, M + n NMAD ] as the judgment standard of the detection of the outlier of the test data set, wherein the outliers are all outliers; the problems of detection of the outlier in an excessive way and the like are solved.

Description

Test data outlier detection method for distribution automation equipment
Technical Field
The invention belongs to a testing technology of distribution automation equipment, and particularly relates to a method for detecting test data outliers of distribution automation equipment.
Background
With the application of the distribution automation equipment in the power system becoming more and more extensive, the test evaluation work of the distribution automation equipment becomes more and more important, and in the test evaluation process, data in the test data set to be evaluated may be affected by various interferences in the actual acquisition process, so that outliers (some abnormal observation values doped in the data set are greatly different from other data values) may exist in the data in the test data set to be evaluated. These outliers may provide erroneous information or rules, which largely affect the accuracy of the evaluation result, and detecting outliers is an important step for ensuring the objective and accurate evaluation result. At present, the most common outlier detection method mainly uses a mean value and a standard deviation as an outlier evaluation standard according to a 3 σ criterion, but it should be noted that the existence of the outlier may affect the mean value and the standard deviation to a great extent, causing an unreliable outlier detection result, and in the process of detecting the outlier by using a quartile number, the mean value and the standard deviation are not used, thereby avoiding the influence of the outlier on statistical parameters (the mean value and the standard deviation), improving the sensitivity of the outlier, but also possibly causing an "excessive" detection phenomenon of the outlier, causing the unreliable outlier detection result, and affecting the final test and evaluation result of the distribution automation equipment.
Disclosure of Invention
The technical problem to be solved by the invention is as follows: the method is used for solving the technical problems that in the prior art, the processing of outliers of data in a test data set can cause the phenomenon of 'excessive' detection of the outliers, so that the detection result of the outliers is unreliable, the final test evaluation result of the distribution automation equipment is influenced, and the like.
The technical scheme of the invention is as follows:
a method for detecting test data outliers of distribution automation equipment comprises the following steps:
step A, collecting test data of the distribution automation terminal to obtain a test data set D, and setting a standardized parameter factor b1、b2And an outlier decision parameter n;
b, arranging the data in the test data set D from small to large to obtain a median M and a standardized quartile distance NIQR of the test data set D;
step C, calculating the absolute deviation value of each data in the test data set D and the median M to obtain an absolute deviation value data set E;
d, arranging the data in the absolute deviation value data set E from small to large, and calculating to obtain a median NMAD of the standardized absolute deviation value;
e, judging the size of the normalized quartile distance NIQR and the median NMAD of the normalized absolute deviation value, if the NIQR is larger than the NMAD, entering the step F, otherwise, entering the step G;
step F, taking the interval [ M-n.NIQR, M + n.NIQR ] as a judgment standard for detecting the outlier of the test data set D, and judging that the exceeding intervals [ M-n.NIQR, M + n.NIQR ] in the test data set D are all the outliers;
g, taking the intervals [ M-n NMAD, M + n NMAD ] as a judgment standard for detecting the outlier of the test data set D, and judging that the excess intervals [ M-n NMAD, M + n NMAD ] in the test data set D are all the outliers;
and H, outputting the outliers in the test data set D to obtain an outlier set O.
Normalizing the parameter factor b in step A1The calculation formula is as follows:
b1=1/[2*Q(0.75)]
in the formula, Q (0.75) is the 0.75 quantile of the standard normal distribution and has a value of 0.67449.
Normalizing the parameter factor b in step A2The calculation formula is as follows:
b2=1/Q(0.75)
in the formula, Q (0.75) is the 0.75 quantile of the standard normal distribution and has a value of 0.67449.
In step a, the calculation formula of the outlier judgment parameter n is as follows:
Figure BDA0002699643600000031
lower quartile Q in step B1Position, median M position and upper quartile Q3The position calculation formula is:
Figure BDA0002699643600000032
where N is the total number of data in the test data set D.
The normalized quartering distance NIQR in step B is calculated by the formula:
NIQR=b1·(Q3-Q1)
wherein NIQR is normalized fourDividing the spacing; b1Is a standardized parameter factor; q3The upper quartile of the test data set D; q1The lower quartile of the test data set D.
In the step C, calculating the absolute deviation value of each data in the test data set D and the median M to obtain an absolute deviation value data set E, wherein the calculation formula is as follows:
ei=|di-M| i=1,2,…,n
in the formula diElements in the test data set D; m is the median of the test data set D; e.g. of the typeiAre elements in the absolute deviation value data set E.
In the step D, the data in the absolute deviation value data set E are arranged from small to large and calculated
The median normalized absolute deviation value NMAD is calculated by the formula:
NMAD=b2·M1
wherein NMAD is the median of the normalized absolute deviation values; b2Is a standardized parameter factor; m1Is the median of the set of absolute deviation data E.
The invention has the beneficial effects that:
the method for detecting the outlier of the test data of the distribution automation equipment adopts the median of the standardized absolute deviation value or the standardized four-quadrant spacing as the outlier judgment standard, is more sensitive to the outlier, avoids the influence of the outlier on statistical parameters (mean and standard deviation) compared with a classical detection method based on the mean and standard deviation, and ensures the reliability of the outlier detection result. According to the invention, the larger one of the median of the standardized absolute deviation value and the standardized four-quadrant spacing is selected as the outlier judgment standard, so that outlier data can be more flexibly and effectively detected, the occurrence of an 'excessive' detection phenomenon is avoided to a certain extent, and the reliability of an outlier detection result is ensured.
The method solves the problems that in the prior art, outliers existing in the test data outlier detection result of the distribution automation equipment affect statistical parameters, further affect the outlier detection result, and cause excessive outlier detection.
Drawings
FIG. 1 is a flow chart of the method of the present invention.
Detailed Description
The invention discloses a method for detecting outlier of test data of distribution automation equipment, which comprises the following steps:
and step A, collecting test data of the distribution automation terminal to obtain a test data set D. Setting a standardized parameter factor b1、b2And an outlier decision parameter n;
b, arranging the data in the test data set D from small to large to obtain a median M and a standardized quartile distance NIQR of the test data set D;
step C, calculating the absolute deviation value of each data in the test data set D and the median M to obtain an absolute deviation value data set E;
d, arranging the data in the absolute deviation value data set E from small to large, and calculating to obtain a median NMAD of the standardized absolute deviation value;
e, judging the size of the normalized quartile distance NIQR and the median NMAD of the normalized absolute deviation value, if the NIQR is larger than the NMAD, entering the step F, otherwise, entering the step G;
step F, taking the interval [ M-n.NIQR, M + n.NIQR ] as a judgment standard for detecting the outlier of the test data set D, judging that the exceeding intervals [ M-n.NIQR, M + n.NIQR ] in the test data set D are all outliers, and entering the step H;
step G, taking the interval [ M-n NMAD, M + n NMAD ] as a judgment standard of the detection of the outlier of the test data set D, and judging the excess interval [ M-n NMAD, M + n NMAD ] in the test data set D
All are outliers, and step H is carried out;
and H, outputting the outlier set O in the test data set D.
And B, collecting test data of the distribution automation terminal in the step A to obtain a test data set D. Setting a standardized parameter factor b1、b2And an outlier decision parameter n, wherein the parameter is normalizedSub-b1The calculation formula is shown as follows:
b1=1/[2*Q(0.75)] (1)
wherein Q (0.75) is the 0.75 quantile of a standard normal distribution with a value of 0.67449;
wherein the normalized parameter factor b2The calculation formula is shown as follows:
b2=1/Q(0.75) (2)
wherein Q (0.75) is the 0.75 quantile of a standard normal distribution with a value of 0.67449;
the calculation formula of the outlier judgment parameter n is shown as the following formula:
Figure BDA0002699643600000061
in step B, the data in the test data set D are arranged from small to large, and the lower quartile Q is found1A median M and an upper quartile Q3Of which the lower quartile Q1Position, median M position and upper quartile Q3The position calculation formula is shown as follows:
Figure BDA0002699643600000062
in the formula, N is the total number of data in the test data set D;
the normalized quartile-spacing NIQR calculation formula is as follows:
NIQR=b1·(Q3-Q1) (5)
wherein, the NIQR is a standardized four-bit spacing; b1Is a standardized parameter factor; q3The upper quartile of the test data set D; q1The lower quartile of the test data set D;
in the step C, calculating an absolute deviation value between each data in the test data set D and the median M to obtain an absolute deviation value data set E, wherein the calculation formula is as follows:
ei=|di-M| i=1,2,…,n(6)
in the formula (d)iElements in the test data set D; m is the median of the test data set D; e.g. of the typeiElements in the absolute deviation value data set E;
in step D, the data in the set of absolute deviation value data E are arranged from small to large, and the median normalized absolute deviation value NMAD is calculated as follows:
NMAD=b2·M1 (7)
wherein NMAD is the median of the normalized absolute deviation values; b2Is a standardized parameter factor; m1Is the median of the set of absolute deviation data E.
The method for detecting outlier of test data of distribution automation equipment provided by the present invention is further described with reference to the following embodiments.
Example 1
As shown in fig. 1, a flowchart of a method for detecting outliers of test data of distribution automation equipment provided by the present invention includes the following steps:
and B, collecting test data of the distribution automation terminal in the step A to obtain a test data set D. Setting a standardized parameter factor b1、b2And an outlier decision parameter n, wherein the parameter factor b is normalized1The calculation formula is shown as follows:
b1=1/[2*Q(0.75)] (8)
wherein Q (0.75) is the 0.75 quantile of a standard normal distribution with a value of 0.67449;
wherein the normalized parameter factor b2The calculation formula is shown as follows:
b2=1/Q(0.75) (9)
wherein Q (0.75) is the 0.75 quantile of a standard normal distribution with a value of 0.67449;
the calculation formula of the outlier judgment parameter n is shown as the following formula:
Figure BDA0002699643600000071
in example 1, the test data set D (0.2419,0.2958,0.1992,0.1307,0.0437,0.0433,0.0679,0.0167,0.0513,0.0004,0.0251,0.0041,0.0849,0.0769), the normalization parameter factor b is set1Is 0.7413, b21.4826 and the outlier decision parameter n is 3;
in step B, the data in the test data set D are arranged from small to large, and the lower quartile Q is found1A median M and an upper quartile Q3Of which the lower quartile Q1Position, median M position and upper quartile Q3The position calculation formula is shown as follows:
Figure BDA0002699643600000081
in the formula, N is the total number of data in the test data set D;
the normalized quartile-spacing NIQR calculation formula is as follows:
NIQR=b1·(Q3-Q1) (12)
wherein, the NIQR is a standardized four-bit spacing; b1Is a standardized parameter factor; q3The upper quartile of the test data set D; q1The lower quartile of the test data set D;
in example 1, the lower quartile Q of data set D (0.2419,0.2958,0.1992,0.1307,0.0437,0.0433,0.0679,0.0167,0.0513,0.0004,0.0251,0.0041,0.0849,0.0769) was tested10.0251, a median M of 0.0596, and an upper quartile Q30.1307, the normalized interquartile range NIQR is 0.0783;
in step C, calculating the absolute deviation value of each data in the test data set D from the median M to obtain a new absolute deviation value data set E, as shown in the following formula:
ei=|di-M| i=1,2,…,n (13)
in the formula (d)iFor testing elements in data set D(ii) a M is the median of the test data set D; e.g. of the typeiElements in the absolute deviation value data set E;
in example 1, an absolute deviation value data set E (0.1823,0.2362,0.1396,0.0711,0.0159,0.0163,0.0083,0.0429,0.0083,0.0592,0.0345,0.0555,0.0253,0.0173) was calculated;
in step D, the data in the set of absolute deviation value data E are arranged from small to large, and the median normalized absolute deviation value NMAD is calculated as follows:
NMAD=b2·M1 (14)
wherein NMAD is the median of the normalized absolute deviation values; b2Is a standardized parameter factor; m1The median of the absolute deviation value data set E;
in example 1, the median of the absolute deviation value data set E (0.1823,0.2362,0.1396,0.0711,0.0159,0.0163,0.0083,0.0429,0.0083,0.0592,0.0345,0.0555,0.0253,0.0173) was 0.0387, and the median of normalized absolute deviation value NMAD was 0.0574;
in the step E, judging the size of the normalized quartile distance NIQR and the median NMAD of the normalized absolute deviation value, if the NIQR is larger than the NMAD, entering the step F, otherwise, entering the step G; in example 1, if the normalized quartile distance NIQR is 0.0783, the normalized absolute deviation median NMAD is 0.0574, and the NIQR is greater than the NMAD, then step F is performed;
in the step F, the interval [ M-n.NIQR, M + n.NIQR ] is used as a judgment standard for detecting the outlier of the test data set D, the exceeding intervals [ M-n.NIQR, M + n.NIQR ] in the test data set D are all judged to be the outlier, and the step H is carried out;
in example 1, as a criterion for detecting an outlier in the test data set D, the interval [ M-n · NIQR, M + n · NIQR ] is [ -0.1753,0.2945], and 0.2958 is judged to be outlier data, and the routine proceeds to step H.
In step H, outputting an outlier set O of the test data set D;
in example 1, the set of outliers O (0.2958) is set.
Example 2
As shown in fig. 1, a flowchart of a method for detecting outliers of test data of distribution automation equipment provided by the present invention includes the following steps:
and B, collecting test data of the distribution automation terminal in the step A to obtain a test data set D. Setting a standardized parameter factor b1、b2And an outlier decision parameter n, wherein the parameter factor b is normalized1The calculation formula is shown as follows:
b1=1/[2*Q(0.75)] (15)
wherein Q (0.75) is the 0.75 quantile of a standard normal distribution with a value of 0.67449;
wherein the normalized parameter factor b2The calculation formula is shown as follows:
b2=1/Q(0.75) (16)
wherein Q (0.75) is the 0.75 quantile of a standard normal distribution with a value of 0.67449;
the calculation formula of the outlier judgment parameter n is shown as the following formula:
Figure BDA0002699643600000101
in example 2, the test data set D (0.0176,0.0195,0.0201,0.0202,0.0203,0.0207,0.0209,0.0212,0.0241), the normalized parameter factor b was set1Is 0.7413, b21.4826 and the outlier decision parameter n is 3;
in step B, the data in the test data set D are arranged from small to large, and the lower quartile Q is found1A median M and an upper quartile Q3Of which the lower quartile Q1Position, median M position and upper quartile Q3The position calculation formula is shown as follows:
Figure BDA0002699643600000102
in the formula, N is the total number of data in the test data set D;
the normalized quartile-spacing NIQR calculation formula is as follows:
NIQR=b1·(Q3-Q1) (19)
wherein, the NIQR is a standardized four-bit spacing; b1Is a standardized parameter factor; q3The upper quartile of the test data set D; q1The lower quartile of the test data set D;
in example 2, the lower quartile Q of data set D (0.0176,0.0195,0.0201,0.0202,0.0203,0.0207,0.0209,0.0212,0.0241) was tested10.0200, a median M of 0.0203, and an upper quartile Q30.0210, a normalized interquartile range NIQR of 0.0007; in step C, calculating the absolute deviation value of each data in the test data set D from the median M to obtain a new absolute deviation value data set E, as shown in the following formula:
ei=|di-M| i=1,2,…,n (20)
in the formula (d)iElements in the test data set D; m is the median of the test data set D; e.g. of the typeiElements in the absolute deviation value data set E;
in example 2, the set of absolute bias values E (0.0027,0.0008,0.0002,0.0001,0.0000,0.0004,0.0006,0.0009,0.0038) was calculated;
in step D, the data in the set of absolute deviation value data E are arranged from small to large, and the median normalized absolute deviation value NMAD is calculated as follows:
NMAD=b2·M1 (21)
wherein NMAD is the median of the normalized absolute deviation values; b2Is a standardized parameter factor; m1The median of the absolute deviation value data set E;
in example 1, the median of the absolute deviation value data set E (0.0027,0.0008,0.0002,0.0001,0.0000,0.0004,0.0006,0.0009,0.0038) was 0.0006, and the median normalized absolute deviation value NMAD was 0.0009;
in the step E, judging the size of the normalized quartile distance NIQR and the median NMAD of the normalized absolute deviation value, if the NIQR is larger than the NMAD, entering the step F, otherwise, entering the step G; in example 1, the normalized quartile range NIQR is 0.0006, the normalized absolute deviation median NMAD is 0.0009, and NIQR is less than NMAD, then step G is entered;
in the step G, the interval [ M-n NMAD, M + n NMAD ] is used as a judgment standard for detecting the outlier of the test data set D, and the excess intervals [ M-n NMAD, M + n NMAD ] in the test data set D are all judged to be the outlier;
in example 2, as a criterion for detecting an outlier in the test data set D, the interval [ M-n · NMAD, M + n · NMAD ] is [0.0176,0.0230], and 0.0241 is determined to be outlier data, and the routine proceeds to step H.
In step H, outputting an outlier set O of the final test data set D;
in example 2, the set of outliers O (0.0241).
Example 3
As shown in fig. 1, a flowchart of a method for detecting outliers of test data of distribution automation equipment provided by the present invention includes the following steps:
and B, collecting test data of the distribution automation terminal in the step A to obtain a test data set D. Setting a standardized parameter factor b1、b2And an outlier decision parameter n, wherein the parameter factor b is normalized1The calculation formula is shown as follows:
b1=1/[2*Q(0.75)] (22)
wherein Q (0.75) is the 0.75 quantile of a standard normal distribution with a value of 0.67449;
wherein the normalized parameter factor b2The calculation formula is shown as follows:
b2=1/Q(0.75) (23)
wherein Q (0.75) is the 0.75 quantile of a standard normal distribution with a value of 0.67449;
the calculation formula of the outlier judgment parameter n is shown as the following formula:
Figure BDA0002699643600000121
in example 3, the test data set D (0.0449,0.2173,0.1707,0.1116,0.0634,0.0102,0.0237,0.0284,0.0165,0.0022,0.0116,0.0398,0.0548,0.0739), the normalized parameter factor b is set1Is 0.7413, b21.4826 and the outlier decision parameter n is 3;
in step B, the data in the test data set D are arranged from small to large, and the lower quartile Q is found1A median M and an upper quartile Q3Of which the lower quartile Q1Position, median M position and upper quartile Q3The position calculation formula is shown as follows:
Figure BDA0002699643600000131
in the formula, N is the total number of data in the test data set D;
the normalized quartile-spacing NIQR calculation formula is as follows:
NIQR=b1·(Q3-Q1) (26)
wherein, the NIQR is a standardized four-bit spacing; b1Is a standardized parameter factor; q3The upper quartile of the test data set D; q1The lower quartile of the test data set D;
in example 3, the lower quartile Q of data set D (0.0449,0.2173,0.1707,0.1116,0.0634,0.0102,0.0237,0.0284,0.0165,0.0022,0.0116,0.0398,0.0548,0.0739) was tested10.0165, a median M of 0.0424, and an upper quartile Q30.0739, a normalized interquartile range NIQR of 0.0426;
in step C, calculating the absolute deviation value of each data in the test data set D from the median M to obtain a new absolute deviation value data set E, as shown in the following formula:
ei=|di-M| i=1,2,…,n (27)
in the formula (d)iElements in the test data set D; m is a test data setThe median of the sum D; e.g. of the typeiElements in the absolute deviation value data set E;
in example 3, an absolute deviation value data set E (0.0026,0.1750,0.1284,0.0693,0.0211,0.0322,0.0187,0.0140,0.0259,0.0402,0.0308,0.0026,0.0125,0.0316) was calculated;
in step D, the data in the set of absolute deviation value data E are arranged from small to large, and the median normalized absolute deviation value NMAD is calculated as follows:
NMAD=b2·M1 (28)
wherein NMAD is the median of the normalized absolute deviation values; b2Is a standardized parameter factor; m1The median of the absolute deviation value data set E;
in example 3, the median of the set of absolute deviation value data E (0.0026,0.1750,0.1284,0.0693,0.0211,0.0322,0.0187,0.0140,0.0259,0.0402,0.0308,0.0026,0.0125,0.0316) was 0.0283, and the median of normalized absolute deviation value NMAD was 0.0420;
in the step E, judging the size of the normalized quartile distance NIQR and the median NMAD of the normalized absolute deviation value, if the NIQR is larger than the NMAD, entering the step F, otherwise, entering the step G; in example 3, the normalized quartile range NIQR is 0.0426, the normalized absolute deviation median NMAD is 0.0420, and NIQR is greater than NMAD, then step F is entered;
in the step F, the interval [ M-n.NIQR, M + n.NIQR ] is used as a judgment standard for detecting the outlier of the test data set D, the exceeding intervals [ M-n.NIQR, M + n.NIQR ] in the test data set D are all judged to be the outlier, and the step H is carried out;
in example 3, as a criterion for detecting an outlier in the test data set D, the interval [ M-n · NIQR, M + n · NIQR ] was [ -0.0854,0.1702], and 0.2173 and 0.1707 were judged as outlier data, and the routine proceeds to step H.
In step H, outputting an outlier set O of the final test data set D;
in example 3, the set of outliers O (0.2173, 0.1707).

Claims (8)

1. A method for detecting test data outliers of distribution automation equipment comprises the following steps:
step A, collecting test data of the distribution automation terminal to obtain a test data set D, and setting a standardized parameter factor b1、b2And an outlier decision parameter n;
b, arranging the data in the test data set D from small to large to obtain a median M and a standardized quartile distance NIQR of the test data set D;
step C, calculating the absolute deviation value of each data in the test data set D and the median M to obtain an absolute deviation value data set E;
d, arranging the data in the absolute deviation value data set E from small to large, and calculating to obtain a median NMAD of the standardized absolute deviation value;
e, judging the size of the normalized quartile distance NIQR and the median NMAD of the normalized absolute deviation value, if the NIQR is larger than the NMAD, entering the step F, otherwise, entering the step G;
step F, taking the interval [ M-n.NIQR, M + n.NIQR ] as a judgment standard for detecting the outlier of the test data set D, and judging that the exceeding intervals [ M-n.NIQR, M + n.NIQR ] in the test data set D are all the outliers;
g, taking the intervals [ M-n NMAD, M + n NMAD ] as a judgment standard for detecting the outlier of the test data set D, and judging that the excess intervals [ M-n NMAD, M + n NMAD ] in the test data set D are all the outliers;
and H, outputting the outliers in the test data set D to obtain an outlier set O.
2. The distribution automation device test data outlier detection method of claim 1, wherein: normalizing the parameter factor b in step A1The calculation formula is as follows:
b1=1/[2*Q(0.75)]
in the formula, Q (0.75) is the 0.75 quantile of the standard normal distribution and has a value of 0.67449.
3. According to the claimsSolving 1 the test data outlier detection method of the distribution automation equipment is characterized in that: normalizing the parameter factor b in step A2The calculation formula is as follows:
b2=1/Q(0.75)
in the formula, Q (0.75) is the 0.75 quantile of the standard normal distribution and has a value of 0.67449.
4. The distribution automation device test data outlier detection method of claim 1, wherein: in step a, the calculation formula of the outlier judgment parameter n is as follows:
Figure FDA0002699643590000021
5. the distribution automation device test data outlier detection method of claim 1, wherein: lower quartile Q in step B1Position, median M position and upper quartile Q3The position calculation formula is:
Figure FDA0002699643590000022
where N is the total number of data in the test data set D.
6. The distribution automation device test data outlier detection method of claim 1, wherein: the normalized quartering distance NIQR in step B is calculated by the formula:
NIQR=b1·(Q3-Q1)
wherein, the NIQR is a standardized four-bit spacing; b1Is a standardized parameter factor; q3The upper quartile of the test data set D; q1The lower quartile of the test data set D.
7. The distribution automation device test data outlier detection method of claim 1, wherein: in the step C, calculating the absolute deviation value of each data in the test data set D and the median M to obtain an absolute deviation value data set E, wherein the calculation formula is as follows:
ei=|di-M| i=1,2,…,n
in the formula diElements in the test data set D; m is the median of the test data set D; e.g. of the typeiAre elements in the absolute deviation value data set E.
8. The distribution automation device test data outlier detection method of claim 1, wherein: in the step D, the data in the absolute deviation value data set E are arranged from small to large, and the median NMAD of the standardized absolute deviation value is obtained through calculation, wherein the calculation formula is as follows:
NMAD=b2·M1
wherein NMAD is the median of the normalized absolute deviation values; b2Is a standardized parameter factor; m1Is the median of the set of absolute deviation data E.
CN202011017753.5A 2020-09-24 2020-09-24 Test data outlier detection method for distribution automation equipment Pending CN112116014A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011017753.5A CN112116014A (en) 2020-09-24 2020-09-24 Test data outlier detection method for distribution automation equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011017753.5A CN112116014A (en) 2020-09-24 2020-09-24 Test data outlier detection method for distribution automation equipment

Publications (1)

Publication Number Publication Date
CN112116014A true CN112116014A (en) 2020-12-22

Family

ID=73801654

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011017753.5A Pending CN112116014A (en) 2020-09-24 2020-09-24 Test data outlier detection method for distribution automation equipment

Country Status (1)

Country Link
CN (1) CN112116014A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116016303A (en) * 2022-12-05 2023-04-25 浪潮通信信息系统有限公司 Method for identifying service quality problem of core network based on artificial intelligence

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040267477A1 (en) * 2001-05-24 2004-12-30 Scott Michael J. Methods and apparatus for data analysis
CN102591267A (en) * 2011-09-23 2012-07-18 天昌国际烟草有限公司 Method for monitoring quality of production process by using target
CN104915846A (en) * 2015-06-18 2015-09-16 北京京东尚科信息技术有限公司 Electronic commerce time sequence data anomaly detection method and system
AU2016201652A1 (en) * 2008-01-03 2016-04-07 University Of Maryland Monitoring a mobile device
CN107729294A (en) * 2017-09-28 2018-02-23 天津同阳科技发展有限公司 The acquisition methods and device of outlier in Detection of Air Quality data
CN109426809A (en) * 2017-08-25 2019-03-05 是德科技股份有限公司 The method and apparatus that detecting event starts in the presence of noise
CN109460776A (en) * 2018-10-11 2019-03-12 浙江工业大学 A kind of driver's differentiating method based on channel status detection
CN110336322A (en) * 2019-07-11 2019-10-15 贵州大学 Method is determined based on the photovoltaic power generation allowed capacity of day minimum load confidence interval
CN110751371A (en) * 2019-09-20 2020-02-04 苏宁云计算有限公司 Commodity inventory risk early warning method and system based on statistical four-bit distance and computer readable storage medium
CN111262722A (en) * 2019-12-31 2020-06-09 中国广核电力股份有限公司 Safety monitoring method for industrial control system network

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040267477A1 (en) * 2001-05-24 2004-12-30 Scott Michael J. Methods and apparatus for data analysis
AU2016201652A1 (en) * 2008-01-03 2016-04-07 University Of Maryland Monitoring a mobile device
CN102591267A (en) * 2011-09-23 2012-07-18 天昌国际烟草有限公司 Method for monitoring quality of production process by using target
CN104915846A (en) * 2015-06-18 2015-09-16 北京京东尚科信息技术有限公司 Electronic commerce time sequence data anomaly detection method and system
CN109426809A (en) * 2017-08-25 2019-03-05 是德科技股份有限公司 The method and apparatus that detecting event starts in the presence of noise
CN107729294A (en) * 2017-09-28 2018-02-23 天津同阳科技发展有限公司 The acquisition methods and device of outlier in Detection of Air Quality data
CN109460776A (en) * 2018-10-11 2019-03-12 浙江工业大学 A kind of driver's differentiating method based on channel status detection
CN110336322A (en) * 2019-07-11 2019-10-15 贵州大学 Method is determined based on the photovoltaic power generation allowed capacity of day minimum load confidence interval
CN110751371A (en) * 2019-09-20 2020-02-04 苏宁云计算有限公司 Commodity inventory risk early warning method and system based on statistical four-bit distance and computer readable storage medium
CN111262722A (en) * 2019-12-31 2020-06-09 中国广核电力股份有限公司 Safety monitoring method for industrial control system network

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
CHRISTOPHE LEYS等: "Detecting outliers: Do not use standard deviation around the mean, use absolute deviation around the median", 《JOURNAL OF EXPERIMENTAL SOCIAL PSYCHOLOGY》 *
王青青: "面向智能考勤应用的离群值检测算法研究", 《中国优秀硕士学位论文全文数据库_信息科技辑》 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116016303A (en) * 2022-12-05 2023-04-25 浪潮通信信息系统有限公司 Method for identifying service quality problem of core network based on artificial intelligence

Similar Documents

Publication Publication Date Title
CN108802535B (en) Screening method, main interference source identification method and device, server and storage medium
CN108921424B (en) Power data anomaly detection method, device, equipment and readable storage medium
US8756028B2 (en) Fault detection method of semiconductor manufacturing processes and system architecture thereof
CN110930057A (en) Quantitative evaluation method for reliability of distribution transformer test result based on LOF algorithm
CN109285791A (en) Design layout-based rapid online defect diagnosis, classification and sampling method and system
CN112949735A (en) Liquid hazardous chemical substance volatile concentration abnormity discovery method based on outlier data mining
CN110751213B (en) Method for identifying and supplementing abnormal wind speed data of wind measuring tower
CN112417763A (en) Defect diagnosis method, device and equipment for power transmission line and storage medium
CN111400911A (en) GNSS deformation information identification and early warning method based on EWMA control chart
CN112116014A (en) Test data outlier detection method for distribution automation equipment
CN111521883A (en) Method and system for obtaining electric field measurement value of high-voltage direct-current transmission line
CN106855990B (en) Nuclear power unit instrument channel measurement error demonstration method
CN108760268B (en) Step fault diagnosis method for vertical mill operation data based on information entropy
CN110907984A (en) Method for detecting earthquake front infrared long-wave radiation abnormal information based on autoregressive moving average model
CN109308395B (en) Wafer-level space measurement parameter anomaly identification method based on LOF-KNN algorithm
CN116644368A (en) Outlier identification method based on improved Grabbs test method
CN111507374A (en) Power grid mass data anomaly detection method based on random matrix theory
CN111797545B (en) Wind turbine generator yaw reduction coefficient calculation method based on measured data
CN113672658B (en) Power equipment online monitoring error data identification method based on complex correlation coefficient
WO2019041732A1 (en) Evaluation method and apparatus for manufacturing process capability
CN115015810A (en) Electric connection defect diagnosis system applied to induction capacitor
CN110705924B (en) Wind measuring data processing method and device of wind measuring tower based on wind direction sector
CN110259435B (en) Well condition change identification method based on oil pumping unit electrical parameters
CN112763678A (en) PCA-based sewage treatment process monitoring method and system
CN105930870A (en) Engineering safety monitoring data outlier detection method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20201222