AU2021335237A1 - Method for detecting abnormality of automatic verification system of smart watt-hour meter based on transductive support vector machine (TSVM) model - Google Patents
Method for detecting abnormality of automatic verification system of smart watt-hour meter based on transductive support vector machine (TSVM) model Download PDFInfo
- Publication number
- AU2021335237A1 AU2021335237A1 AU2021335237A AU2021335237A AU2021335237A1 AU 2021335237 A1 AU2021335237 A1 AU 2021335237A1 AU 2021335237 A AU2021335237 A AU 2021335237A AU 2021335237 A AU2021335237 A AU 2021335237A AU 2021335237 A1 AU2021335237 A1 AU 2021335237A1
- Authority
- AU
- Australia
- Prior art keywords
- tsvm
- model
- meter
- data
- sample
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Abandoned
Links
- 238000012795 verification Methods 0.000 title claims abstract description 120
- 230000005856 abnormality Effects 0.000 title claims abstract description 70
- 238000000034 method Methods 0.000 title claims abstract description 39
- 238000012706 support-vector machine Methods 0.000 title claims description 7
- 238000001514 detection method Methods 0.000 claims abstract description 57
- 230000002159 abnormal effect Effects 0.000 claims abstract description 48
- 238000012549 training Methods 0.000 claims abstract description 19
- 238000002372 labelling Methods 0.000 claims abstract description 12
- 238000000605 extraction Methods 0.000 claims abstract description 7
- 238000007781 pre-processing Methods 0.000 claims abstract description 7
- 238000012360 testing method Methods 0.000 claims description 44
- 238000000513 principal component analysis Methods 0.000 claims description 9
- 238000010276 construction Methods 0.000 claims description 7
- 238000010606 normalization Methods 0.000 claims description 7
- 238000012216 screening Methods 0.000 claims description 7
- 238000005457 optimization Methods 0.000 claims description 3
- 238000002955 isolation Methods 0.000 claims description 2
- 239000013598 vector Substances 0.000 abstract description 3
- 230000005611 electricity Effects 0.000 abstract 2
- 239000003795 chemical substances by application Substances 0.000 abstract 1
- 238000009826 distribution Methods 0.000 description 11
- 238000007689 inspection Methods 0.000 description 7
- 230000000694 effects Effects 0.000 description 4
- 230000007547 defect Effects 0.000 description 2
- 239000006185 dispersion Substances 0.000 description 2
- 238000011156 evaluation Methods 0.000 description 2
- 238000010801 machine learning Methods 0.000 description 2
- 238000012423 maintenance Methods 0.000 description 2
- 239000011159 matrix material Substances 0.000 description 2
- 230000032683 aging Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 230000007797 corrosion Effects 0.000 description 1
- 238000005260 corrosion Methods 0.000 description 1
- 238000002790 cross-validation Methods 0.000 description 1
- 238000000354 decomposition reaction Methods 0.000 description 1
- 230000002349 favourable effect Effects 0.000 description 1
- 230000007774 longterm Effects 0.000 description 1
- 238000013450 outlier detection Methods 0.000 description 1
- 238000011946 reduction process Methods 0.000 description 1
- 238000004904 shortening Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 230000003313 weakening effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01R—MEASURING ELECTRIC VARIABLES; MEASURING MAGNETIC VARIABLES
- G01R35/00—Testing or calibrating of apparatus covered by the other groups of this subclass
- G01R35/04—Testing or calibrating of apparatus covered by the other groups of this subclass of instruments for measuring time integral of power or current
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01R—MEASURING ELECTRIC VARIABLES; MEASURING MAGNETIC VARIABLES
- G01R1/00—Details of instruments or arrangements of the types included in groups G01R5/00 - G01R13/00 and G01R31/00
- G01R1/02—General constructional details
- G01R1/04—Housings; Supporting members; Arrangements of terminals
Landscapes
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
- Testing And Monitoring For Control Systems (AREA)
- Complex Calculations (AREA)
Abstract
MJ 1- 9*#Tree INVz% Pil
(1)19)M
'43 PdVT WO 2023/273249 A1
2023 4T1 5 (05.01.2023) WIPO T PWC0T
(51) p--i$4M: TECHNOLOGY DEVELOPMENT CO., LTD.) [CN/
GO1R 35/04 (2006.01) GOR 1/04 (2006.01) CN]; +] li _Li A X* M f
(21) ) PCT/CN2021/141547 921 ,Shanghai 202156 (CN)o
(22) Mp8$i : 2021 412 ] 27 H (27.12.2021) (72)&fl)R,.:J-Ei (ZHUANG,Gewei); +[1%TiT4#
X + li (-%) * N tl r, ! E X, , M1122
(25) $t , Shanghai 200122 (CN)o F09 %(GU, Zhen); +
(26) Q : +* AI ) A E
(30)VYH.VtR: ilWM1122t, Shanghai 200122 (CN)o N (HE, (30) ( $R :Qing); + MISZ+K ( %)
202110732174.7 202146A]30H (30.06.2021) CN & + M _L * 229 Shanghai 200122 (CN) o
(71) i : lx J u 0t3l fj i](STATE T HOj(ZHOU,Lei); + Li A X +l(
GRID SHANGHAI MUNICIPAL ELECTRIC POWER _L) A Ftl _ M %2K 1122 , Shanghai
COMPANY) [CN/CN]; +L@d flT4ifUTKiZ + 200122 (CN)o 1f9)(ZHANG,Jingyue); +[
(I%)A tl ZW v!A X 1122 Shanghai * I if N id (L* ) 1 E
200122 (CN)0 ± f f§ I-E F f 4 & R h pF R ffi%2iU 1122- , Shanghai 200122 (CN) o bAa
15] (SHANGHAI SHINEENERGY INFORMATION (FENG, Xiuqing); +P_*fi4TK X FP (
(54) Title: TSVM-MODEL-BASED ABNORMALITY DETECTION METHOD FOR AUTOMATIC VERIFICATION SYSTEM
OF SMART ELECTRICITY METER
(54)&RR)M : TTSVMM9ThYWitL$ALif
S1
It-- 4 i F P At0 , A i Tf
S2
$Bt * 4 pS3 S1 Perform feature extraction on error
experimental dataofaverification meter ~bIQLI f"It x _'ptTSVMnPj# position to betested that includes asmall
amount of abnormal data, construct feature
vectors, and perform pre-processing
S2 Manually label some samples
S3 Use labeled samples and unlabeled samples
to obtain a TSVM-based abnormality detection
S4 model by means of training in a semi
fl~ffATTSV n, supervised manner T J PT SVMMS4 Use theTSVM-based abnormality detection
model to dynamically predict an abnormal
state of the verification meter
position
(57) Abstract: A TSVM-model-based abnormality detection method for an automatic verification system of a smart electricity meter,
enlthe method comprising the following steps: S1, performing feature extraction on error experimental data of a verification meter posi
tion to be tested that includes a small amount of abnormal data, constructing feature vectors, and performing pre-processing to form
e data samples; S2, manually labeling some of the samples; S3, using labeled samples and unlabeled samples to obtain a TSVM-based
abnormality detection model by means of training in a semi-supervised manner; and S4, using the TSVM-based abnormality detection
W O 2023/273249 A 1 ||111||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
W)E My S) 0Xi X a M 1122 Shanghai
200122 (CN)o, M *3(SU, Pengtao); FPPd
rii (l [ 430 [ F, Shanghai 200025
(CN)o A 4$ (PAN, Ye); + li -L *ii fl M [§
4309[ tQ , Shanghai 200025 (CN)o
(74) tI A, : f11-- 4T t T M ij$ it TT- ; [ IR i]
(SCIHEAD IP LAW FIRM); +P [Id ii
A A Ex L Y + M4 80 [ - Ti- l W 1508
, Guangdong 510070 (CN)o
(81) $H3IA~gfNgfN{9N
AE, AG, AL, AM, AO, AT, AU, AZ, BA, BB, BG,
BH, BN, BR, BW, BY, BZ, CA, CH, CL, CN, CO, CR, CU,
CZ, DE, DJ, DK, DM, DO, DZ, EC, EE, EG, ES, FI, GB,
GD, GE, GH, GM, GT, HN, HR, HU, ID, IL, IN, IR, IS, IT,
JO, JP, KE, KG, KH, KN, KP, KR, KW, KZ, LA, LC, LK,
LR, LS, LU, LY, MA, MD, ME, MG, MK, MN, MW, MX,
MY, MZ, NA, NG, NI, NO, NZ, OM, PA, PE, PG, PH, PL,
PT, QA, RO, RS, RU, RW, SA, SC, SD, SE, SG, SK, SL,
ST, SV, SY, TH, TJ, TM, TN, TR, TT, TZ, UA, UG, US,
UZ, VC, VN, WS, ZA, ZM, ZWc
fif): ARIPO (BW, GH, GM, KE, LR, LS, MW, MZ,
NA, RW, SD, SL, ST, SZ, TZ, UG, ZM, ZW), KXl (AM,
AZ, BY, KG, KZ, RU, TJ, TM), KIXII (AL, AT, BE, BG,
CH, CY, CZ, DE, DK, EE, ES, FI, FR, GB, GR, HR, HU,
IE, IS, IT, LT, LU, LV, MC, MK, MT, NL, NO, PL, PT,
RO, RS, SE, SI, SK, SM, TR), OAPI (BF, BJ, CF, CG, CI,
CM, GA, GN, GQ, GW, KM, ML, MR, NE, SN, TD, TG).
- t t K FT+*It (* 9M21* (3))
model to dynamically predict an abnormal state of the verification meter position. The method has the advantages of being high in
accuracy, achieving on-line detection and saving on detection costs.
Si:N~~~SV ta tryt,~t~~ )r~i~~i,4 (57) $j: M e T TWrtF :
7%g% R $4;S2: A tE1%4;S3: i)tsE*ifaE*2$ iiil
fi&M& TSVM 9#R &f)Q ry ff; S4: fi)JTSM9#a%2 XJft& N$
Description
[0001] The present disclosure relates to an abnormality detection method for an automatic verification system of a smart watt-hour meter, and in particular, to a method for detecting an abnormality of an automatic verification system of a smart watt-hour meter based on a transductive support vector machine (TSVM) model.
[0002] Watt-hour meters provide trade settlement basis for power transactions, so the verification of watt-hour meters is becoming more and more important. With the continuous progress of smart grid construction, the demand of smart watt-hour meter is increasing day by day. In order to cope with the surge of smart watt-hour meter verification demand, the automatic verification system with high verification efficiency arises at the historic moment. However, in the long-term uninterrupted operation of the intelligent watt-hour meter verification system, the connection link may appear mechanical fatigue or even aging, resulting in abnormal verification results.
[0003] At present, the metrology center regularly shuts down the assembly line of the automatic verification system and carries out manual inspection to ensure that each verification unit is in a healthy running state. however, this method cannot get the risk information related to the assembly line monitored by the automatic verification system in time, so that the verification system will still serve the test project before the next manual inspection, which will lead to the risk of deviation of large-scale test results. Although the possibility of the above situation can be reduced to some extent by shortening the time interval of manual inspection, it will greatly reduce the verification efficiency of the assembly line and increase the cost of manpower and operation and maintenance at the same time. Therefore, to realize the on-line evaluation of the mechanical performance of each verification epitope connection link in the automatic verification system is of great significance to improve the reliability of the automatic verification system.
[0004] The purpose of the present disclosure is to provide an anomaly detection method of an intelligent watt-hour meter automatic verification system based on TSVM model in order to overcome the defects existing in the prior art.
[0005] The object of the present disclosure can be realized by the following technical solution:
[00061 The present disclosure relates to an anomaly detection method of intelligent watt-hour meter automatic verification system based on TSVM model. The method includes the following steps:
[00071 Si: performing feature extraction and eigenvector construction on error test data that is of a to-be-detected verification meter position and contains a small amount of abnormal data, and performing preprocessing to form a data sample;
[0008] S2: labeling some samples manually;
[0009] S3: using labeled samples and unlabeled samples to train in a semi-supervised way to obtain an anomaly detection model based on TSVM.
[0010] S4: dynamically predicting an abnormal state of the verification meter position by using the TSVM-based abnormality detection model.
[0011] Preferably, the eigenvector construction in step S Iincludes: obtaining historical error test data of each verification meter position under different verification test items, extracting an eigenvalue of historical error test data under each verification test item, and combining eigenvalues under all verification test items into an eigenvector of the corresponding verification meter position.
[0012] Preferably, the eigenvalue includes a maximum value, a minimum value, an expectation, a variance, a skewness, and a kurtosis of the historical error test data.
[0013] Preferably, the preprocessing in step Sl includes normalization and dimension reduction of an eigenvector of each meter position.
[0014] Preferably, the normalization is performed in the following manner: x - u Z = S
[0015] where x represents an eigenvalue of a to-be-processed eigenvector, u represents an expectation of the eigenvalue of the to-be-processed eigenvector, S represents a standard deviation of the eigenvalue of the to-be-processed eigenvector, and z represents a normalized eigenvalue.
[0016] Preferably, the dimension reduction includes principal component analysis (PCA).
[00171 Preferably, step S2 specifically includes:
[0018] obtaining an "abnormal meter position" through preliminary screening based on the data sample and an unsupervised abnormality detection algorithm; and
[0019] manually checking and labeling the obtained "abnormal meter position", determining the "abnormal meter position" as a normal meter position or an abnormal meter position based on a manual check result, and labeling a data sample corresponding to a manually checked verification meter position to form a labeled sample.
[00201 Preferably, the unsupervised abnormality detection algorithm includes an isolation forest (Iforest) algorithm, a local outlier factor (LOF) algorithm, and a one-class support vector machine (OCSVM) algorithm.
[0021] Preferably, a quantity of labeled samples during model training in step S3 is less than a quantity of unlabeled samples.
[0022] Preferably, the method further includes: optimizing the TSVM-based abnormality detection model, specifically: using the model to predict abnormal data in a to-be-detected sample, manually checking and labeling the abnormal data, constructing a labeled sample library by using all obtained manually-labeled samples, selecting a data point close to a classification boundary from the labeled sample library to form a new labeled sample, and performing semi-supervised model training based on the new labeled sample and the unlabeled sample to complete the optimization; and predicting a data point in the labeled sample library by using an optimized model, and calculating a ratio that there is a difference between a predicted state and a real state of the labeled sample, where when the ratio is less than a specified threshold, it is determined that performance of the model meets a prediction accuracy condition, and the model can directly predict a to-be-detected dataset.
[0023] Compared with the prior art, the present disclosure has the following advantages:
[0024] (1) The present disclosure uses a small quantity of labeled samples and a large quantity of unlabeled samples to construct the TSVM-based abnormality detection model in a semi-supervised manner. This can effectively reduce a cost of manual inspection compared with other methods.
[0025] (2) Based on historical error test data generated for one verification meter position, the present disclosure takes statistics on maximum and minimum values in data of each verification test item, calculates an expectation, a variance, a skewness, and a kurtosis of the data to describe an average level, a dispersion degree, asymmetry, and a proportion of an extreme outlier for data distribution of the verification meter position, and converts an abnormal state of the meter position into an abnormality of data distribution, which makes it possible to analyze a state of the meter position based on the data and realizes on-line evaluation of the abnormal state of the meter position, thereby reducing impact on an assembly line and improving verification efficiency.
[0026] (3) The PCA method adopted in the present disclosure effectively reduces dimensions of sample data of the verification meter position, thereby effectively resolving problems of sparse data samples and difficult distance calculation in a high-dimensional situation, and reducing a difficulty of abnormality detection.
[00271 (4) The present disclosure can continuously obtain a new labeled sample in a working process, and expand and optimize the TSVM-based abnormality detection model still through semi-supervised training based on the new labeled sample and the unlabeled sample to continuously improve accuracy of the model.
[00281 FIG. 1 is a flowchart of a method for detecting an abnormality of an automatic verification system of a smart watt-hour meter based on a TSVM model according to the present disclosure;
[0029] FIG. 2 shows proportions of reserved sample feature information under different dimensions according to an embodiment of the present disclosure; and
[0030] FIG. 3 is a schematic flowchart of detecting an abnormality of an automatic verification system of a smart watt-hour meter in actual application according to the present disclosure.
[0031] The present disclosure will be described in detail in conjunction with the accompanying drawings and specific embodiments. It should be noted that the description of following implementations is merely a substantial example, and the present disclosure is neither intended to limit its application or use, nor being limited to the following implementations.
[0032] Embodiment
[0033] As shown in FIG. 1, this embodiment provides a method for detecting an abnormality of an automatic verification system of a smart watt-hour meter based on a TSVM model. The method includes the following steps.
[0034] Si: Perform feature extraction and eigenvector construction on error test data that is of a to-be-detected verification meter position and contains a small amount of abnormal data, and perform preprocessing to form a data sample.
[0035] Specifically, it is assumed that one assembly line of an automatic verification system of a smart watt-hour meter contains 30 verification units, and a test dataset of each verification unit contains 60 verification meter position samples. In each verification task, smart watt-hour meters from a same batch are randomly assigned to different meter positions for a plurality of different error tests. Obtained error test data can not only reflect a quality problem of the smart watt-hour meter, but also indirectly reflect a problem of a verification apparatus.
[0036] Assuming that metering performance of the smart watt-hour meters of the same batch has a same distribution feature, when all verification meter positions are in a normal state and their states are consistent, it is considered that error test data corresponding to 60 verification meter positions in a same verification unit should also have a same distribution feature. When a verification meter position has a fault such as corrosion or deformation, its distribution feature is different from distribution features of other meter positions, which is expressed as an "abnormal" data point. In order to extract a data distribution eigenvalue from massive error test data, related statistics of massive error test data generated for one verification meter position are calculated. Based on data generated for one verification meter position, statistics are taken on maximum and minimum values in data of each test item, an expectation, a variance, a skewness, and a kurtosis of the data are calculated to describe an average level, a dispersion degree, asymmetry, and a proportion of an extreme outlier for data distribution of the verification meter position, and an abnormal state of the meter position is converted into an abnormality of data distribution.
[00371 Therefore, the eigenvector construction in step Si includes: obtaining historical error test data of each verification meter position under different verification test items, extracting an eigenvalue of historical error test data under each verification test item, and combining eigenvalues under all verification test items into an eigenvector of the corresponding verification meter position. The eigenvalue includes a maximum value, a minimum value, an expectation, a variance, a skewness, and a kurtosis of the historical error test data.
[0038] A next assembly line of the verification system contains 30 verification units. A test dataset of each verification unit contains 60 verification meter position samples, namely, {x1, X2, . .
, X60}. A maximum value, a minimum value, an expectation, a variance, a skewness, and a kurtosis of each piece of error test data corresponding to each meter position are calculated to construct an eigenvector of each meter position sample. For example, m error tests are conducted, and each sample contains 6m eigenvalues, namely, 6m dimensions.
[0039] In order to prevent large-scale data from weakening impact of other feature data to degrade prediction performance of an outlier factor algorithm, each eigenvalue of the sample is scaled to a same scale, and normalized feature scaling is performed to process the data according to the following formula: x - u Z = S
[0040] In the above formula, x represents an eigenvalue of a to-be-processed eigenvector, u represents an expectation of the eigenvalue of the to-be-processed eigenvector, S represents a standard deviation of the eigenvalue of the to-be-processed eigenvector, and z represents a normalized eigenvalue. Normalization can realize an average value of 0 and a variance of 1 for all features of the sample.
[0041] Each verification meter position sample has up to 60 data dimensions. In this case, data samples are sparse and it is difficult to calculate a distance, which makes abnormality detection more difficult. Therefore, it is necessary to perform dimension reduction on the eigenvector. PCA is a most commonly used dimension reduction method. A specific dimension reduction process is as follows:
[0042] The following sample set is input: D={Xi, X 2 , ..., X 5 9, X 6 o}. In the following formula, X represents different samples, and a value of i is an integer ranging from 1 to 60.
[00431 All the samples are centralized as follows:
Xi <-- Xi - 6 y0 0Xi i= 1, 2, 3, ..., 6 0
[0044] A covariance matrix XXT of the samples is calculated. T
[0045] Eigenvalue decomposition is performed on the covariance matrix XX
[0046] Eigenvectors W 1, W2 , ... , and WD' corresponding to the largest d' eigenvalues are taken.
[00471 A dimension d' obtained after dimension reduction is specified by a user. Proportions of data feature information under different dimensions are different. The user may determine a value of d' by setting a proportion of feature information to be reserved. For a data sample of the automatic verification system of the smart watt-hour meter, proportions, corresponding to different d 'values, of reserved feature information are shown in FIG. 2. If nearly 99.9% of feature information needs to be reserved in normalized sample data, more than 40 data dimensions are required, in other words, a quantity of effective data dimensions to be analyzed by the abnormality detection algorithm is 40.
[0048] S2: Label some samples manually.
[0049] Specifically, an "abnormal meter position" is obtained through preliminary screening based on the data sample and an unsupervised abnormality detection algorithm.
[0050] The obtained "abnormal meter position" is manually checked and labeled, the "abnormal meter position" is determined as a normal meter position or an abnormal meter position based on a manual check result, and a data sample corresponding to the manually checked verification meter position is labeled to form a labeled sample, where the unsupervised abnormality detection algorithm includes an Iforest algorithm, a LOF algorithm, and an OCSVM algorithm. The Iforest algorithm has a good effect on global abnormality detection, and is suitable for abnormality detection of continuous and high-dimensional data. The Iforest algorithm is a plurality of binary-tree division processes. A feature of a dataset is randomly extracted each time, and a randomly extracted value is used as a division basis to divide the dataset. A plurality of iterations are performed till an isolated tree is formed in a forest. A sample data point at a lower height in the tree is more likely to be determined as an abnormal data point. The LOF algorithm has a poorer effect than the Iforest algorithm in detecting a global outlier, but has a better effect in detecting a local abnormality in a dataset with a concentrated data distribution and a small proportion of abnormalities. As a density-based outlier detection method, the LOF algorithm determines a local reachable density by calculating a kth neighborhood (non-global) of a sample point, and compares local reachable densities of the sample point and its neighborhood point to determine whether a sample is an outlier. A lower density of the sample point leads to a higher possibility that the sample point is an outlier. The OCSVM algorithm is a modified SVM algorithm, and is suitable for singular value detection and sample imbalance scenarios. It has a good effect on abnormality detection of high-dimensional and large sample data. A training sample of an OCSVM model is only one-class data. A distribution shape of a dataset is obtained by establishing a model that can represent this kind of data, so as to determine, in a detection process, whether both a to-be-predicted data sample and the training sample belong to the one-class data.
[0051] The labeled sample is selected based on a principle of minimizing a labeling cost. A sample that is most likely to be an abnormal data point is selected for labeling, which not only eliminates a meter position fault, but also helps to quickly find a new abnormality type. In order to select an unsupervised abnormality detection algorithm suitable for the data of the automatic verification system of the smart watt-hour meter, a letter high-dimensional abnormality dataset in a machine learning library is selected to detect accuracy of three unsupervised abnormality detection algorithms. A data dimension and an abnormality degree of the letter high-dimensional abnormal dataset are similar to those of data that is of the automatic verification system of the smart watt-hour meter and obtained after PCA-based dimension reduction. The letter dataset has 32 dimensions and 1600 samples including 100 abnormal samples. A cross validation method is used to optimize parameters of the model algorithm. A test result is shown in Table 1.
Table 1 Average accuracy of unsupervised abnormality detection
Abnormality detection Iforest LOF OCSVM
algorithm
Average accuracy 89% 91% 67%
[0052] S3: Perform semi-supervised training based on the labeled sample and an unlabeled sample, to obtain a TSVM-based abnormality detection model, where a quantity of labeled samples during model training is less than a quantity of unlabeled samples.
[0053] As a representative of semi-supervised SVMs, a TSVM, like a standard binary SVM classifier, is an algorithm for resolving a binary classification problem. The algorithm attempts to use the unlabeled sample as all combinations of normal data points or abnormal data points, and attempts to find a hyperplane to maximize an interval between all samples including the labeled sample and the unlabeled sample.
[0054] For a labeled sample Dz={(xi, yi), (x2 , y), ... , (xz, yz)} and an unlabeled sample D.={xl.1, X+2, ..., xm} whose sample types are known, yi E {-1, +1}, -1 indicates that the sample is abnormal, +1 indicates that the sample is normal, and a quantity of samples in D, is less than that of samples in D,. A goal of the TSVM algorithm is to find a most appropriate label for a to-be-labeled sample:f (fi1, +2,--,m),where fi E {-1, +1}, namely: min ||w|+C Ei+C EI s.t.y,(wTx + b) 1 - Ei, i = 1,2, . . , I ,(wTx + b) 1 - Ei, i = + 1, 1 + 2,..., m Ei : 0, i = 1, 2, .. ,m
[0055] In the above formula, (w, b) represents a hyperplane, Ei represents relaxation vectors one-to-one corresponding to all samples, and C, and C, represent a compromise parameter of a weight of the labeled sample and a compromise parameter of a weight of the unlabeled sample respectively. The TSVM finds an approximate solution of the above formula through a plurality of iterations.
[0056] S4: Dynamically predict an abnormal state of the verification meter position by using the TSVM-based abnormality detection model.
[00571 The method further includes: optimizing the TSVM-based abnormality detection model, specifically: using the model to predict abnormal data in a to-be-detected sample, manually checking and labeling the abnormal data, constructing a labeled sample library by using all obtained manually-labeled samples, selecting a data point close to a classification boundary from the labeled sample library to form a new labeled sample, and performing semi-supervised model training based on the new labeled sample and the unlabeled sample to complete the optimization; and predicting a data point in the labeled sample library by using an optimized model, and calculating a ratio that there is a difference between a predicted state and a real state of the labeled sample, where when the ratio is less than a specified threshold, it is determined that performance of the model meets a prediction accuracy condition, and the model can directly predict a to-be-detected dataset.
[0058] In this embodiment, verification data of an automatic verification system of a smart watt-hour meter whose batch number is JYL20002 from November 10, 2020 to November 13, 2020 is used. Specific steps are as follows:
[0059] Step 1: Perform feature extraction and dimension reduction on the data.
[0060] There are total 30 verification units in an assembly line of the verification system, and a dataset of each verification unit contains 60 verification meter position samples. Based on error test data of ten items that is generated for each verification meter position, an eigenvector of the verification meter position is constructed. An eigenvector of each sample contains 60 eigenvalues. Verification meter position 1 of verification unit 1 is used as an example, and eigenvalues of verification meter position 1 are shown in Table 2.
Table 2 Eigenvalues of a meter position sample (for example, sample 1) Feature Max (X) Min (X) E (X) D (X) Skew (X) Kurt (X)
Test item
Item 1 0.0070 -0.0509 -0.023917 0.000406 0.106826 -1.541011 Item 2 -0.0135 -0.0740 -0.048492 0.000418 0.457221 -0.870495
Item 3 0.0022 -0.1117 -0.032800 0.000985 -1.535648 2.775146
Item 4 0.0433 -0.0851 -0.001425 0.001208 -1.134629 2.165499
Item 5 0.0676 -0.1605 -0.004658 0.003671 -1.547250 3.437145
Item 6 0.1093 -0.1216 0.026133 0.003655 -1.176047 2.458465
Item 7 0.0117 -0.0500 -0.020842 0.000300 0.221488 -0.173168
Item 8 -0.0227 -0.0852 -0.058567 0.000412 0.554031 -0.820532
Item 9 0.0102 -0.0535 -0.021258 0.000407 -0.017970 -1.203731
Item 10 -0.0238 -0.0876 -0.062267 0.000447 0.601206 -0.686866
[0061] Normalization and PCA-based dimension reduction are performed on eigenvectors of samples of verification unit 1, such that original 60 dimensions are reduced to 40 dimensions. A data feature obtained after dimension reduction is shown in Table 3.
[0062] Table 3 Feature data obtained after PCA-based dimension reduction imens 1 2 3 .. 38 39 40
ion
Metr
sition
1 -2.234622 1.585350 2.526461 ...... -0.001673 0.035707 -0.030962
2 -1.229589 2.076074 -2.800221 ...... -0.046646 -0.046169 0.070169
3 -1.296936 -0.182465 1.629265 ...... 0.099782 -0.086831 0.027047
58 -3.462262 -0.090737 1.551019 ...... 0.139304 -0.034429 -0.087262
59 -1.508056 -1.500644 -0.810303 ...... 0.055930 0.147360 -0.142620
60 -1.336344 3.994048 -0.853882 ...... 0.040263 -0.094883 0.169225
[00631 Step 2: Obtain an "abnormal meter position" through screening based on the unsupervised abnormality detection algorithm, manually check the "abnormal meter position", and obtain a labeled sample while removing a fault.
[0064] There may be different standard meter errors between verification units and an electrical circuit may be faulty. Therefore, when the labeled sample is obtained, meter position samples of one same verification unit are used as a to-be-detected dataset, and the LOF algorithm is used to calculate an outlier factor value (representing an abnormality degree of each sample) of each meter position in the verification unit based on feature data of the meter position. Then, a box plot method is used to perform abnormality screening on outlier factor values of 60 meter position samples of the verification unit to obtain a meter position sample that is most likely to be an abnormal data point, and the "abnormal meter position" is manually checked. The unsupervised abnormality detection algorithm is applied to 30 verification units of this batch (JYL20002), and outlier factor values of 1800 verification meter positions can be obtained. Outlier factor values of 60 verification table positions of verification unit 1 are shown in Table 4.
Table 4 Result of the unsupervised abnormality detection algorithm Meter position Outlier Meter position Outlier Meter position Outlier Meter position Outlier
number factor value number factor value number factor value number factor value
1 1.130353 16 1.031647 31 1.058620 46 1.046426
2 1.014464 17 1.010069 32 2.000560 47 1.320810
3 1.004347 18 0.998419 33 1.197083 48 1.098991
4 1.054612 19 1.044121 34 2.334983 49 1.207244
5 1.000622 20 1.015149 35 1.857913 50 1.167582
6 0.995484 21 1.000194 36 1.007053 51 2.695171
7 1.023143 22 1.140063 37 1.077457 52 3.494880
8 1.012834 23 1.268981 38 1.010256 53 2.956908
9 1.008601 24 1.022946 39 1.033592 54 1.007359
10 1.026758 25 1.106047 40 1.008499 55 1.027882
11 1.883877 26 1.065297 41 1.010359 56 1.188867
12 1.054852 27 1.217648 42 0.996309 57 1.050760
13 1.121655 28 1.267209 43 1.160934 58 0.998736
14 1.010204 29 1.003913 44 1.026550 59 1.015013
15 0.997774 30 1.097910 45 0.996525 60 0.999981
[00651 The box plot method is used to perform abnormality detection on the above outlier factor values, and an on-line threshold value 1.39758 is used as a determining value. Meter positions 11, 32, 34, 35, 51, 52, and 53 in verification unit1 are determined as abnormal. Through manual inspection, it is found that meter positions 11, 51, and 53 are faulty, but meter positions 32, 34, 35, and 52 have no fault. The same unsupervised abnormality detection algorithm is applied to data of the whole assembly line, and 322 meter positions are determined as abnormal. After manual inspection, it is found that 230 meter positions have no fault. It is obvious that unsupervised abnormality detection has a high error rate in abnormality detection of the smart watt-hour meter.
[0066] Step 3: Predict a result by using the TSVM model.
[00671 The TSVM obtains an initial SVM through training by using a small labeled sample set obtained through unsupervised abnormality screening and manual inspection, and then uses the learner to label an unlabeled sample, such that all samples are labeled. Based on these labeled samples, the SVM is retrained, and an error-prone sample is found for continuous adjustment.
[0068] In order to detect performance of the model, the present disclosure adopts a method for randomly dividing a sample into a training set and test set in machine learning. However, different from the application of directly and randomly dividing the sample, the present disclosure randomly divides error test data of the verification meter positions in the assembly line into a "training set" and a "test set" to simulate verification datasets obtained by the assembly line in two different working processes, and then obtains a training sample and a test sample through feature extraction, normalization, and dimension reduction.
[0069] The training sample includes labeled and unlabeled samples. Verification unit 1 is used as an example, sample data of meter positions 11, 32, 34, 35, 51, 52 and 53 that are manually detected may be used as a labeled sample Xi, where -1 and + 1 represent normal and faulty states of the verification meter position respectively:
Dz={ (Xi 1, -1), (X32 , +1), (X34 , +1), ( X 35 , +1), (X5 1, -1), (X5 2 , +1), (X5 3, -1)}
[00701 Sample data of other meter positions that are not manually inspected may be used as an unlabeled sample set:
Du={Xi, X 2, ..., Xio, X 12 , ... , X 3 1, X 33, X 36 , ..., X 5 0 , X 5 4 , ..., X60 }
[00711 The TSVM model is obtained through semi-supervised training by using the labeled and unlabeled samples. The model predicts the "test set". Table 5 compares a prediction result of the model and a result of the unsupervised abnormality detection algorithm.
Table 5 Comparison between abnormality detection results obtained by the TSVM model
and the LOF algorithm
Meter Meter Meter Meter Meter Meter Meter position position position position position position position 11 32 34 35 51 52 53 Detection based on -1 -1 -1 -1 -1 -1 -1 the LOF algorithm Manual detection -1 +1 +1 +1 -1 +1 -1 Detection based on -1 +1 +1 +1 +1 +1 -1 the TSVM model
[0072] It can be seen from the prediction results of the models that the TSVM model constructed in the present disclosure has higher accuracy than the unsupervised anomaly detection model.
[0073] As shown in FIG. 3, after obtaining a prediction result of the abnormal meter position, the method in the present disclosure can finally be used to assist professionals in carrying out a fixed-point check on the verification meter positions to find out a real abnormal verification meter position, so as to reduce an operation and maintenance cost of the automatic verification system, ensure verification accuracy of an automatic verification pipeline, and accurately locate an abnormal point and eliminate a defect.
[0074] The present disclosure proposes a method for constructing the abnormality detection model based on the TSVM model. For impure verification meter position samples, a most suspicious meter position sample is obtained through screening in an unsupervised manner, and then is labeled manually. Some labeled sample data is obtained while a meter position fault is removed. Then, the TSVM model is constructed by using the labeled sample and the unlabeled sample. A test result shows that the abnormality detection model constructed in the present disclosure can realize on-line detection of the meter position abnormality in the assembly line, reduce a workload caused by shutdown overhaul, and improve work efficiency of the assembly line. Compared with the unsupervised anomaly detection method, the TSVM model based on semi-supervised learning has higher accuracy, and the model can select a favorable labeled sample through active learning, to perform model training to improve the performance of the model. This provides an idea for the automatic verification system of the smart watt-hour meter to continuously optimize and improve the performance of TSVM model in the future.
[00751 The above implementations are merely described as examples, and are not intended to limit the scope of the present disclosure. These implementations can also be implemented in various other ways, and various omissions, substitutions, and changes can be made without departing from the technical thought of the present disclosure.
Claims (10)
- CLAIMS: 1. A method for detecting an abnormality of an automatic verification system of a smart watt-hour meter based on a transductive support vector machine (TSVM) model, wherein the method comprises the following steps: Si: performing feature extraction and eigenvector construction on error test data that is of a to-be-detected verification meter position and contains a small amount of abnormal data, and performing preprocessing to form a data sample; S2: labeling some samples manually; S3: using labeled samples and unlabeled samples to train in a semi-supervised way to obtain an anomaly detection model based on TSVM; and S4: dynamically predicting an abnormal state of the verification meter position by using the TSVM-based abnormality detection model.
- 2. The method for detecting an abnormality of an automatic verification system of a smart watt-hour meter based on a TSVM model according to claim 1, wherein the eigenvector construction in step S comprises: obtaining historical error test data of each verification meter position under different verification test items, extracting an eigenvalue of historical error test data under each verification test item, and combining eigenvalues under all verification test items into an eigenvector of the corresponding verification meter position.
- 3. The method for detecting an abnormality of an automatic verification system of a smart watt-hour meter based on a TSVM model according to claim 2, wherein the eigenvalue comprises a maximum value, a minimum value, an expectation, a variance, a skewness, and a kurtosis of the historical error test data.
- 4. The method for detecting an abnormality of an automatic verification system of a smart watt-hour meter based on a TSVM model according to claim 1, wherein the preprocessing in step SI comprises normalization and dimension reduction of an eigenvector of each meter position.
- 5. The method for detecting an abnormality of an automatic verification system of a smart watt-hour meter based on a TSVM model according to claim 4, wherein the normalization is performed in the following manner: x -u Z = S wherein x represents an eigenvalue of a to-be-processed eigenvector, u represents an expectation of the eigenvalue of the to-be-processed eigenvector, S represents a standard deviation of the eigenvalue of the to-be-processed eigenvector, and z represents a normalized eigenvalue.
- 6. The method for detecting an abnormality of an automatic verification system of a smart watt-hour meter based on a TSVM model according to claim 4, wherein the dimension reduction comprises principal component analysis (PCA).
- 7. The method for detecting an abnormality of an automatic verification system of a smart watt-hour meter based on a TSVM model according to claim 1, wherein step S2 specifically comprises: obtaining an "abnormal meter position" through preliminary screening based on the data sample and an unsupervised abnormality detection algorithm; and manually checking and labeling the obtained "abnormal meter position", determining the "abnormal meter position" as a normal meter position or an abnormal meter position based on a manual check result, and labeling a data sample corresponding to a manually checked verification meter position to form a labeled sample.
- 8. The method for detecting an abnormality of an automatic verification system of a smart watt-hour meter based on a TSVM model according to claim 7, wherein the unsupervised abnormality detection algorithm comprises an isolation forest (Iforest) algorithm, a local outlier factor (LOF) algorithm, and a one-class support vector machine (OCSVM) algorithm.
- 9. The method for detecting an abnormality of an automatic verification system of a smart watt-hour meter based on a TSVM model according to claim 1, wherein a quantity of labeled samples during model training in step S3 is less than a quantity of unlabeled samples.
- 10. The method for detecting an abnormality of an automatic verification system of a smart watt-hour meter based on a TSVM model according to claim 1, wherein the method further comprises: optimizing the TSVM-based abnormality detection model, specifically: using the model to predict abnormal data in a to-be-detected sample, manually checking and labeling the abnormal data, constructing a labeled sample library by using all obtained manually-labeled samples, selecting a data point close to a classification boundary from the labeled sample library to form a new labeled sample, and performing semi-supervised model training based on the new labeled sample and the unlabeled sample to complete the optimization; and predicting a data point in the labeled sample library by using an optimized model, and calculating a ratio that there is a difference between a predicted state and a real state of the labeled sample, wherein when the ratio is less than a specified threshold, it is determined that performance of the model meets a prediction accuracy condition, and the model can directly predict a to-be-detected dataset.
Applications Claiming Priority (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110732174.7A CN113484817A (en) | 2021-06-30 | 2021-06-30 | Intelligent electric energy meter automatic verification system abnormity detection method based on TSVM model |
CN202110732174.7 | 2021-06-30 | ||
PCT/CN2021/141547 WO2023273249A1 (en) | 2021-06-30 | 2021-12-27 | Tsvm-model-based abnormality detection method for automatic verification system of smart electricity meter |
Publications (1)
Publication Number | Publication Date |
---|---|
AU2021335237A1 true AU2021335237A1 (en) | 2023-02-02 |
Family
ID=77936778
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
AU2021335237A Abandoned AU2021335237A1 (en) | 2021-06-30 | 2021-12-27 | Method for detecting abnormality of automatic verification system of smart watt-hour meter based on transductive support vector machine (TSVM) model |
Country Status (3)
Country | Link |
---|---|
CN (1) | CN113484817A (en) |
AU (1) | AU2021335237A1 (en) |
WO (1) | WO2023273249A1 (en) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113484817A (en) * | 2021-06-30 | 2021-10-08 | 国网上海市电力公司 | Intelligent electric energy meter automatic verification system abnormity detection method based on TSVM model |
CN116702078B (en) * | 2023-06-02 | 2024-03-26 | 中国电信股份有限公司浙江分公司 | State detection method based on modular expandable cabinet power distribution unit |
Family Cites Families (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107590262A (en) * | 2017-09-21 | 2018-01-16 | 黄国华 | The semi-supervised learning method of big data analysis |
CN107976992B (en) * | 2017-11-29 | 2020-01-21 | 东北大学 | Industrial process big data fault monitoring method based on graph semi-supervised support vector machine |
WO2019152050A1 (en) * | 2018-02-02 | 2019-08-08 | Visa International Service Association | Efficient method for semi-supervised machine learning |
CN108985632A (en) * | 2018-07-16 | 2018-12-11 | 国网上海市电力公司 | A kind of electricity consumption data abnormality detection model based on isolated forest algorithm |
CN109828230B (en) * | 2019-04-02 | 2021-03-09 | 国网新疆电力有限公司营销服务中心(资金集约中心、计量中心) | Positioning method for automatically detecting meter position fault of assembly line of electric energy meter |
CN114039794A (en) * | 2019-12-11 | 2022-02-11 | 支付宝(杭州)信息技术有限公司 | Abnormal flow detection model training method and device based on semi-supervised learning |
CN111259937B (en) * | 2020-01-09 | 2022-04-05 | 中国人民解放军国防科技大学 | Semi-supervised communication radiation source individual identification method based on improved TSVM |
CN111398886B (en) * | 2020-04-09 | 2022-12-16 | 国网山东省电力公司营销服务中心(计量中心) | Detection method and system for automatically detecting online abnormity of epitope of assembly line |
CN111740991B (en) * | 2020-06-19 | 2022-08-09 | 上海仪电(集团)有限公司中央研究院 | Anomaly detection method and system |
CN112115467A (en) * | 2020-09-04 | 2020-12-22 | 长沙理工大学 | Intrusion detection method based on semi-supervised classification of ensemble learning |
CN113484817A (en) * | 2021-06-30 | 2021-10-08 | 国网上海市电力公司 | Intelligent electric energy meter automatic verification system abnormity detection method based on TSVM model |
-
2021
- 2021-06-30 CN CN202110732174.7A patent/CN113484817A/en active Pending
- 2021-12-27 WO PCT/CN2021/141547 patent/WO2023273249A1/en unknown
- 2021-12-27 AU AU2021335237A patent/AU2021335237A1/en not_active Abandoned
Also Published As
Publication number | Publication date |
---|---|
CN113484817A (en) | 2021-10-08 |
WO2023273249A1 (en) | 2023-01-05 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Shen et al. | A combined algorithm for cleaning abnormal data of wind turbine power curve based on change point grouping algorithm and quartile algorithm | |
CN104595170B (en) | A kind of air compressor machine monitoring and diagnosis system and method for self-adaptive kernel gauss hybrid models | |
CN102361014B (en) | State monitoring and fault diagnosis method for large-scale semiconductor manufacture process | |
WO2019080367A1 (en) | Method for evaluating health status of mechanical device | |
AU2021335237A1 (en) | Method for detecting abnormality of automatic verification system of smart watt-hour meter based on transductive support vector machine (TSVM) model | |
CN109583520B (en) | State evaluation method of cloud model and genetic algorithm optimization support vector machine | |
CN109842373A (en) | Diagnosing failure of photovoltaic array method and device based on spatial and temporal distributions characteristic | |
CN112069727B (en) | Intelligent transient stability evaluation system and method with high reliability for power system | |
CN111398886A (en) | Detection method and system for automatically detecting online abnormity of epitope of assembly line | |
CN110687895B (en) | Chemical process fault detection method based on self-adaptive kernel principal component analysis | |
CN114676742A (en) | Power grid abnormal electricity utilization detection method based on attention mechanism and residual error network | |
CN113239132A (en) | Online out-of-tolerance identification method for voltage transformer | |
CN113409166A (en) | XGboost model-based method and device for detecting abnormal electricity consumption behavior of user | |
CN115526258A (en) | Power system transient stability evaluation method based on Spearman correlation coefficient feature extraction | |
CN115496108A (en) | Fault monitoring method and system based on manifold learning and big data analysis | |
WO2019019429A1 (en) | Anomaly detection method, device and apparatus for virtual machine, and storage medium | |
CN113608968A (en) | Power dispatching monitoring data anomaly detection method based on density and distance comprehensive decision | |
CN116956189A (en) | Current abnormality detection system, method, electronic equipment and medium | |
CN110647922B (en) | Layered non-Gaussian process monitoring method based on public and special feature extraction | |
Massaoudi et al. | Short-Term Dynamic Voltage Stability Status Estimation Using Multilayer Neural Networks | |
Shan et al. | Root Cause Analysis of Failures for Power Communication Network Based on CNN | |
CN106548191B (en) | Continuous process fault detection method based on collection nucleation locality preserving projections | |
Zheng et al. | Research on Monitoring of Transmission Line Anti-Outbreak Fault Based on Spark and Fuzzy KNN | |
Li et al. | A novel probabilistic framework with interpretability for generator coherency identification | |
Dunwen et al. | A trainsient voltage stability evaluation model based on morphological similarity distance online calculation |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
MK5 | Application lapsed section 142(2)(e) - patent request and compl. specification not accepted |