CN116776209A - Method, system, equipment and medium for identifying operation state of gateway metering device - Google Patents

Method, system, equipment and medium for identifying operation state of gateway metering device Download PDF

Info

Publication number
CN116776209A
CN116776209A CN202311089567.6A CN202311089567A CN116776209A CN 116776209 A CN116776209 A CN 116776209A CN 202311089567 A CN202311089567 A CN 202311089567A CN 116776209 A CN116776209 A CN 116776209A
Authority
CN
China
Prior art keywords
samples
electricity utilization
sample
class
characteristic
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311089567.6A
Other languages
Chinese (zh)
Inventor
赖国书
黄春竹
黄天富
吴志武
涂彦昭
张颖
王春光
姚文翰
林彤尧
黄汉斌
张增荣
伍翔
童承鑫
林雨欣
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
State Grid Fujian Electric Power Co Ltd
Marketing Service Center of State Grid Fujian Electric Power Co Ltd
Original Assignee
State Grid Fujian Electric Power Co Ltd
Marketing Service Center of State Grid Fujian Electric Power Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by State Grid Fujian Electric Power Co Ltd, Marketing Service Center of State Grid Fujian Electric Power Co Ltd filed Critical State Grid Fujian Electric Power Co Ltd
Priority to CN202311089567.6A priority Critical patent/CN116776209A/en
Publication of CN116776209A publication Critical patent/CN116776209A/en
Pending legal-status Critical Current

Links

Landscapes

  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention relates to a method, a system, equipment and a medium for identifying the running state of a gateway metering device, wherein the method comprises the following steps: collecting historical electricity utilization data of a target gateway metering device, acquiring electricity utilization characteristic indexes, and adding fault type labels to the electricity utilization characteristic indexes to form a real sample; sample proliferation is carried out based on fault type distribution of a real sample, so that a training sample set is formed; performing feature screening on samples in the training sample set; constructing a first neural network, re-adding a second label to samples in the training sample set to indicate whether a fault occurs, and training by taking an electricity characteristic index as input and taking the second label as output to obtain a fault diagnosis model; constructing a second neural network, taking the electricity characteristic index as input and the fault type label as output to train, and obtaining a fault type identification model; and acquiring current electricity utilization data, judging whether a fault occurs in the fault diagnosis model, and identifying the type of the fault through the fault type identification model.

Description

Method, system, equipment and medium for identifying operation state of gateway metering device
Technical Field
The invention relates to a method, a system, equipment and a medium for identifying the running state of a gateway metering device, and belongs to the technical field of power equipment state monitoring.
Background
The gate metering device is an important device for trade settlement of both sides of purchasing and selling electricity, and the reliability and accuracy of uploaded data can directly influence the fairness of transactions because the gate metering device monitors a large number of indexes and the uploaded data volume is huge. At present, most domestic researches on misalignment of gateway metering devices are focused on operation state identification. Analyzing a gateway misalignment generation mechanism, and dividing the fault reasons of the gateway metering device into the following four categories: the first type is the failure of the metering device itself, including the dead halt of the electric energy meter (stop metering, no follow display), abnormal electric energy, abnormal automatic meter checking of the electric energy meter, etc. The second type is a secondary loop fault, which comprises current secondary loop current loss, loose cable connection of a voltage loop and the like. The third type is that the communication module of the metering device is damaged, so that the background cannot read the electric quantity data of the electric energy meter. The fourth category is human error, including wiring errors, equipment installation errors, and the like. In addition to the common faults, abnormal data often appear in the process of daily monitoring the data uploaded by the gateway metering device, and the occurrence of abnormal points does not represent that the metering device is in a fault state, and can be caused by abnormal electricity utilization behaviors of a user. According to the analysis, the gateway metering device has various faults and certain concealment, the conventional detection means often consume a large amount of manpower and material resources, and timely warning cannot be achieved, and a large amount of power grid assets are lost in the process. At present, the operation state identification method of the gateway metering device mainly comprises the following steps:
(1) And selecting and weighting indexes related to the running state of the gateway metering device, and carrying out state identification on the metering device by using the weighted result. The method is too dependent on the selected state index, the index weight is greatly interfered by manual work, and the reliability is difficult to effectively verify.
(2) The monitoring data transmitted back by the electricity consumption information acquisition system contains a large amount of information related to the running state of the metering device, characteristic indexes reflecting the corresponding running state of the metering device are mined from the information, the characteristic parameters are subjected to self-adaptive learning through a built machine learning or deep learning model, a classification model for identifying the running state of the gateway metering device is further generated, and the identification of the running state of the metering device is realized through the classification model. The data mining method requires a large amount of historical data to train the network, and the identification accuracy is difficult to guarantee under the condition of missing data sets or unbalanced samples.
Although the existing method for identifying the running state of the gateway metering device can judge the running state of a part of the gateway metering device, certain limitations still exist in practical engineering application, such as:
(1) The abundant electricity consumption data uploaded by the information system can determine the running state of the gateway metering device, but not all data can be used for identifying the running state of the metering device, and redundant information only increases the calculated amount of data mining and increases the model training difficulty.
(2) Transformer loss and bus loss are an important characteristic quantity for judging the state of the gateway metering device. Under the condition of smaller load, the fluctuation of transformer loss and bus loss is very severe under the influence of the precision of the metering device, so that the model can misjudge the state of the metering device, and the state identification accuracy is reduced.
(3) For common machine learning or deep learning methods, a certain number of samples are required to train the model so as to ensure the stability of the model and the accuracy of the algorithm. However, due to improvement of the technical level and perfection of the protection measures, the frequency of faults of the gate metering points is not frequent, and the fault samples of the gate metering points which can be obtained are far smaller than the normal state operation samples, namely the problem of unbalance of the data samples among different categories can occur, so that the training model is over-fitted, and the identification accuracy is reduced.
(4) In nature, there are a large number of interference sources, and signals generated by the interference sources enter various electrical devices through electromagnetic coupling or paths such as power wiring and the like to form various forms of noise. The influence of noise on data acquisition cannot be ignored in the traditional method, and finally the operation state identification precision of the metering device is likely to be reduced.
Disclosure of Invention
In order to solve the problems in the prior art, the invention provides a method, a system, equipment and a medium for identifying the running state of a gateway metering device.
The technical scheme of the invention is as follows:
in one aspect, the invention provides a method for identifying the running state of a gateway metering device, which comprises the following steps:
collecting historical electricity utilization data of a target gateway metering device, acquiring a plurality of electricity utilization characteristic indexes of a plurality of historical time periods based on the historical electricity utilization data, preprocessing the historical electricity utilization characteristic indexes, and adding a fault type label to the electricity utilization characteristic indexes of each historical time period to form a real sample;
based on fault type distribution of the real samples, sample proliferation is carried out on the real samples with small quantity, so as to form a training sample set;
screening the characteristics of the samples in the training sample set, screening out high-correlation electricity utilization characteristic indexes, and removing the rest electricity utilization characteristic indexes from the samples;
constructing a first neural network, re-adding a second label to the samples in the training sample set according to the fault type label, wherein the second label is used for indicating whether the samples have faults or not, taking the electricity utilization characteristic index in the samples as input, and taking the second label as output to train the first neural network to obtain a trained fault diagnosis model;
Constructing a second neural network, taking the electricity utilization characteristic index in the sample as input, and taking the fault type label as output to train the second neural network to obtain a trained fault type identification model;
and acquiring current electricity utilization data of the target gateway metering device, acquiring corresponding electricity utilization characteristic indexes, inputting the corresponding electricity utilization characteristic indexes into a fault diagnosis model, judging whether faults occur or not, and inputting the electricity utilization characteristic indexes into a fault type identification model to identify the type of the faults after the faults occur.
As a preferred embodiment, the method for sample proliferation of a few real samples specifically comprises the following steps:
sample generation is carried out by adopting a variation self-encoder;
inputting the real samples into an encoder network of a variational self-encoder, and compressing the real samples into low-dimensional vectors;
and then inputting the low-dimensional vector into a decoder network of a variable self-encoder to reconstruct samples, thereby obtaining generated samples.
As a preferred embodiment, the method for screening the high-correlation electricity utilization characteristic index by screening the characteristics of the samples in the training sample set specifically comprises the following steps:
for any two types of fault types of samples, marking the feature set of the first type of samples asThe feature set of the second class of samples is +. >
wherein ,characteristic set representing the ith electricity utilization characteristic index in the first class of samples, +.>Feature set representing the ith electricity usage feature index in the second class of samples, +.>An ith power consumption characteristic index indicating an Nth 1 st sample of the first class of samples,/-)>The ith electricity utilization characteristic index of the Nth 2 samples in the second type samples is N1 in number of samples of the first type, and N2 in number of samples of the second type;
and carrying out nonlinear transformation on the sample by adopting a Gaussian kernel function, mapping the sample into a high-dimensional space, wherein a mapping equation is specifically as follows:
wherein ,is-> and />Of (2), wherein>Represents the center point of the kernel function,/>Is any point in space; />Is bandwidth and is used for controlling the action range of the Gaussian kernel function;
obtaining feature sets of two types of samples mapped to high-dimensional space and />The method is characterized by comprising the following steps:
wherein ,characteristic set representing jth electricity utilization characteristic index after mapping first class sample to high-dimensional space, ++>A feature set representing a j-th power utilization feature index in the second class of samples; />An i-th index of electricity utilization characteristics mapped to a high-dimensional space representing an Nth 1 sample of the first class of samples, ">An electricity utilization characteristic index representing that an ith one of the Nth 2 samples in the second class of samples is mapped to the high-dimensional space;
And then obtaining an intra-class dispersion matrix of each class of sample feature set by using a Fisher linear discriminant method, wherein the intra-class dispersion matrix is as follows:
wherein :
acquiring a total intra-class dispersion matrix and a total inter-class dispersion matrix of a sample:
constructing a characteristic scoring formula according to a total intra-class dispersion matrix and a total inter-class dispersion matrix of the sample:
the characteristic scoring of the ith electricity utilization characteristic index is given, and the larger the value of the characteristic scoring is, the larger the inter-class dispersion degree of different types of samples is, namely the larger the characteristic distinction degree is;
the redundancy among the power utilization characteristic indexes is calculated by adopting a maximum information coefficient method, and the method specifically comprises the following steps:
given i and j, performing i column and j row meshing on a scatter diagram formed by any two variables U, V, and solving the maximum mutual information value; then normalizing the maximum mutual information value; and finally, selecting the maximum value of mutual information under different scales as an MIC value, wherein the calculation formula is as follows:
wherein U, V is any two variables, and B is a preset reference factor;the mutual information value is U, V, and the calculation formula is as follows:
score based on featuresAnd MIC value calculation final score:
wherein ,final score for the ith electricity utilization characteristic index, N is characteristic number, ++>And selecting an optimal characteristic group capable of showing the running state of the gateway metering device for the ith electricity utilization characteristic index based on the calculated final score.
As a preferred implementation manner, the first neural network adopts a multi-scale convolutional neural network, and a batch normalization and Dropout algorithm is added after a pooling layer of the multi-scale convolutional neural network for optimization;
the second neural network employs a depth residual contraction network.
In another aspect, the present invention further provides a system for identifying an operation state of a gateway metering device, including:
the data acquisition module is used for acquiring historical electricity utilization data of the target gateway metering device, acquiring a plurality of electricity utilization characteristic indexes of a plurality of historical time periods based on the historical electricity utilization data for preprocessing, and adding fault type labels to the electricity utilization characteristic indexes of each historical time period to form a real sample;
the sample balancing module is used for carrying out sample proliferation on the few real samples based on fault type distribution of the real samples to form a training sample set;
the feature screening module is used for carrying out feature screening on the samples in the training sample set, screening out high-correlation power utilization feature indexes, and eliminating the rest power utilization feature indexes from the samples;
the first network training module is used for constructing a first neural network, re-adding a second label to the samples in the training sample set according to the fault type label, wherein the second label is used for indicating whether the samples have faults or not, and training the first neural network by taking the electricity utilization characteristic index in the samples as input and taking the second label as output to obtain a trained fault diagnosis model;
The second network training module is used for constructing a second neural network, taking the electricity utilization characteristic index in the sample as input, and taking the fault type label as output to train the second neural network so as to obtain a trained fault type identification model;
and the identification module is used for acquiring current electricity utilization data of the target gateway metering device, acquiring corresponding electricity utilization characteristic indexes, inputting the corresponding electricity utilization characteristic indexes into the fault diagnosis model, judging whether faults occur or not, and inputting the electricity utilization characteristic indexes into the fault type identification model to identify the fault type occurring after the faults occur.
As a preferred embodiment, the method for sample proliferation of the sample balancing module to a small number of real samples specifically comprises:
sample generation is carried out by adopting a variation self-encoder;
inputting the real samples into an encoder network of a variational self-encoder, and compressing the real samples into low-dimensional vectors;
and then inputting the low-dimensional vector into a decoder network of a variable self-encoder to reconstruct samples, thereby obtaining generated samples.
As a preferred embodiment, the feature screening module performs feature screening on samples in the training sample set, and the method for screening out the high-correlation power consumption feature index specifically includes:
for any two types of fault types of samples, marking the feature set of the first type of samples as The feature set of the second class of samples is +.>
wherein ,characteristic set representing the ith electricity utilization characteristic index in the first class of samples, +.>Feature set representing the ith electricity usage feature index in the second class of samples, +.>An ith power consumption characteristic index indicating an Nth 1 st sample of the first class of samples,/-)>The ith electricity utilization characteristic index of the Nth 2 samples in the second type samples is N1 in number of samples of the first type, and N2 in number of samples of the second type;
and carrying out nonlinear transformation on the sample by adopting a Gaussian kernel function, mapping the sample into a high-dimensional space, wherein a mapping equation is specifically as follows:
wherein ,is-> and />Of (2), wherein>Represents the center point of the kernel function,/>Is any point in space; />Is bandwidth and is used for controlling the action range of the Gaussian kernel function;
obtaining feature sets of two types of samples mapped to high-dimensional space and />The method is characterized by comprising the following steps:
wherein ,characteristic set representing jth electricity utilization characteristic index after mapping first class sample to high-dimensional space, ++>A feature set representing a j-th power utilization feature index in the second class of samples; />Power usage feature finger representing ith mapping to high dimensional space for the Nth 1 st sample of the first class of samplesMark (I) of->An electricity utilization characteristic index representing that an ith one of the Nth 2 samples in the second class of samples is mapped to the high-dimensional space;
And then obtaining an intra-class dispersion matrix of each class of sample feature set by using a Fisher linear discriminant method, wherein the intra-class dispersion matrix is as follows:
wherein :
acquiring a total intra-class dispersion matrix and a total inter-class dispersion matrix of a sample:
constructing a characteristic scoring formula according to a total intra-class dispersion matrix and a total inter-class dispersion matrix of the sample:
the characteristic scoring of the ith electricity utilization characteristic index is given, and the larger the value of the characteristic scoring is, the larger the inter-class dispersion degree of different types of samples is, namely the larger the characteristic distinction degree is;
the redundancy among the power utilization characteristic indexes is calculated by adopting a maximum information coefficient method, and the method specifically comprises the following steps:
given i and j, performing i column and j row meshing on a scatter diagram formed by any two variables U, V, and solving the maximum mutual information value; then normalizing the maximum mutual information value; and finally, selecting the maximum value of mutual information under different scales as an MIC value, wherein the calculation formula is as follows:
wherein U, V is any two variables, and B is a preset reference factor;the mutual information value is U, V, and the calculation formula is as follows:
score based on featuresAnd MIC value calculation final score:
wherein ,final score for the ith electricity utilization characteristic index, N is characteristic number, ++>And selecting an optimal characteristic group capable of showing the running state of the gateway metering device for the ith electricity utilization characteristic index based on the calculated final score.
As a preferred implementation manner, the first neural network adopts a multi-scale convolutional neural network, and a batch normalization and Dropout algorithm is added after a pooling layer of the multi-scale convolutional neural network for optimization;
the second neural network employs a depth residual contraction network.
In still another aspect, the present invention further provides an electronic device, including a memory, a processor, and a computer program stored in the memory and capable of running on the processor, where the processor implements the method for identifying the running state of the gateway metering apparatus according to any embodiment of the present invention when the processor executes the program.
In still another aspect, the present invention further provides a computer readable storage medium having a computer program stored thereon, where the program when executed by a processor implements a method for identifying an operation state of a gateway metering apparatus according to any of the embodiments of the present invention.
The invention has the following beneficial effects:
1. according to the method for identifying the running state of the gateway metering device, the running state of the gateway metering device is identified by sample proliferation, independent screening of feature sets and combination of a neural network. The method can effectively solve the problem of sample unbalance, and can quickly realize the classification of the running states of the gateway metering device on the premise of reducing data redundancy and noise interference.
2. According to the method for identifying the running state of the gateway metering device, disclosed by the invention, the fault samples of the gateway metering device are multiplied through the variation self-encoder, so that the problems that the training model is over-fitted and the overall identification accuracy is reduced due to unbalanced fault samples and smaller sample size are effectively solved, and the stability and the overall identification accuracy of the training model are ensured.
3. According to the method for identifying the running state of the gateway metering device, provided by the invention, the running state characteristics of the gateway metering device are automatically screened by utilizing a Gaussian kernel Fisher discriminant analysis method and a maximum information coefficient method, the optimal characteristic group which can most express the characteristics of a sample is selected from the running state characteristics, and the model training efficiency is improved.
4. According to the method for identifying the running state of the gateway metering device, disclosed by the invention, the running state identification of the gateway metering device in a multi-level mode is realized through two neural network models, whether the metering device fails or not is judged first, then the specific failure type is judged, and the accuracy rate of the running state identification is effectively improved.
Drawings
FIG. 1 is a schematic flow chart of a method according to a first embodiment of the invention;
FIG. 2 is a diagram illustrating an example of a multi-scale convolutional neural network model employed in an embodiment of the present invention;
Fig. 3 is a diagram illustrating an exemplary structure of a depth residual shrinkage network model used in an embodiment of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
It should be understood that the step numbers used herein are for convenience of description only and are not limiting as to the order in which the steps are performed.
It is to be understood that the terminology used in the description of the invention is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used in this specification and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.
The terms "comprises" and "comprising" indicate the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
The term "and/or" refers to any and all possible combinations of one or more of the associated listed items, and includes such combinations.
Embodiment one:
referring to fig. 1, the present embodiment provides a method for identifying an operation state of a gateway metering device, which realizes the operation state identification of the gateway metering device by sample proliferation, feature set autonomous screening and neural network combination. The method can effectively solve the problem of sample unbalance, and can quickly realize the classification of the running states of the gateway metering device on the premise of reducing data redundancy and noise interference. The method specifically comprises the following steps:
s100, based on platforms such as an existing power industry electricity consumption information acquisition system, an online monitoring device and an electric energy metering system, remote acquisition is carried out on electricity consumption data of the gateway metering device, historical electricity consumption data of the target gateway metering device are acquired, and the acquisition time interval is once a day. And fully considering the conditions of the basic data of the gateway metering device, the modeling information of the plant stations, the maintenance plan of the metering device and the like, acquiring a plurality of electricity utilization characteristic indexes of a plurality of historical time periods based on the historical electricity utilization data, preprocessing, establishing an input matrix for identifying the operation state of the gateway metering device, and adding fault type labels to the electricity utilization characteristic indexes (namely the input matrix) of each historical time period to form a real sample.
In the present embodiment, the real sample is composed of an input matrix X and a fault type tag nm Is a one-dimensional matrix containing all the electricity utilization characteristic indexes, whereinnRepresent the firstnData indexes, 15 indexes in total.
Index X 1 The difference value of the daily positive active power and the total power of each rate period of daily peak, average and valley is represented, the index represents the state of uneven value of the electric energy meter, and the calculation formula is as follows:
wherein ,P1 Representing total daily positive active power, A 1 、A 2 、A 3 and A4 Representing the amount of power for different rate periods of daily spikes, peaks, peaked and valleys.
Index X 2 The ratio of the daily positive active power quantity to the reference value is represented, the index represents the flying state of the electric energy meter, and the calculation formula is as follows:
wherein ,P1ref A reference value representing the daily positive active power.
Index X 3 The logical AND operation of the continuous two-day reverse active power difference value and the continuous two-day line loss rate difference value is represented, the index represents the stop-and-go state of the electric energy meter, and the calculation formula is as follows:
wherein ,P2 and P2q Indicating the reverse active power of the day and the day of the last day, and />Indicating the antenna loss rate of the current day and the previous day,&the operation rule is as follows: if the calculation results of the two calculation formulas on the right side of the equation are both greater than zero, the index on the left side of the equation is set to 1, otherwise, the index is set to 0. Taking the above formula as an example, if the formula P is calculated 2 -P 2q ,/>-/>The calculated results of (a) are all greater than 0, the index X 3 Equal to 1, if any calculation result of the two calculation formulas is smaller than zero, the index X 3 Equal to 0.
Index X 4 The method is characterized in that the difference value of forward active power energy in two continuous days is represented, the index represents the backward running state of the electric energy meter, and the calculation formula is as follows:
wherein ,P1q Indicating the positive active power of the day of the last day.
Index X 5 Characterization of the Voltage lossThe state is as follows:
wherein ,represents the voltage value of any phase,/->Indicating the nominal voltage in normal operating state, +.>Representing the current of any phase +.>The starting current of the electric energy meter is represented,&the operation rule is as follows: if the calculation results of the two calculation formulas on the right side of the equation are both greater than zero, the index on the left side of the equation is set to 1, otherwise, the index is set to 0.
Index X 6 The voltage phase failure state is represented, and the calculation formula is as follows:
wherein ,representing the lower threshold of the voltage in normal operating conditions,&the operation rule is as follows: if the calculation results of the two calculation formulas on the right side of the equation are both greater than zero, the index on the left side of the equation is set to 1, otherwise, the index is set to 0.
Index X 7 The voltage out-of-limit state is represented, and the calculation formula is as follows:
wherein ,representing the upper threshold of the voltage in normal operating conditions.
Index X 8 The voltage unbalance state is represented, and the calculation formula is as follows:
wherein ,represents the maximum value of the three-phase voltage, +.>Representing the minimum value of the three-phase voltage.
Index X 9 The current loss state is characterized, and the calculation formula is as follows:
index X 10 The current imbalance state is characterized, and the calculation formula is as follows:
wherein ,represents the maximum value of the three-phase current, +.>Representing the minimum value of the three-phase current.
Index X 11 The wrong wiring state of the measuring loop is represented by adopting logical AND operation expression of the daily reverse active electric quantity and the difference value of the continuous two-day line loss rate, and the calculation formula is as follows:
the operation rule of the AND operator is as follows: if the calculation results of the two calculation formulas on the right side of the equation are both greater than zero, the index on the left side of the equation is set to 1, otherwise, the index is set to 0.
Index X 12 The reverse state of the power flow is represented by adopting three-phase current, and the calculation formula is as follows:
index X 13 And X is 14 Representing the main transformer loss rate and the bus loss rate respectively. Because the no-load loss of the main transformer or the bus is fixed, the input electric quantity is smaller under the condition of smaller load, and the loss rate of the main transformer or the bus is higher. Considering that the condition of small load generally occurs after maintenance, if the calculation time limit of the loss rate is prolonged, the loss rate can be made to be normal. Therefore, in the case of a small load, it is necessary to comprehensively judge whether the measuring device is in a fault state. When the daily input electric quantity of the main transformer or the bus is below 10kWh, the loss rate of the main transformer and the bus is calculated by taking data of nearly seven days; when the input electric quantity is 10-100kWh, the main transformer and bus loss rate is calculated by taking data of nearly five days; when the input electric quantity is 100-300kWh, the main transformer and bus loss rate is calculated by taking data of nearly three days; when the input electric quantity is above 300kWh, the main transformer and bus loss rate is calculated by taking the data of the same day.
In order to increase the accuracy of identifying the abnormal state of the gateway, whether the current day is in the power failure maintenance state is also taken as one of the data indexes, if the current day is in the power failure maintenance state, the current day is marked as 1, otherwise the current day is marked as 0, and the current day is marked as the index X 15 . In the process of identifying the sample by the model, if only loss indexes are abnormal, the abnormal electricity utilization behavior is marked as the condition, and other conditions are classified as specific fault types or normal running states.
Because the distribution range of the input data of each dimension of the model is large in difference, in order to avoid the model being dominated by the data of a certain dimension, the input data is required to be uniformly mapped to the [0,1] interval through normalization processing, so that indexes of different units or orders of magnitude can be compared and weighted conveniently. The normalization formula is as follows:
wherein ,for normalized input data, ++>For the original input data, ++>、/>Respectively, the maximum value and the minimum value in the original input data.
And S200, based on fault type distribution of the real samples, sample proliferation is carried out on the real samples with small number, for example, the number of the fault types comprises five types, and the numbers are 5000, 4500, 4600, 3800 and 600 respectively, and sample proliferation is carried out on the samples with the number of the fault types of 600, so that the samples are kept in a certain balance with the numbers of the samples with the other types of the fault types, and a training sample set with balanced samples is formed. The sample proliferation method can adopt an oversampling method and a model generation method, such as an SMOTE algorithm.
S300, carrying out feature screening on samples in a training sample set, screening out high-correlation electricity utilization feature indexes, and removing the rest electricity utilization feature indexes from the samples, wherein each sample contains 15 electricity utilization feature indexes, but not all the 15 electricity utilization feature indexes are helpful for final classification. In this embodiment, several electricity utilization characteristic indexes are selected from the 15 electricity utilization characteristic indexes, and the several electricity utilization characteristic indexes enable the samples to be better distinguished. For example, screening 4 electricity utilization characteristic indexes can enable the distinction between samples of each category to be larger, and the last 4 electricity utilization characteristic indexes can better help the model to classify, and then the 4 electricity utilization characteristic indexes are selected in the embodiment.
S400, constructing a first neural network, re-adding a second label to the samples in the training sample set according to the fault type label, wherein the second label is used for indicating whether the samples fail or not, taking the electricity utilization characteristic index (the screened electricity utilization characteristic index) in the samples as input, and taking the second label as output to train the first neural network so as to obtain a trained fault diagnosis model.
S500, constructing a second neural network, taking the electricity utilization characteristic index (the screened electricity utilization characteristic index) in the sample as input, and taking the fault type label as output to train the second neural network, so as to obtain a trained fault type identification model.
S600, current electricity utilization data of the target gateway metering device are obtained, corresponding electricity utilization characteristic indexes are obtained and input into a fault diagnosis model to judge whether faults occur, and after faults occur, the electricity utilization characteristic indexes are input into a fault type identification model to identify the fault types occurring.
As a preferred implementation manner of this example, in step S200, the method for performing sample proliferation on the real samples with a small number of samples specifically includes:
sample generation is carried out by adopting a variation self-encoder; variable Auto-Encoders (VAEs) are a common generation model that can learn a model such that the distribution of output data approximates as closely as possible to the original data distribution. The basic idea is to transform a stack of real samples through the encoder network into an ideal data distribution, which is then transferred to a decoder network to obtain a stack of generated samples. If the generated samples are sufficiently close to the real samples, a VAE model is trained.
Inputting the real samples into an encoder network of a variational self-encoder, and compressing the real samples into low-dimensional vectors;
and then inputting the low-dimensional vector into a decoder network of a variable self-encoder to reconstruct samples, thereby obtaining generated samples.
The VAE is mainly divided into two parts, namely an Encoder (Encoder) which compresses original data into a low-dimensional vector and a Decoder (Decoder) which restores the low-dimensional vector into the original data. First, the real sample X is input into the Encoder to determine the posterior distribution:
however the posterior probability is extremely complex to calculate,is a mixed distribution, the integral is very difficult to calculate and the computational complexity increases exponentially with increasing X, so that a variational extrapolation is used, by means of +.>To approximateGeneral assumption +.>Subject to Gaussian distribution, i.e.)>Then, by generating the auxiliary variable +.>To introduce the hidden variable Z:
represented by the hidden variable ZNamely, decoder process:
for any input data, it should be ensured that the final conversion by the hidden variable back to output data is as equal as possible to the input data, thereby introducing a maximum likelihood estimate:
for a pair ofAnd (3) performing conversion:
wherein :
finally, parameter adjustment is carried out through a neural network, so thatAs small as possible, < >>As large as possible, thereby achieving an effect of generating a sample as similar as possible to a real sample. The gateway metering device fault samples can be generated in a large quantity through the VAE model, so that sample balance is achieved.
As a preferred implementation manner of this embodiment, in step S300, the method for screening the samples in the training sample set to obtain the highly relevant power consumption characteristic index specifically includes:
for any two types of fault types of samples, marking the feature set of the first type of samples asThe feature set of the second class of samples is +.>
wherein ,characteristic set representing the ith electricity utilization characteristic index in the first class of samples, +.>Feature set representing the ith electricity usage feature index in the second class of samples, +.>An ith power consumption characteristic index indicating an Nth 1 st sample of the first class of samples,/-)>The ith electricity utilization characteristic index of the Nth 2 samples in the second type samples is N1 in number of samples of the first type, and N2 in number of samples of the second type;
because of the great difference between the input electricity utilization characteristic indexes, a Gaussian kernel Fisher discriminant analysis method (Gaussian Kernel Fisher Discriminant Analysis, GKFDA) is used for selecting the characteristic with high expression degree in order to reduce the dimension of information. The Gaussian kernel Fisher discriminant analysis method needs to carry out nonlinear transformation on an original data sample through a Gaussian kernel function, map the original data sample into a high-dimensional space, and adopt the Gaussian kernel function to carry out nonlinear transformation on the sample, map the sample into the high-dimensional space, wherein a mapping equation is specifically as follows:
wherein ,is-> and />Of (2), wherein>Represents the center point of the kernel function,/>Is any point in space; />Is bandwidth and is used for controlling the action range of the Gaussian kernel function;
obtaining a feature set of two types of samples mapped to the high-dimensional space by adopting the Gaussian kernel mapping equation and />The method is characterized by comprising the following steps:
wherein ,characteristic set representing jth electricity utilization characteristic index after mapping first class sample to high-dimensional space, ++>A feature set representing a j-th power utilization feature index in the second class of samples; />An i-th index of electricity utilization characteristics mapped to a high-dimensional space representing an Nth 1 sample of the first class of samples, ">An electricity utilization characteristic index representing that an ith one of the Nth 2 samples in the second class of samples is mapped to the high-dimensional space;
and then obtaining an intra-class dispersion matrix of each class of sample feature set by using a Fisher linear discriminant method, wherein the intra-class dispersion matrix is as follows:
wherein :
;/>
acquiring a total intra-class dispersion matrix and a total inter-class dispersion matrix of a sample:
constructing a characteristic scoring formula according to a total intra-class dispersion matrix and a total inter-class dispersion matrix of the sample:
the characteristic scoring of the ith electricity utilization characteristic index is given, and the larger the value of the characteristic scoring is, the larger the inter-class dispersion degree of different types of samples is, namely the larger the characteristic distinction degree is;
After judging the feature distinction, the redundancy among the power utilization feature indexes is calculated by adopting a maximum information coefficient method (Maximal Information Coefficient, MIC), and specifically comprises the following steps:
given i and j, performing i column and j row meshing on a scatter diagram formed by any two variables U, V, and solving the maximum mutual information value; then normalizing the maximum mutual information value; and finally, selecting the maximum value of mutual information under different scales as an MIC value, wherein the calculation formula is as follows:
wherein U, V is any two variables, and B is a preset reference factor;the mutual information value is U, V, and the calculation formula is as follows:
score based on featuresAnd MIC value calculation final score:
wherein ,final score for the ith electricity utilization characteristic index, N is characteristic number, ++>For the ith electricity characteristic index, an optimal characteristic group capable of showing the running state of the gateway metering device is selected based on the calculated final score, and in this embodiment, the method for selecting the optimal characteristic group is to sort the selected electricity characteristic indexes by the final score, and then select several electricity characteristic indexes with the highest final score from the selected electricity characteristic indexes to form the optimal characteristic group.
As a preferred implementation manner of this embodiment, the first neural network adopts a multi-scale convolutional neural network, and a batch normalization and Dropout algorithm is added to optimize after a pooling layer of the multi-scale convolutional neural network;
The second neural network employs a depth residual contraction network.
The structure of the multi-scale convolutional neural network model constructed in this embodiment is shown in fig. 2. Wherein F represents the number of convolution kernels, K represents the scale of the convolution kernels, P represents the scale of the pooling layer, and S represents the pooling step size. Batch normalization (batch normalization, BN) and Dropout algorithms are used to prevent the network from over fitting.
Assume thatS 0 As a matrix of the sequence of inputs,S i is the firstiA sequence matrix of outputs.
The convolution layer is used for realizing local feature extraction on the input sequence data. In CNN, the convolutional and pooling layers typically alternate, assuming thatS i (iOdd) is the output matrix of the convolutional layer, which can be described as:
wherein ,is the firstiWeights of layers, ++>Is the firstiDeviation of layer->Is an activation function.
The pooling layer is used for compressing the features extracted by the convolution layer and realizing information dimension reduction. The maximum pooling function is used in the studycX 1) the maximum value in the pooled core is kept as the output characteristic. Output matrix of pooling layer(mAn even number starting from 2) can be expressed as:
wherein ,refers to the maximum pooling function, +.>Is of the size ofj/c×k,jAndkare respectively->The dimensions of the layer features are such that, cIs the scale of the current pooling layer.
The output layer is essentially a fully connected layer whose role is to classify, select softmax as its activation function. At this layer, the model calculates the probability of each sample corresponding to each type at present, and then a new expression is obtained):
f represents the softmax activation function,representative of input samples belonging to the firstiProbability of type->Represents the weight, and b represents the bias.
The depth residual contraction network (Deep Residual Shrinkage Network, DRSN) is a modified network based on a depth residual network, combining the depth residual network, the attention mechanism and the soft threshold function.
The depth residual shrinkage network model structure constructed in this embodiment is shown in fig. 3. The depth residual shrink network stacks a certain number of adaptive threshold residual shrink units (RSBU-CW), a convolutional layer (Conv), a Batch Normalization (BN), an activation function (ReLU), a Global Average Pooling (GAP), a full connection layer (FC), etc., the network adaptively learns sample characteristics, and reduces the impact of noise on the current task through soft thresholding.
The network residual shrinkage network consists of 4 residual shrinkage units, the downsampling step length of each residual shrinkage unit is set to 2, and the number of channels of the output characteristic map is set to 4. The regularization scheme of the output layer selects L2 regularization with regularization coefficient set to 0.0015. The loss function of the network configuration is selected from cross entropy loss functions, and the initial learning rate of Adam is set to be 1e-4.
In practical application, the acquired data contains a certain noise due to the influence of environment and the interference of electromagnetic radiation, and the phenomena of fitting and data dispersion easily occur when a network without any optimization means trains an actual sample, so that the identification accuracy of a model is obviously reduced. In order to solve the problem, the generalization capability and the anti-interference capability of the model are improved, and several methods are adopted to optimize the neural network in the embodiment.
1. Batch normalization
Batch Normalization (BN) refers to the random gradient descent of data into batches, and the data is normalized before continuing on to pass. The calculation formula of batch standardization is shown as follows:
wherein , and />Are both representative of two learnable variables of the algorithm, < >>Is a minute positive number added to prevent divisor 0,/for>Representing the number of input data contained in each batch.
2. Regularization coefficient, loss function and optimizer selection
L2 regularization is selected in the output layer to accelerate the convergence speed of the network and prevent the network from being over-fitted. The classification cross entropy is selected as a loss function of the model, and the smaller the value of the cross entropy is, the closer the actual output is to the expected output. The method of calculating the loss function can be represented by the following formula:
wherein ,a variable with a value of 0 or 1, wherein if the category of the ith sample is the same as the jth category, the value is 1, otherwise, the value is 0; />Probability that the ith sample corresponds to the jth class; n is the number of samples; c represents the number of output categories; />Is an L2 regularization factor.
Meanwhile, an Adam optimizer is used as an optimizer of the model. According to the method, the weight of the neural network can be iteratively updated according to the training result of each time of the neural network, and the loss function of the model can be minimized.
The training process of the neural network in this embodiment is summarized as follows: first, the input electricity utilization characteristic index is subjected to standardization processing. Setting each parameter of the model, wherein the set parameters comprise the number of convolution kernels, the size of the convolution kernels, the step length of the convolution kernels, the size of a pooling window, the step length of pooling, the number of neurons of a full-connection layer, the number of samples in batches, the network learning rate, the training times and the like. The samples are then randomly divided into training, validation and test sets. And then, initializing a weight matrix and bias variables in the network. The training mode of the network can be regarded as a forward propagation process and a backward propagation process, firstly, the network calculates input data in a forward propagation mode, calculates the difference between actual output and ideal output after obtaining model output, then calculates gradients layer by layer from the last layer through a backward propagation algorithm, adjusts the weight and bias of the network according to the gradients by an Adam optimizer, and ends training when the error of the network meets fault diagnosis requirements or the training times of the network reach preset times.
After the neural network training is completed, step S600 is performed. When the operation state of the gateway metering device is required to be identified, current electricity utilization data is firstly obtained, then the current electricity utilization data is preprocessed and is subjected to independent screening of a feature set, then the current electricity utilization feature index is obtained, the current electricity utilization feature index is sent to a fault diagnosis model based on a multi-scale convolutional neural network, the fault diagnosis model judges that the operation state of the gateway metering device is a fault or normal electricity utilization state, and if the state is the normal electricity utilization state, the whole identification process is finished. If the fault diagnosis model judges that the running state of the gateway metering device is a fault state, the preprocessed electricity utilization characteristic index is sent to a fault type identification model based on the depth residual error shrinkage network model to carry out specific fault type identification, and the fault type of the current gateway metering device is obtained.
Embodiment two:
the embodiment provides a gateway metering device running state identification system, which comprises:
the data acquisition module is used for acquiring historical electricity utilization data of the target gateway metering device, acquiring a plurality of electricity utilization characteristic indexes of a plurality of historical time periods based on the historical electricity utilization data for preprocessing, and adding fault type labels to the electricity utilization characteristic indexes of each historical time period to form a real sample; the module is used for implementing the function of step S100 in the first embodiment, and will not be described here again;
The sample balancing module is used for carrying out sample proliferation on the few real samples based on fault type distribution of the real samples to form a training sample set; the module is used for implementing the function of step S200 in the first embodiment, and will not be described in detail herein;
the feature screening module is used for carrying out feature screening on the samples in the training sample set, screening out high-correlation power utilization feature indexes, and eliminating the rest power utilization feature indexes from the samples; the module is used for implementing the function of step S300 in the first embodiment, and will not be described in detail herein;
the first network training module is used for constructing a first neural network, re-adding a second label to the samples in the training sample set according to the fault type label, wherein the second label is used for indicating whether the samples have faults or not, and training the first neural network by taking the electricity utilization characteristic index in the samples as input and taking the second label as output to obtain a trained fault diagnosis model; the module is used for realizing the function of step S400 in the first embodiment, and will not be described in detail herein;
the second network training module is used for constructing a second neural network, taking the electricity utilization characteristic index in the sample as input, and taking the fault type label as output to train the second neural network so as to obtain a trained fault type identification model; the module is used for realizing the function of step S500 in the first embodiment, and will not be described in detail herein;
The identification module is used for acquiring current electricity utilization data of the target gateway metering device, acquiring corresponding electricity utilization characteristic indexes, inputting the corresponding electricity utilization characteristic indexes into the fault diagnosis model, judging whether faults occur or not, and inputting the electricity utilization characteristic indexes into the fault type identification model to identify the fault type occurring after the faults occur; the module is used to implement the function of step S600 in the first embodiment, which is not described herein.
As a preferred implementation manner of this embodiment, the method for performing sample proliferation on a small number of real samples by the sample balancing module specifically includes:
sample generation is carried out by adopting a variation self-encoder;
inputting the real samples into an encoder network of a variational self-encoder, and compressing the real samples into low-dimensional vectors;
and then inputting the low-dimensional vector into a decoder network of a variable self-encoder to reconstruct samples, thereby obtaining generated samples.
As a preferred implementation manner of this embodiment, the feature screening module performs feature screening on samples in the training sample set, and the method for screening out the high-correlation power consumption feature index specifically includes:
for any two types of fault types of samples, marking the feature set of the first type of samples asFeature set of second class sampleIs->
wherein ,Characteristic set representing the ith electricity utilization characteristic index in the first class of samples, +.>Feature set representing the ith electricity usage feature index in the second class of samples, +.>An ith power consumption characteristic index indicating an Nth 1 st sample of the first class of samples,/-)>The ith electricity utilization characteristic index of the Nth 2 samples in the second type samples is N1 in number of samples of the first type, and N2 in number of samples of the second type;
and carrying out nonlinear transformation on the sample by adopting a Gaussian kernel function, mapping the sample into a high-dimensional space, wherein a mapping equation is specifically as follows:
;/>
wherein ,is-> and />Of (2), wherein>Represents the center point of the kernel function,/>Is any point in space; />Is bandwidth and is used for controlling the action range of the Gaussian kernel function;
obtaining feature sets of two types of samples mapped to high-dimensional space and />The method is characterized by comprising the following steps:
wherein ,characteristic set representing jth electricity utilization characteristic index after mapping first class sample to high-dimensional space, ++>A feature set representing a j-th power utilization feature index in the second class of samples; />An i-th index of electricity utilization characteristics mapped to a high-dimensional space representing an Nth 1 sample of the first class of samples, ">An electricity utilization characteristic index representing that an ith one of the Nth 2 samples in the second class of samples is mapped to the high-dimensional space;
And then obtaining an intra-class dispersion matrix of each class of sample feature set by using a Fisher linear discriminant method, wherein the intra-class dispersion matrix is as follows:
wherein :
acquiring a total intra-class dispersion matrix and a total inter-class dispersion matrix of a sample:
constructing a characteristic scoring formula according to a total intra-class dispersion matrix and a total inter-class dispersion matrix of the sample:
the characteristic scoring of the ith electricity utilization characteristic index is given, and the larger the value of the characteristic scoring is, the larger the inter-class dispersion degree of different types of samples is, namely the larger the characteristic distinction degree is;
the redundancy among the power utilization characteristic indexes is calculated by adopting a maximum information coefficient method, and the method specifically comprises the following steps:
given i and j, performing i column and j row meshing on a scatter diagram formed by any two variables U, V, and solving the maximum mutual information value; then normalizing the maximum mutual information value; and finally, selecting the maximum value of mutual information under different scales as an MIC value, wherein the calculation formula is as follows:
wherein U, V is any two variables, and B is a preset reference factor;the mutual information value is U, V, and the calculation formula is as follows:
score based on featuresAnd MIC value calculation final score:
wherein ,final score for the ith electricity utilization characteristic index, N is characteristic number, ++>And selecting an optimal characteristic group capable of showing the running state of the gateway metering device for the ith electricity utilization characteristic index based on the calculated final score.
As a preferred implementation manner of this embodiment, the first neural network adopts a multi-scale convolutional neural network, and a batch normalization and Dropout algorithm is added to optimize after a pooling layer of the multi-scale convolutional neural network;
the second neural network employs a depth residual contraction network.
Embodiment III:
the embodiment provides an electronic device, which comprises a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein the processor realizes the operation state identification method of the gateway metering device according to any embodiment of the application when executing the program.
Embodiment four:
the present embodiment proposes a computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements a method for identifying an operation state of a gateway metering apparatus according to any of the embodiments of the present application.
In the embodiments of the present application, "at least one" means one or more, and "a plurality" means two or more. "and/or", describes an association relation of association objects, and indicates that there may be three kinds of relations, for example, a and/or B, and may indicate that a alone exists, a and B together, and B alone exists. Wherein A, B may be singular or plural. The character "/" generally indicates that the context-dependent object is an "or" relationship. "at least one of the following" and the like means any combination of these items, including any combination of single or plural items. For example, at least one of a, b and c may represent: a, b, c, a and b, a and c, b and c or a and b and c, wherein a, b and c can be single or multiple.
Those of ordinary skill in the art will appreciate that the various elements and algorithm steps described in the embodiments disclosed herein can be implemented as a combination of electronic hardware, computer software, and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
It will be clear to those skilled in the art that, for convenience and brevity of description, specific working procedures of the above-described systems, apparatuses and units may refer to corresponding procedures in the foregoing method embodiments, and are not repeated herein.
In several embodiments provided by the present application, any of the functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer-readable storage medium. Based on this understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a usb disk, a removable hard disk, a Read-Only Memory (hereinafter referred to as ROM), a random access Memory (Random Access Memory) and various media capable of storing program codes such as a magnetic disk or an optical disk.
The foregoing description is only illustrative of the present invention and is not intended to limit the scope of the invention, and all equivalent structures or equivalent processes or direct or indirect application in other related technical fields are included in the scope of the present invention.

Claims (8)

1. The method for identifying the running state of the gateway metering device is characterized by comprising the following steps of:
collecting historical electricity utilization data of a target gateway metering device, acquiring a plurality of electricity utilization characteristic indexes of a plurality of historical time periods based on the historical electricity utilization data, preprocessing the historical electricity utilization characteristic indexes, and adding a fault type label to the electricity utilization characteristic indexes of each historical time period to form a real sample;
based on fault type distribution of the real samples, sample proliferation is carried out on the real samples with small quantity, so as to form a training sample set;
screening the characteristics of the samples in the training sample set, screening out high-correlation electricity utilization characteristic indexes, and removing the rest electricity utilization characteristic indexes from the samples;
constructing a first neural network, re-adding a second label to the samples in the training sample set according to the fault type label, wherein the second label is used for indicating whether the samples have faults or not, taking the electricity utilization characteristic index in the samples as input, and taking the second label as output to train the first neural network to obtain a trained fault diagnosis model;
Constructing a second neural network, taking the electricity utilization characteristic index in the sample as input, and taking the fault type label as output to train the second neural network to obtain a trained fault type identification model;
acquiring current electricity utilization data of the target gateway metering device, acquiring corresponding electricity utilization characteristic indexes, inputting the corresponding electricity utilization characteristic indexes into a fault diagnosis model, judging whether faults occur or not, and inputting the electricity utilization characteristic indexes into a fault type identification model to identify the type of the faults after the faults occur;
the method for screening the high-correlation electricity utilization characteristic indexes by screening the samples in the training sample set specifically comprises the following steps of:
for any two types of fault types of samples, marking the feature set of the first type of samples asThe feature set of the second class of samples is +.>
wherein ,characteristic set representing the ith electricity utilization characteristic index in the first class of samples, +.>Feature set representing the ith electricity usage feature index in the second class of samples, +.>An ith power consumption characteristic index indicating an Nth 1 st sample of the first class of samples,/-)>The ith electricity utilization characteristic index of the Nth 2 samples in the second type samples is N1 in number of samples of the first type, and N2 in number of samples of the second type;
And carrying out nonlinear transformation on the sample by adopting a Gaussian kernel function, mapping the sample into a high-dimensional space, wherein a mapping equation is specifically as follows:
wherein ,is-> and />Of (2), wherein>Represents the center point of the kernel function,/>Is any point in space; />Is bandwidth and is used for controlling the action range of the Gaussian kernel function;
obtaining feature sets of two types of samples mapped to high-dimensional space and />The method is characterized by comprising the following steps:
wherein ,characteristic set representing jth electricity utilization characteristic index after mapping first class sample to high-dimensional space, ++>A feature set representing a j-th power utilization feature index in the second class of samples; />An i-th index of electricity utilization characteristics mapped to a high-dimensional space representing an Nth 1 sample of the first class of samples, ">An electricity utilization characteristic index representing that an ith one of the Nth 2 samples in the second class of samples is mapped to the high-dimensional space;
and then obtaining an intra-class dispersion matrix of each class of sample feature set by using a Fisher linear discriminant method, wherein the intra-class dispersion matrix is as follows:
wherein :
acquiring a total intra-class dispersion matrix and a total inter-class dispersion matrix of a sample:
constructing a characteristic scoring formula according to a total intra-class dispersion matrix and a total inter-class dispersion matrix of the sample:
the characteristic scoring of the ith electricity utilization characteristic index is given, and the larger the value of the characteristic scoring is, the larger the inter-class dispersion degree of different types of samples is, namely the larger the characteristic distinction degree is;
The redundancy among the power utilization characteristic indexes is calculated by adopting a maximum information coefficient method, and the method specifically comprises the following steps:
given i and j, performing i column and j row meshing on a scatter diagram formed by any two variables U, V, and solving the maximum mutual information value; then normalizing the maximum mutual information value; and finally, selecting the maximum value of mutual information under different scales as an MIC value, wherein the calculation formula is as follows:
wherein U, V is any two variables, and B is a preset reference factor;the mutual information value is U, V, and the calculation formula is as follows:
score based on featuresAnd MIC value calculation final score:
wherein ,final score for the ith electricity utilization characteristic index, N is characteristic number, ++>And selecting an optimal characteristic group capable of showing the running state of the gateway metering device for the ith electricity utilization characteristic index based on the calculated final score.
2. The method for identifying the operation state of a gateway metering device according to claim 1, wherein the method for sample proliferation of a few real samples is specifically as follows:
sample generation is carried out by adopting a variation self-encoder;
inputting the real samples into an encoder network of a variational self-encoder, and compressing the real samples into low-dimensional vectors;
And then inputting the low-dimensional vector into a decoder network of a variable self-encoder to reconstruct samples, thereby obtaining generated samples.
3. The method for identifying the operation state of a gateway metering device according to claim 1, wherein:
the first neural network adopts a multi-scale convolutional neural network, and a batch normalization and Dropout algorithm is added after a pooling layer of the multi-scale convolutional neural network for optimization;
the second neural network employs a depth residual contraction network.
4. A gateway metering device operational status identification system, comprising:
the data acquisition module is used for acquiring historical electricity utilization data of the target gateway metering device, acquiring a plurality of electricity utilization characteristic indexes of a plurality of historical time periods based on the historical electricity utilization data for preprocessing, and adding fault type labels to the electricity utilization characteristic indexes of each historical time period to form a real sample;
the sample balancing module is used for carrying out sample proliferation on the few real samples based on fault type distribution of the real samples to form a training sample set;
the feature screening module is used for carrying out feature screening on the samples in the training sample set, screening out high-correlation power utilization feature indexes, and eliminating the rest power utilization feature indexes from the samples;
The first network training module is used for constructing a first neural network, re-adding a second label to the samples in the training sample set according to the fault type label, wherein the second label is used for indicating whether the samples have faults or not, and training the first neural network by taking the electricity utilization characteristic index in the samples as input and taking the second label as output to obtain a trained fault diagnosis model;
the second network training module is used for constructing a second neural network, taking the electricity utilization characteristic index in the sample as input, and taking the fault type label as output to train the second neural network so as to obtain a trained fault type identification model;
the identification module is used for acquiring current electricity utilization data of the target gateway metering device, acquiring corresponding electricity utilization characteristic indexes, inputting the corresponding electricity utilization characteristic indexes into the fault diagnosis model, judging whether faults occur or not, and inputting the electricity utilization characteristic indexes into the fault type identification model to identify the fault type occurring after the faults occur;
the feature screening module performs feature screening on samples in the training sample set, and the method for screening out the high-correlation electricity utilization feature indexes specifically comprises the following steps:
for any two types of fault types of samples, marking the feature set of the first type of samples as The feature set of the second class of samples is +.>
wherein ,characteristic set representing the ith electricity utilization characteristic index in the first class of samples, +.>Feature set representing the ith electricity usage feature index in the second class of samples, +.>An ith power consumption characteristic index indicating an Nth 1 st sample of the first class of samples,/-)>The ith electricity utilization characteristic index of the Nth 2 samples in the second type samples is N1 in number of samples of the first type, and N2 in number of samples of the second type;
and carrying out nonlinear transformation on the sample by adopting a Gaussian kernel function, mapping the sample into a high-dimensional space, wherein a mapping equation is specifically as follows:
wherein ,is-> and />Of (2), wherein>Represents the center point of the kernel function,/>Is any point in space; />Is bandwidth and is used for controlling the action range of the Gaussian kernel function;
obtaining feature sets of two types of samples mapped to high-dimensional space and />The method is characterized by comprising the following steps:
wherein ,characteristic set representing jth electricity utilization characteristic index after mapping first class sample to high-dimensional space, ++>A feature set representing a j-th power utilization feature index in the second class of samples; />An i-th index of electricity utilization characteristics mapped to a high-dimensional space representing an Nth 1 sample of the first class of samples, ">An electricity utilization characteristic index representing that an ith one of the Nth 2 samples in the second class of samples is mapped to the high-dimensional space;
And then obtaining an intra-class dispersion matrix of each class of sample feature set by using a Fisher linear discriminant method, wherein the intra-class dispersion matrix is as follows:
wherein :
acquiring a total intra-class dispersion matrix and a total inter-class dispersion matrix of a sample:
constructing a characteristic scoring formula according to a total intra-class dispersion matrix and a total inter-class dispersion matrix of the sample:
the characteristic scoring of the ith electricity utilization characteristic index is given, and the larger the value of the characteristic scoring is, the larger the inter-class dispersion degree of different types of samples is, namely the larger the characteristic distinction degree is;
the redundancy among the power utilization characteristic indexes is calculated by adopting a maximum information coefficient method, and the method specifically comprises the following steps:
given i and j, performing i column and j row meshing on a scatter diagram formed by any two variables U, V, and solving the maximum mutual information value; then normalizing the maximum mutual information value; and finally, selecting the maximum value of mutual information under different scales as an MIC value, wherein the calculation formula is as follows:
wherein U, V is any two variables, and B is a preset reference factor;the mutual information value is U, V, and the calculation formula is as follows:
score based on featuresAnd MIC value calculation final score:
wherein ,final score for the ith electricity utilization characteristic index, N is characteristic number, ++>And selecting an optimal characteristic group capable of showing the running state of the gateway metering device for the ith electricity utilization characteristic index based on the calculated final score.
5. The system for identifying the operation state of a gateway metering device according to claim 4, wherein the sample balancing module performs sample proliferation on a small number of real samples by specifically:
sample generation is carried out by adopting a variation self-encoder;
inputting the real samples into an encoder network of a variational self-encoder, and compressing the real samples into low-dimensional vectors;
and then inputting the low-dimensional vector into a decoder network of a variable self-encoder to reconstruct samples, thereby obtaining generated samples.
6. The gateway metering apparatus operating state identification system of claim 4, wherein:
the first neural network adopts a multi-scale convolutional neural network, and a batch normalization and Dropout algorithm is added after a pooling layer of the multi-scale convolutional neural network for optimization;
the second neural network employs a depth residual contraction network.
7. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the method of identifying the operational status of the gateway metering apparatus of any of claims 1 to 3 when the program is executed by the processor.
8. A computer-readable storage medium having stored thereon a computer program, which when executed by a processor implements a method of identifying the operational status of a gateway metering apparatus as claimed in any one of claims 1 to 3.
CN202311089567.6A 2023-08-28 2023-08-28 Method, system, equipment and medium for identifying operation state of gateway metering device Pending CN116776209A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311089567.6A CN116776209A (en) 2023-08-28 2023-08-28 Method, system, equipment and medium for identifying operation state of gateway metering device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311089567.6A CN116776209A (en) 2023-08-28 2023-08-28 Method, system, equipment and medium for identifying operation state of gateway metering device

Publications (1)

Publication Number Publication Date
CN116776209A true CN116776209A (en) 2023-09-19

Family

ID=87991696

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311089567.6A Pending CN116776209A (en) 2023-08-28 2023-08-28 Method, system, equipment and medium for identifying operation state of gateway metering device

Country Status (1)

Country Link
CN (1) CN116776209A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117176550A (en) * 2023-09-25 2023-12-05 云念软件(广东)有限公司 Integrated operation maintenance method and system based on fault identification

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103824093A (en) * 2014-03-19 2014-05-28 北京航空航天大学 SAR (Synthetic Aperture Radar) image target characteristic extraction and identification method based on KFDA (Kernel Fisher Discriminant Analysis) and SVM (Support Vector Machine)
CN105162413A (en) * 2015-09-08 2015-12-16 河海大学常州校区 Method for evaluating performances of photovoltaic system in real time based on working condition identification
CN113657556A (en) * 2021-09-23 2021-11-16 华北电力大学 Gas turbine inlet guide vane system fault diagnosis method based on multivariate statistical analysis
CN114064900A (en) * 2021-11-24 2022-02-18 广东电网有限责任公司 Power distribution automation terminal fault diagnosis method, device, equipment and storage medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103824093A (en) * 2014-03-19 2014-05-28 北京航空航天大学 SAR (Synthetic Aperture Radar) image target characteristic extraction and identification method based on KFDA (Kernel Fisher Discriminant Analysis) and SVM (Support Vector Machine)
CN105162413A (en) * 2015-09-08 2015-12-16 河海大学常州校区 Method for evaluating performances of photovoltaic system in real time based on working condition identification
CN113657556A (en) * 2021-09-23 2021-11-16 华北电力大学 Gas turbine inlet guide vane system fault diagnosis method based on multivariate statistical analysis
CN114064900A (en) * 2021-11-24 2022-02-18 广东电网有限责任公司 Power distribution automation terminal fault diagnosis method, device, equipment and storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
高伟等: ""不均衡小样本下多特征优化选择的生命体触电故障识别方法"", 《电工技术学报》, pages 1 - 13 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117176550A (en) * 2023-09-25 2023-12-05 云念软件(广东)有限公司 Integrated operation maintenance method and system based on fault identification
CN117176550B (en) * 2023-09-25 2024-03-19 云念软件(广东)有限公司 Integrated operation maintenance method and system based on fault identification

Similar Documents

Publication Publication Date Title
WO2023123941A1 (en) Data anomaly detection method and apparatus
CN108053148B (en) Efficient fault diagnosis method for power information system
CN116150897A (en) Machine tool spindle performance evaluation method and system based on digital twin
CN111695731A (en) Load prediction method, system and equipment based on multi-source data and hybrid neural network
CN116776209A (en) Method, system, equipment and medium for identifying operation state of gateway metering device
CN112418476A (en) Ultra-short-term power load prediction method
CN115841278B (en) Method, system, equipment and medium for evaluating running error state of electric energy metering device
CN117473048B (en) Financial abnormal data monitoring and analyzing system and method based on data mining
CN115587543A (en) Federal learning and LSTM-based tool residual life prediction method and system
CN113409166A (en) XGboost model-based method and device for detecting abnormal electricity consumption behavior of user
Dong Combining unsupervised and supervised learning for asset class failure prediction in power systems
CN111460001A (en) Theoretical line loss rate evaluation method and system for power distribution network
CN114266289A (en) Complex equipment health state assessment method
CN115640969A (en) Power grid operation and maintenance cost distribution method based on equipment state and operation age
Zhang et al. Load Prediction Based on Hybrid Model of VMD‐mRMR‐BPNN‐LSSVM
CN113935413A (en) Distribution network wave recording file waveform identification method based on convolutional neural network
CN117131022B (en) Heterogeneous data migration method of electric power information system
CN114021758A (en) Operation and maintenance personnel intelligent recommendation method and device based on fusion of gradient lifting decision tree and logistic regression
CN117674119A (en) Power grid operation risk assessment method, device, computer equipment and storage medium
CN113033898A (en) Electrical load prediction method and system based on K-means clustering and BI-LSTM neural network
CN112232570A (en) Forward active total electric quantity prediction method and device and readable storage medium
CN111738483A (en) Power grid loss reduction optimization method and system based on clustering and deep belief network
CN112348220A (en) Credit risk assessment prediction method and system based on enterprise behavior pattern
Pisica et al. Feature selection filter for classification of power system operating states
CN115864644A (en) Relay protection device state evaluation method, system, equipment and medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20230919