CN116702078B - State detection method based on modular expandable cabinet power distribution unit - Google Patents

State detection method based on modular expandable cabinet power distribution unit Download PDF

Info

Publication number
CN116702078B
CN116702078B CN202310649869.8A CN202310649869A CN116702078B CN 116702078 B CN116702078 B CN 116702078B CN 202310649869 A CN202310649869 A CN 202310649869A CN 116702078 B CN116702078 B CN 116702078B
Authority
CN
China
Prior art keywords
model
data
training
data set
supervision
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202310649869.8A
Other languages
Chinese (zh)
Other versions
CN116702078A (en
Inventor
贾继伟
邵国栋
何炬亮
姚伟军
陈善民
郑汉杰
李兴
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang Post & Telecommunication Engineering Construction Co ltd
China Telecom Corp Ltd Zhejiang Branch
Original Assignee
Zhejiang Post & Telecommunication Engineering Construction Co ltd
China Telecom Corp Ltd Zhejiang Branch
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang Post & Telecommunication Engineering Construction Co ltd, China Telecom Corp Ltd Zhejiang Branch filed Critical Zhejiang Post & Telecommunication Engineering Construction Co ltd
Priority to CN202310649869.8A priority Critical patent/CN116702078B/en
Publication of CN116702078A publication Critical patent/CN116702078A/en
Application granted granted Critical
Publication of CN116702078B publication Critical patent/CN116702078B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H02GENERATION; CONVERSION OR DISTRIBUTION OF ELECTRIC POWER
    • H02JCIRCUIT ARRANGEMENTS OR SYSTEMS FOR SUPPLYING OR DISTRIBUTING ELECTRIC POWER; SYSTEMS FOR STORING ELECTRIC ENERGY
    • H02J13/00Circuit arrangements for providing remote indication of network conditions, e.g. an instantaneous record of the open or closed condition of each circuitbreaker in the network; Circuit arrangements for providing remote control of switching means in a power distribution network, e.g. switching in and out of current consumers by using a pulse code signal carried by the network
    • H02J13/00002Circuit arrangements for providing remote indication of network conditions, e.g. an instantaneous record of the open or closed condition of each circuitbreaker in the network; Circuit arrangements for providing remote control of switching means in a power distribution network, e.g. switching in and out of current consumers by using a pulse code signal carried by the network characterised by monitoring
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01RMEASURING ELECTRIC VARIABLES; MEASURING MAGNETIC VARIABLES
    • G01R31/00Arrangements for testing electric properties; Arrangements for locating electric faults; Arrangements for electrical testing characterised by what is being tested not provided for elsewhere
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • G06F18/2155Generating training patterns; Bootstrap methods, e.g. bagging or boosting characterised by the incorporation of unlabelled data, e.g. multiple instance learning [MIL], semi-supervised techniques using expectation-maximisation [EM] or naïve labelling
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/217Validation; Performance evaluation; Active pattern learning techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/2433Single-class perspective, e.g. one-against-all classification; Novelty detection; Outlier detection
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y04INFORMATION OR COMMUNICATION TECHNOLOGIES HAVING AN IMPACT ON OTHER TECHNOLOGY AREAS
    • Y04SSYSTEMS INTEGRATING TECHNOLOGIES RELATED TO POWER NETWORK OPERATION, COMMUNICATION OR INFORMATION TECHNOLOGIES FOR IMPROVING THE ELECTRICAL POWER GENERATION, TRANSMISSION, DISTRIBUTION, MANAGEMENT OR USAGE, i.e. SMART GRIDS
    • Y04S10/00Systems supporting electrical power generation, transmission or distribution
    • Y04S10/50Systems or methods supporting the power network operation or management, involving a certain degree of interaction with the load-side end user applications

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Power Engineering (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

A state detection method based on a modular expandable cabinet power distribution unit belongs to the technical field of current supply of electric communication technology, and an original non-supervision model U is utilized first model For the unlabeled second training data set B t Making predictions to generate virtual tags in advance, and giving the virtual tags a second training data set B t As a later active learning supervision model S model Finally, the data set is supervised by the supervision model S model Responsible for the training and prediction of the active learning thereafter. According to the scheme, active learning state detection is adopted, and the marks of the samples are automatically collected through feedback of an expert, so that the accuracy of state detection is improved.

Description

State detection method based on modular expandable cabinet power distribution unit
Technical Field
The invention belongs to the technical field of current supply of electric communication technology, and particularly relates to a state detection method based on a power distribution unit of a modular expandable cabinet.
Background
The power distribution unit of the modular expandable cabinet, as shown in figure 1, consists of a base, a bus, a total incoming line unit, a movable module, an empty panel and the like. Each movable module can be independently plugged and unplugged from the bottom box, can be increased or decreased according to the requirement, and does not influence the communication of other extension module circuits due to the absence of any module. For example, chinese patent publication No. CN112333970a discloses an expandable cabinet power distribution unit.
For another example, in the "power supply network abnormality detection processing system and detection processing method thereof" disclosed in chinese patent publication No. CN111585846a, the processor module obtains the difference of each sampling point and calculates an abnormal component, and when the abnormal component reaches an alarm trigger point, the processor module sends out alarm information.
Therefore, the sensor module and the communication module are arranged in the active module of the power distribution unit, and the power data collected by the sensor is uploaded to the cloud end, which is the prior art in the field.
According to the scheme, the sensor module and the communication module are arranged in the power distribution unit of the modular expandable cabinet, and the power information is uploaded to the cloud, so that the power utilization state of the communication equipment is detected.
Aging or improper use of communication equipment is prone to fire. Therefore, it is necessary to collect power information of the communication equipment, detect the power utilization state, early warn the abnormal state, remind the user to maintain or replace the related equipment, and avoid the damage caused by the fault of the communication equipment.
However, in the system and method for detecting and processing abnormal components of electric power supply network disclosed in chinese patent publication No. CN111585846a, a statistical method is adopted to artificially determine a threshold value for determining the deviation of abnormal components. On the one hand, the setting of the threshold value does not have the real physical meaning behind the abnormality, and on the other hand, the setting of the threshold value does not have a scientific scale; the deviation is not calculated to be abnormal, and an exact standard is difficult to be found.
And for detecting the power utilization state of the communication equipment, non-supervision learning is adopted for classification, and then abnormal quantity is found out. As disclosed in chinese patent publication No. CN109740694a, "a method for detecting non-technical loss of smart grid based on unsupervised learning", which collects the load of power users and classifies them using k-means clustering method, it considers abnormal electricity consumption pattern detection to be essentially a binary classification problem, i.e. all users are classified into two categories: normal users and abnormal users. However, the original data has normal data and abnormal data, and the method ignores the abnormal data which occupies a relatively small amount, and when training the model, all the data are regarded as the normal data for training, and the result after training is not correlated with the initial abnormal data.
Disclosure of Invention
In view of the above-mentioned state of the art and shortcomings, it is an object of the present invention to provide a status detection method based on a modular scalable cabinet power distribution unit.
The state detection method based on the modular expandable cabinet power distribution unit comprises the following steps: step S1, data selection and cleaning: the sensor module is used for collecting power data of the communication equipment, taking every 3 minutes as a time acquisition interval, and recording and storing a time-series data set; power profile, including the point in time of occurrence and the actual power W;
step S2, converting data: converting the time-series data set into a space vector-series data set; space vector v t A set of actual powers W recorded for consecutive N time acquisition intervals; n is the width of the space vector;
step S3, a pre-training process:
step S301, training an unsupervised model U model
Half of the data in the data set is used as a training non-supervision model U model Is not marked in the first training data set A t The rest data is used as a training supervision model S model Second training data set B t The method comprises the steps of carrying out a first treatment on the surface of the Second training data set B t In the method, the normal sample set is N maj And the abnormal sample set is O min The method comprises the steps of carrying out a first treatment on the surface of the From the normal sample set N maj Randomly selecting H sample sets S maj Then each sample set S maj Are respectively matched with the abnormal sample set O min Mixing to form H mixed samples, wherein the number of normal samples and abnormal samples in each mixed sample is 1:1;
for the first training data set A which is not marked t Training an unsupervised model U using a unitary classification approach model
Step S302, a mixed sample set is selected and guided into an unsupervised model U model Predicting virtual tags, recording virtual tags in the virtual tag set PL, and adding the mixed sample set assigned virtual tags to the marked dataset X t The method comprises the steps of carrying out a first treatment on the surface of the Then use the marked data set X t Pre-training supervision model S model Completing the supervision model S model Initializing;
step S4, active learning feedback flow: supervision model S model After initialization, selecting the next mixed sample set to the supervision model S model Performing classification prediction;
step S5, turning to step S4, selecting the next mixed sample set and then entering the supervision model S model Predictions are made until the mixed sample set is traversed.
Further, step S2 further includes the steps of: conversion characteristics: for space vector v in space vector sequence t Performing a conversion feature, the converted feature comprising:
average power
Maximum difference S maxdiff =max(v ti )–min(v ti );
The first quartile Q1, v ti A number of 25% ranging from small to large;
second quartile Q2, v ti A median number ranging from small to large;
third quartile Q3, v ti A number of 75% ranging from small to large;
the quarter-bit difference iqr=q3-Q1;
standard deviation SD;
maximum data change ratio S maxratio =max(v ti )/V tm
Minimum data change ratio S minratio =min(v ti )/V tm
Discrete value S of data dev =S maxdiff /V tm
Replacing the space vector v with the 10-dimensional features t Thereby forming a new sequence of space vectors.
Further, step S2 further includes the steps of:
extraction characteristics: and (3) carrying out feature extraction on the 10-dimensional features in the space vector sequence by adopting a principal component analysis method, replacing the extracted new features with the 10-dimensional features to obtain a new space vector sequence, and taking the space vector sequence as a new data set.
Further, step S302, further includes the following steps:
method for calculating unsupervised model U using uncertainty sampling model Uncertainty of each data in the list, find m which is the most uncertain 1 Pen data, will be m 1 And transmitting the data to an expert for updating the labels, and updating the virtual labels in the virtual label set PL according to the updated labels.
Further, step 4, further includes: method for calculating a supervision model S using uncertainty sampling model Uncertainty of each data in the list, find m which is the most uncertain 2 Pen data; let m 2 The data are transmitted to an expert for labeling, and a newly added label is obtained; adding a newly added tag M2 to the marked dataset X t The method comprises the steps of carrying out a first treatment on the surface of the Then use the marked data set X t Retraining a supervision model S model . The proposal firstly utilizes the original non-supervision model U model For the unlabeled second training data set B t Making predictions to generate virtual tags in advance, and giving the virtual tags a second training data set B t As a later active learning supervision model S model Is trained in advanceTraining the data set, finally using the supervision model S model Responsible for the training and prediction of the active learning thereafter. It has the following advantages:
according to the scheme, active learning state detection is adopted, and the marks of the samples are automatically collected through feedback of an expert, so that the accuracy of state detection is improved. When the abnormal state of the communication equipment is detected, the abnormal state information is sent to a user for reference so as to carry out relevant maintenance or replacement measures, and the danger caused by the fault of the communication equipment is avoided.
The scheme solves the problem given by the initial label of the data by utilizing the virtual label in the semi-supervised learning, so that the semi-supervised learning scheme can be organically combined with the supervision model of the active learning, and the accuracy of the semi-supervised learning is improved.
If only the supervised model of active learning is used, since there is a limitation in active learning that two kinds of data must be distributed from the beginning, i.e. normal data and outlier abnormal data, and the solution is only a mixed data set of a large amount of normal data and a small amount of abnormal data from the beginning, single active learning is not suitable for the solution.
If the benefits of accuracy improvement of active learning are to be exploited, the supervision model must be trained using real labels. In the scheme, an unsupervised model U is utilized first model For the second training data set B as unlabeled t Predicting to obtain initial virtual label for training supervision model S model Then, the model can be smoothly poured into active learning, retraining is carried out by inquiring uncertain records, and the accuracy of the supervision model is further improved.
Drawings
FIG. 1 is a block diagram of a distribution unit of the present invention;
FIG. 2 is a flow chart of the present invention;
FIG. 3 is a schematic diagram of a data set converted into a space vector sequence;
FIG. 4 is a schematic sampling of a random forest;
FIG. 5 is a diagram of anomaly data;
FIG. 6 is a graph of predicted results versus query times.
Detailed Description
The present invention will be described in further detail with reference to the accompanying drawings.
The power consumption information of the communication equipment has no label, and the traditional method for detecting the state cannot work normally at the moment when the communication equipment has a large amount of normal data or almost no corresponding abnormal data. The scheme solves the problem of data marking by using the concept of virtual marking in semi-supervised learning.
Since the number of samples of the marked data is too small, the model will initially obtain a less accurate classification boundary with low prediction accuracy. The scheme is used for selecting records with the most influence on the accuracy in the test data, feeding the records back to a user or an expert to give the correct marks again, and adding the correct marks into the original training data set to retrain the model. As such, the boundaries of the model classification will become more and more accurate.
FIG. 2 is a flow chart of the present invention; as shown in fig. 2. The state detection method based on the modular expandable cabinet power distribution unit comprises the following steps:
step S1, data selection and cleaning.
And the sensor module is used for collecting power data of the communication equipment, recording and storing the power data in a database at intervals of every 3 minutes.
According to the scheme, in the modular expandable cabinet power distribution unit, the sensor module and the communication module are arranged, and the power information is uploaded to the cloud end, so that the power utilization state of the communication equipment is detected. The structure is the same as that of an abnormality detection processing system of an electric power supply network disclosed in China patent publication No. CN 111585846A.
The database is used for recording the power information of a plurality of row time points, including the occurrence time point, the voltage V, the current A, the power factor PF, the actual power W and the apparent power VA.
The apparent power is the product of current and voltage, the power factor is the ratio of the actual power divided by the apparent power, and is between 0 and 1, and the higher the power factor is, the higher the power consumption efficiency of the communication equipment is. Because the value variation difference between the voltage and the power factor is very small and can be almost regarded as a fixed constant, the scheme adopts the actual power W to match the occurrence time point to form a time series data set.
In the scheme, 36 sensor modules are used, the time range is 2022, 1 month to 11 months, more than 50 tens of thousands of records are recorded, and a time series data set is formed.
Step S2, converting the data, converting the characteristics and extracting the characteristics.
Converting data: the power consumption abnormality of the communication device is not determinable at a single data point. Thus, in this scheme, the object of investigation of the power consumption state is a set of data points for one period of time.
FIG. 3 is a schematic diagram of a data set converted into a space vector sequence; as shown in fig. 3, the time-series data set is converted into a space vector sequence, which is formulated as follows:
S n ={v 1 ,v 2 ,v 3 ,...,v t ,...,v n };
wherein S is n Representing a space vector sequence with the continuous length of time acquisition interval being n, t representing the ordinal number of the time acquisition interval and being 1-t-n and v t Representing the space vector recording N actual powers W starting with the t-th time acquisition interval, i.e. v t ={v t1 ,v t2 ,...,v ti ,...,v tN },v t1 Representing a first actual power W recorded at a t-th time acquisition interval; v ti Representing v t The i-th actual power W in (a). Preferably, N is 3.ltoreq.N.ltoreq.10.
According to the scheme, state abnormality is predicted through state detection. Abnormality refers to a situation in which the behavior patterns of a small number of individuals in a population do not conform to a defined normal condition, or normal behavior expected with respect to a large number of individuals is referred to as abnormality. Because of the abnormal power consumption of the communication device, it is not a single data point that can be determined. Therefore, the method adopts the set exception, which means that a group of sets consisting of a plurality of record points are needed to form the exception, and the record points in the same set have a dependency relationship. For example, chinese patent publication No. CN109740694a discloses a method for detecting non-technical loss of smart grid based on non-supervised learning, which uses a time sequence and uses a moving average method to continuously analyze the time sequence to obtain a trend index.
Conversion characteristics: if principal component analysis PCA is directly adopted to perform feature extraction on the space vector sequence, the fluctuation of the actual power W is too fine, so that the too fine abnormal power fluctuation cannot be captured. Therefore, it is necessary to first use the space vector v in the space vector sequence t And converting the characteristics to improve the identification accuracy.
The characteristics after transformation include:
average power
Maximum difference S maxdiff =max(v ti )–min(v ti );
The first quartile Q1, v ti A number of 25% ranging from small to large;
second quartile Q2, v ti A median number ranging from small to large;
third quartile Q3, v ti A number of 75% ranging from small to large;
the quarter-bit difference iqr=q3-Q1;
standard deviation SD;
maximum data change ratio S maxratio =max(v ti )/V tm
Minimum data change ratio S minratio =min(v ti )/V tm
Discrete value S of data dev =S maxdiff /V tm
Replacing the space vector v with the 10-dimensional features t Thereby forming a new sequence of space vectors. Features of 10 dimensions, large coveragePower characteristics of a portion of the communication device. Because the median eigenvalues are less susceptible to outliers than the power average, the first quartile Q1, the second quartile Q2, the third quartile Q3, and the quartile difference IQR are used to enhance the robustness of the system prediction.
Extraction characteristics: and adopting Principal Component Analysis (PCA), extracting features of 10 dimensions in the space vector sequence, replacing the extracted new features with the features of 10 dimensions to obtain a new space vector sequence, and taking the space vector sequence as a new data set.
Principal component analysis PCA is a statistical method, and in order to explore the correlation degree among a plurality of possible correlation variables, the maximum or minimum correlation direction is searched, and the purposes of data compression or denoising are achieved, the method comprises the following steps: let the variable class be n "; the number of samples for each variable is m';
1) Calculating the mean value of each variable, and then subtracting the mean value (removing the mean value) from each sampling value;
2) Solving covariance matrix, note divided by m "-1 (to get unbiased estimate);
3) And solving covariance matrix eigenvalues and eigenvectors.
Principal component analysis PCA is a conventional technical means and will not be described in detail.
And S3, pre-training the flow.
Step S301, training an unsupervised model U model
Half of the data in the data set is used as a training non-supervision model U model Is not marked in the first training data set A t The rest data is used as a training supervision model S model Second training data set B t
Because the samples with abnormal states are too few and difficult to obtain, the unbalance of the data types can lead to serious bias of the trained prediction results to the majority types, so that the prediction accuracy of the minority types is reduced. Thus, the second training data set B t It is desirable to solve the problem of data category imbalance.
Second training data set B t In order to make normalThe sample set is N maj The number is |N maj I, the abnormal sample set is O min In an amount of |O min I, and I N maj |>>|O min I, then from the normal sample set N maj Randomly selecting H sample sets S maj And the number of which is S maj |=|O min I, then each sample set S maj Are respectively matched with the abnormal sample set O min Mixing to form H mixed samples, wherein the number of normal samples and abnormal samples in each mixed sample is 1:1; then using the mixed sample set, for the supervision model S model And (5) predicting.
For the first training data set A which is not marked t Training an unsupervised model U using a unitary classification approach model
Because the power consumption data of the communication equipment is not labeled at first, but a large amount of normal data and a small amount of abnormal data exist in the data according to the normal distribution assumption, the method is very suitable for adopting a semi-supervised state detection method, namely the problem of unified classification is solved, and therefore, the non-supervised model is trained by adopting the unified classification method. Unitary classification does not take into account the actual labels of the training materials themselves, but instead considers all training materials as the same positive class to train the classification model.
Step S302, a mixed sample set is selected and guided into an unsupervised model U model Predicting virtual tags, recording the virtual tags in a virtual tag set PL, and calculating an unsupervised model U by using an uncertainty sampling method model Uncertainty of each data in the list, find m which is the most uncertain 1 Pen data, will be m 1 The data are transmitted to an expert for updating the labels, and the virtual labels in the virtual label set PL are updated according to the updated labels; finally adding the mixed sample set endowed with the virtual tag to the marked data set X t The method comprises the steps of carrying out a first treatment on the surface of the Then use the marked data set X t Pre-training supervision model S model Completing the supervision model S model Initializing.
The mark is the mark in the data that defines whether the data point is normal or abnormal.
Because of the non-supervision modelU model The predicted virtual marking accuracy is limited, so the proposal adopts the mixed sample set of virtual labels as the first training data set for the supervision model S model Pre-training is carried out; then, go through the supervision model S model The predicted tag updates the virtual tag, thereby improving the accuracy of the virtual tag.
Step S4, actively learning the feedback flow.
Step S401, supervision model S model Retraining and predicting.
Supervision model S model After initialization, selecting the next mixed sample set to the supervision model S model And (5) performing classification prediction to obtain a mark.
FIG. 4 is a schematic sampling of a random forest; as shown in fig. 4. According to the scheme, a random forest supervised learning algorithm is used for the mixed sample, different decision tree models are trained, and then the classification results are predicted jointly by the different decision tree models in a voting mode.
The random forest is a supervised learning algorithm, the mixed sample is trained into a plurality of different decision tree models, all the decision tree models are used for jointly predicting classification results of new samples in a voting mode, and the construction process is as follows:
1. obtaining m 'mixed samples as m' training sets;
2. respectively training m 'decision tree models for m' training sets;
3. for a single decision tree model, assuming that the number of training sample features is n', selecting the best features for splitting according to information gain, information gain ratio or a coefficient of a radix when splitting each time;
4. and forming a random forest by the generated multiple decision trees, and determining the final classification result according to the voting of the multiple tree classifiers for the classification problem.
As a plurality of different decision tree models exist, the normal sample set can be almost distributed into training samples of different models, so that the problem of information loss is avoided.
Step S402, the flow of the query and expert interaction of the material is not determined.
Method for calculating a supervision model S using uncertainty sampling model Uncertainty of each data in the list, find m which is the most uncertain 2 Pen data; let m 2 The data are transmitted to an expert for labeling, and a newly added label is obtained; adding a newly added tag M2 to the marked dataset X t . Then use the marked data set X t Retraining a supervision model S model
Uncertainty samples (Uncertainty Sampling) are unlabeled samples that are used to identify the vicinity of decision boundaries in the current learning model. The sample of the most uncertainty of the model may be data near the classification boundary; by looking at the samples where these classifications are the most difficult to obtain more information about class boundaries, their predictive labels can be sampled with the least confidence. Uncertainty sampling, which is common knowledge, can be referred to as: uncertainty-based active learning algorithm study [ D ], [ Wang Zhen ], university of Hebei, 2011.
Step S5, turning to step S4, selecting the next mixed sample set and then entering the supervision model S model Predictions are made until the mixed sample set is traversed.
Experiments prove that the scheme is as follows:
the TPR index true positive rate indicates the proportion of all abnormal data that is correctly identified as abnormal, and the higher the TPR index is, the better the TPR index is. The FPR index false positive rate indicates the proportion of the data which is erroneously determined to be abnormal to all normal data, and a high FPR index indicates a high probability of being erroneously determined to be abnormal, so that the lower the FPR index is, the better.
Firstly, using a data part, taking the electric power data of a cooling fan of one communication device as an experiment, taking more than 11 ten thousand data, adopting k-fold cross validation, taking half of the data as an unsupervised model U according to an experiment plan model Is not marked in the first training data set A t The other half is used as a second training data set B t And equally dividing the sample into k equal parts, mixing abnormal data with a certain proportion in each part to obtain a mixed sample set, wherein one equal part is used for predicting the virtual tag,the remaining aliquots are then used for each active learning query procedure. The experiment improves the accuracy of the model by importing k-fold cross-validation by cycling different k aliquots. The mixed sample set must have data covering the actual anomalies so that the classification model can correctly find the anomalies.
FIG. 5 is a diagram of anomaly data; as shown in fig. 5. In this experiment, the rotation speed of the electric fan is reduced to be abnormal by increasing the friction force in a manner of inserting the foreign matters, and the foreign matters are inserted at different time points for two times, so that the power of the electric fan can be observed to rise from original average 52w to 65w in fig. 5, and then the power of the electric fan slowly drops along with the time of each note, so that abnormal data of rising and then falling is obtained.
Number of inquiry data (m) in active learning 1 +m 2 ): querying 50 samples; m of queries 1 Pen comes from unsupervised model U model To correct the false positive virtual tag to increase accuracy, and the other half from each new test data.
Fig. 6 is a correlation diagram of the predicted result and the number of queries, as shown in fig. 6. At the beginning of the experiment, an unsupervised model of a unitary classification algorithm is adopted, the TPR index of a predicted result is 0.71, the FPR index is 0.049, the influence of the accuracy of the virtual tag is received, and the TPR index and the FPR index at the beginning of the active learning model are distributed almost in the vicinity of the range. It can be seen that the higher the accuracy of the virtual labels predicted by the unsupervised model, the later active learning model can reach convergence with fewer queries.
It will be understood that equivalents and modifications will occur to those skilled in the art in light of the present invention and their spirit, and all such modifications and substitutions are intended to be included within the scope of the present invention as defined in the following claims.

Claims (3)

1. The state detection method based on the modular expandable cabinet power distribution unit is characterized by comprising the following steps of:
step S1, data selection and cleaning: the sensor module is used for collecting power data of the communication equipment, taking every 3 minutes as a time acquisition interval, and recording and storing a time-series data set; power profile, including the point in time of occurrence and the actual power W;
step S2, converting data: converting the time-series data set into a space vector-series data set; space vector v t A set of actual powers W recorded for consecutive N time acquisition intervals; n is the width of the space vector;
step S3, a pre-training process:
step S301, training an unsupervised model U model The method comprises the steps of carrying out a first treatment on the surface of the Half of the data in the data set is used as a training non-supervision model U model Is not marked in the first training data set A t The rest data is used as a training supervision model S model Second training data set B t The method comprises the steps of carrying out a first treatment on the surface of the Second training data set B t In the method, the normal sample set is N maj And the abnormal sample set is O min The method comprises the steps of carrying out a first treatment on the surface of the From the normal sample set N maj Randomly selecting H sample sets S maj Then each sample set S maj Are respectively matched with the abnormal sample set O min Mixing to form H mixed samples, wherein the number of normal samples and abnormal samples in each mixed sample is 1:1;
for the first training data set A which is not marked t Training an unsupervised model U using a unitary classification approach model
Step S302, a mixed sample set is selected and guided into an unsupervised model U model Predicting virtual tags, recording the virtual tags in a virtual tag set PL, and calculating an unsupervised model U by using an uncertainty sampling method model Uncertainty of each data in the list, find m which is the most uncertain 1 Pen data, will be m 1 The data are transmitted to an expert for updating the labels, and the virtual labels in the virtual label set PL are updated according to the updated labels; finally adding the mixed sample set endowed with the virtual tag to the marked data set X t The method comprises the steps of carrying out a first treatment on the surface of the Then use the marked data set X t Pre-training supervision model S model Completing the supervision model S model Initializing;
step S4, active learning feedback flow:
step S401, supervision model S model Retraining and predicting; supervision model S model After initialization, selecting the next mixed sample set to the supervision model S model Performing classification prediction to obtain a mark;
step S402, the flow of inquiring and expert interaction of the uncertain materials; method for calculating a supervision model S using uncertainty sampling model Uncertainty of each data in the list, find m which is the most uncertain 2 Pen data; let m 2 The data are transmitted to an expert for labeling, and a newly added label is obtained; adding a newly added tag M2 to the marked dataset X t The method comprises the steps of carrying out a first treatment on the surface of the Then use the marked data set X t Retraining a supervision model S model
Step S5, turning to step S4, selecting the next mixed sample set and then entering the supervision model S model Predictions are made until the mixed sample set is traversed.
2. The method for detecting a status of a power distribution unit of a modular expandable rack according to claim 1, wherein step S2 further comprises the steps of: conversion characteristics: for space vector v in space vector sequence t Performing a conversion feature, the converted feature comprising: average powerThe method comprises the steps of carrying out a first treatment on the surface of the Maximum difference S maxdiff =max(v ti )–min(v ti );
The first quartile Q1, v ti A number of 25% ranging from small to large;
second quartile Q2, v ti A median number ranging from small to large;
third quartile Q3, v ti A number of 75% ranging from small to large;
the quarter-bit difference iqr=q3-Q1;
standard deviation SD;
maximum data change ratio S maxratio =max(v ti )/V tm
Minimum data change ratio S minratio =min(v ti )/V tm
Discrete value S of data dev =S maxdiff /V tm
The space vector v is replaced by the above 10-dimensional features t Thereby forming a new sequence of space vectors.
3. The method for detecting a status of a power distribution unit of a modular expandable rack according to claim 2, wherein step S2 further comprises the steps of:
extraction characteristics: and (3) carrying out feature extraction on the 10-dimensional features in the space vector sequence by adopting a principal component analysis method, replacing the extracted new features with the 10-dimensional features to obtain a new space vector sequence, and taking the space vector sequence as a new data set.
CN202310649869.8A 2023-06-02 2023-06-02 State detection method based on modular expandable cabinet power distribution unit Active CN116702078B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310649869.8A CN116702078B (en) 2023-06-02 2023-06-02 State detection method based on modular expandable cabinet power distribution unit

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310649869.8A CN116702078B (en) 2023-06-02 2023-06-02 State detection method based on modular expandable cabinet power distribution unit

Publications (2)

Publication Number Publication Date
CN116702078A CN116702078A (en) 2023-09-05
CN116702078B true CN116702078B (en) 2024-03-26

Family

ID=87835129

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310649869.8A Active CN116702078B (en) 2023-06-02 2023-06-02 State detection method based on modular expandable cabinet power distribution unit

Country Status (1)

Country Link
CN (1) CN116702078B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109376796A (en) * 2018-11-19 2019-02-22 中山大学 Image classification method based on active semi-supervised learning
CN111740991A (en) * 2020-06-19 2020-10-02 上海仪电(集团)有限公司中央研究院 Anomaly detection method and system
CN113484817A (en) * 2021-06-30 2021-10-08 国网上海市电力公司 Intelligent electric energy meter automatic verification system abnormity detection method based on TSVM model
CN115410026A (en) * 2022-07-14 2022-11-29 扬州大学 Image classification method and system based on label propagation contrast semi-supervised learning

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20210209412A1 (en) * 2020-01-02 2021-07-08 International Business Machines Corporation Labeling data using automated weak supervision

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109376796A (en) * 2018-11-19 2019-02-22 中山大学 Image classification method based on active semi-supervised learning
CN111740991A (en) * 2020-06-19 2020-10-02 上海仪电(集团)有限公司中央研究院 Anomaly detection method and system
CN113484817A (en) * 2021-06-30 2021-10-08 国网上海市电力公司 Intelligent electric energy meter automatic verification system abnormity detection method based on TSVM model
CN115410026A (en) * 2022-07-14 2022-11-29 扬州大学 Image classification method and system based on label propagation contrast semi-supervised learning

Also Published As

Publication number Publication date
CN116702078A (en) 2023-09-05

Similar Documents

Publication Publication Date Title
De Baets et al. Detection of unidentified appliances in non-intrusive load monitoring using siamese neural networks
Brust et al. Active learning for deep object detection
CN111967343B (en) Detection method based on fusion of simple neural network and extreme gradient lifting model
Hachicha et al. A survey of control-chart pattern-recognition literature (1991–2010) based on a new conceptual classification scheme
CN112699913A (en) Transformer area household variable relation abnormity diagnosis method and device
Zhu et al. Process pattern construction and multi-mode monitoring
Yu et al. Meta-ADD: A meta-learning based pre-trained model for concept drift active detection
WO2013081718A2 (en) System and method employing a self-organizing map load feature database to identify electric load types of different electric loads
Vercruyssen et al. Transfer learning for time series anomaly detection
CN109086793A (en) A kind of abnormality recognition method of wind-driven generator
WO2013081717A2 (en) System and method employing a hierarchical load feature database to identify electric load types of different electric loads
CN111160401A (en) Abnormal electricity utilization judging method based on mean shift and XGboost
Kazemi et al. A hybrid method for estimating the process change point using support vector machine and fuzzy statistical clustering
He et al. Intelligent detection for key performance indicators in industrial-based cyber-physical systems
Ghorbanpour et al. Swarm and evolutionary algorithms for energy disaggregation: challenges and prospects
Li et al. Distance measures in building informatics: An in-depth assessment through typical tasks in building energy management
CN116484289A (en) Carbon emission abnormal data detection method, terminal and storage medium
CN113094448B (en) Analysis method and analysis device for residence empty state and electronic equipment
CN116702078B (en) State detection method based on modular expandable cabinet power distribution unit
CN117493922A (en) Power distribution network household transformer relation identification method based on data driving
CN112395168A (en) Stacking-based edge side service behavior identification method
Shin et al. A user-centered active learning approach for appliance recognition
Zhang et al. Similarity Analysis of Industrial Alarm Floods Based on Word Embedding and Move-Split-Merge Distance
CN115965135A (en) New energy prediction error modeling method and system based on naive Bayes classification
Stržinar et al. Soft sensor for non-invasive detection of process events based on Eigenresponse Fuzzy Clustering

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant