CN115963350A - Fault positioning method and device for direct-current power distribution network - Google Patents

Fault positioning method and device for direct-current power distribution network Download PDF

Info

Publication number
CN115963350A
CN115963350A CN202211389396.4A CN202211389396A CN115963350A CN 115963350 A CN115963350 A CN 115963350A CN 202211389396 A CN202211389396 A CN 202211389396A CN 115963350 A CN115963350 A CN 115963350A
Authority
CN
China
Prior art keywords
current data
mode current
fault
features
random forest
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211389396.4A
Other languages
Chinese (zh)
Inventor
马天祥
段昕
李丹
罗蓬
胡紫琪
徐岩
王若琳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
State Grid Corp of China SGCC
Electric Power Research Institute of State Grid Hebei Electric Power Co Ltd
North China Electric Power University
State Grid Hebei Energy Technology Service Co Ltd
Original Assignee
State Grid Corp of China SGCC
Electric Power Research Institute of State Grid Hebei Electric Power Co Ltd
North China Electric Power University
State Grid Hebei Energy Technology Service Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by State Grid Corp of China SGCC, Electric Power Research Institute of State Grid Hebei Electric Power Co Ltd, North China Electric Power University, State Grid Hebei Energy Technology Service Co Ltd filed Critical State Grid Corp of China SGCC
Priority to CN202211389396.4A priority Critical patent/CN115963350A/en
Publication of CN115963350A publication Critical patent/CN115963350A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y04INFORMATION OR COMMUNICATION TECHNOLOGIES HAVING AN IMPACT ON OTHER TECHNOLOGY AREAS
    • Y04SSYSTEMS INTEGRATING TECHNOLOGIES RELATED TO POWER NETWORK OPERATION, COMMUNICATION OR INFORMATION TECHNOLOGIES FOR IMPROVING THE ELECTRICAL POWER GENERATION, TRANSMISSION, DISTRIBUTION, MANAGEMENT OR USAGE, i.e. SMART GRIDS
    • Y04S10/00Systems supporting electrical power generation, transmission or distribution
    • Y04S10/50Systems or methods supporting the power network operation or management, involving a certain degree of interaction with the load-side end user applications
    • Y04S10/52Outage or fault management, e.g. fault detection or location

Landscapes

  • Supply And Distribution Of Alternating Current (AREA)

Abstract

The embodiment of the disclosure provides a fault positioning method and a fault positioning device for a direct current power distribution network, wherein the method comprises the following steps: acquiring line mode current data; based on an improved Relieff algorithm, performing optimal feature subset selection on the line mode current data to obtain an optimal feature subset, and generating weights corresponding to features; processing according to the optimal feature subset and the weight to obtain a fault positioning model of a weighted random forest algorithm; the method comprises the steps of collecting current linear mode current data, inputting the current linear mode current data into a fault positioning model of the weighted random forest algorithm, outputting a current fault position, selecting an optimal feature subset of the linear mode current data by an improved Relieff model, reducing feature dimensions input into the fault positioning model and eliminating redundant features, so that the calculation time of the algorithm is reduced, and the model precision is improved.

Description

Fault positioning method and device for direct-current power distribution network
Technical Field
The disclosure relates to the field of fault location, in particular to the technical field of circuit fault location, and particularly relates to a method and a device for locating faults of a direct-current power distribution network.
Background
The topological structure of the direct current power distribution system comprises a radiation shape, a hand-held shape and an annular shape, the annular direct current system is provided with a plurality of ports, the topological structure and the fault protection strategy are relatively complex, and the diversity is increased by the access of photovoltaic and wind power new energy.
The monopole earth fault process finally enters a voltage recovery stage, the voltage of the direct current side is gradually recovered to be normal, a non-fault polar line bears short-time monopole operation of a direct current system, and the fault characteristic is not obvious, so that the difficulty in accurate positioning of faults is increased.
In circuit fault location, the universal method is that a universal meter and oscilloscope equipment are used for detecting and analyzing a circuit to determine a fault, and the problem of inaccurate fault location exists.
Disclosure of Invention
The disclosure provides a fault positioning method and device for a direct-current power distribution network.
According to a first aspect of the disclosure, a method and a device for positioning a fault of a direct current distribution network are provided. The method comprises the following steps:
acquiring line mode current data;
processing the line mode current data based on an improved Relieff algorithm to obtain an optimal feature subset, and generating weights corresponding to the features;
training according to the optimal feature subset and the weight to obtain a fault positioning model of a weighted random forest algorithm;
and acquiring current linear mode current data, inputting the current linear mode current data into a fault positioning model of the weighted random forest algorithm, and outputting the current fault position.
The above-described aspects and any possible implementation further provide an implementation, and the line mode current data is labeled to generate a training data set.
As to the above-described aspects and any possible implementation, there is further provided an implementation, where the obtaining line mode current data further includes preprocessing the line mode current data; wherein the content of the first and second substances,
performing a first preset number of time domain feature extractions and a second preset number of frequency domain feature extractions on the line mode current data to generate a multiple feature parameter table;
and eliminating redundant features in the multiple feature parameter table by using the Pearson correlation coefficient.
The above-described aspect and any possible implementation manner further provide an implementation manner, where the eliminating redundant features in the multiple feature parameter table by using the pearson correlation coefficient includes:
calculating Pearson correlation coefficients between every two features in the multiple feature parameter table, and generating corresponding p-values;
if the p-value is larger than or equal to a preset threshold value, no significant correlation exists between the two features corresponding to the p-value;
if the p-value is smaller than the preset threshold value, significant correlation exists between the two features corresponding to the p-value;
redundant features having significant correlation in the multiple feature parameter table are eliminated.
The above-described aspect and any possible implementation manner further provide an implementation manner, where the processing the line mode current data based on the improved ReliefF algorithm to obtain an optimal feature subset, and generating weights corresponding to features includes:
determining a limiting coefficient according to the number of the characteristic elements in the multiple characteristic parameter table after the redundant characteristic is eliminated, and taking the product of the limiting coefficient and the number of the characteristic elements as the sampling times;
based on an improved Relieff algorithm, carrying out weight iteration according to the characteristic elements in the multiple characteristic parameter table after the redundant characteristic is eliminated and the sampling times;
averaging the weights corresponding to the features generated after iteration to serve as final weights;
and forming an optimal feature subset by the feature vector with the top final weight.
As to the above-mentioned aspect and any possible implementation manner, there is further provided an implementation manner, where the training is performed according to the optimal feature subset and the weights to obtain a fault location model of a weighted random forest algorithm, including:
multiplying the feature values in the optimal feature subset by the weight corresponding to the features to obtain a multiplication result, wherein the multiplication result is a weight coefficient of the weighted random forest algorithm;
and training according to the optimal feature subset and the weight coefficient of the weighted random forest algorithm to obtain a fault positioning model of the weighted random forest algorithm.
As to the above-mentioned aspect and any possible implementation manner, there is further provided an implementation manner, where the training is performed according to the optimal feature subset and the weights to obtain a fault location model of a weighted random forest algorithm, the method further includes:
and searching the value of the attribute number with the highest fitting goodness of the fault positioning model of the weighted random forest algorithm and the total number of the decision trees by using a cycle statement.
According to a second aspect of the present disclosure, a dc distribution network fault locating device is provided. The device includes:
the acquisition module is used for acquiring line mode current data;
the processing module is used for processing the line mode current data based on an improved Relieff algorithm to obtain an optimal feature subset and generate weights corresponding to the features;
the training module is used for training according to the optimal feature subset and the weight to obtain a fault positioning model of a weighted random forest algorithm;
and the output module is used for acquiring current linear mode current data, inputting the current linear mode current data into the fault positioning model of the weighted random forest algorithm and outputting the current fault position.
According to a third aspect of the present disclosure, an electronic device is provided. The electronic device includes: a memory having a computer program stored thereon and a processor implementing the method as described above when executing the program.
According to a fourth aspect of the present disclosure, there is provided a computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements a method as according to the first and/or second aspects of the present disclosure.
It should be understood that the statements herein reciting aspects are not intended to limit the critical or essential features of the embodiments of the present disclosure, nor are they intended to limit the scope of the present disclosure. Other features of the present disclosure will become apparent from the following description.
Drawings
The above and other features, advantages and aspects of various embodiments of the present disclosure will become more apparent by referring to the following detailed description when taken in conjunction with the accompanying drawings. The accompanying drawings are included to provide a further understanding of the present disclosure, and are not incorporated in or constitute a part of this specification, wherein like reference numerals refer to like or similar elements throughout the several views and wherein:
fig. 1 shows a flow chart of a method for dc distribution network fault location according to an embodiment of the present disclosure;
fig. 2 shows a block diagram of a dc distribution network fault locating device according to an embodiment of the present disclosure;
fig. 3 shows a flow chart of another dc distribution network fault location method according to an embodiment of the present disclosure;
figure 4 illustrates a six-terminal dc power distribution system topology according to an embodiment of the present disclosure;
FIG. 5 illustrates an equivalent circuit diagram when a VSC DC side circuit has a single pole ground fault, according to an embodiment of the present disclosure;
FIG. 6 shows a flow diagram of a method of selecting an optimal feature subset according to an improved Relieff algorithm in accordance with an embodiment of the present disclosure;
FIG. 7 illustrates a block diagram of an exemplary electronic device capable of implementing embodiments of the present disclosure.
Detailed Description
To make the objects, technical solutions and advantages of the embodiments of the present disclosure more clear, the technical solutions of the embodiments of the present disclosure will be described clearly and completely with reference to the drawings in the embodiments of the present disclosure, and it is obvious that the described embodiments are some, but not all embodiments of the present disclosure. All other embodiments, which can be derived by a person skilled in the art from the embodiments disclosed herein without making any creative effort, shall fall within the protection scope of the present disclosure.
In addition, the term "and/or" herein is only one kind of association relationship describing an associated object, and means that there may be three kinds of relationships, for example, a and/or B, which may mean: a exists alone, A and B exist simultaneously, and B exists alone. In addition, the character "/" herein generally indicates that the former and latter associated objects are in an "or" relationship.
In the disclosure, aiming at the defects in the application of the Relieff algorithm, a limiting coefficient is introduced, and the purpose of automatically screening the optimal feature subset is achieved by combining a Pearson correlation coefficient method.
Fig. 1 shows a flow chart of a dc distribution network fault location method 100 according to an embodiment of the present disclosure.
As shown in fig. 1, the method for locating a fault in a dc power distribution network includes:
s101, acquiring line mode current data;
s102, processing the line mode current data based on an improved Relieff algorithm to obtain an optimal feature subset and generate weights corresponding to features;
s103, training according to the optimal feature subset and the weight to obtain a fault positioning model of a weighted random forest algorithm;
and S104, collecting current linear mode current data, inputting the current linear mode current data into a fault positioning model of the weighted random forest algorithm, and outputting the current fault position.
In some embodiments, the line mode current data is labeled, generating a training data set.
In some embodiments, said obtaining line mode current data further comprises pre-processing said line mode current data; wherein the content of the first and second substances,
performing a first preset number of time domain feature extractions and a second preset number of frequency domain feature extractions on the line mode current data to generate a multiple feature parameter table;
and eliminating redundant features in the multiple feature parameter table by using the Pearson correlation coefficient.
In some embodiments, said eliminating redundant features in said multiple feature parameter table using pearson correlation coefficients comprises:
calculating a Pearson correlation coefficient between every two characteristics in the multiple characteristic parameter table and generating a corresponding p-value;
if the p-value is larger than or equal to a preset threshold value, no significant correlation exists between the two characteristics corresponding to the p-value;
if the p-value is smaller than the preset threshold value, significant correlation exists between the two features corresponding to the p-value;
redundant features having significant correlation in the multiple feature parameter table are eliminated.
In some embodiments, the processing the line mode current data based on the improved ReliefF algorithm to obtain an optimal feature subset, and generating weights corresponding to the features includes:
determining a limiting coefficient according to the number of the characteristic elements in the multiple characteristic parameter table after the redundant characteristic is eliminated, and taking the product of the limiting coefficient and the number of the characteristic elements as the sampling times;
based on an improved Relieff algorithm, carrying out weight iteration according to the characteristic elements in the multiple characteristic parameter table after the redundant characteristic is eliminated and the sampling times;
averaging the weights corresponding to the features generated after iteration to serve as final weights;
and forming an optimal feature subset by the feature vector with the top ranking of the final weight.
In some embodiments, the training according to the optimal feature subset and the weight to obtain a fault location model of a weighted random forest algorithm includes:
multiplying the feature values in the optimal feature subset by the weight corresponding to the features to obtain a multiplication result, wherein the multiplication result is a weight coefficient of the weighted random forest algorithm;
and training according to the optimal feature subset and the weight coefficient of the weighted random forest algorithm to obtain a fault positioning model of the weighted random forest algorithm.
In some embodiments, the training according to the optimal feature subset and the weight to obtain a fault location model of a weighted random forest algorithm further includes:
and searching the value of the attribute number with the highest fitting goodness of the fault positioning model of the weighted random forest algorithm and the total number of the decision trees by using a cycle statement.
According to the embodiment of the disclosure, the following technical effects are achieved:
when a single-pole grounding short circuit occurs in the direct-current distribution line, constructing multiple time-frequency composite fault characteristics by using line mode current components of a single end of the line; aiming at the defects in application of the Relieff algorithm, a limiting coefficient q is introduced, and the purpose of automatically screening the optimal feature subset is achieved by combining a Pearson correlation coefficient method; taking the calculated feature weight average value as a final weight assignment, so as to obtain an optimal feature subset containing weight coefficients, wherein the weight coefficients can be continuously used for improving the performance and the fault positioning accuracy of a subsequent random forest algorithm; the improvement to random forests includes two aspects: firstly, the improved Relieff algorithm can obtain specific numerical values of characteristic weights, and the specific numerical values are combined with a random forest algorithm to form a weighted random forest algorithm, so that the prediction precision of the model is improved in a targeted manner; secondly, the values of mtry and ntree with the highest model fitting goodness are found out by using a cycle statement, and the speed and the precision of fault positioning are improved; the principle is simple, only local measurement is needed, information at two ends does not need to be sent and synchronized, the management and investment cost of sampling equipment is greatly reduced, and certain engineering practicability is achieved.
It should be noted that for simplicity of description, the above-mentioned method embodiments are described as a series of acts, but those skilled in the art should understand that the present disclosure is not limited by the described order of acts, as some steps may be performed in other orders or simultaneously according to the present disclosure. Further, those skilled in the art will appreciate that the embodiments described in the specification are exemplary embodiments and that acts and modules are not necessarily required for the disclosure.
The above is a description of embodiments of the method, and the embodiments of the apparatus are described below to further illustrate the aspects of the disclosure.
Fig. 2 shows a block diagram of a dc distribution network fault locating device 200 according to an embodiment of the present disclosure.
As shown in fig. 2, the apparatus 200 includes:
an obtaining module 201, configured to obtain line mode current data;
the processing module 202 is configured to process the line mode current data based on an improved ReliefF algorithm to obtain an optimal feature subset, and generate a weight corresponding to a feature;
the training module 203 is used for training according to the optimal feature subset and the weight to obtain a fault positioning model of a weighted random forest algorithm;
the output module 204 is configured to acquire current linear mode current data, input the current linear mode current data into the fault location model of the weighted random forest algorithm, and output a current fault location.
Fig. 3 shows a flow chart of another dc distribution network fault location method 300 according to an embodiment of the present disclosure.
As shown in fig. 3, the method for locating a fault in a dc power distribution network includes:
s301, after detecting a single-phase grounding short circuit of a direct-current distribution line, the protection device starts to record current and voltage data of an observation point for 100ms, and performs phase-mode conversion on the recorded data to obtain line mode current data;
s302: correspondingly calculating 11 time domain characteristics and 13 frequency domain characteristics by using the line mode current data to form a multiple characteristic parameter table;
s303: eliminating redundant characteristics in the multiple characteristic parameter table by using the Pearson correlation coefficient;
s304, based on an improved Relieff algorithm, carrying out weight iterative computation according to the characteristic elements and the sampling times in the multiple characteristic parameter table after the redundant characteristics are eliminated, taking the average value of the weights corresponding to the characteristics generated after iteration as a final weight, and automatically selecting the characteristic quantity with the final weight ranked ahead to form an optimal characteristic subset;
s305, multiplying the characteristic value in the optimal characteristic subset by the weight corresponding to the characteristic to obtain a multiplication result, wherein the multiplication result is a weight coefficient of the weighted random forest algorithm; training according to the optimal feature subset and the weight coefficient of the weighted random forest algorithm to obtain a fault location model of the weighted random forest algorithm, wherein the values of the attribute number with the highest fitness of the fault location model of the weighted random forest algorithm and the total number of the decision trees are determined through a cycle statement;
s306, collecting current linear mode current data, inputting the current linear mode current data into a fault positioning model of the weighted random forest algorithm, and outputting a current fault position.
In some embodiments, the topology structure of the direct current power distribution system comprises a radiation shape, a hand-held shape and a ring shape, wherein the ring-shaped direct current system has a plurality of ports and has the advantages of short power restoration time and high power supply reliability;
fig. 4 shows a topology diagram of a six-terminal dc distribution system according to an embodiment of the present disclosure, where the six-terminal annular dc distribution system shown in fig. 4 is used as a research object to research a fault location method of a dc distribution network;
an equivalent circuit when a single-pole ground fault occurs in a VSC direct-current side circuit is shown in fig. 5;
the fault characteristics of the anode and the cathode are symmetrical when the single-pole ground fault occurs, the embodiment of the disclosure takes the anode ground fault as an example, and researches a direct current distribution network single-pole ground fault positioning method;
the fault is different from the situation that when an interelectrode short circuit occurs in a direct-current power distribution network, the fault must be removed before the capacitor voltage crosses zero, the single-pole grounding fault process finally enters a voltage recovery stage, the voltage on the direct-current side is gradually recovered to be normal, a non-fault polar line undertakes the short-time single-pole operation of a direct-current system, a certain time is strived for the removal of the fault, but the fault is not obvious in characteristic, and the difficulty in accurately positioning the fault is increased;
in a bipolar direct current power distribution system, due to the coupling characteristics of positive and negative line parameters, positive and negative currents are converted into linear mode and zero-mode current components according to Karenbauer polar mode conversion:
Figure BDA0003931365660000101
in the formula: i.e. i p Is the positive current value of the DC line, i n The negative current value of the direct current line; i.e. i 1 To correspond to the line modulus component, i 0 Is the corresponding zero modulus component; positive direction and positive and negative of line-mode componentThe positive direction of the pole current is the same, the positive direction of the zero modulus component is defined as flowing from the bus to the line, and S is a Karenbauer transformation matrix, namely:
Figure BDA0003931365660000111
the line mode current is not influenced by the coupling of the antipodal line, the current attenuation amplitude is small, fault analysis is facilitated, and the line mode component of the unipolar grounding fault current is adopted to achieve the fault location method of the direct-current power distribution network.
For Line1 of the six-end annular direct-current power distribution system shown in fig. 4, when a positive ground fault occurs at different positions from the outlet of the left-end converter within 0.1s, a Line mode current is measured at a bus a;
when a direct current cable line fails, the system needs to go through a transient process and gradually recovers to a steady state, fault transient signals which are transmitted to two ends of the cable are generated by fault points during the transient state of the system, and compared with the system which only contains direct current components during normal operation, the fault transient process contains signal energy with various frequencies, and the transient signals during the transient state contain rich fault information, so that effective fault characteristics can be obtained;
the oscillation behavior or the frequency content of the line mode current changes along with the change of the fault position, a corresponding relation exists between the oscillation behavior or the frequency content, and in order to avoid complex threshold setting calculation and reduce the investment of related sampling equipment, 11 characteristic quantities (T) are extracted from a short-circuit line mode current signal time domain in the embodiment of the disclosure 1 ~T 11 ) Frequency domain extraction of 13 feature quantities (T) 12 ~T 24 ) 24 characteristic parameters are counted in total, multiple characteristic parameters of the line mode current are formed, specific parameter serial numbers and corresponding formulas are shown in table 1, x (N) in table 1 is a time domain signal sequence, N =1,2, \ 8230;, and N; n is the number of sample points; s (K) is the frequency spectrum obtained after DFT of the time domain signal x (n), K =1,2, \ 8230;, K; k is the number of spectral lines; f. of k Is the frequency value of the k-th spectral line;
TABLE 1 multiple characteristic parameter Table
Figure BDA0003931365660000112
/>
Figure BDA0003931365660000121
The Relief is a classic Filter type feature selection algorithm, but is only limited to a binary problem and cannot process noise and missing values in data;
konenille proposes a Relieff algorithm applicable to a multi-classification problem based on a Relief idea, and expresses the importance of features by setting a 'correlation statistic', wherein the essence of the correlation statistic is to characterize the capability of the features to enable 'intra-class aggregation and inter-class dispersion', and is an important measurement standard of feature importance degree, the features are endowed with weights with different sizes through the correlation of each feature and each class, the features are ranked according to the sizes of the weights, the larger the weight of the features is, the higher the correlation of the features is, and the features with smaller weights can be removed according to a set threshold value during feature selection, so that a feature subset is formed.
The specific method for selecting the line mode current time-frequency domain multiple short-circuit fault characteristics based on the Relieff algorithm is as follows:
(1) Randomly selecting a sample Ri from the feature ensemble training set.
(2) Respectively from and to the sample R i Finding K nearest neighbor samples H in sample sets of the same type and different types j And M j (j=1,2,…,K)。
(3) Repeating iteration on each feature dimension by using KNN (K-Nearest Neighbors, KNN) idea, and updating each feature T according to formula (3) p Weight W (T) of (p =1,2, \8230;, 24) p ):
Figure BDA0003931365660000131
In the formula, m is the iteration number of the algorithm; p (c) is the probability of class c samples in the training set; d (T) p ,X,H j ) Representing a sampleX in multiple characteristics T p Upper and sample Hj; d (Tp, X, M) j ) Indicating multiple features T of sample X p The distance from the upper sample Mj is calculated by the formula:
Figure BDA0003931365660000132
Figure BDA0003931365660000133
in the formula, V (T) p A) represents the characteristic T of the sample A p A represents X or H j Or M j
(4) Repeating the above process for m times, and calculating average weight as final assignment result W (T) of the features p )。
(5) After the iteration is completed, the weight W (T) of each feature p ) And sorting from large to small, and extracting the features sorted in the front according to a set weight threshold alpha to form a feature subset.
The Relieff algorithm has the advantages of high operating efficiency, no limitation on data types, strong noise resistance, suitability for multi-feature classification and the like, but the following defects still exist in practical application:
(1) The random selection of samples may result in the selection of class-edge samples or selection of noisy samples with "outliers" that may introduce errors in updating the feature weights. Meanwhile, random selection cannot ensure that each subclass sample is selected, and the selection times of the samples are uneven, so that the stability and the precision of feature selection are influenced.
(2) The algorithm is sensitive to iteration times m and the nearest neighbor sample number K, different final assignment results may be caused by different parameter combinations, and values determined by m and K need to be assigned in consideration of actual classification conditions.
(3) Only the contribution of different features to the classification can be calculated, and the formed feature subset does not exclude the possible redundant features.
Aiming at the two defects, the embodiment of the disclosure adds a limiting coefficient q, and provides an improved method for selecting the optimal feature subset of the Relieff. The repeated extraction of a single sample can be caused by the overlarge value of q, and the probability of selecting a subclass sample is low; and if the value is too small, iteration is insufficient, the optimal solution is not obtained, and the result is not credible. For 24 time-frequency domain multiple features constructed according to the embodiment of the disclosure, q =3 is taken, so that the equality of each sample can be ensured under the limited sampling frequency, the reliability of the result is improved, and the total sampling frequency m is determined by the product of the limiting coefficient q and the feature quantity.
For redundant features existing in feature subsets, the current processing method is to remove a part of features with higher similarity in advance by calculating cosine similarity between the features, and then to select the optimal subset to calculate weight. Vector X = (X) 1 ,X 2 ,…,X n ) And Y = (Y) 1 ,Y 2 ,…,Y n ) The cosine similarity calculation formula of (2) is as follows:
Figure BDA0003931365660000141
as can be seen from equation (6), the cosine similarity calculates an included angle of a space vector, only the Correlation of the features in the space dimension is evaluated, but the Correlation in the numerical dimension is ignored, in order to improve the reliability of the Correlation evaluation, in the embodiment of the disclosure, a Pearson Correlation Coefficient (PCC) is used to calculate the Correlation between the features, the PCC centralizes the features and then reduces the numerical difference, and then determines whether the two are collinear in space, thereby solving the defect that the cosine similarity is insensitive to the numerical difference, and the closer the PCC is to 0, the lower the similarity is. Vector X = (X) 1 ,X 2 ,…,X n ) And Y = (Y) 1 ,Y 2 ,…,Y n ) The calculation formula of the PCC between is:
Figure BDA0003931365660000151
in the formula (I), the compound is shown in the specification,
Figure BDA0003931365660000152
Figure BDA0003931365660000153
after calculating PCC (X, Y), the average values of the features X, Y are respectively corresponded, and whether the correlation between the features is significant or not is judged by a hypothesis test, and the result is presented as p-value.
p-value is a numerical criterion for determining whether formula (7) is significant or not. If the p-value between the two characteristics is larger than a preset threshold value, the PCC result between the two characteristics is not obvious, and even if the PCC is close to 0, the two characteristics cannot be mutually considered as redundant characteristics; on the contrary, if the p-value is smaller than the preset threshold, it indicates that the calculated correlation coefficient result is significant, and the PCC is used for characterizing the correlation degree between the two.
Fig. 6 shows a flow diagram of a method 600 for selecting an optimal feature subset for an improved ReliefF algorithm according to an embodiment of the disclosure.
As shown in fig. 6, the method for selecting an optimal feature subset by using the improved ReliefF algorithm includes:
s601, calculating a Pearson correlation coefficient between every two characteristics in the multiple characteristic parameter table and generating a corresponding p-value; if the p-value is larger than or equal to a preset threshold value, no significant correlation exists between the two characteristics corresponding to the p-value; if the p-value is smaller than the preset threshold value, significant correlation exists between the two features corresponding to the p-value; eliminating redundant features with significant correlation in the multiple feature parameter table;
s602, determining a limiting coefficient according to the number of the characteristic elements in the multiple characteristic parameter table after the redundant characteristic is eliminated, and taking the product of the limiting coefficient and the number of the characteristic elements as the sampling times; based on an improved Relieff algorithm, carrying out weight iteration according to the characteristic elements in the multiple characteristic parameter table after the redundant characteristic is eliminated and the sampling times;
s603, averaging the weights corresponding to the features generated after iteration to serve as a final weight; and forming an optimal feature subset by the feature vector with the top final weight.
It can be clearly understood by those skilled in the art that, for convenience and simplicity of description, the specific working process of the described module may refer to the corresponding process in the foregoing method embodiment, and details are not described herein again.
In the technical scheme of the disclosure, the acquisition, storage, application and the like of the personal information of the related user all accord with the regulations of related laws and regulations, and do not violate the good customs of the public order.
The present disclosure also provides an electronic device, a readable storage medium, and a computer program product according to embodiments of the present disclosure.
FIG. 7 illustrates a schematic block diagram of an electronic device 700 that may be used to implement embodiments of the present disclosure. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. Electronic devices may also represent various forms of mobile devices, such as personal digital processors, cellular telephones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be examples only, and are not meant to limit implementations of the disclosure described and/or claimed herein.
The electronic device 700 comprises a computing unit 701, which may perform various suitable actions and processes according to a computer program stored in a Read Only Memory (ROM) 702 or a computer program loaded from a storage unit 708 into a Random Access Memory (RAM) 703. In the RAM703, various programs and data required for the operation of the device 700 can be stored. The computing unit 701, the ROM 702, and the RAM703 are connected to each other by a bus 704. An input/output (I/O) interface 705 is also connected to bus 704.
A plurality of components in the electronic device 700 are connected to the I/O interface 705, including: an input unit 706 such as a keyboard, a mouse, or the like; an output unit 707 such as various types of displays, speakers, and the like; a storage unit 708 such as a magnetic disk, optical disk, or the like; and a communication unit 709 such as a network card, a modem, a wireless communication transceiver, etc. The communication unit 709 allows the device 700 to exchange information/data with other devices via a computer network, such as the internet, and/or various telecommunication networks.
Computing unit 701 may be a variety of general and/or special purpose processing components with processing and computing capabilities. Some examples of the computing unit 701 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various dedicated Artificial Intelligence (AI) computing chips, various computing units running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, and so forth. The computing unit 701 performs the various methods and processes described above, such as the method 100. For example, in some embodiments, the method 100 may be implemented as a computer software program tangibly embodied in a machine-readable medium, such as the storage unit 708. In some embodiments, part or all of the computer program may be loaded and/or installed onto the electronic device 700 via the ROM 702 and/or the communication unit 709. When the computer program is loaded into RAM703 and executed by the computing unit 701, one or more steps of the method 700 described above may be performed. Alternatively, in other embodiments, the computing unit 701 may be configured to perform the method 700 in any other suitable manner (e.g., by way of firmware).
Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuitry, field Programmable Gate Arrays (FPGAs), application Specific Integrated Circuits (ASICs), application Specific Standard Products (ASSPs), system on a chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.
Program code for implementing the methods of the present disclosure may be written in any combination of one or more programming languages. These program codes may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the program codes, when executed by the processor or controller, cause the functions/operations specified in the flowchart and/or block diagram to be performed. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package partly on the machine and partly on a remote machine or entirely on the remote machine or server.
In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user may provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user can be received in any form, including acoustic, speech, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), and the Internet.
The computer system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server may be a cloud server, a server of a distributed system, or a server combining a blockchain.
It should be understood that various forms of the flows shown above may be used, with steps reordered, added, or deleted. For example, the steps described in the present disclosure may be executed in parallel, sequentially, or in different orders, as long as the desired results of the technical solutions disclosed in the present disclosure can be achieved, and the present disclosure is not limited herein.
The above detailed description should not be construed as limiting the scope of the disclosure. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made, depending on design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present disclosure should be included in the scope of protection of the present disclosure.

Claims (10)

1. A fault positioning method for a direct current power distribution network comprises the following steps:
acquiring line mode current data;
processing the line mode current data based on an improved Relieff algorithm to obtain an optimal feature subset, and generating weights corresponding to features;
training according to the optimal feature subset and the weight to obtain a fault positioning model of a weighted random forest algorithm;
and acquiring current linear mode current data, inputting the current linear mode current data into a fault positioning model of the weighted random forest algorithm, and outputting the current fault position.
2. The method of claim 1, wherein the line-mode current data is labeled, generating a training data set.
3. The method of claim 1, wherein the obtaining line mode current data further comprises pre-processing the line mode current data; wherein, the first and the second end of the pipe are connected with each other,
performing a first preset number of time domain feature extractions and a second preset number of frequency domain feature extractions on the line mode current data to generate a multiple feature parameter table;
and eliminating redundant features in the multiple feature parameter table by using the Pearson correlation coefficient.
4. The method of claim 3, wherein said utilizing Pearson's correlation coefficient to eliminate redundant features in the multiple feature parameter table comprises:
calculating a Pearson correlation coefficient between every two characteristics in the multiple characteristic parameter table and generating a corresponding p-value;
if the p-value is larger than or equal to a preset threshold value, no significant correlation exists between the two characteristics corresponding to the p-value;
if the p-value is smaller than the preset threshold value, the two characteristics corresponding to the p-value have significant correlation;
redundant features having significant correlation in the multiple feature parameter table are eliminated.
5. The method according to claim 1, wherein the processing the line mode current data based on the improved ReliefF algorithm to obtain an optimal feature subset and generating weights corresponding to features comprises:
determining a limiting coefficient according to the number of the characteristic elements in the multiple characteristic parameter table after the redundant characteristic is eliminated, and taking the product of the limiting coefficient and the number of the characteristic elements as the sampling times;
based on an improved Relieff algorithm, carrying out weight iteration according to the characteristic elements in the multiple characteristic parameter table after the redundant characteristic is eliminated and the sampling times;
averaging the weights corresponding to the features generated after iteration to serve as final weights;
and forming an optimal feature subset by the feature vector with the top ranking of the final weight.
6. The method of claim 1, wherein the training according to the optimal subset of features and the weights to obtain a fault localization model of a weighted random forest algorithm comprises:
multiplying the characteristic value in the optimal characteristic subset by the weight corresponding to the characteristic to obtain a multiplication result, wherein the multiplication result is a weight coefficient of the weighted random forest algorithm;
and training according to the optimal feature subset and the weight coefficient of the weighted random forest algorithm to obtain a fault positioning model of the weighted random forest algorithm.
7. The method of claim 1, wherein the training according to the optimal subset of features and the weights to obtain a fault localization model of a weighted random forest algorithm, further comprises:
and searching the values of the attribute number with the highest goodness of fit of the fault positioning model of the weighted random forest algorithm and the total number of the decision trees by using a cycle statement.
8. A DC distribution network fault locating device comprises:
the acquisition module is used for acquiring line mode current data;
the processing module is used for processing the line mode current data based on an improved Relieff algorithm to obtain an optimal feature subset and generate weights corresponding to the features;
the training module is used for training according to the optimal feature subset and the weight to obtain a fault positioning model of a weighted random forest algorithm;
and the output module is used for acquiring current linear mode current data, inputting the current linear mode current data into the fault positioning model of the weighted random forest algorithm and outputting the current fault position.
9. An electronic device, comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-7.
10. A non-transitory computer readable storage medium having stored thereon computer instructions for causing the computer to perform the method of any one of claims 1-7.
CN202211389396.4A 2022-11-08 2022-11-08 Fault positioning method and device for direct-current power distribution network Pending CN115963350A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211389396.4A CN115963350A (en) 2022-11-08 2022-11-08 Fault positioning method and device for direct-current power distribution network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211389396.4A CN115963350A (en) 2022-11-08 2022-11-08 Fault positioning method and device for direct-current power distribution network

Publications (1)

Publication Number Publication Date
CN115963350A true CN115963350A (en) 2023-04-14

Family

ID=87360603

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211389396.4A Pending CN115963350A (en) 2022-11-08 2022-11-08 Fault positioning method and device for direct-current power distribution network

Country Status (1)

Country Link
CN (1) CN115963350A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117110989A (en) * 2023-10-16 2023-11-24 国网浙江省电力有限公司宁波供电公司 Noise fault positioning detection method and system for power equipment

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117110989A (en) * 2023-10-16 2023-11-24 国网浙江省电力有限公司宁波供电公司 Noise fault positioning detection method and system for power equipment
CN117110989B (en) * 2023-10-16 2024-02-06 国网浙江省电力有限公司宁波供电公司 Noise fault positioning detection method and system for power equipment

Similar Documents

Publication Publication Date Title
Zhu et al. Time series shapelet classification based online short-term voltage stability assessment
Roy et al. Detection, classification, and estimation of fault location on an overhead transmission line using S-transform and neural network
CN109033702A (en) A kind of Transient Voltage Stability in Electric Power System appraisal procedure based on convolutional neural networks CNN
CN109375060B (en) Method for calculating fault waveform similarity of power distribution network
CN112285489B (en) Fault indicator fault positioning method based on feature fusion and model fusion
CN112098889B (en) Single-phase earth fault positioning method based on neural network and feature matrix
CN107478963A (en) Single-phase ground fault line selecting method of small-electric current grounding system based on power network big data
CN113011481A (en) Electric energy meter function abnormity evaluation method and system based on decision tree algorithm
CN114386537A (en) Lithium battery fault diagnosis method and device based on Catboost and electronic equipment
CN115963350A (en) Fault positioning method and device for direct-current power distribution network
CN116679161A (en) Power grid line fault diagnosis method, equipment and medium
CN110579684A (en) low-current grounding system line selection method based on fusion algorithm
CN107462810B (en) Fault section positioning method suitable for active power distribution network
CN115932484A (en) Method and device for identifying and ranging faults of power transmission line and electronic equipment
CN117723893B (en) RLMD-based fault traveling wave identification method and device and computer equipment
CN110824299A (en) Fault line selection method based on two-dimensional plane judgment of zero-sequence current curve cluster
Paul et al. Series AC arc fault detection using decision tree-based machine learning algorithm and raw current
CN109684749B (en) Photovoltaic power station equivalent modeling method considering operating characteristics
CN115144696B (en) Fault line selection method, device, equipment and medium for low-current grounding system
CN116482571A (en) CNN-based low-current single-phase earth fault multi-criterion fusion line selection method
CN112051479A (en) Power distribution network operation state identification method and system
CN115794473A (en) Root cause alarm positioning method, device, equipment and medium
CN116662840A (en) Low-voltage station user phase identification method based on machine learning
CN114878973A (en) Multi-branch distribution line lightning stroke fault positioning method and device and storage medium
CN114062845A (en) Line fault detection method and device and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination