CN115344567A - Low-voltage transformer area data cleaning and treatment method and device suitable for edge calculation - Google Patents
Low-voltage transformer area data cleaning and treatment method and device suitable for edge calculation Download PDFInfo
- Publication number
- CN115344567A CN115344567A CN202211269752.9A CN202211269752A CN115344567A CN 115344567 A CN115344567 A CN 115344567A CN 202211269752 A CN202211269752 A CN 202211269752A CN 115344567 A CN115344567 A CN 115344567A
- Authority
- CN
- China
- Prior art keywords
- data
- voltage
- low
- transformer area
- area
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 78
- 238000004364 calculation method Methods 0.000 title claims abstract description 43
- 238000004140 cleaning Methods 0.000 title claims abstract description 22
- 230000002159 abnormal effect Effects 0.000 claims abstract description 69
- 238000005259 measurement Methods 0.000 claims abstract description 69
- 230000008569 process Effects 0.000 claims abstract description 20
- 238000010219 correlation analysis Methods 0.000 claims abstract description 13
- 230000001419 dependent effect Effects 0.000 claims abstract description 4
- 230000007704 transition Effects 0.000 claims description 36
- 239000013598 vector Substances 0.000 claims description 36
- 150000001875 compounds Chemical class 0.000 claims description 30
- 238000012937 correction Methods 0.000 claims description 28
- 239000011159 matrix material Substances 0.000 claims description 25
- 238000012549 training Methods 0.000 claims description 24
- 238000012546 transfer Methods 0.000 claims description 8
- 230000000295 complement effect Effects 0.000 claims description 6
- 238000005315 distribution function Methods 0.000 claims description 6
- 230000005611 electricity Effects 0.000 claims description 6
- 238000005070 sampling Methods 0.000 claims description 6
- 230000008859 change Effects 0.000 description 6
- 230000000694 effects Effects 0.000 description 3
- 238000012545 processing Methods 0.000 description 3
- 238000011161 development Methods 0.000 description 2
- 238000007726 management method Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000003672 processing method Methods 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000005192 partition Methods 0.000 description 1
- 238000010223 real-time analysis Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/21—Design, administration or maintenance of databases
- G06F16/215—Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Theoretical Computer Science (AREA)
- Quality & Reliability (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Remote Monitoring And Control Of Power-Distribution Networks (AREA)
Abstract
The invention discloses a low-voltage distribution room data cleaning and treating method and device suitable for edge calculation. The method comprises the following steps: the method comprises the steps of completing missing data of measured data of a low-voltage transformer area to obtain complete measured data of information acquisition equipment of the low-voltage transformer area; clustering complete measurement data, and identifying abnormal data; correcting the identified abnormal data and outputting corrected complete measurement data; performing correlation analysis on voltage data of the low-voltage transformer area in the corrected complete measurement data, and correcting the user-to-variable relationship of the transformer area file; calculating statistical line loss data of the corrected distribution area on the basis of the corrected distribution area archives in the user-dependent relationshipL: outputting the corrected complete measurement data and correcting the statistical line loss data of the transformer areaL. The invention can effectively process missing and abnormal data in the low-voltage transformer area measurement data, and carries out household variation relation and transformer area line loss data based on the cleaned basic measurement dataAnd (5) governing and correcting, and remarkably improving the large data quality of the low-voltage transformer area.
Description
Technical Field
The invention relates to the technical field of data cleaning and processing, in particular to a low-voltage distribution room data cleaning and treatment method and device suitable for edge calculation.
Background
In recent years, with the development and construction of intelligent power distribution networks, power utilization information acquisition systems in low-voltage transformer areas are popularized comprehensively, so that the low-voltage transformer areas generate massive information data. At present, the transmission of each terminal information in a low-voltage distribution area generally adopts power line carrier communication, the working environment of an acquisition terminal is complex, data loss is easily caused by channel interference, abnormal acquisition equipment and the like, and simultaneously, due to changes of distribution area lines, users and the like, file information is not updated timely, the distribution area household variation relation and the abnormal line loss data are caused, and the development target of lean management of a power company is seriously influenced.
For missing and abnormal data, the traditional processing method discards missing data or fills up by calculating the mean, median, mode and the like of adjacent data points, and the traditional processing method only improves the integrity of information data but cannot restore correct data, so that the data quality is not effectively improved. Especially, the error correction can not be realized for the household variable relation of the low-voltage transformer area and the line loss data of the transformer area, and the quality is improved.
The Edge Computing technology transfers Computing power to the side close to the physical equipment and the data source of the power terminal, a partition management method is adopted, and Edge Computing Nodes (ECN) which are optimally deployed and configured based on each region perform system situation sensing, data processing analysis and autonomous quick decision making on the side of a control execution unit in the region, so that the real-time analysis and Computing capacity of the CPS in the region can be effectively improved, and the requirement of a power information system on quick processing of mass data is met.
Disclosure of Invention
The invention aims to provide a low-voltage distribution area data cleaning and treating method and device suitable for edge calculation so as to meet the use requirements of users.
In order to achieve the purpose of the invention, the technical scheme provided by the invention is as follows:
first aspect
The invention provides a low-voltage distribution area data cleaning and governing method suitable for edge calculation, which comprises the following steps:
the method comprises the following steps: performing missing data completion on the measured data of the low-voltage transformer area by adopting a Markov process missing data completion method to obtain complete measured data of each information acquisition device of the low-voltage transformer area;
step two: clustering the complete measurement data output in the first step by adopting a mean shift clustering algorithm, and identifying abnormal data;
step three: correcting the abnormal data identified in the step two by adopting a Markov process missing data completion method, and outputting corrected complete measurement data;
step four: performing correlation analysis on the voltage data of the low-voltage transformer area in the corrected complete measurement data by adopting a Pearson correlation coefficient method, and correcting the outdoor variable relation of the transformer area files;
step five: on the basis of the corrected station area file outdoor variation relation, calculating the corrected station area statistical line loss data based on the corrected complete measurement data output in the third stepL:
In the formula (I), the compound is shown in the specification,Sthe power supply amount is supplied to the platform area,p i the amount of electricity is used for the user,ntotal number of users in a cell,iDenotes the firstiA user;
step six: outputting the complete measurement data corrected in the third step and the corrected station area statistical line loss data in the fifth stepL。
In the first step, the measured data of the low-voltage transformer area comprise electric quantity, voltage and current data of the single-phase and three-phase intelligent electric meters in the low-voltage transformer area.
The data completion method of the Markov process in the first step and the third step specifically comprises the following steps:
dividing a state space: constructing a training set by using historical data of a section of electric quantity, voltage and current continuously acquired by the intelligent electric meter, and according to the maximum sampling value in the training seta max Minimum sample valuea min And a specified accuracy of the complement data, dividing the training set intokA state space;
markov state transition matrix: calculating transition probability between each state by using Markov state transition probability formula to obtain sampled Markov forward and backward transition matrixPMarkov state transition probabilityP mn The expression of the formula is:
in the formula (I), the compound is shown in the specification,s(m) Is composed ofmProbability under state, measured value of electric quantity, voltage or currentmThe probability of (d);s(n|m) Is at the same timemThe next state in the state isnHas a probability of measuringmThe next measurement value isnThe probability of (d);
for the measured data of the electric quantity, the voltage and the current to be compensated in the last section of the same time scale, the forward and reverse initial states of the measured data of the electric quantity, the voltage and the current to be compensated are determinedmAndnand a sampled Markov forward and reverse transition matrixPRespectively obtaining two interpolation valuesI 1 AndI 2 ;
and (3) calculating a compensation value: interpolating the forward and reverse interpolation values obtained by the Markov transfer matrixI 1 AndI 2 weighting and summing to obtain the final interpolation valueIThe weighted sum calculation formula is:
in the formula (I), the compound is shown in the specification,zfor interpolation of the forward and reverse initial statesmAndnthe difference in frequency of occurrence in the training set;A(z) Is composed ofzA ridge-type distribution function.
In the second step, a mean shift clustering algorithm is adopted to cluster the complete measurement data output in the first step, and abnormal data is identified, specifically as follows:
the mean shift clustering updates the candidate point of the central point into the mean value of the points in the sliding window through a mean shift vector, gradually finds a dense area of the voltage data, and completes the positioning of the central point of each cluster; if the distance from a certain voltage data to be detected to each cluster central point is larger than a set threshold value, marking the data to be detected as abnormal data, wherein the data to be detected does not belong to any cluster;
wherein the mean shift vector represents the magnitude and direction of the deviation from the center point, thereby determining whether the center point iteration is finished and calculating a new center point, the mean shift vectorM h The expression of (c) is:
in the formula (I), the compound is shown in the specification,yrandomly selecting or appointing a certain sample point as an initial clustering center;x q the intelligent ammeter measures a time sequence of data for the sample point;G(. Dash) is a kernel function, commonly used Gaussian kernel function;hthe width of the core is the width of the core,Nrepresents the total number of the measured data of the intelligent electric meter,qthe second step of representing the time series of the measurement data of the intelligent electric meterqAnd (4) the time.
In the fourth step, a pearson correlation coefficient method is adopted to perform correlation analysis on the voltage data of the low-voltage distribution room in the corrected complete measurement data, and the user-dependent relationship of the distribution room file is corrected, specifically as follows:
the calculation formula of the correlation degree between the historical voltage data of the intelligent ammeter in the transformer area is as follows:
in the formula (I), the compound is shown in the specification,C bd intelligent ammeterbAnd intelligent electric meterdA correlation coefficient therebetween;u b andu d respectively representing intelligent electric meterbAnd intelligent electric meterdHistorical voltage data of;andis the average value of the values,nthe total number of users in the platform area;
calculating the association characteristic vector of each intelligent electric meter based on the association degree calculation formula (6)Q b :
nIs the total number of users in the cell,bis shown asbThe intelligent electric meters are used for providing the associated characteristic vectors of the intelligent electric metersQ b Building a feature matrixQAnd finding out the intelligent electric meter with abnormal household variation relation through an isolated forest algorithm, checking the correct household variation relation through the associated characteristic vector, and correcting the household variation relation of the file.
Second aspect of the invention
Correspondingly to the method, the invention provides a low-pressure platform area data cleaning and treating device suitable for edge calculation, which comprises the following units: the system comprises a data completion unit, an abnormal data identification unit, an abnormal data correction unit, a user variable relation correction unit, a line loss data calculation unit and a data output unit;
the data completion unit is used for completing the missing data of the measurement data of the low-voltage distribution area by adopting a Markov process missing data completion method to obtain complete measurement data of each information acquisition device of the low-voltage distribution area;
the abnormal data identification unit is used for clustering the complete measurement data output by the data completion unit by adopting a mean shift clustering algorithm to identify abnormal data;
the abnormal data correction unit is used for correcting the abnormal data identified by the abnormal data identification unit by adopting a Markov process missing data completion method and outputting corrected complete measurement data;
the family change relation correction unit is used for performing correlation analysis on voltage data of the low-voltage distribution room in the corrected complete measurement data by adopting a Pearson correlation coefficient method and correcting the family change relation of the distribution room file;
the line loss data calculation unit is used for calculating and correcting statistical line loss data of the distribution room based on corrected complete measurement data output by the abnormal data correction unit on the basis of the corrected distribution room variation relationship of the distribution room filesL:
In the formula (I), the compound is shown in the specification,Sthe power supply amount is supplied to the platform area,p i the amount of electricity is used for the user,ntotal number of users in a cell,iIs shown asiA user;
the data output unit is used for outputting the corrected complete measurement data output by the abnormal data correction unit and the corrected station area statistical line loss data output by the line loss data calculation unitL。
In the data complementing unit, the measurement data of the low-voltage transformer area comprise electric quantity, voltage and current data of single-phase and three-phase intelligent electric meters in the low-voltage transformer area.
The data completion method of the markov process in the data completion unit and the abnormal data correction unit specifically comprises the following steps:
dividing a state space: constructing a training set by using historical data of a section of electric quantity, voltage and current continuously acquired by the intelligent electric meter, and according to the maximum sampling value in the training seta max Minimum sample valuea min And a specified accuracy of the complement data, dividing the training set intokA state space;
markov state transition matrix: calculating transition probability between each state by using Markov state transition probability formula to obtain sampled Markov forward and backward transition matrixPMarkov state transition probabilityP mn The expression of the formula is:
in the formula (I), the compound is shown in the specification,s(m) Is composed ofmProbability in the state, measured value of electric quantity, voltage or currentmThe probability of (d);s(n|m) Is at the same timemThe next state under the state isnHas a probability of measuringmThe next measurement value isnThe probability of (d);
for the measured data of the electric quantity, the voltage and the current to be compensated in the last section of the same time scale, the forward and reverse initial states of the measured data of the electric quantity, the voltage and the current to be compensated are determinedmAndnand a sampled Markov forward and reverse transition matrixPRespectively obtaining two interpolation valuesI 1 AndI 2 ;
and (3) calculating a compensation value: interpolating the forward and reverse interpolation values obtained by the Markov transfer matrixI 1 AndI 2 weighting and summing to obtain final interpolation valueIThe weighted sum calculation formula is:
in the formula (I), the compound is shown in the specification,zfor interpolation of the values for the forward and reverse initial statesmAndnthe difference in frequency of occurrence in the training set;A(z) Is composed ofzA ridge-type distribution function.
The abnormal data identification unit is used for clustering the complete measurement data output by the data completion unit by adopting a mean shift clustering algorithm to identify abnormal data, and the method specifically comprises the following steps:
the mean shift clustering updates the candidate point of the central point into the mean value of the points in the sliding window through a mean shift vector, gradually finds a dense area of the voltage data, and completes the positioning of the central point of each cluster; if the distance from a certain voltage data to be detected to each cluster central point is larger than a set threshold value, marking the data to be detected as abnormal data, wherein the data to be detected does not belong to any cluster;
wherein the mean shift vector represents the magnitude and direction of the deviation from the center point, thereby determining whether the center point iteration is finished and calculating a new center point, the mean shift vectorM h The expression of (a) is:
in the formula (I), the compound is shown in the specification,yrandomly selecting or appointing a certain sample point as an initial clustering center;x q the intelligent ammeter measures a time sequence of data for the sample point;G(. Dash) is a kernel function, commonly used Gaussian kernel function;hthe width of the core is the width of the core,Nrepresents the total number of the measured data of the intelligent electric meter,qthe second step of representing the time series of the measurement data of the intelligent electric meterqAnd (4) the time.
The household variable relationship correction unit adopts a Pearson correlation coefficient method to perform correlation analysis on the voltage data of the low-voltage distribution room in the corrected complete measurement data, and corrects the household variable relationship of the distribution room file, which specifically comprises the following steps:
the calculation formula of the correlation degree between the historical voltage data of the intelligent ammeter in the transformer area is as follows:
in the formula (I), the compound is shown in the specification,C bd intelligent ammeterbAnd intelligent electric meterdA correlation coefficient between;u b andu d respectively represent intelligent ammeterbAnd intelligent electric meterdHistorical voltage data of;andis the average value of the values,nthe total number of users in the platform area;
based on the degree of associationCalculating formula (6) and calculating the associated characteristic vector of each intelligent electric meterQ b :
nIs the total number of users in the cell,bis shown asbThe intelligent electric meters are used for providing the associated characteristic vectors of the intelligent electric metersQ b Building feature matricesQAnd finding out the intelligent electric meter with abnormal household variation relation through an isolated forest algorithm, checking the correct household variation relation through the associated characteristic vector, and correcting the household variation relation of the file.
Compared with the prior art, the invention has the beneficial effects that:
the method and the device provided by the invention have the advantages of simple steps and small calculated amount, are suitable for quickly and effectively cleaning and managing the data of the low-voltage transformer area at the edge side, can effectively process missing and abnormal data in the measured data of the low-voltage transformer area, and can manage and correct the house change relation and the line loss data of the transformer area based on the cleaned basic measured data, thereby obviously improving the quality of the big data of the low-voltage transformer area.
Drawings
FIG. 1 is a schematic flow chart of a method provided by an embodiment of the present invention;
FIG. 2 is a graph of mean shift clustering effect according to an embodiment of the present invention;
fig. 3 is a diagram illustrating an effect of modifying line loss data of a distribution room according to an embodiment of the present invention.
Detailed Description
The invention is described in further detail below with reference to the figures and specific examples. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
As shown in fig. 1, the low-pressure area data cleaning and treating method suitable for edge calculation provided by the invention comprises the following steps:
the method comprises the following steps: performing missing data completion on the measured data of the low-voltage transformer area by adopting a Markov process missing data completion method to obtain complete measured data of each information acquisition device of the low-voltage transformer area;
step two: clustering the complete measurement data output in the step one by adopting a mean shift clustering algorithm, and identifying abnormal data;
step three: correcting the abnormal data identified in the step two by adopting a Markov process missing data completion method, and outputting corrected complete measurement data;
step four: performing correlation analysis on the voltage data of the low-voltage transformer area in the corrected complete measurement data by adopting a Pearson correlation coefficient method, and correcting the outdoor variable relation of the transformer area files;
step five: on the basis of the corrected station area file outdoor variation relation, calculating the corrected station area statistical line loss data based on the corrected complete measurement data output in the third stepL:
In the formula (I), the compound is shown in the specification,Sthe power supply amount is supplied to the platform area,p i the amount of electricity is used for the user,ntotal number of users in a cell,iIs shown asiA user;
step six: outputting the complete measurement data corrected in the third step and the corrected station area statistical line loss data in the fifth stepL。
In a preferred embodiment, in the first step, the measurement data of the low-voltage region includes power, voltage and current data of the single-phase and three-phase smart meters in the low-voltage region. The data such as electric quantity, voltage, current and the like have uniqueness.
In addition, the data completion method of the markov process in the first step and the third step specifically comprises the following steps:
dividing a state space: constructing a training set by using historical data of a section of electric quantity, voltage and current continuously acquired by the intelligent electric meter, and according to the maximum sampling value in the training seta max Minimum sample valuea min And a specified accuracy of the complement data, dividing the training set intokA state space;
markov stateA state transition matrix: calculating transition probability between each state by using Markov state transition probability formula to obtain sampled Markov forward and backward transition matrixPMarkov state transition probabilityP mn The expression of the formula is:
in the formula (I), the compound is shown in the specification,s(m) Is composed ofmProbability in the state, measured value of electric quantity, voltage or currentmThe probability of (d);s(n|m) Is at leastmThe next state under the state isnHas a probability of measuringmThe next measurement value isnThe probability of (d);
for the measured data of the electric quantity, the voltage and the current to be compensated in the last section of the same time scale, the forward and reverse initial states of the measured data of the electric quantity, the voltage and the current to be compensated are determinedmAndnand a sampled Markov forward and reverse transition matrixPRespectively obtaining two interpolation valuesI 1 AndI 2 ;
and (3) calculating a compensation value: forward and reverse interpolation values obtained by Markov transfer matrixI 1 AndI 2 weighting and summing to obtain final interpolation valueIThe weighted sum calculation formula is:
in the formula (I), the compound is shown in the specification,zfor interpolation of the values for the forward and reverse initial statesmAndnthe difference in frequency of occurrence in the training set;A(z) Is composed ofzA ridge-type distribution function.
Taking the power consumption measurement data of users in a certain actual distribution area as an example, taking the state of an actual value closest to a missing value as an initial state, performing preliminary completion on the missing value from the positive direction and the negative direction through 1-step or multi-step state transition, performing weighted summation on possible completion values, and completing the missing value in sequence to obtain complete time sequence data.
In a preferred embodiment, in the second step, a mean shift clustering algorithm is adopted to cluster the complete measurement data output in the first step, and the identification of abnormal data is performed, specifically as follows:
the mean shift clustering updates the candidate point of the central point into the mean value of the points in the sliding window through a mean shift vector, gradually finds a dense area of the voltage data, and completes the positioning of the central point of each cluster; if the distance from a certain voltage data to be detected to each cluster central point is larger than a set threshold value, marking the data to be detected as abnormal data, wherein the data to be detected does not belong to any cluster;
taking the voltage measurement data of a certain low-voltage distribution area as an example, the abnormal measurement voltage is detected, as shown in fig. 2, the low-voltage distribution area supplies power to distribution area users through a three-phase power supply, each user belongs to one of the phases, and the voltage measurement value of one user is abnormal and outlier through clustering.
In a preferred embodiment, the mean shift vector represents the magnitude and direction of the offset from the center point, to determine whether the center point iteration is over and to calculate a new center point, the mean shift vectorM h The expression of (a) is:
in the formula (I), the compound is shown in the specification,yrandomly selecting or appointing a certain sample point as an initial clustering center;x q the intelligent ammeter measures a time sequence of data for the sample point;G(. Dash) is a kernel function, commonly used Gaussian kernel function;hthe width of the core is the width of the core,Nrepresents the total number of the measured data of the intelligent electric meter,qthe second step of representing the time series of the measurement data of the intelligent electric meterqAnd (4) the time.
In a preferred embodiment, in the fourth step, a pearson correlation coefficient method is adopted to perform correlation analysis on the voltage data of the low voltage distribution room in the corrected complete measurement data, and the user-dependent relationship of the distribution room file is corrected, specifically as follows:
the calculation formula of the correlation degree between the historical voltage data of the intelligent ammeter in the transformer area is as follows:
in the formula (I), the compound is shown in the specification,C bd intelligent ammeterbAnd intelligent electric meterdA correlation coefficient between;u b andu d respectively representing intelligent electric meterbAnd intelligent electric meterdHistorical voltage data of;andis the average value of the values,nthe total number of users in the platform area;
calculating the association characteristic vector of each intelligent electric meter based on the association degree calculation formula (6)Q b :
nIs the total number of users in the cell,bis shown asbThe intelligent electric meters are used for providing the associated characteristic vectors of the intelligent electric metersQ b Building feature matricesQAnd finding out the intelligent electric meter with abnormal household variation relation through an isolated forest algorithm, checking the correct household variation relation through the associated characteristic vector, and correcting the household variation relation of the file.
Taking the actual statistical line loss data of a certain low-voltage transformer area as an example, as shown in fig. 3, the method of the present invention performs missing data completion, abnormal data correction and user variable relationship correction to actually calculate abnormal and error data in line loss, and obtains good correction.
The method has simple steps and small calculated amount, is suitable for quickly and effectively cleaning and treating the data of the low-voltage transformer area at the edge side, provides a set of complete and effective treatment method for quickly cleaning and treating the data of the low-voltage transformer area at the edge side, can effectively treat missing and abnormal data in the measured data of the low-voltage transformer area, treats and corrects the house change relation and the line loss data of the transformer area based on the cleaned basic measured data, and obviously improves the quality of the big data of the low-voltage transformer area.
Second aspect of the invention
Correspondingly to the method, the invention provides a low-pressure platform area data cleaning and treating device suitable for edge calculation, which comprises the following units: the system comprises a data completion unit, an abnormal data identification unit, an abnormal data correction unit, a user variable relation correction unit, a line loss data calculation unit and a data output unit;
the data completion unit is used for performing missing data completion on the measured data of the low-voltage distribution area by adopting a Markov process missing data completion method to obtain complete measured data of each information acquisition device of the low-voltage distribution area;
the abnormal data identification unit is used for clustering the complete measurement data output by the data completion unit by adopting a mean shift clustering algorithm to identify abnormal data;
the abnormal data correction unit is used for correcting the abnormal data identified by the abnormal data identification unit by adopting a Markov process missing data completion method and outputting corrected complete measurement data;
the family change relation correction unit is used for performing correlation analysis on voltage data of the low-voltage distribution room in the corrected complete measurement data by adopting a Pearson correlation coefficient method and correcting the family change relation of the distribution room file;
the line loss data calculation unit is used for calculating and correcting statistical line loss data of the distribution room based on corrected complete measurement data output by the abnormal data correction unit on the basis of the corrected distribution room variation relationship of the distribution room filesL:
In the formula (I), the compound is shown in the specification,Sthe power supply amount is supplied to the platform area,p i in order to use the electricity for the user,ntotal number of users in a cell,iIs shown asiA user;
the data output unit is used for outputting the corrected complete measurement data output by the abnormal data correction unit and the corrected station area statistical line loss data output by the line loss data calculation unitL。
In the data complementing unit, the measurement data of the low-voltage transformer area comprise electric quantity, voltage and current data of single-phase and three-phase intelligent electric meters in the low-voltage transformer area.
The data completion method of the markov process in the data completion unit and the abnormal data correction unit specifically comprises the following steps:
dividing a state space: constructing a training set by using historical data of a section of electric quantity, voltage and current continuously acquired by the intelligent electric meter, and according to the maximum sampling value in the training seta max Minimum sample valuea min And a specified accuracy of the complement data, dividing the training set intokA state space;
markov state transition matrix: calculating transition probability between each state by using Markov state transition probability formula to obtain sampled Markov forward and backward transition matrixPMarkov state transition probabilityP mn The expression of the formula is:
in the formula (I), the compound is shown in the specification,s(m) Is composed ofmProbability in the state, measured value of electric quantity, voltage or currentmThe probability of (d);s(n|m) Is at the same timemThe next state under the state isnHas a probability of measuringmThe next measurement value isnThe probability of (d);
for the measured data of the electric quantity, the voltage and the current to be compensated in the last section of the same time scale, the forward direction of the measured data of the electric quantity, the voltage and the current to be compensated isAnd reverse initial statemAndnand a sampled Markov forward and reverse transition matrixPRespectively obtaining two interpolation valuesI 1 AndI 2 ;
and (3) calculating a compensation value: forward and reverse interpolation values obtained by Markov transfer matrixI 1 AndI 2 weighting and summing to obtain final interpolation valueIThe weighted sum calculation formula is:
in the formula (I), the compound is shown in the specification,zfor interpolation of the forward and reverse initial statesmAndnthe difference in frequency of occurrence in the training set;A(z) Is composed ofzA ridge-type distribution function.
The abnormal data identification unit is used for clustering the complete measurement data output by the data completion unit by adopting a mean shift clustering algorithm to identify abnormal data, and the method specifically comprises the following steps:
the mean shift clustering updates the candidate point of the central point into the mean value of the points in the sliding window through a mean shift vector, gradually finds a dense area of the voltage data, and completes the positioning of the central point of each cluster; if the distance from a certain voltage data to be detected to each cluster central point is larger than a set threshold value, marking the data to be detected as abnormal data, wherein the data to be detected does not belong to any cluster;
wherein the mean shift vector represents the magnitude and direction of the deviation from the center point, thereby determining whether the center point iteration is finished and calculating a new center point, the mean shift vectorM h The expression of (a) is:
in the formula (I), the compound is shown in the specification,yrandomly selecting or appointing a certain sample point as an initial clustering center;x q the intelligent ammeter measures a time sequence of data for the sample point;G(. Dash) is a kernel function, commonly used Gaussian kernel function;hthe width of the core is the width of the core,Nrepresents the total number of the measured data of the intelligent electric meter,qrepresenting a time series of the measurement data of the smart meterqAnd (4) the time.
The household variable relationship correction unit adopts a Pearson correlation coefficient method to perform correlation analysis on the voltage data of the low-voltage distribution room in the corrected complete measurement data, and corrects the household variable relationship of the distribution room file, which specifically comprises the following steps:
the calculation formula of the correlation degree between the historical voltage data of the intelligent ammeter in the transformer area is as follows:
in the formula (I), the compound is shown in the specification,C bd intelligent ammeterbAnd intelligent ammeterdA correlation coefficient between;u b andu d respectively representing intelligent electric meterbAnd intelligent electric meterdHistorical voltage data of (a);andis the average value of the values,nthe total number of users in the platform area;
calculating the association feature vector of each intelligent electric meter based on the association degree calculation formula (6)Q b :
nIs the total number of users in the cell,bis shown asbThe intelligent electric meters are used for providing the associated characteristic vectors of the intelligent electric metersQ b Building feature matricesQDisclosure of the inventionAnd finding out the intelligent electric meter with abnormal household variation relation through an isolated forest algorithm, checking the correct household variation relation through the associated characteristic vector, and correcting the household variation relation of the file.
It should be noted that the apparatus provided in the embodiment of the present invention has the same or similar details and effects as those of the method in the embodiment described above, and is not repeated herein.
Although the embodiments of the present invention have been described in detail with reference to the accompanying drawings, the embodiments of the present invention are not limited to the specific details of the above embodiments, and various simple modifications may be made to the technical solutions of the embodiments of the present invention within the technical idea of the embodiments of the present invention, and these simple modifications all belong to the protection scope of the embodiments of the present invention.
Claims (10)
1. A low-voltage transformer area data cleaning and governing method suitable for edge calculation is characterized by comprising the following steps:
the method comprises the following steps: performing missing data completion on the measured data of the low-voltage transformer area by adopting a Markov process missing data completion method to obtain complete measured data of each information acquisition device of the low-voltage transformer area;
step two: clustering the complete measurement data output in the step one by adopting a mean shift clustering algorithm, and identifying abnormal data;
step three: correcting the abnormal data identified in the step two by adopting a Markov process missing data completion method, and outputting corrected complete measurement data;
step four: performing correlation analysis on voltage data of the low-voltage transformer area in the corrected complete measurement data by adopting a Pearson correlation coefficient method, and correcting the user-variant relation of the transformer area file;
step five: on the basis of the corrected station area file outdoor variation relation, calculating the corrected station area statistical line loss data based on the corrected complete measurement data output in the step threeL:
In the formula (I), the compound is shown in the specification,Sthe power supply amount is supplied to the platform area,p i in order to use the electricity for the user,ntotal number of users in a cell,iIs shown asiA user;
step six: outputting the complete measurement data corrected in the third step and the corrected station area statistical line loss data in the fifth stepL。
2. The method as claimed in claim 1, wherein in the first step, the measured data of the low-voltage transformer area includes power, voltage and current data of a single-phase smart meter and a three-phase smart meter in the low-voltage transformer area.
3. The low-pressure distribution room data cleaning and governing method suitable for edge computing according to claim 1 or 2, characterized in that the data completion method of the markov process in step one and step three is as follows:
dividing a state space: constructing a training set by using historical data of a section of electric quantity, voltage and current continuously acquired by the intelligent electric meter, and according to the maximum sampling value in the training seta max Minimum sample valuea min And a specified accuracy of the complement data, dividing the training set intokA state space;
markov state transition matrix: calculating transition probability between each state by using Markov state transition probability formula to obtain sampled Markov forward and backward transition matrixPMarkov state transition probabilityP mn The expression of the formula is:
in the formula (I), the compound is shown in the specification,s(m) Is composed ofmProbability in the state, measured value of electric quantity, voltage or currentmThe probability of (d);s(n|m) Is at the same timemThe next under the stateA state isnHas a probability of measuringmThe next measurement value isnThe probability of (d);
for the measured data of the electric quantity, the voltage and the current to be compensated in the last section of the same time scale, the forward and reverse initial states of the measured data of the electric quantity, the voltage and the current to be compensated are determinedmAndnand a sampled Markov forward and reverse transition matrixPRespectively obtaining two interpolation valuesI 1 AndI 2 ;
and (3) calculating a compensation value: forward and reverse interpolation values obtained by Markov transfer matrixI 1 AndI 2 weighting and summing to obtain the final interpolation valueIThe weighted sum calculation formula is:
in the formula (I), the compound is shown in the specification,zfor interpolation of the forward and reverse initial statesmAndnthe difference in frequency occurs in the training set;A(z) Is composed ofzA ridge-type distribution function.
4. The low-voltage transformer area data cleaning and treatment method suitable for edge calculation according to claim 1 or 2, wherein in the second step, a mean shift clustering algorithm is adopted to cluster the complete measurement data output in the first step, and abnormal data is identified, specifically as follows:
the mean shift clustering updates the candidate point of the central point into the mean value of the points in the sliding window through a mean shift vector, gradually finds a dense area of the voltage data, and completes the positioning of the central point of each cluster; if the distance from a certain voltage data to be detected to each cluster central point is larger than a set threshold value, marking the data to be detected as abnormal data, wherein the data to be detected does not belong to any cluster;
wherein the mean shift vector represents the magnitude and direction of the deviation from the center point, thereby determining whether the center point iteration is finished and calculating a new center point, the mean shift vectorM h The expression of (a) is:
in the formula (I), the compound is shown in the specification,yrandomly selecting or appointing a certain sample point as an initial clustering center;x q the intelligent ammeter measures a time sequence of data for the sample point;G(. Dash) is a kernel function, commonly used Gaussian kernel function;hthe width of the core is the width of the core,Nrepresents the total number of the measured data of the intelligent electric meter,qthe second step of representing the time series of the measurement data of the intelligent electric meterqAt a time instant.
5. The method for cleaning and managing low-voltage transformer area data suitable for edge calculation according to claim 1 or 2, wherein in the fourth step, a pearson correlation coefficient method is adopted to perform correlation analysis on the low-voltage transformer area voltage data in the corrected complete measurement data, so as to correct the user-dependent relationship of the transformer area file, and specifically, the method comprises the following steps:
the calculation formula of the correlation degree between the historical voltage data of the intelligent ammeter in the transformer area is as follows:
in the formula (I), the compound is shown in the specification,C bd intelligent ammeterbAnd intelligent ammeterdA correlation coefficient between;u b andu d respectively representing intelligent electric meterbAnd intelligent electric meterdHistorical voltage data of;andis the average value of the values,nthe total number of users in the platform area;
calculating the association characteristic vector of each intelligent electric meter based on the association degree calculation formula (6)Q b :
nIs the total number of users in the cell,bis shown asbThe intelligent electric meters are used for providing the associated characteristic vectors of the intelligent electric metersQ b Building feature matricesQAnd finding out the intelligent electric meter with abnormal household variation relation through an isolated forest algorithm, checking the correct household variation relation through the associated characteristic vector, and correcting the household variation relation of the file.
6. The low-pressure platform area data cleaning and treating device suitable for edge calculation is characterized by comprising the following units: the system comprises a data completion unit, an abnormal data identification unit, an abnormal data correction unit, a user variable relation correction unit, a line loss data calculation unit and a data output unit;
the data completion unit is used for performing missing data completion on the measured data of the low-voltage distribution area by adopting a Markov process missing data completion method to obtain complete measured data of each information acquisition device of the low-voltage distribution area;
the abnormal data identification unit is used for clustering the complete measurement data output by the data completion unit by adopting a mean shift clustering algorithm to identify abnormal data;
the abnormal data correction unit is used for correcting the abnormal data identified by the abnormal data identification unit by adopting a Markov process missing data completion method and outputting corrected complete measurement data;
the household variable relation correction unit is used for performing correlation analysis on voltage data of a low-voltage distribution room in the corrected complete measurement data by adopting a Pearson correlation coefficient method, and correcting the household variable relation of the distribution room files;
the line loss data calculation unit is used for calculating and correcting statistical line loss data of the distribution room based on corrected complete measurement data output by the abnormal data correction unit on the basis of the corrected distribution room variation relationship of the distribution room filesL:
In the formula (I), the compound is shown in the specification,Sthe power supply amount is supplied to the platform area,p i in order to use the electricity for the user,ntotal number of users in a cell,iIs shown asiA user;
the data output unit is used for outputting the corrected complete measurement data output by the abnormal data correction unit and the corrected station area statistical line loss data output by the line loss data calculation unitL。
7. The low-voltage transformer area data cleaning and treatment device suitable for edge calculation of claim 6, wherein in the data completion unit, the measured data of the low-voltage transformer area comprises the power, voltage and current data of the single-phase and three-phase smart meters in the low-voltage transformer area.
8. The low-pressure platform area data cleaning and treatment device suitable for edge calculation according to claim 6 or 7, wherein the data completion unit and the data completion method of the Markov process in the abnormal data correction unit are as follows:
dividing a state space: constructing a training set by using historical data of a section of electric quantity, voltage and current continuously acquired by the intelligent electric meter, and according to the maximum sampling value in the training seta max Minimum sample valuea min And a specified accuracy of the complement data, dividing the training set intokA state space;
markov state transition matrix: calculating transition probability between each state by using Markov state transition probability formula to obtain sampled MarkovKoff forward and backward transfer matrixPMarkov state transition probabilityP mn The expression of the formula is:
in the formula (I), the compound is shown in the specification,s(m) Is composed ofmProbability under state, measured value of electric quantity, voltage or currentmThe probability of (d);s(n|m) Is at the same timemThe next state in the state isnHas a probability of measuringmThe next measurement value isnThe probability of (d);
for the measured data of the electric quantity, the voltage and the current to be compensated in the last section of the same time scale, the forward and reverse initial states of the measured data of the electric quantity, the voltage and the current to be compensated are determinedmAndnand a sampled Markov forward and reverse transition matrixPRespectively obtaining two interpolation valuesI 1 AndI 2 ;
and (3) calculating a compensation value: interpolating the forward and reverse interpolation values obtained by the Markov transfer matrixI 1 AndI 2 weighting and summing to obtain final interpolation valueIThe weighted sum calculation formula is:
in the formula (I), the compound is shown in the specification,zfor interpolation of the forward and reverse initial statesmAndnthe difference in frequency occurs in the training set;A(z) Is composed ofzA ridge-type distribution function.
9. The low-pressure distribution room data cleaning and treatment device suitable for edge calculation according to claim 6 or 7, wherein the abnormal data identification unit is configured to cluster the complete measurement data output by the data completion unit by using a mean shift clustering algorithm to identify abnormal data, and specifically includes:
the mean shift clustering updates the candidate point of the central point into the mean value of the points in the sliding window through a mean shift vector, gradually finds a dense area of the voltage data, and completes the positioning of the central point of each cluster; if the distance from a certain voltage data to be detected to each cluster central point is larger than a set threshold value, marking the data to be detected as abnormal data, wherein the data to be detected does not belong to any cluster;
wherein the mean shift vector represents the magnitude and direction of the deviation from the center point, thereby determining whether the center point iteration is finished and calculating a new center point, the mean shift vectorM h The expression of (a) is:
in the formula (I), the compound is shown in the specification,yrandomly selecting or appointing a certain sample point as an initial clustering center;x q the intelligent ammeter measures a time sequence of data for the sample point;G(. Dash) is a kernel function, commonly used Gaussian kernel function;hthe width of the core is the width of the core,Nrepresents the total number of the measured data of the intelligent electric meter,qrepresenting a time series of the measurement data of the smart meterqAt a time instant.
10. The low-voltage transformer area data cleaning and treatment device suitable for edge calculation according to claim 6 or 7, wherein the correlation correction unit performs correlation analysis on the low-voltage transformer area voltage data in the corrected complete measurement data by using a pearson correlation coefficient method to correct the correlation of the transformer area file, and specifically includes the following steps:
the calculation formula of the correlation degree between the historical voltage data of the intelligent ammeter in the transformer area is as follows:
in the formula (I), the compound is shown in the specification,C bd intelligent ammeterbAnd intelligent electric meterdA correlation coefficient between;u b andu d respectively represent intelligent ammeterbAnd intelligent electric meterdHistorical voltage data of;andis the average value of the values,nthe total number of users in the platform area;
calculating the association feature vector of each intelligent electric meter based on the association degree calculation formula (6)Q b :
nIs the total number of users in the cell,bdenotes the firstbThe intelligent electric meters are used for providing the associated characteristic vectors of the intelligent electric metersQ b Building feature matricesQAnd finding out the intelligent electric meter with abnormal household variation relation through an isolated forest algorithm, checking the correct household variation relation through the associated characteristic vector, and correcting the household variation relation of the file.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211269752.9A CN115344567A (en) | 2022-10-18 | 2022-10-18 | Low-voltage transformer area data cleaning and treatment method and device suitable for edge calculation |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211269752.9A CN115344567A (en) | 2022-10-18 | 2022-10-18 | Low-voltage transformer area data cleaning and treatment method and device suitable for edge calculation |
Publications (1)
Publication Number | Publication Date |
---|---|
CN115344567A true CN115344567A (en) | 2022-11-15 |
Family
ID=83957210
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202211269752.9A Pending CN115344567A (en) | 2022-10-18 | 2022-10-18 | Low-voltage transformer area data cleaning and treatment method and device suitable for edge calculation |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115344567A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116522268A (en) * | 2023-06-28 | 2023-08-01 | 广东电网有限责任公司 | Line loss anomaly identification method for power distribution network |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109344144A (en) * | 2018-09-06 | 2019-02-15 | 葛得辉 | A kind of low-voltage platform area family change relation recognition method and system |
CN110175167A (en) * | 2019-05-10 | 2019-08-27 | 国网天津市电力公司电力科学研究院 | A kind of data cleaning method and system suitable for low-voltage platform area electricity consumption data |
CN110727662A (en) * | 2019-09-10 | 2020-01-24 | 国网浙江省电力有限公司电力科学研究院 | Low-voltage transformer area user phase identification method and system based on correlation analysis |
CN111309973A (en) * | 2020-01-21 | 2020-06-19 | 杭州安脉盛智能技术有限公司 | Missing value filling method based on improved Markov model and improved K nearest neighbor |
CN111505433A (en) * | 2020-04-10 | 2020-08-07 | 国网浙江余姚市供电有限公司 | Low-voltage transformer area family variable relation error correction and phase identification method |
CN112699913A (en) * | 2020-11-25 | 2021-04-23 | 国网湖南省电力有限公司 | Transformer area household variable relation abnormity diagnosis method and device |
CN114519514A (en) * | 2022-01-27 | 2022-05-20 | 佰聆数据股份有限公司 | Low-voltage transformer area reasonable line loss value measuring and calculating method, system and computer equipment |
-
2022
- 2022-10-18 CN CN202211269752.9A patent/CN115344567A/en active Pending
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109344144A (en) * | 2018-09-06 | 2019-02-15 | 葛得辉 | A kind of low-voltage platform area family change relation recognition method and system |
CN110175167A (en) * | 2019-05-10 | 2019-08-27 | 国网天津市电力公司电力科学研究院 | A kind of data cleaning method and system suitable for low-voltage platform area electricity consumption data |
CN110727662A (en) * | 2019-09-10 | 2020-01-24 | 国网浙江省电力有限公司电力科学研究院 | Low-voltage transformer area user phase identification method and system based on correlation analysis |
CN111309973A (en) * | 2020-01-21 | 2020-06-19 | 杭州安脉盛智能技术有限公司 | Missing value filling method based on improved Markov model and improved K nearest neighbor |
CN111505433A (en) * | 2020-04-10 | 2020-08-07 | 国网浙江余姚市供电有限公司 | Low-voltage transformer area family variable relation error correction and phase identification method |
CN112699913A (en) * | 2020-11-25 | 2021-04-23 | 国网湖南省电力有限公司 | Transformer area household variable relation abnormity diagnosis method and device |
CN114519514A (en) * | 2022-01-27 | 2022-05-20 | 佰聆数据股份有限公司 | Low-voltage transformer area reasonable line loss value measuring and calculating method, system and computer equipment |
Non-Patent Citations (2)
Title |
---|
张若愚等: "面向电力变压器状态评价的油中溶解气体监测数据补全方法", 《电力自动化设备》 * |
梨阳羊等: "基于均值漂移聚类的开关柜局部放电异常检测", 《电气传动》 * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116522268A (en) * | 2023-06-28 | 2023-08-01 | 广东电网有限责任公司 | Line loss anomaly identification method for power distribution network |
CN116522268B (en) * | 2023-06-28 | 2024-03-19 | 广东电网有限责任公司 | Line loss anomaly identification method for power distribution network |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108416695B (en) | Power load probability density prediction method, system and medium based on deep learning | |
CN108199375B (en) | Intelligent power distribution network topology identification method based on synchronous phasor measurement | |
CN109376910B (en) | Historical data drive-based power distribution network dynamic state estimation method | |
CN116243097B (en) | Electric energy quality detection method based on big data | |
CN115344567A (en) | Low-voltage transformer area data cleaning and treatment method and device suitable for edge calculation | |
CN115577978A (en) | Power distribution network target investment decision element weight coefficient measuring and calculating method | |
CN110766190A (en) | Power distribution network load prediction method | |
CN111639111A (en) | Water transfer engineering-oriented multi-source monitoring data deep mining and intelligent analysis method | |
CN113094862A (en) | Data-driven platform zone multilayer topological structure identification method | |
CN111091223B (en) | Matching short-term load prediction method based on intelligent sensing technology of Internet of things | |
CN116754959A (en) | SOC estimation method based on improved GWO optimized forgetting factor on-line parameter identification | |
CN115908051A (en) | Method for determining energy storage capacity of power system | |
CN113724101B (en) | Table relation identification method and system, equipment and storage medium | |
CN106372440B (en) | A kind of adaptive robust state estimation method of the power distribution network of parallel computation and device | |
CN114971090A (en) | Electric heating load prediction method, system, equipment and medium | |
CN113177600B (en) | Adaptive robust state estimation method for power system | |
CN111651448B (en) | Low-voltage topology identification method based on noise reduction differential evolution | |
CN117787915A (en) | Digital twin intelligent brain construction method for power distribution network | |
CN112153564A (en) | Efficient multi-hop positioning method based on combination of centralized calculation and distributed calculation | |
CN111756031B (en) | Power grid operation trend estimation method and system | |
CN109638811B (en) | Power distribution network voltage power sensitivity robust estimation method based on model equivalence | |
CN116629625A (en) | Power grid line loss prediction method based on neural network model | |
CN115619028A (en) | Clustering algorithm fusion-based power load accurate prediction method | |
CN114741822A (en) | Method, system and device for predicting power failure probability of power distribution network under natural disasters | |
CN110571791B (en) | Optimal configuration method for power transmission network planning under new energy access |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20221115 |
|
RJ01 | Rejection of invention patent application after publication |