CN114189313B - Ammeter data reconstruction method and device - Google Patents
Ammeter data reconstruction method and device Download PDFInfo
- Publication number
- CN114189313B CN114189313B CN202111311460.2A CN202111311460A CN114189313B CN 114189313 B CN114189313 B CN 114189313B CN 202111311460 A CN202111311460 A CN 202111311460A CN 114189313 B CN114189313 B CN 114189313B
- Authority
- CN
- China
- Prior art keywords
- reconstruction
- data set
- reconstructed
- missing
- missing value
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 62
- 230000035772 mutation Effects 0.000 claims abstract description 50
- 230000007246 mechanism Effects 0.000 claims abstract description 28
- 238000004422 calculation algorithm Methods 0.000 claims abstract description 27
- 238000012545 processing Methods 0.000 claims abstract description 25
- 238000004134 energy conservation Methods 0.000 claims abstract description 17
- 238000012549 training Methods 0.000 claims abstract description 13
- 238000012217 deletion Methods 0.000 claims description 25
- 230000037430 deletion Effects 0.000 claims description 25
- 238000012216 screening Methods 0.000 claims description 13
- 125000004122 cyclic group Chemical group 0.000 claims description 9
- 238000012360 testing method Methods 0.000 claims description 6
- 238000004458 analytical method Methods 0.000 claims description 4
- 238000010801 machine learning Methods 0.000 claims description 3
- 230000008569 process Effects 0.000 abstract description 8
- 238000013528 artificial neural network Methods 0.000 abstract description 4
- 230000006872 improvement Effects 0.000 description 8
- 238000012986 modification Methods 0.000 description 7
- 230000004048 modification Effects 0.000 description 7
- 230000008901 benefit Effects 0.000 description 5
- 238000005065 mining Methods 0.000 description 5
- 230000008859 change Effects 0.000 description 2
- 238000004891 communication Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 238000003860 storage Methods 0.000 description 2
- 230000008485 antagonism Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000007636 ensemble learning method Methods 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 238000007637 random forest analysis Methods 0.000 description 1
- 230000000306 recurrent effect Effects 0.000 description 1
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L1/00—Arrangements for detecting or preventing errors in the information received
- H04L1/0078—Avoidance of errors by organising the transmitted data in a format specifically designed to deal with errors, e.g. location
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L1/00—Arrangements for detecting or preventing errors in the information received
- H04L1/0078—Avoidance of errors by organising the transmitted data in a format specifically designed to deal with errors, e.g. location
- H04L1/0079—Formats for control data
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L1/00—Arrangements for detecting or preventing errors in the information received
- H04L1/0078—Avoidance of errors by organising the transmitted data in a format specifically designed to deal with errors, e.g. location
- H04L1/0079—Formats for control data
- H04L1/0082—Formats for control data fields explicitly indicating existence of error in data being transmitted, e.g. so that downstream stations can avoid decoding erroneous packet; relays
Landscapes
- Engineering & Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention provides an ammeter data reconstruction method and device, which are characterized in that a ammeter data set of a station area containing a missing value is obtained, the data set to be reconstructed is screened out according to a missing mechanism and a missing mode of a sample in the data set, a preset lower-layer circulation regression reconstruction algorithm and a preset upper-layer circulation regression reconstruction algorithm are adopted to carry out double-layer circulation regression reconstruction on the missing value, a double-layer reconstruction result is obtained, the upper and lower bounds of the missing ammeter and the upper and lower bounds of the missing ammeter reconstruction result are determined by combining the energy conservation relation of the total surface and the sub-table, the reconstruction precision is further improved, and finally a mutation processing mechanism is adopted to process the mutation value in the obtained upper and lower bounds of the reconstruction result, so that a final reconstruction result is obtained. Compared with the existing method for reconstructing the missing data through the neural network, the method can reconstruct the missing data of the ammeter without massive and complete data training, and has high reconstruction accuracy.
Description
Technical Field
The present invention relates to the field of data processing technologies, and in particular, to a method and an apparatus for reconstructing ammeter data.
Background
Complete meter data is an important basis for achieving high-precision meter operation state assessment. However, in an actual running environment, the ammeter data may be affected by factors such as equipment faults, artificial interference and the like in the processes of measurement, communication, storage and the like, so that the condition of data deletion occurs, and the missing positions and the missing number may cause that the ammeter data cannot reflect the actual state of the power system. Therefore, the reconstruction of the missing data of the ammeter has important significance for realizing a digital power grid.
Disclosure of Invention
The embodiment of the invention provides an ammeter data reconstruction method and device, which can reconstruct missing data of an ammeter and have high reconstruction accuracy.
The embodiment of the invention provides an ammeter data reconstruction method, which comprises the following steps:
acquiring a station area ammeter data set containing a missing value, and screening out a data set to be reconstructed according to a missing mechanism and a missing mode of each sample of the station area ammeter data set; the data set to be reconstructed comprises a plurality of samples to be reconstructed, and the station area ammeter comprises a total table and at least one sub table;
performing upper reconstruction on the missing values based on the data set to be reconstructed by adopting a preset lower-layer circulating regression reconstruction algorithm to obtain a lower-layer reconstruction result, and replacing the missing values of the data set to be reconstructed with the lower-layer reconstruction result to obtain a lower-layer reconstruction data set;
performing lower layer reconstruction on the missing values based on the lower layer reconstruction data set by adopting a preset upper layer circulation regression reconstruction algorithm to obtain a double-layer reconstruction result of the missing values;
solving an equation according to an ammeter error of an energy conservation law, determining upper and lower bounds of the missing value, and obtaining upper and lower bounds of the missing value according to a comparison result of a double-layer reconstruction result of the missing value and the upper and lower bounds;
When the upper and lower boundary reconstruction results of the missing value meet the preset mutation conditions, processing the upper and lower boundary reconstruction results meeting the preset mutation conditions according to a preset mutation processing mechanism to obtain the final reconstruction result of the missing value.
As an improvement of the above scheme, after solving an equation according to an ammeter error of an energy conservation law, determining an upper boundary and a lower boundary of the missing value, and obtaining a reconstruction result of the upper boundary and the lower boundary of the missing value according to a comparison result of a double-layer reconstruction result of the missing value and the upper boundary and the lower boundary, the method further includes:
and when the upper and lower boundary reconstruction results of the missing value do not meet the preset mutation condition, taking the upper and lower boundary reconstruction results of the missing value as the final reconstruction result of the missing value.
As an improvement of the above scheme, the lower layer cyclic regression reconstruction algorithm specifically includes:
screening out sub-tables containing missing values from the data set to be reconstructed to obtain at least one sub-table to be reconstructed at the lower layer, and determining the reconstruction priority of the lower layer according to the comparison result of the number of the missing values; wherein, the priority of the lower layer to-be-reconstructed sub-table with the least missing value number is the largest;
reconstructing the missing value of each lower-layer to-be-reconstructed sub-table in turn based on the to-be-reconstructed data set according to the principle of the largest first reconstruction of the lower-layer reconstruction priority, obtaining an initial lower-layer reconstruction result of the missing value, and updating the missing value of the to-be-reconstructed data set into the initial lower-layer reconstruction result to obtain an initial lower-layer reconstruction data set;
When the deviation between the initial lower layer reconstruction result obtained by any lower layer reconstruction and the initial lower layer reconstruction result obtained by the last lower layer reconstruction is in a preset range, taking the initial lower layer reconstruction result as a lower layer reconstruction result of a missing value and taking the initial lower layer reconstruction data set as a lower layer reconstruction data set; otherwise, updating the data set to be reconstructed into the initial lower layer reconstruction data set, and returning to the step of reconstructing the missing value of each lower layer to be reconstructed sub-table in turn based on the data set to be reconstructed according to the principle of the maximum first reconstruction of the lower layer reconstruction priority.
As an improvement of the above solution, the reconstructing the missing value of each lower layer to-be-reconstructed sub-table based on the to-be-reconstructed data set according to the principle of the largest first reconstruction of the lower layer reconstruction priority to obtain an initial lower layer reconstruction result of the missing value, and updating the missing value of the to-be-reconstructed data set to the initial lower layer reconstruction result to obtain an initial lower layer reconstruction data set, which specifically includes:
selecting a lower-layer to-be-reconstructed sub-table with the largest lower-layer reconstruction priority as a current lower-layer to-be-reconstructed sub-table;
replacing the current lower layer data of the sub-table to be reconstructed in the data set to be reconstructed with total table data to obtain a first state data set;
Reconstructing the missing value of the current lower layer to-be-reconstructed sub-table based on the first state data set to obtain an initial lower layer reconstruction result, and updating the missing value of the current lower layer to-be-reconstructed sub-table in the to-be-reconstructed data set to the initial lower layer reconstruction result to obtain a second state data set;
when all the missing values of the sub-tables to be reconstructed at the lower layer are updated, the second state data set is used as an initial lower layer reconstruction data set; otherwise, designating the next sub-table to be reconstructed as the current sub-table to be reconstructed according to the lower reconstruction priority, updating the data set to be reconstructed into the second state data set, and returning to the step of replacing the current sub-table to be reconstructed in the data set to be reconstructed with the total table data to obtain the first state data set.
As an improvement of the above scheme, the upper layer cyclic regression reconstruction algorithm is:
taking the sub-table containing the missing values as an upper-layer sub-table to be reconstructed to obtain at least one upper-layer sub-table to be reconstructed;
determining the upper layer reconstruction priority according to the comparison result of the number of the missing values; wherein, the priority of the upper layer to-be-reconstructed sub-table with the least missing value number is the largest;
According to the principle of the maximum first reconstruction of the upper reconstruction priority, reconstructing the missing value of each upper to-be-reconstructed sub-table based on the lower reconstruction data set to obtain an upper reconstruction result of the missing value, and updating the missing value of the lower reconstruction data set to the upper reconstruction result to obtain an upper reconstruction data set;
when the deviation between the upper layer reconstruction result obtained by any upper layer reconstruction and the upper layer reconstruction result obtained by the last upper layer reconstruction is in a preset range, taking the upper layer reconstruction result as a double-layer reconstruction result of the missing value; otherwise, updating the lower layer reconstruction data set into an upper layer reconstruction data set, and returning to the step of reconstructing the missing value of each upper layer to-be-reconstructed sub-table in turn based on the lower layer reconstruction data set according to the principle of the maximum first reconstruction of the upper layer reconstruction priority.
As an improvement of the above solution, the reconstructing the missing value of each upper layer to-be-reconstructed sub-table based on the lower layer reconstruction data set according to the principle of the maximum first reconstruction of the upper layer reconstruction priority to obtain an upper layer reconstruction result of the missing value, and updating the missing value of the lower layer reconstruction data set to the upper layer reconstruction result to obtain an upper layer reconstruction data set, which specifically includes:
Selecting a sub-table with the highest upper layer reconstruction priority as current upper layer sub-table data to be reconstructed;
replacing the current upper layer sub-table data to be reconstructed in the lower layer reconstruction result with total table data to obtain a third state data set;
dividing the data of the current upper layer to-be-reconstructed sub-table into complete data and missing data, taking the complete data as a tag of a training set, taking a data part of a third state data set corresponding to the complete data as the training set, and taking a data part of the third state data set corresponding to the missing data as a test set;
inputting the training set into a pre-established machine learning model to obtain a trained reconstruction network;
inputting the test set into the trained reconstruction network to obtain an upper layer reconstruction result of the missing values, and updating the missing values in the lower layer reconstruction result into the upper layer reconstruction result to obtain a fourth state data set;
when all the missing values of the upper layer sub-tables to be reconstructed are updated, the fourth state data set is used as an upper layer reconstruction result; otherwise, designating the next upper layer reconstruction sub-table as the current upper layer reconstruction sub-table according to the upper layer reconstruction priority, updating the lower layer reconstruction result into the fourth state data set, and returning to the step of replacing the current upper layer reconstruction sub-table data in the lower layer reconstruction result with total table data to obtain a third state data set.
As an improvement of the above solution, the solving equation of the ammeter error according to the law of conservation of energy determines the upper and lower bounds of the missing value, and obtains the upper and lower bounds of the missing value according to the comparison result between the double-layer reconstruction result of the missing value and the upper and lower bounds, which specifically includes:
solving an equation based on ammeter errors of an energy conservation law, and determining a missing value upper bound of each sample to be reconstructed;
for the missing value of each sample to be reconstructed, according to a double-layer reconstruction result of the missing value and a first comparison result of the corresponding upper bound of the missing value, adopting a preset upper bound constraint adjustment strategy corresponding to the first comparison result to adjust the double-layer reconstruction result of the missing value, and obtaining an upper bound constraint reconstruction result of the missing value;
and obtaining the upper and lower bound reconstruction results of the missing value corresponding to the second comparison result according to the second comparison result of the upper bound constraint reconstruction result of the missing value and a preset lower bound constraint threshold.
As an improvement of the above scheme, the upper bound adjustment strategy is:
for a missing value of each sample to be reconstructed, when a double-layer reconstruction result of the missing value is greater than a corresponding missing value upper bound, making the double-layer reconstruction result of the missing value equal to the missing value upper bound;
For the missing value of each sample to be reconstructed, when the sum of the double-layer reconstruction results of all the missing values in the sample to be reconstructed is greater than or equal to the corresponding upper bound of the missing value, the double-layer reconstruction result of the missing value is adjusted according to the following formula, and the upper bound constraint reconstruction result of the missing value is obtained;
wherein x' j Reconstructing a result, x, for an upper bound constraint of a missing value of a jth missing sub-table in a sample to be reconstructed j B, upper layer reconstruction result of the missing value of the j-th missing sub-table in the sample to be reconstructed up For the upper bound of the missing values, η is a preset loss coefficient, and k is the number of electric meters containing the missing values in the sample to be reconstructed.
As an improvement of the above scheme, the preset mutation conditions are: the difference value between the maximum value and the minimum value in 2r pieces of same-table data which are nearest to the position of the missing value is larger than a preset mutation threshold value, wherein the same-table data are data of a table in which the missing value is located, and r > =1;
and when the upper and lower bound reconstruction results of the missing value meet the preset mutation condition, processing the upper and lower bound reconstruction results meeting the preset mutation condition according to a preset mutation processing mechanism to obtain a final reconstruction result of the missing value, wherein the method specifically comprises the following steps:
When the difference value between the maximum value and the minimum value in the 2r nearest-neighbor data of the position of the missing value is larger than a preset mutation threshold value, calculating the average value of the 2r nearest-neighbor data, and taking the average value as the final reconstruction result of the missing value.
The embodiment of the invention also provides an ammeter data reconstruction device, which comprises:
the system comprises a data set to be reconstructed screening module, a data set analysis module and a data set analysis module, wherein the data set to be reconstructed is used for acquiring a platform region ammeter data set containing a missing value and screening the data set to be reconstructed according to a missing mechanism and a missing mode of each sample of the platform region ammeter data set; the data set to be reconstructed comprises a plurality of samples to be reconstructed, and the station area ammeter comprises a total table and at least one sub table;
the lower layer reconstruction data set acquisition module is used for carrying out upper layer reconstruction on the missing values based on the data set to be reconstructed by adopting a preset lower layer circulation regression reconstruction algorithm to obtain a lower layer reconstruction result, and replacing the missing values of the data set to be reconstructed with the lower layer reconstruction result to obtain a lower layer reconstruction data set;
the double-layer reconstruction result acquisition module is used for carrying out lower-layer reconstruction on the missing value based on the lower-layer reconstruction data set by adopting a preset upper-layer circulation regression reconstruction algorithm to obtain a double-layer reconstruction result of the missing value;
The upper and lower boundary reconstruction result acquisition module is used for solving an equation according to the ammeter error of the energy conservation law, determining the upper and lower boundaries of the missing value, and obtaining the upper and lower boundary reconstruction result of the missing value according to the comparison result of the double-layer reconstruction result of the missing value and the upper and lower boundaries;
and the final reconstruction result acquisition module is used for processing the upper and lower boundary reconstruction results meeting the preset mutation conditions according to a preset mutation processing mechanism when the upper and lower boundary reconstruction results of the missing value meet the preset mutation conditions, so as to obtain the final reconstruction result of the missing value.
Compared with the prior art, the ammeter data reconstruction method provided by the embodiment of the invention comprises the steps of obtaining a platform region ammeter data set containing a missing value, screening the data set to be reconstructed according to a missing mechanism and a missing mode of a sample in the data set, carrying out double-layer circulation regression reconstruction on the missing value by adopting a preset lower-layer circulation regression reconstruction algorithm and a preset upper-layer circulation regression reconstruction algorithm to obtain a double-layer reconstruction result, determining the upper and lower bounds of the missing value according to an ammeter error solving equation of an energy conservation law, obtaining the upper and lower bounds of the missing value according to a comparison result of the double-layer reconstruction result of the missing value and the upper and lower bounds, finally, when the fact that the upper and lower bounds of the missing value meet a preset mutation condition is detected, processing the upper and lower bounds of the missing value reconstruction result meeting the preset mutation condition according to a preset mutation processing mechanism to obtain a final reconstruction result of the missing value 。Compared with the existing method for reconstructing the missing data through the neural network, the method can reconstruct the missing data of the ammeter without massive and complete data training, and has high reconstruction accuracy. Correspondingly, the embodiment of the invention also provides an ammeter data reconstruction device。
Drawings
Fig. 1 is a flow chart of an ammeter data reconstruction method according to an embodiment of the present invention;
fig. 2 is a block diagram of an ammeter data reconstruction device according to an embodiment of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Referring to fig. 1, fig. 1 is a flow chart of an ammeter data reconstruction method according to an embodiment of the present invention.
The ammeter data reconstruction method provided by the embodiment of the invention comprises the steps of S11 to S15:
step S11, a station area ammeter data set containing a missing value is obtained, and a data set to be reconstructed is screened out according to a missing mechanism and a missing mode of each sample of the station area ammeter data set; the data set to be reconstructed comprises a plurality of samples to be reconstructed, and the station area ammeter comprises a total table and at least one sub table.
It can be understood that the data is missing due to the fact that the ammeter data may be affected by equipment faults, artificial interference and other factors in the measuring, communication, storage and other processes, so that missing values in the ammeter data are generated.
Specifically, in the process of collecting the data of the electric meter of the area with the missing value, the power supply summary table of the area and the power consumption sub-tables of all users can be used as attributes, and the data formed by the set data statistics frequency can be used as samples for collecting. For example, the collection frequency is 1 day, and then the total and each sub-table data within 1 day constitute 1 sample.
Illustratively, the district meter dataset may be in the form of the following table 1:
TABLE 1
Where null represents a missing value, [ a1, null, a3, A ] is one sample.
The deletion mechanism may be classified according to the integrity of the data distribution, where an unbiased deletion refers to a data distribution of a table area ammeter data set containing a deletion value that is the same as or similar to a complete table area ammeter data set, and a biased deletion refers to a data distribution of a table area ammeter data set containing a deletion value that is greatly different from a complete table area ammeter data set; the deletion modes can be classified into user variable deletion and non-user variable deletion according to the user variable conditions; wherein, the user change missing means that the missing value is caused by the user of the table division list, and the non-user change missing means that the missing value is caused by the reasons other than the user of the table division list.
And the data set to be reconstructed refers to sample sub-table data with the deletion mechanism of unbiased deletion and the deletion mode of non-user variable deletion. As shown in the above table 1, assuming that the deletion values in the above table 1 satisfy the deletion mechanism and the deletion mode is non-user variable deletion, the screened data set to be reconstructed can be seen in the following table 2:
TABLE 2
Table 1 | Table 2 | Table 3 |
a1 | null | a3 |
b1 | null | b1 |
null | c2 | c3 |
The data set to be reconstructed in the above table 2 is taken as an example, and the samples to be reconstructed in the step S11 include [ a1, null, a3], [ b1, null, b1], [ null, c2, c3].
And S12, performing upper reconstruction on the missing values based on the data set to be reconstructed by adopting a preset lower-layer circulation regression reconstruction algorithm to obtain a lower-layer reconstruction result, and replacing the missing values of the data set to be reconstructed with the lower-layer reconstruction result to obtain a lower-layer reconstruction data set.
Step S13, performing lower layer reconstruction on the missing values based on the lower layer reconstruction data set by adopting a preset upper layer circulation regression reconstruction algorithm to obtain a double-layer reconstruction result of the missing values;
step S14, solving an equation according to ammeter errors of an energy conservation law, determining upper and lower bounds of the missing value, and obtaining upper and lower bound reconstruction results of the missing value according to a comparison result of a double-layer reconstruction result of the missing value and the upper and lower bounds;
And S15, when the upper and lower boundary reconstruction results of the missing value meet the preset mutation conditions, processing the upper and lower boundary reconstruction results meeting the preset mutation conditions according to a preset mutation processing mechanism to obtain the final reconstruction result of the missing value.
In an alternative embodiment, the lower-layer loop regression reconstruction algorithm in step S12 includes:
s121, screening out sub-tables containing missing values from the data set to be reconstructed to obtain at least one sub-table to be reconstructed at the lower layer, and determining the reconstruction priority of the lower layer according to the comparison result of the number of the missing values; wherein, the priority of the lower layer to-be-reconstructed sub-table with the least missing value number is the largest.
Specifically, it is assumed that the numbers of missing values are 0, 4, 1 and 2 in the tables 1, 2, 3 and 4 respectively; the lower layer reconstruction priority is: sub-table 3> sub-table 4> sub-table 2.
S122, reconstructing missing values of each lower-layer to-be-reconstructed sub-table in turn based on the to-be-reconstructed data set according to the principle of the largest first reconstruction of the lower-layer reconstruction priority, obtaining an initial lower-layer reconstruction result of the missing values, and updating the missing values of the to-be-reconstructed data set into the initial lower-layer reconstruction result to obtain an initial lower-layer reconstruction data set.
In the embodiment of the invention, the method with the advantage of linear relation mining can be utilized to carry out lower-layer cyclic regression reconstruction on the missing value of each lower-layer sub-table to be reconstructed, and a lower-layer reconstruction result of the missing value is obtained. Specifically, the method with the advantage of linear relation mining mainly realizes missing data reconstruction by adjusting weights among sub-tables or utilizing sample similarity in the sub-tables or sample similarity among the sub-tables, and specifically comprises the following steps: layered mean filling method, expected maximization filling method, K nearest neighbor filling method, hot card filling method, cold card filling method, regression filling method, bayesian ridge regression method, etc. The methods are generally simple to operate, short in time consumption, stable in result and strong in interpretability.
In the embodiment of the invention, on the basis of a data set to be reconstructed, performing lower-layer cyclic regression reconstruction on sub-tables 3, 4 and 2 in turn according to the priority of lower-layer reconstruction, and reconstructing the missing values by adopting a preset filling algorithm to obtain an initial lower-layer reconstruction result of the missing values of sub-table 3, an initial lower-layer reconstruction result of the missing values of sub-table 4 and an initial lower-layer reconstruction result of the missing values of sub-table 2, and updating the first state data set on the basis of the initial lower-layer reconstruction results of the missing values of sub-table 3, sub-table 4 and sub-table 2 to obtain an initial lower-layer reconstruction result.
S123, when the deviation between the initial lower layer reconstruction result obtained by any lower layer reconstruction and the initial lower layer reconstruction result obtained by the last lower layer reconstruction is in a preset range, taking the initial lower layer reconstruction result as a lower layer reconstruction result of a missing value and taking the initial lower layer reconstruction data set as a lower layer reconstruction data set; otherwise, updating the data set to be reconstructed into the initial lower layer reconstruction data set, and returning to the step of reconstructing the missing value of each lower layer to be reconstructed sub-table in turn based on the data set to be reconstructed according to the principle of the maximum first reconstruction of the lower layer reconstruction priority.
Exemplary, the deviation between the initial lower layer reconstruction result obtained by any one lower layer reconstruction and the initial lower layer reconstruction result obtained by the last lower layer reconstruction is within a preset range, which means that: assume that the initial lower layer reconstruction results of the missing values obtained by performing the first lower layer reconstruction on the sub-tables 3, 4, and 2 are a3, a4, and a2, respectively, and the initial lower layer reconstruction results obtained by performing the second lower layer reconstruction are b3, b4, and b2, respectively. If |a3-b3| is within the preset range, and |a4-b4| is within the preset range, and |a2-b2| is within the preset range, then the missing values of the sub-tables 3, 4, 2 are considered to converge to a more stable and better lower layer reconstruction result, the lower layer regression reconstruction process of the sub-tables 3, 4, 2 is ended, the lower layer regression reconstruction of the missing values is completed, and the lower layer reconstruction data set is obtained. And when |a4-b4| is not within the preset range, updating the data set to be reconstructed to an initial lower layer reconstruction data set, for example, updating the data set to be reconstructed from [ sub-table 1 data, sub-table 2 data, sub-table 3 data, sub-table 4 data ] to [ sub-table 1 data, sub-table 2 data (once reconstructed), sub-table 3 data (once reconstructed), sub-table 4 data (once reconstructed) ], and returning to the step S122, and carrying out lower layer recurrent reconstruction on sub-tables 3, 4 and 2 in turn according to the data set to be reconstructed [ sub-table 1 data, sub-table 2 data (once reconstructed), sub-table 3 data (once reconstructed), sub-table 4 data (once reconstructed) ]untilthe deviation between the initial lower layer reconstruction result obtained by any lower layer reconstruction and the initial lower layer reconstruction result obtained by the last lower layer reconstruction is within the preset range.
Preferably, the step S123 specifically includes:
s1231, selecting a lower-layer to-be-reconstructed sub-table with the highest lower-layer reconstruction priority as a current lower-layer to-be-reconstructed sub-table;
s1232, replacing the current lower layer data of the sub-table to be reconstructed in the data set to be reconstructed with total data to obtain a first state data set;
s1233, reconstructing the missing value of the current lower layer to-be-reconstructed sub-table based on the first state data set to obtain an initial lower layer reconstruction result, and updating the missing value of the current lower layer to-be-reconstructed sub-table in the first state data set to the initial lower layer reconstruction result to obtain a second state data set;
s1234, when all the missing values of the sub-tables to be reconstructed at the lower layer are updated, the second state data set is used as an initial lower layer reconstruction data set; otherwise, designating the next sub-table to be reconstructed as the current sub-table to be reconstructed according to the lower reconstruction priority, updating the data set to be reconstructed into the second state data set, and returning to the step of replacing the current sub-table to be reconstructed in the data set to be reconstructed with the total table data to obtain the first state data set.
For example, the data set to be reconstructed is [ sub-table 1 data, sub-table 2 data, sub-table 3 data, sub-table 4 data ], the lower layer sub-tables to be reconstructed are sub-tables 3, 2 and 4, and the lower layer reconstruction priority is: table 3> table 4> table 2 illustrates the detailed process.
S1231', taking the sub-table 3 as a sub-table to be reconstructed at the current lower layer;
s1232', replacing the current lower reconstruction sub-table data in the data set to be reconstructed with total table data to obtain a first state data set [ sub-table 1 data, sub-table 2 data, total table data, sub-table 4 data ];
s1233', reconstructing each missing value of the sub-table 3 by using the first state data set [ sub-table 1 data, sub-table 2 data, total table data, sub-table 4 data ] to obtain an initial lower layer reconstruction result of each missing value in the sub-table 3, and updating to obtain a second state data set [ sub-table 1 data, sub-table 2 data, sub-table 3 data (reconstructed), sub-table 4 data ];
s1234', designating sub-table 4 as a current lower layer sub-table to be reconstructed, updating an original data set to be reconstructed into [ sub-table 1 data, sub-table 2 data, sub-table 3 data (reconstructed) and sub-table 4 data ], obtaining an initial lower layer reconstruction result of the missing value of sub-table 3 based on the data set and through steps S1232-S1233, and updating to obtain a second state data set into [ sub-table 1 data, sub-table 2 data, sub-table 3 data (reconstructed) and sub-table 4 data (reconstructed) ];
s1235', designating the sub-table 2 as the current sub-table to be reconstructed, obtaining an initial sub-table reconstruction result of the missing value of the sub-table 2 through steps S1232-S1233, and updating to obtain a second state data set [ sub-table 1 data, sub-table 2 data (reconstructed), sub-table 3 data (reconstructed), sub-table 4 data (reconstructed) ].
In an alternative embodiment, the upper-layer loop regression reconstruction algorithm in the step S13 includes S131 to S134:
s131, taking the sub-table containing the missing values as an upper-layer sub-table to be reconstructed to obtain at least one upper-layer sub-table to be reconstructed;
s132, determining the upper layer reconstruction priority according to the comparison result of the number of the missing values; wherein, the priority of the upper layer to-be-reconstructed sub-table with the least missing value number is the largest;
s133, reconstructing the missing value of each upper layer to-be-reconstructed sub-table in turn based on the lower layer reconstruction data set according to the principle of the maximum first reconstruction of the upper layer reconstruction priority to obtain an upper layer reconstruction result of the missing value, and updating the missing value of the lower layer reconstruction data set into the upper layer reconstruction result to obtain an upper layer reconstruction data set;
in the implementation, when the missing values of the upper layer to-be-reconstructed sub-table are reconstructed, the missing values can be reconstructed by using a method with nonlinear relation mining advantages. The method with nonlinear relation mining advantages mainly realizes missing data reconstruction by generating a plurality of reconstruction values and comprehensively processing the plurality of reconstruction values, and specifically comprises the following steps: cluster filling methods (and variants or modifications thereof), ensemble learning methods (including random forest methods, extreme random tree methods, GBRT, XGBoost, lightBoost, catBoost, etc. methods, and other variants or modifications thereof), generative antagonism networks (and variants or modifications thereof), depth automatic encoders (and variants or modifications thereof), neural network methods (and variants or modifications thereof), and the like. These methods generally have small reconstruction errors, but require a large amount of parameter adjustment, consume long time, have high calculation cost, have variable results, and have poor interpretability.
It is worth to be noted that, compared with the reconstruction result obtained by reconstructing the data set to be reconstructed in the step 12 by using the mean value filling method, the mode filling method or the 0-value filling method, the lower reconstruction result in the step S13 has smaller error, can provide a better initial value for the method with nonlinear relation mining advantage in the step S14, reduces training time, reduces reconstruction error, and thus makes the double-layer reconstruction result closer to the actual value.
S134, when the deviation between the upper layer reconstruction result obtained by any upper layer reconstruction and the upper layer reconstruction result obtained by the last upper layer reconstruction is within a preset range, taking the upper layer reconstruction result as a double-layer reconstruction result of the missing value; otherwise, updating the lower layer reconstruction data set into an upper layer reconstruction data set, and returning to the step of reconstructing the missing value of each upper layer to-be-reconstructed sub-table in turn based on the lower layer reconstruction data set according to the principle of the maximum first reconstruction of the upper layer reconstruction priority.
Further, the step S133 includes S1331 to S1336:
s1331, selecting a sub-table with the highest upper layer reconstruction priority as current upper layer sub-table data to be reconstructed;
s1332, replacing the current upper layer sub-table data to be reconstructed in the lower layer reconstruction result with total table data to obtain a third state data set;
S1333, dividing the data of the current upper layer to-be-reconstructed sub-table into complete data and missing data, taking the complete data as a tag of a training set, taking a data part of a third state data set corresponding to the complete data as the training set, and taking a data part of the third state data set corresponding to the missing data as a test set;
s1334, inputting the training set into a pre-established machine learning model to obtain a trained reconstruction network;
s1335, inputting the test set into the trained reconstruction network to obtain an upper reconstruction result of the missing values, and updating the missing values in the lower reconstruction result into the upper reconstruction result to obtain a fourth state data set;
s1336, when all the missing values of the upper layer sub-tables to be reconstructed are updated, taking the fourth state data set as an upper layer reconstruction result; otherwise, designating the next upper layer reconstruction sub-table as the current upper layer reconstruction sub-table according to the upper layer reconstruction priority, and returning to the step of replacing the current upper layer reconstruction sub-table data in the lower layer reconstruction result with the total table data to obtain a third state data set.
Illustratively, assume that following steps S1331-S1332, the third state data set is obtained in the form of Table 3 below:
TABLE 3 Table 3
Table 1 | Table 2 | Summary sheet | Table 4 |
Data 1.1 | Data 2.1 | Data 0.1 | Data 4.1 |
Data 1.2 | Data 2.2 | Data 0.2 (null) | Data 4.2 |
Data 1.3 | Data 2.3 | Data 0.3 | Data 4.3 |
Data 0.2 (null) represents the missing value (missing data) in the original sub-table 3, and is replaced with total data 0.2.
In step S1333, the complete data is [ data 0.1, data 0.3], the third state data set data corresponding to the missing data is [ data 1.2, data 2.2, data 0.2, data 4.2], and the third state data set data corresponding to the complete data is [ data 1.1, data 2.1, data 0.1, data 4.1] and [ data 1.3, data 2.3, data 0.3, data 4.3].
In an alternative embodiment, the step S14 "solving an equation according to the ammeter error of the law of conservation of energy, determining the upper and lower bounds of the missing value, and obtaining the upper and lower bounds of the missing value according to the comparison result between the double-layer reconstruction result of the missing value and the upper and lower bounds" specifically includes S141 to S143:
s141, determining the upper bound of the missing value of each sample to be reconstructed based on an ammeter error solving equation of the energy conservation law.
In the embodiment of the invention, each sample to be reconstructed has an ammeter error solving equation of the corresponding energy conservation law. Wherein,
The ammeter error solving equation of the energy conservation law is specifically as follows:
wherein, all variables of the equation refer to a single reconstructed sample, and y is the total electric quantity of the table area (the total electric quantity of a certain sample to be reconstructed); n is the number of the station distinction tables; k is the number of sub-tables containing the missing values in the table area; x is x i The electric quantity of the sub-table without the missing value in the station area is calculated; x is x j The electric quantity of the sub-table containing the missing value in the station area is calculated; eother is the other loss of the zone.
Then, the equation is solved by the ammeter error of the energy conservation law, and the following can be obtained:
furthermore, for the sample to be reconstructed, the upper bound of the missing values of the missing sub-table is:b up is the upper bound of the missing values for the samples to be reconstructed.
S142, for the missing value of each sample to be reconstructed, according to the double-layer reconstruction result of the missing value and a first comparison result of the corresponding upper bound of the missing value, adopting a preset upper bound constraint adjustment strategy corresponding to the first comparison result to adjust the double-layer reconstruction result of the missing value, so as to obtain an upper bound constraint reconstruction result of the missing value;
in some embodiments, the upper bound constraint adjustment strategy comprises:
for a missing value of each sample to be reconstructed, when a double-layer reconstruction result of the missing value is greater than a corresponding missing value upper bound, making the double-layer reconstruction result of the missing value equal to the missing value upper bound; i.e. when x j >b up Then the double-layer reconstruction result of the missing value is adjusted to x j =b up ;
For the missing value of each sample to be reconstructed, when the sum of the double-layer reconstruction results of all the missing values in the sample to be reconstructed is greater than or equal to the corresponding upper bound of the missing value, the double-layer reconstruction result of the missing value is adjusted according to the following formula, and the upper bound constraint reconstruction result of the missing value is obtained;
wherein x' j Reconstructing a result, x, for an upper bound constraint of a missing value of a jth missing sub-table in a sample to be reconstructed j B, upper layer reconstruction result of the missing value of the j-th missing sub-table in the sample to be reconstructed up For the upper bound of the missing values, η is a preset loss coefficient, and k is the number of electric meters containing the missing values in the sample to be reconstructed.
S143, obtaining the upper and lower bound reconstruction results of the missing value corresponding to the second comparison result according to the second comparison result of the upper bound constraint reconstruction result of the missing value and the preset lower bound constraint threshold.
In the embodiment of the invention, based on the upper-bound reconstruction result, in order to prevent the reconstruction value from being too small, when x j <b up Wherein beta is a lower bound adjustment coefficient, the double layer reconstruction result of the missing value is adjusted to x j =b up K, thereby obtaining upper and lower bound reconstruction results; and when x is j ≥b up At/kχβ, no adjustment of the double-layer reconstruction result of the missing values is needed, so that the upper and lower bound reconstruction results are equal to the double-layer reconstruction result.
In an alternative embodiment, the mutation conditions in step S15 are: the difference value between the maximum value and the minimum value in 2r pieces of same-table data which are nearest to the position of the missing value is larger than a preset mutation threshold value, wherein the same-table data are data of a table in which the missing value is located, and r > =1;
for example, if there is data of sub-table 3 [ data 1, data 2, missing value, data 4, data 5], then r=2, then the same table data is data 1, data 2, data 4, data 5; and further finding that the maximum value and the minimum value in the same-table data are data 2 and data 4 respectively, wherein the difference value is the difference value of the data 2 and the data 4; if the data in the sub-table 3 is [ missing value, data 2, data 3, data 4, data 5], the data in the same table is data 2, data 3, data 4, data 5.
And step S14 "when the upper and lower boundary reconstruction results of the missing value meet the preset mutation condition, processing the upper and lower boundary reconstruction results meeting the preset mutation condition according to a preset mutation processing mechanism to obtain a final reconstruction result of the missing value", specifically including:
when the difference value between the maximum value and the minimum value in the 2r nearest-neighbor data of the position of the missing value is larger than a preset mutation threshold value, calculating the average value of the 2r nearest-neighbor data, and taking the average value as the final reconstruction result of the missing value.
In other embodiments, the method further comprises: and when the upper and lower boundary reconstruction results of the missing value do not meet the preset mutation condition, taking the upper and lower boundary reconstruction results of the missing value as the final reconstruction result of the missing value.
In the embodiment of the invention, a mutation value processing mechanism is introduced, when a double-layer reconstruction result of a detected missing value meets a preset mutation condition, if the same-table data of the missing value in the sub-table 3 are data 1, data 2, data 4 and data 5, the average value of the data 1, 2, 4 and 5 is taken as the final reconstruction result of the mutation value. It can be understood that the mutation value processing mechanism provided by the embodiment of the invention can further improve the reconstruction accuracy.
Referring to fig. 2, fig. 2 is a block diagram of an electric meter data reconstruction device according to an embodiment of the present invention. The ammeter data reconstruction device 10 provided in the embodiment of the present invention is configured to execute all the steps and procedures of the ammeter data reconstruction method provided in the above embodiment, and includes:
the data set to be reconstructed screening module 11 is used for acquiring a data set of the electric meter of the area with the missing value, and screening out the data set to be reconstructed according to the missing mechanism and the missing mode of each sample of the data set of the electric meter of the area; the data set to be reconstructed comprises a plurality of samples to be reconstructed, and the station area ammeter comprises a total table and at least one sub table;
The lower layer reconstruction data set obtaining module 12 is configured to perform an upper layer reconstruction on the missing values based on the data set to be reconstructed by adopting a preset lower layer circulation regression reconstruction algorithm to obtain a lower layer reconstruction result, and replace the missing values of the data set to be reconstructed with the lower layer reconstruction result to obtain a lower layer reconstruction data set;
the double-layer reconstruction result obtaining module 13 is configured to perform a lower-layer reconstruction on the missing value based on the lower-layer reconstruction data set by adopting a preset upper-layer cyclic regression reconstruction algorithm to obtain a double-layer reconstruction result of the missing value;
the upper and lower boundary reconstruction result obtaining module 14 is configured to solve an equation according to an ammeter error of an energy conservation law, determine an upper and lower boundary of the missing value, and obtain an upper and lower boundary reconstruction result of the missing value according to a comparison result between a double-layer reconstruction result of the missing value and the upper and lower boundary;
and the final reconstruction result obtaining module 15 is configured to process, according to a preset mutation processing mechanism, the upper and lower boundary reconstruction results that meet the preset mutation condition when the upper and lower boundary reconstruction results of the missing value meet the preset mutation condition, so as to obtain the final reconstruction result of the missing value.
It should be noted that, the ammeter data reconstruction device provided in the embodiment of the present invention is configured to execute all the flow steps of the ammeter data reconstruction method in the foregoing embodiment, and the working principles and beneficial effects of the two correspond to each other one by one, which is not described in detail herein.
The invention provides an ammeter data reconstruction method and device, which are characterized in that a ammeter data set of a station area containing a missing value is obtained, the data set to be reconstructed is screened out according to a missing mechanism and a missing mode of a sample in the data set, a preset lower-layer circulation regression reconstruction algorithm and a preset upper-layer circulation regression reconstruction algorithm are adopted to carry out double-layer circulation regression reconstruction on the missing value, a double-layer reconstruction result is obtained, the upper and lower bounds of the missing ammeter and the upper and lower bounds of the missing ammeter reconstruction result are determined by combining the energy conservation relation of the total surface and the sub-table, the reconstruction precision is further improved, and finally a mutation processing mechanism is adopted to process the mutation value in the obtained upper and lower bounds of the reconstruction result, so that a final reconstruction result is obtained. Compared with the existing method for reconstructing the missing data through the neural network, the method can reconstruct the missing data of the ammeter without massive and complete data training, and has high reconstruction accuracy.
While the foregoing is directed to the preferred embodiments of the present invention, it will be appreciated by those skilled in the art that changes and modifications may be made without departing from the principles of the invention, such changes and modifications are also intended to be within the scope of the invention.
Claims (8)
1. A method for reconstructing data of an electric meter, comprising:
acquiring a station area ammeter data set containing a missing value, and screening out a data set to be reconstructed according to a missing mechanism and a missing mode of each sample of the station area ammeter data set; the data set to be reconstructed comprises a plurality of samples to be reconstructed, and the station area ammeter comprises a total table and at least one sub table; the data set to be reconstructed refers to sample sub-table data with a deletion mechanism of unbiased deletion and a deletion mode of non-user variable deletion;
performing upper reconstruction on the missing values based on the data set to be reconstructed by adopting a preset lower-layer circulating regression reconstruction algorithm to obtain a lower-layer reconstruction result, and replacing the missing values of the data set to be reconstructed with the lower-layer reconstruction result to obtain a lower-layer reconstruction data set;
performing lower layer reconstruction on the missing values based on the lower layer reconstruction data set by adopting a preset upper layer circulation regression reconstruction algorithm to obtain a double-layer reconstruction result of the missing values;
solving an equation according to an ammeter error of an energy conservation law, determining upper and lower bounds of the missing value, and obtaining upper and lower bounds of the missing value according to a comparison result of a double-layer reconstruction result of the missing value and the upper and lower bounds;
When the upper and lower boundary reconstruction results of the missing value meet the preset mutation conditions, processing the upper and lower boundary reconstruction results meeting the preset mutation conditions according to a preset mutation processing mechanism to obtain the final reconstruction result of the missing value;
the lower layer cyclic regression reconstruction algorithm specifically comprises the following steps:
screening out sub-tables containing missing values from the data set to be reconstructed to obtain at least one sub-table to be reconstructed at the lower layer, and determining the reconstruction priority of the lower layer according to the comparison result of the number of the missing values; wherein, the priority of the lower layer to-be-reconstructed sub-table with the least missing value number is the largest;
reconstructing the missing value of each lower-layer to-be-reconstructed sub-table in turn based on the to-be-reconstructed data set according to the principle of the largest first reconstruction of the lower-layer reconstruction priority, obtaining an initial lower-layer reconstruction result of the missing value, and updating the missing value of the to-be-reconstructed data set into the initial lower-layer reconstruction result to obtain an initial lower-layer reconstruction data set;
when the deviation between the initial lower layer reconstruction result obtained by any lower layer reconstruction and the initial lower layer reconstruction result obtained by the last lower layer reconstruction is in a preset range, taking the initial lower layer reconstruction result as a lower layer reconstruction result of a missing value and taking the initial lower layer reconstruction data set as a lower layer reconstruction data set; otherwise, updating the data set to be reconstructed into the initial lower layer reconstruction data set, and returning to the principle of the maximum first reconstruction according to the lower layer reconstruction priority, and sequentially reconstructing the missing value of each lower layer reconstruction sub-table based on the data set to be reconstructed;
Wherein, the upper layer cyclic regression reconstruction algorithm is:
taking the sub-table containing the missing values as an upper-layer sub-table to be reconstructed to obtain at least one upper-layer sub-table to be reconstructed;
determining the upper layer reconstruction priority according to the comparison result of the number of the missing values; wherein, the priority of the upper layer to-be-reconstructed sub-table with the least missing value number is the largest;
according to the principle of the maximum first reconstruction of the upper reconstruction priority, reconstructing the missing value of each upper to-be-reconstructed sub-table based on the lower reconstruction data set to obtain an upper reconstruction result of the missing value, and updating the missing value of the lower reconstruction data set to the upper reconstruction result to obtain an upper reconstruction data set;
when the deviation between the upper layer reconstruction result obtained by any upper layer reconstruction and the upper layer reconstruction result obtained by the last upper layer reconstruction is in a preset range, taking the upper layer reconstruction result as a double-layer reconstruction result of the missing value; otherwise, updating the lower layer reconstruction data set into an upper layer reconstruction data set, and returning to the step of reconstructing the missing value of each upper layer to-be-reconstructed sub-table in turn based on the lower layer reconstruction data set according to the principle of the maximum first reconstruction of the upper layer reconstruction priority.
2. The method for reconstructing ammeter data according to claim 1, wherein after solving an equation according to ammeter error of law of conservation of energy, determining upper and lower bounds of the missing value, and obtaining upper and lower bounds of the missing value according to a comparison result of double-layer reconstruction result of the missing value and the upper and lower bounds, further comprising:
and when the upper and lower boundary reconstruction results of the missing value do not meet the preset mutation condition, taking the upper and lower boundary reconstruction results of the missing value as the final reconstruction result of the missing value.
3. The method for reconstructing data of an electric meter according to claim 1, wherein a deviation between an initial lower layer reconstruction result obtained by any one lower layer reconstruction and an initial lower layer reconstruction result obtained by a last lower layer reconstruction is within a preset range, the initial lower layer reconstruction result is taken as a lower layer reconstruction result of a missing value, and the initial lower layer reconstruction data set is taken as a lower layer reconstruction data set; otherwise, updating the data set to be reconstructed into the initial lower layer reconstruction data set, and returning to the principle of the maximum first reconstruction according to the lower layer reconstruction priority, and sequentially reconstructing the missing value of each lower layer reconstruction sub-table based on the data set to be reconstructed, wherein the method specifically comprises the following steps:
Selecting a lower-layer to-be-reconstructed sub-table with the largest lower-layer reconstruction priority as a current lower-layer to-be-reconstructed sub-table;
replacing the current lower layer data of the sub-table to be reconstructed in the data set to be reconstructed with total table data to obtain a first state data set;
reconstructing the missing value of the current lower layer to-be-reconstructed sub-table based on the first state data set to obtain an initial lower layer reconstruction result, and updating the missing value of the current lower layer to-be-reconstructed sub-table in the to-be-reconstructed data set to the initial lower layer reconstruction result to obtain a second state data set;
when all the missing values of the sub-tables to be reconstructed at the lower layer are updated, the second state data set is used as an initial lower layer reconstruction data set; otherwise, designating the next sub-table to be reconstructed as the current sub-table to be reconstructed according to the lower reconstruction priority, updating the data set to be reconstructed into the second state data set, and returning to the step of replacing the current sub-table to be reconstructed in the data set to be reconstructed with the total table data to obtain the first state data set.
4. The method for reconstructing ammeter data according to claim 1, wherein the reconstructing the missing value of each upper layer to-be-reconstructed sub-table based on the lower layer reconstruction data set sequentially according to the principle of maximum first reconstruction of the upper layer reconstruction priority, to obtain an upper layer reconstruction result of the missing value, and updating the missing value of the lower layer reconstruction data set to the upper layer reconstruction result, to obtain an upper layer reconstruction data set, comprises:
Selecting a sub-table with the highest upper layer reconstruction priority as current upper layer sub-table data to be reconstructed;
replacing the current upper layer sub-table data to be reconstructed in the lower layer reconstruction result with total table data to obtain a third state data set;
dividing the data of the current upper layer to-be-reconstructed sub-table into complete data and missing data, taking the complete data as a tag of a training set, taking a data part of a third state data set corresponding to the complete data as the training set, and taking a data part of the third state data set corresponding to the missing data as a test set;
inputting the training set into a pre-established machine learning model to obtain a trained reconstruction network;
inputting the test set into the trained reconstruction network to obtain an upper layer reconstruction result of the missing values, and updating the missing values in the lower layer reconstruction result into the upper layer reconstruction result to obtain a fourth state data set;
when all the missing values of the upper layer sub-tables to be reconstructed are updated, the fourth state data set is used as an upper layer reconstruction result; otherwise, designating the next upper layer reconstruction sub-table as the current upper layer reconstruction sub-table according to the upper layer reconstruction priority, updating the lower layer reconstruction result into the fourth state data set, and returning to the step of replacing the current upper layer reconstruction sub-table data in the lower layer reconstruction result with total table data to obtain a third state data set.
5. The method for reconstructing ammeter data according to claim 1, wherein the solving an equation according to ammeter error of energy conservation law, determining upper and lower bounds of the missing value, and obtaining upper and lower bounds of the missing value according to a comparison result between a double-layer reconstruction result of the missing value and the upper and lower bounds, specifically comprises:
solving an equation based on ammeter errors of an energy conservation law, and determining a missing value upper bound of each sample to be reconstructed;
for the missing value of each sample to be reconstructed, according to a double-layer reconstruction result of the missing value and a first comparison result of the corresponding upper bound of the missing value, adopting a preset upper bound constraint adjustment strategy corresponding to the first comparison result to adjust the double-layer reconstruction result of the missing value, and obtaining an upper bound constraint reconstruction result of the missing value;
and obtaining the upper and lower bound reconstruction results of the missing value corresponding to the second comparison result according to the second comparison result of the upper bound constraint reconstruction result of the missing value and a preset lower bound constraint threshold.
6. The method of claim 5, wherein the upper bound adjustment strategy is:
for a missing value of each sample to be reconstructed, when a double-layer reconstruction result of the missing value is greater than a corresponding missing value upper bound, making the double-layer reconstruction result of the missing value equal to the missing value upper bound;
For the missing value of each sample to be reconstructed, when the sum of the double-layer reconstruction results of all the missing values in the sample to be reconstructed is greater than or equal to the corresponding upper bound of the missing value, the double-layer reconstruction result of the missing value is adjusted according to the following formula, and the upper bound constraint reconstruction result of the missing value is obtained;
wherein x' j Reconstructing a result, x, for an upper bound constraint of a missing value of a jth missing sub-table in a sample to be reconstructed j B, upper layer reconstruction result of the missing value of the j-th missing sub-table in the sample to be reconstructed up For the upper bound of the missing values, η is a preset loss coefficient, and k is the number of electric meters containing the missing values in the sample to be reconstructed.
7. The method for reconstructing ammeter data according to claim 1, wherein the mutation condition is: the difference value between the maximum value and the minimum value in 2r pieces of same-table data which are nearest to the position of the missing value is larger than a preset mutation threshold value, wherein the same-table data are data of a table in which the missing value is located, and r > =1;
and when the upper and lower bound reconstruction results of the missing value meet the preset mutation condition, processing the upper and lower bound reconstruction results meeting the preset mutation condition according to a preset mutation processing mechanism to obtain a final reconstruction result of the missing value, wherein the method specifically comprises the following steps:
When the difference value between the maximum value and the minimum value in the 2r nearest-neighbor data of the position of the missing value is larger than a preset mutation threshold value, calculating the average value of the 2r nearest-neighbor data, and taking the average value as the final reconstruction result of the missing value.
8. An ammeter data reconstruction device, comprising:
the system comprises a data set to be reconstructed screening module, a data set analysis module and a data set analysis module, wherein the data set to be reconstructed is used for acquiring a platform region ammeter data set containing a missing value and screening the data set to be reconstructed according to a missing mechanism and a missing mode of each sample of the platform region ammeter data set; the data set to be reconstructed comprises a plurality of samples to be reconstructed, and the station area ammeter comprises a total table and at least one sub table; the data set to be reconstructed refers to sample sub-table data with a deletion mechanism of unbiased deletion and a deletion mode of non-user variable deletion;
the lower layer reconstruction data set acquisition module is used for carrying out upper layer reconstruction on the missing values based on the data set to be reconstructed by adopting a preset lower layer circulation regression reconstruction algorithm to obtain a lower layer reconstruction result, and replacing the missing values of the data set to be reconstructed with the lower layer reconstruction result to obtain a lower layer reconstruction data set;
The double-layer reconstruction result acquisition module is used for carrying out lower-layer reconstruction on the missing value based on the lower-layer reconstruction data set by adopting a preset upper-layer circulation regression reconstruction algorithm to obtain a double-layer reconstruction result of the missing value;
the upper and lower boundary reconstruction result acquisition module is used for solving an equation according to the ammeter error of the energy conservation law, determining the upper and lower boundaries of the missing value, and obtaining the upper and lower boundary reconstruction result of the missing value according to the comparison result of the double-layer reconstruction result of the missing value and the upper and lower boundaries;
the final reconstruction result acquisition module is used for processing the upper and lower boundary reconstruction results meeting the preset mutation conditions according to a preset mutation processing mechanism when the upper and lower boundary reconstruction results of the missing value meet the preset mutation conditions, so as to obtain the final reconstruction result of the missing value;
the lower layer cyclic regression reconstruction algorithm specifically comprises the following steps:
screening out sub-tables containing missing values from the data set to be reconstructed to obtain at least one sub-table to be reconstructed at the lower layer, and determining the reconstruction priority of the lower layer according to the comparison result of the number of the missing values; wherein, the priority of the lower layer to-be-reconstructed sub-table with the least missing value number is the largest;
Reconstructing the missing value of each lower-layer to-be-reconstructed sub-table in turn based on the to-be-reconstructed data set according to the principle of the largest first reconstruction of the lower-layer reconstruction priority, obtaining an initial lower-layer reconstruction result of the missing value, and updating the missing value of the to-be-reconstructed data set into the initial lower-layer reconstruction result to obtain an initial lower-layer reconstruction data set;
when the deviation between the initial lower layer reconstruction result obtained by any lower layer reconstruction and the initial lower layer reconstruction result obtained by the last lower layer reconstruction is in a preset range, taking the initial lower layer reconstruction result as a lower layer reconstruction result of a missing value and taking the initial lower layer reconstruction data set as a lower layer reconstruction data set; otherwise, updating the data set to be reconstructed into the initial lower layer reconstruction data set, and returning to the principle of the maximum first reconstruction according to the lower layer reconstruction priority, and sequentially reconstructing the missing value of each lower layer reconstruction sub-table based on the data set to be reconstructed;
wherein, the upper layer cyclic regression reconstruction algorithm is:
taking the sub-table containing the missing values as an upper-layer sub-table to be reconstructed to obtain at least one upper-layer sub-table to be reconstructed;
determining the upper layer reconstruction priority according to the comparison result of the number of the missing values; wherein, the priority of the upper layer to-be-reconstructed sub-table with the least missing value number is the largest;
According to the principle of the maximum first reconstruction of the upper reconstruction priority, reconstructing the missing value of each upper to-be-reconstructed sub-table based on the lower reconstruction data set to obtain an upper reconstruction result of the missing value, and updating the missing value of the lower reconstruction data set to the upper reconstruction result to obtain an upper reconstruction data set;
when the deviation between the upper layer reconstruction result obtained by any upper layer reconstruction and the upper layer reconstruction result obtained by the last upper layer reconstruction is in a preset range, taking the upper layer reconstruction result as a double-layer reconstruction result of the missing value; otherwise, updating the lower layer reconstruction data set into an upper layer reconstruction data set, and returning to the step of reconstructing the missing value of each upper layer to-be-reconstructed sub-table in turn based on the lower layer reconstruction data set according to the principle of the maximum first reconstruction of the upper layer reconstruction priority.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111311460.2A CN114189313B (en) | 2021-11-08 | 2021-11-08 | Ammeter data reconstruction method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111311460.2A CN114189313B (en) | 2021-11-08 | 2021-11-08 | Ammeter data reconstruction method and device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114189313A CN114189313A (en) | 2022-03-15 |
CN114189313B true CN114189313B (en) | 2023-11-24 |
Family
ID=80601901
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111311460.2A Active CN114189313B (en) | 2021-11-08 | 2021-11-08 | Ammeter data reconstruction method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114189313B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114839586B (en) * | 2022-05-12 | 2023-07-18 | 烟台东方威思顿电气有限公司 | Low-voltage station metering device misalignment calculation method based on EM algorithm |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108805193A (en) * | 2018-06-01 | 2018-11-13 | 广东电网有限责任公司 | A kind of power loss data filling method based on mixed strategy |
CN110991866A (en) * | 2019-11-29 | 2020-04-10 | 国网江苏省电力有限公司电力科学研究院 | Machine learning-based platform area data missing value completion method and device |
US10733515B1 (en) * | 2017-02-21 | 2020-08-04 | Amazon Technologies, Inc. | Imputing missing values in machine learning models |
CN113469189A (en) * | 2021-09-02 | 2021-10-01 | 国网江西省电力有限公司供电服务管理中心 | Method, system and device for filling missing values of power utilization acquisition data |
CN113468796A (en) * | 2021-04-13 | 2021-10-01 | 广西电网有限责任公司南宁供电局 | Voltage missing data identification method based on improved random forest algorithm |
-
2021
- 2021-11-08 CN CN202111311460.2A patent/CN114189313B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10733515B1 (en) * | 2017-02-21 | 2020-08-04 | Amazon Technologies, Inc. | Imputing missing values in machine learning models |
CN108805193A (en) * | 2018-06-01 | 2018-11-13 | 广东电网有限责任公司 | A kind of power loss data filling method based on mixed strategy |
CN110991866A (en) * | 2019-11-29 | 2020-04-10 | 国网江苏省电力有限公司电力科学研究院 | Machine learning-based platform area data missing value completion method and device |
CN113468796A (en) * | 2021-04-13 | 2021-10-01 | 广西电网有限责任公司南宁供电局 | Voltage missing data identification method based on improved random forest algorithm |
CN113469189A (en) * | 2021-09-02 | 2021-10-01 | 国网江西省电力有限公司供电服务管理中心 | Method, system and device for filling missing values of power utilization acquisition data |
Non-Patent Citations (1)
Title |
---|
基于分层模型的缺失数据插补方法研究;于力超;金勇进;;统计研究(11);全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN114189313A (en) | 2022-03-15 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110263866B (en) | Power consumer load interval prediction method based on deep learning | |
EP3678065A1 (en) | Chinese medicine production process knowledge system | |
CN110070282B (en) | Low-voltage transformer area line loss influence factor analysis method based on comprehensive relevance | |
Zhu et al. | Construction of membership functions for predictive soil mapping under fuzzy logic | |
CN107038292B (en) | Multi-wind-field output correlation modeling method based on self-adaptive multivariate nonparametric kernel density estimation | |
CN113126019B (en) | Remote estimation method, system, terminal and storage medium for error of intelligent ammeter | |
CN114548509B (en) | Multi-type load joint prediction method and system for multi-energy system | |
CN107895100B (en) | Drainage basin water quality comprehensive evaluation method and system | |
CN108647807B (en) | River flow prediction method | |
CN114004137A (en) | Multi-source meteorological data fusion and pretreatment method | |
CN103676649A (en) | Local self-adaptive WNN (Wavelet Neural Network) training system, device and method | |
CN114189313B (en) | Ammeter data reconstruction method and device | |
CN111932081B (en) | Method and system for evaluating running state of power information system | |
CN113379116A (en) | Cluster and convolutional neural network-based line loss prediction method for transformer area | |
CN104008433A (en) | Method for predicting medium-and-long-term power loads on basis of Bayes dynamic model | |
CN117195505A (en) | Evaluation method and system for informatization evaluation calibration model of electric energy meter | |
CN116341911A (en) | Alternating-current interference corrosion risk evaluation method and system based on FAHP-SVM | |
Zhang et al. | A representativeness heuristic for mitigating spatial bias in existing soil samples for digital soil mapping | |
CN117078114A (en) | Water quality evaluation method and system for water-bearing lakes under influence of diversion engineering | |
CN113139337B (en) | Partition interpolation processing method and device for lake topography simulation | |
CN110222098A (en) | Electric power high amount of traffic abnormality detection based on flow data clustering algorithm | |
CN117909932A (en) | System and method for analyzing line loss of transformer area of comprehensive big data | |
CN114970698B (en) | Metering equipment operation performance prediction method based on improved LWPS | |
CN110717623A (en) | Photovoltaic power generation power prediction method, device and equipment integrating multiple weather conditions | |
CN107203493A (en) | Multiple target battle field situation method based on complicated ratio evaluation method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |