CN110995396B - Compression method of communication messages of electricity consumption information acquisition system based on hierarchical structure - Google Patents

Compression method of communication messages of electricity consumption information acquisition system based on hierarchical structure Download PDF

Info

Publication number
CN110995396B
CN110995396B CN201911315950.2A CN201911315950A CN110995396B CN 110995396 B CN110995396 B CN 110995396B CN 201911315950 A CN201911315950 A CN 201911315950A CN 110995396 B CN110995396 B CN 110995396B
Authority
CN
China
Prior art keywords
data
original
compressed
voltage
self
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201911315950.2A
Other languages
Chinese (zh)
Other versions
CN110995396A (en
Inventor
窦健
郑国权
卢继哲
郄爽
李然
陆春光
覃剑
黄天聪
苏航
荆向月
胡浩星
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
State Grid Corp of China SGCC
State Grid Zhejiang Electric Power Co Ltd
China Electric Power Research Institute Co Ltd CEPRI
Original Assignee
State Grid Corp of China SGCC
State Grid Zhejiang Electric Power Co Ltd
China Electric Power Research Institute Co Ltd CEPRI
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by State Grid Corp of China SGCC, State Grid Zhejiang Electric Power Co Ltd, China Electric Power Research Institute Co Ltd CEPRI filed Critical State Grid Corp of China SGCC
Priority to CN201911315950.2A priority Critical patent/CN110995396B/en
Publication of CN110995396A publication Critical patent/CN110995396A/en
Application granted granted Critical
Publication of CN110995396B publication Critical patent/CN110995396B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L69/00Network arrangements, protocols or services independent of the application payload and not provided for in the other groups of this subclass
    • H04L69/22Parsing or analysis of headers
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L1/00Arrangements for detecting or preventing errors in the information received
    • H04L1/0078Avoidance of errors by organising the transmitted data in a format specifically designed to deal with errors, e.g. location
    • H04L1/0083Formatting with frames or packets; Protocol or part of protocol for error control
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L69/00Network arrangements, protocols or services independent of the application payload and not provided for in the other groups of this subclass
    • H04L69/04Protocols for data compression, e.g. ROHC
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L69/00Network arrangements, protocols or services independent of the application payload and not provided for in the other groups of this subclass
    • H04L69/30Definitions, standards or architectural aspects of layered protocol stacks

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Computer Security & Cryptography (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

The invention discloses a compression method of communication messages of a power consumption information acquisition system based on a hierarchical structure, which divides power consumption protocol data acquired by a user side into control type data and numerical type data. The control type data mostly respond to the downlink calling information, so that the control type data has strong randomness, errors are not allowed to occur, and the correlation between the control type data and the downlink calling information cannot be found, so that the whole compression processing of the last layer is directly carried out; the numerical data is analyzed and classified into four subclasses, and the four subclasses are compressed by algorithms such as a self-coding network and differential coding according to the distribution and relevant characteristics of the numerical data. Because mutual rules exist in the numerical data, the rules can be summarized by a correlation algorithm to carry out abnormal analysis and loss compensation on the data, so that the decompressed data of a receiving end is more reasonable and reliable, the stability of the power utilization information acquisition system is guaranteed, finally, the data subjected to reintegration is subjected to lossless compression, the original rule of a transmission system is met, the efficiency is improved, and the time is saved.

Description

Compression method of communication messages of electricity consumption information acquisition system based on hierarchical structure
Technical Field
The invention relates to the technical field of data compression, in particular to a compression method of communication messages of a power utilization information acquisition system based on a hierarchical structure.
Background
In the daily operation of an electric power system, in order to effectively monitor and analyze the electric energy consumption condition of a user, an electric power company needs to collect and process the electricity consumption data of a large number of users. Along with the large-scale construction of the smart power grid, the smart electric meter replaces manual meter reading and is gradually popularized. Compared with traditional electric meter data, the data generated by the intelligent electric meter has the characteristics of high acquisition frequency, more transmission data, complex message structure and the like. In the process of transmitting and processing a large amount of power data, the data distribution is uneven, so that data redundancy is inevitably brought, the redundant data not only occupies bandwidth, affects the transmission speed, increases the expense, but also brings unnecessary time expense to the retrieval of information. The data compression technology can not only eliminate digital information redundancy, but also ensure the quality of transmitted data, reduce the load of a power communication system, save bandwidth and improve the operation efficiency.
The existing research on compressing the power data usually adopts only one algorithm to compress the whole packaged protocol data. The method is rough in all aspects, and the internal structure of the data is not analyzed in detail in combination with the protocol, so that the compression effect is not ideal. It is very urgent to solve the data compression and storage problems, and only finding a proper data compression method can reduce the data redundancy to the maximum extent.
Disclosure of Invention
In order to solve the technical problem, the invention provides a compression method of communication messages of a power consumption information acquisition system based on a hierarchical structure.
In order to achieve the purpose, the invention adopts the following specific technical scheme:
a compression method of communication messages of a power utilization information acquisition system based on a hierarchical structure is characterized by comprising the following steps:
s1: in a first layer of data architecture, an electricity consumption data acquisition terminal divides acquired electricity consumption protocol data into control type data and numerical type data, and analyzes and classifies the numerical type data to obtain original voltage data, original current data, original electric energy and rate data and original power data;
s2: the method comprises the steps that original electric energy and rate data are compressed by a differential coding data compression algorithm to obtain first compressed data, and a self-coding neural network model is obtained by training original voltage data and original current data;
s3: entering a second layer data architecture, and processing the original voltage data and the original current data by using the trained self-coding neural network model to obtain second compressed data and a data abnormal position;
s4: performing data correction according to the first compressed data, the second compressed data and the data abnormal position to obtain third compressed data;
s5: entering a third layer of data architecture, rearranging the third compressed data, the original power data and the control type data according to the original time stamp in the protocol to obtain data to be compressed, and performing compression processing by using a lossless compression algorithm to obtain fourth compressed data;
s6: and the electric data acquisition terminal transmits the fourth compressed data to the master station.
Further, the self-coding neural network model comprises a coding network and a decoding network, the coding network is used for outputting second compressed data, and the decoding network is used for outputting data abnormal positions.
Furthermore, the coding network is provided with four layers, and the number of the neurons is 64, 32, 24 and 16 in sequence;
the decoding network is provided with four layers, and the number of the neurons is 16, 24, 32 and 64 in sequence.
Furthermore, the original voltage data and the original current data are independently trained to obtain a first self-coding neural network model and a second self-coding neural network model respectively, the first self-coding neural network model is used for determining second compressed data of the original voltage data and positions of data anomalies of the second compressed data, and the second self-coding neural network model is used for determining second compressed data of the original current data and positions of the data anomalies of the second compressed data.
Furthermore, the original voltage data and the original current data are normalized and data enhanced, and then are input into the self-coding neural network model.
Further, comparing the decoded output data of the first self-coding neural network model with corresponding voltage input data to generate a first fitting curve, and determining the position of data abnormality in the original voltage data by using the first fitting curve;
and comparing the decoded output data of the second self-coding neural network model with the corresponding current input data to generate a second fitting curve, and determining the position of data abnormality in the original current data by using the second fitting curve.
Further, acquiring a plurality of pieces of output data of the first self-coding neural network model in a period of time, generating a voltage data decoding curve in the time range aiming at each position point in the output data, and determining the position of data abnormality in the original voltage data based on the voltage data decoding curve and a preset voltage abnormality threshold;
acquiring a plurality of pieces of output data of the second self-coding neural network model in a period of time, generating a current data decoding curve in the time range aiming at each position point in the output data, and determining the position of data abnormality in the original current data based on the current data decoding curve and a preset voltage abnormality threshold.
Further, the normalizing the raw voltage data includes: normalizing the original voltage data to 215-235, and reserving 3-bit decimal values for the result;
the normalizing the raw current data comprises: the raw current data is normalized to 0-25, and the result is retained at 3 decimal places.
Further, step S5 includes:
s51: removing repeated characters in the data to be compressed by using a run length coding algorithm to obtain primary data of fourth compressed data;
s52: replacing a frame header and a frame tail in the primary data of the fourth compressed data by using a uniform character, and coding by using a Huffman algorithm on the basis to obtain secondary data of the fourth compressed data;
s53: and coding the secondary data of the fourth compressed data by using an LZ77 algorithm to obtain final compressed data to be sent to the master station.
Further, step S4 includes:
and when the original voltage data or the original current data has data loss, determining the lost voltage data or current data according to the voltage value or the current value near the data loss point, adding the lost voltage data or current data to the corresponding position in the original voltage data or the original current data to form complete data, and inputting the complete data from the coding neural network model again to obtain the third compressed data.
The compression method of the communication message of the electricity consumption information acquisition system based on the hierarchical structure has different characteristics aiming at different power data, can well consider the overall structure and the internal numerical value correlation of protocol data, obviously improves the compression ratio and protects the information which does not allow errors, such as control frames, address frames and the like. The numerical value correlation is utilized to mark and correct and compensate some abnormal data, so that the stability of the power utilization information acquisition system is ensured. The integration compression performed by the last layer conforms to the rules of the original transmission system, thereby improving the transmission efficiency and saving the transmission time.
Drawings
The invention will be further described with reference to the accompanying drawings and examples, in which:
fig. 1 is a schematic flowchart of a compression method of communication packets of a power consumption information acquisition system based on a hierarchical structure according to an embodiment of the present invention;
FIG. 2 is a diagram of the data architecture of the present invention;
FIG. 3 is a schematic diagram of a self-coding neural network;
FIG. 4 is a schematic diagram of a fitted curve output using a first self-encoding neural network; (ii) a
FIG. 5-1 is a schematic diagram of voltage data peak anomaly when performing a vertical comparison;
FIG. 5-2 is a schematic diagram of a trough anomaly in voltage data when a longitudinal comparison is performed;
FIG. 6-1 is a schematic diagram of a first fitted curve output using a second self-encoding neural network;
FIG. 6-2 is a schematic diagram of a second fitted curve output using a second self-encoding neural network;
6-3 are schematic diagrams of a third fitted curve using the output of the second self-encoding neural network;
FIG. 7 is a schematic diagram of the peak anomaly of current data when performing a longitudinal comparison;
FIG. 8-1 is a diagram of power data collected by a collection terminal;
fig. 8-2 is a data diagram obtained by correcting the power data in fig. 8-1.
Detailed Description
In order to make the technical problems, technical solutions and advantages of the present invention more apparent, the following detailed description is given with reference to the accompanying drawings and specific embodiments, it being understood that the specific embodiments described herein are merely illustrative of the present invention and are not intended to limit the present invention.
The present embodiment provides a method for compressing communication packets of a power consumption information acquisition system based on a hierarchical structure, please refer to fig. 1, which includes the following steps:
s1: in a first layer of data architecture, an electricity consumption data acquisition terminal divides acquired electricity consumption protocol data into control type data and numerical type data, and analyzes and classifies the numerical type data to obtain original voltage data, original current data, original electric energy and rate data and original power data;
before data compression, the acquired complete protocol data is analyzed and split into a control part and a numerical part, the numerical data is classified, and the analysis and the splitting of the complete protocol data are performed, in this embodiment, the electric power protocol data of the DL/T698.45 protocol is taken as an example for specific description, specifically, please refer to fig. 2, the original electricity utilization protocol data of the transmitting end (i.e., the acquisition terminal) is split into numerical data and control data preliminarily, and the numerical data is split into current data, voltage data, electric energy and rate data and power data according to attribute types.
The power data of the DL/T698.45 protocol is divided into an instruction issued by the master station to the concentrator and a reported response of the concentrator to the host, so as to ensure stable transceiving between the master station and the terminal. The frame format of the object oriented DL/T698.45 protocol is shown in table 1.
TABLE 1
Figure BDA0002325833170000061
It can be seen that the extraction of the frame head and the frame tail of the sending and responding data includes a length field L, a control field C, an address field A, a frame head check HCS and a frame check FCH, which have fixed format and fixed byte length, and have no regularity and similarity between adjacent data, and require no error and complete information. These data are collectively referred to as "control information", and the data are subjected to overall lossless compression directly at the last layer, while avoiding the processing at the first two layers.
The link user data (application layer) comprises resident electricity utilization key data, and recorded electric meter real numerical data can be disassembled from the resident electricity utilization key data and comprise different categories. Collectively, these data are referred to as "numerical information," and the method provided by this embodiment performs compression based on data correlation, and improves the overall compression effect by losing a certain precision.
S2: the method comprises the steps that original electric energy and rate data are compressed by a differential coding data compression algorithm to obtain first compressed data, and a self-coding neural network model is obtained by training original voltage data and original current data;
in the process of training and using the self-coding neural network model, it is usually necessary to perform normalization processing and data enhancement processing on data, and an existing normalization algorithm may be adopted in this embodiment, and the normalization processing may also be performed in the following manner:
the raw voltage data is normalized to 215-235, the result is kept at 3 decimal places, the raw current data is normalized to 0-25, and the result is kept at 3 decimal places.
S3: entering a second layer data architecture, and processing the original voltage data and the original current data by using the trained self-coding neural network model to obtain second compressed data and a data abnormal position;
s4: performing data correction according to the first compressed data, the second compressed data and the data abnormal position to obtain third compressed data;
and when the original voltage data or the original current data has data loss, determining the lost voltage data or current data according to the voltage value or the current value near the data loss point, adding the lost voltage data or current data to the corresponding position in the original voltage data or the original current data to form complete data, and inputting the complete data from the coding neural network model again to obtain the third compressed data.
S5: entering a third layer of data architecture, rearranging the third compressed data, the original power data and the control type data according to the original time stamp in the protocol to obtain data to be compressed, and performing compression processing by using a lossless compression algorithm to obtain fourth compressed data;
specifically, step S5 may be performed as follows:
s51: removing repeated characters in the data to be compressed by using a run length coding algorithm to obtain primary data of fourth compressed data;
s52: replacing a frame header and a frame tail in the primary data of the fourth compressed data by using a uniform character, and coding by using a Huffman algorithm on the basis to obtain secondary data of the fourth compressed data;
s53: and coding the secondary data of the fourth compressed data by using an LZ77 algorithm to obtain final compressed data to be sent to the master station.
S6: and the electric data acquisition terminal transmits the fourth compressed data to the master station.
In specific implementation, the voltage input data and the current input data can be respectively input into the trained first self-coding neural network and the trained second self-coding neural network, the position of abnormal data in the original voltage data is determined according to the output data of the first self-coding neural network, and the position of abnormal data in the original current data is determined according to the output data of the second self-coding neural network.
In this embodiment, the first self-coding neural network is configured to encode voltage input data and then decode and output the voltage encoding result, and the second self-coding neural network is configured to encode current input data and then decode and output the current encoding result.
It should be noted that the first self-coding neural network in this embodiment may be a network obtained by performing a training test based on the enhanced voltage data, and the training test includes: acquiring a plurality of pieces of original voltage data for training, performing voltage data enhancement processing, performing normalization processing on the basis of the enhancement processing, and inputting the normalized voltage data into a neural network for training to obtain a self-coding neural network;
the second self-coding neural network can be a network obtained by performing a training test based on the enhanced current data, wherein the training test comprises the following steps: acquiring a plurality of pieces of original current data for training, performing current enhancement processing, performing normalization processing on the basis of enhancement processing, and inputting the normalized current data into a neural network for training to obtain a two-self-coding neural network.
Each self-coding neural network in this embodiment may be composed of four layers of coding sub-networks and four layers of decoding sub-networks. Optionally, the number of output neurons at the lowest dimension in each self-coding neural network is 16 or 24.
The process of determining the location of abnormal data in raw data from the output data of the self-coding neural network is described in detail here:
in a first example, the output data of the first self-encoding neural network may be compared to the corresponding voltage input data to generate a first fit curve for the first self-encoding neural network, the first fit curve being used to determine the location of anomalous data in the raw voltage data, and the output data of the second self-encoding neural network may be compared to the corresponding current input data to generate a second fit curve for the second self-encoding neural network, the second fit curve being used to determine the location of anomalous data in the raw current data.
In a second example, a plurality of pieces of output data of the first self-coding neural network in a time range can be acquired, a voltage data decoding curve in the time range is generated for each position point in the output data, and the position of abnormal data in the original voltage data is determined based on the voltage data decoding curve and a preset voltage abnormal threshold; acquiring a plurality of pieces of output data of the second self-coding neural network in a period of time, generating a current data decoding curve in the time range aiming at each position point in the output data, and determining the position of abnormal data in the original current data based on the current data decoding curve and a preset voltage abnormal threshold.
In a third example, the methods in the two examples may be combined, that is, the position of the abnormal data in the original voltage data may be determined according to the first fitted curve, then the position of the abnormal data in the original voltage data may be determined according to the voltage data decoding curve and the preset voltage abnormal threshold, and a union of the two positions may be used as the position where the abnormality actually occurs in the original voltage data; and for the current data, determining the position of abnormal data in the original current data according to a first fitted curve, and if the determined abnormal data reaches 80% or 90% of the total data of the original current or other preset percentages, determining the position of the abnormal data in the original current data according to a current data decoding curve and a preset current abnormal threshold.
Specifically, for an abnormal data point in the original rate data, an average value of normal values before and after the abnormal data point may be calculated, and the average value is used as the data after the abnormal data point is corrected. For example, the average of the previous data and the next data of the abnormal data point may be calculated, or the average of two or more previous data and two or more next data of the abnormal data point may be calculated.
In this embodiment, a run-length coding method is used to remove repeated characters existing in data, and then it is considered that the frame header and the frame tail of each piece of data are fixed, and all the symbols can be uniformly used for substitution. Through the primary treatment, a Huffman coding algorithm is used firstly, and then the LZ77 method compression is carried out. The reason is that the Huffman algorithm changes data from the original 16-system format to the brand-new binary format after compression, which is equivalent to arrangement of a data structure, so that the original complex structure becomes single, and the method is favorable for the next step of LZ77 encoding based on a local data structure.
Through a plurality of tests, the effect of the combined compression method provided by the embodiment is improved compared with that of a single mode no matter low-voltage uplink or special variable uplink.
In order to verify the effectiveness of the method provided in this example, relevant experiments were performed according to the method described above. The experiment adopts the electric power data which are collected from a smart meter in a province and conform to the DL/T698.45 protocol, and records the electric energy consumption data collected at sampling points every one minute or five minutes in a whole day for a total of 8820. And selecting response data values of a fixed terminal in the day, wherein each piece of data respectively represents active power, reactive power, voltage, current and total rate electric energy of the electric meter (namely the rate data). The numerical data in the link user data with the frame header and the frame tail removed is analyzed as shown in table 2. And (4) respectively storing the four types of decomposed numerical data in consideration of different characteristics of different data, and performing a next compression experiment based on a data correlation algorithm.
TABLE 2
Figure BDA0002325833170000111
The first self-coding neural network model and the second self-coding neural network model in the experiment both adopt four-layer coding and four-layer decoding networks, and the structures are shown in fig. 3. And analyzing the classified DL/T698 data to obtain a plurality of data sets with 96 data points as one data set. These 96 points represent all values (voltage or current) obtained by collecting data 24 hours a day at 15 minute intervals.
After 96-dimensional original data is input, the dimension is reduced through a self-coding sub-network, and then the dimension is increased through a decoding sub-network to restore to 96-dimensional output. The lowest dimension of the network structure is the decoding result of the encoding network, 64, 32, 24 and 16 neurons are designed respectively, and each neuron represents the near 2/3, 1/3, 1/4 and 1/6 data compression rates.
And analyzing and classifying the electricity utilization data used in the experimental process according to a DL/T698 protocol to obtain 728 pieces of data. Classifying data according to voltage, current and power attributes, removing obvious abnormal or partially lost data, and finally obtaining a data set: voltage 77, current 62 and power 81. Since there are 96 points per data, a total of 7392 voltage values, 5952 current values, and 7776 power values are obtained.
Looking at scatter plot findings for different types of data: the data distribution of the voltage is more uniform in a centralized way, and the current and the power are distributed randomly. The floating intervals of the two types of data, namely voltage and current, are small, the maximum of the power data can reach about seven-eight-thousand, and the minimum of the power data is in zero wandering. The input data is subjected to batch normalization processing before neural network training, the small processing of voltage and current floating intervals is reasonable, and the power is opposite. The designed network is difficult to capture the law of the power data, the curve characteristics are unbalanced, and the error of the result is large, so that the method provided by the invention does not need to capture the law of the power data, and directly places the law of the power data in a third layer for data splicing to form the complete data to be compressed and then performs lossless compression on the complete data to be compressed.
In the experiment, before the self-coding neural network training is carried out, the data is normalized according to the method. For the voltage data, the data normalization processing is carried out between 215 and 235, and 3 decimal parts are reserved as a result. For the current data, normalization is carried out between 0 and 25, the result is a floating point number from 0 to 1, and 3 decimal places are reserved.
According to machine learning experience, the data size of network training is very important, and more than thousands of data sets are needed. In order to make the designed network have good effect, the data is also enhanced in the experiment. Specifically, for the voltage, each piece of original data (data before normalization) is added with a random number of-0.5 to 0.5 of a 3-bit decimal, the cycle is performed for 29 times, 2310 pieces of training data are finally generated, 10 pieces are rounded off, 100 pieces are taken as one batch, and 23 batches form one epoch. For the current, each piece of original data is added with a random number of 3-bit decimal-0.6 to 0.6, 29 times of circulation are carried out, 1860 pieces of training data are finally generated, 60 pieces are rounded off, and 100 pieces are taken as a batch.
The three-phase voltage data is first compressed. 2300 voltage input data with enhanced and normalized data are obtained, the network structure is mainly divided into four categories for training and testing, and the number of the output neurons corresponding to the network encoder is 16, 24, 32 and 64. After repeated tests and analyses, a set of different optimal parameters needs to be set for the network of each encoder type on the basis of the existing network structure.
And (3) training the four types of networks for hours by using the set optimal parameters and the designed network structure and actually adopting a CPU (central processing unit) computer. The loss drops significantly and levels off within 1000 iterations. Training is still required to be carried out for tens of thousands of times all the time, because losses observed in tens of thousands of iterations still slowly fluctuate and decrease in experiments, and meanwhile objective evaluation on training results of the four types of networks is facilitated. The results of different types of self-encoding neural networks are shown in table 3:
TABLE 3
Figure BDA0002325833170000131
The fitting graphs of four different network structures in the comparison experiment process can find that the fitting effect can be better along with the increase of the number of the output neurons of the encoder, and the fitting effect accords with the self characteristics of the self-coding network. It is contemplated that if the data accuracy requirements are not particularly stringent, a network of 16 or 24 neuron encoders may be selected so that the loss of precision is not excessive at higher compression rates.
Taking 1800 pieces of current input data, and still dividing the network structure into four types for training and testing, the number of output neurons corresponding to the encoder is 16, 24, 32 and 64. The recovery curve of the current data can be obtained through the same process, and the result shows that the fitting result of the current data is different from the voltage. For some current data, the curve fitting effect is close to saturation, but some data fitting effects may be very poor. This is because the current data and the voltage data are different from each other in their laws and are different from each other in their laws. The specific contents are shown in table 4:
TABLE 4
Figure BDA0002325833170000141
In the actual national power grid calling and measuring meter reading process, situations of data loss and data insufficiency often occur. And for the condition that only 90 points exist in a piece of real 96-point data, and the like, recovering the actual data by using the data correlation learned by the self-coding neural network. Firstly, the data which is not completely tested and the average value of the data loss point are taken to form complete data, the complete data is input into a trained self-coding neural network, finally, decoding and outputting are carried out, and the result of the de-normalization is the obtained and predicted recovery data.
In a verification experiment, the decoding recovery of the voltage data is quite successful, and the fitting recovery curve can basically learn all rules of the original data curve. By comparing with the original data, the abnormal conditions of some real data can be found out. Taking a self-coding neural network output by 32 neuron encoders as an example, the acquisition terminal performs anomaly analysis on voltage data. Specifically, referring to fig. 4, the blocks in fig. 4 represent the abnormal voltage data found by the above method, which are actually some parts with poor fitting effect. It can be seen that most of the difference is over a small peak variation of the raw data. For abnormal data in the real data, the marks of the data can be returned in the actual data compression process.
On the basis of transverse data comparison analysis, longitudinal comparison can also be adopted, and 96 points are analyzed point by point (each point contains 77 numerical conditions). Wherein, observing the rule of these 96 voltage data points finds two kinds of data anomalies, the boxes in fig. 5-1 and fig. 5-2 represent "peak anomaly" and "valley anomaly", respectively. After the abnormal threshold value is made, the points are marked to be transmitted along with the compressed data, and the compressed data are provided for the data center of the master station to give an early warning. The specific abnormality determination results are shown in fig. 5-1 and 5-2.
From the previous analysis, it can be seen that the decoded fit of the current data seems good, but the actual effect is quite different. The curve trends of different current data are quite different, which is in contrast to voltage data, and the judgment criterion that only the output from the encoded neural network is used as abnormal data is no longer applicable. As shown in fig. 6-1, 6-2, and 6-3, the boxes represent some of the differences in the curves, and it can be seen that the recovery of the curve fit for the current data is good or bad, leading to bipolar differentiation. The data fit of fig. 6-1 and 6-3 is very good with few anomalies, and for fig. 6-2, with global anomalies. This is probably because the neural network learns a mechanism of "fluctuation" in the training process, and it is reasonable to remember "fluctuation", so that when some stationary line segments appear, the test effect will suddenly deteriorate. However, absolute errors and absolute correctness are very detrimental to the analysis of data anomalies. Because of the limited data set given, this overfitting case is learned by the designed network, and if better results are desired, it is considered necessary to use more, more characteristic current data to make the trained network more robust. For further optimization, the longitudinal comparison can be continued after the transverse comparison is performed on the current data, and 96 points are analyzed point by point (each point contains 62 numerical cases). Unlike the analysis results of the voltage, only "peak abnormality" was observed in the experiment, for example, fig. 7. As with the voltage method, a threshold is established, and a "peak point" exceeding the threshold is determined as a "peak anomaly".
According to the definition of the total rate data, the electric energy data continuously rise under normal conditions. However, in the actual calling process, some electric energy data collected by the terminal are often abnormal for some special reasons. Fig. 8-1 shows an abnormal data, and the boxes represent specific abnormal data points. These abnormal data points need to be repaired. The specific correction method may be an average of the previous and subsequent normal values in the vicinity of the abnormal point. The abnormal value is corrected during the compression process, a piece of recovered data is returned during the decompression, and the abnormal recovery result is shown in fig. 8-2.
Through the above, the numerical data which is split and has good correlation compression is completely rearranged according to the marks and the time stamps in the protocol. In this case, control frames, address frames, and other data that are not allowed to be lost or corrupted during transmission still exist in the arranged data, and they actually have redundancy. And finally, the acquisition terminal compresses the complete data to be compressed by adopting a lossless compression algorithm and then sends the compressed complete data to the master station.
The data compression method of the electricity consumption information acquisition system provided by the embodiment is divided into three layers in total. Firstly, each piece of power protocol data collected by a user side is divided into two parts, namely control type data and numerical type data. The control type data is considered to be error-free and hardly found to be correlated with each other, so that the first and second layers are not processed. The numerical data is analyzed and classified into four types by taking the DL/T698.45 protocol as an experiment, and the four types are respectively compressed by adopting a self-coding network, a differential coding method and the like according to the respective characteristics of the numerical data. Because the correlation algorithms can obtain the mutual rules among the data, the rules can be used for marking and compensating the data abnormity and loss, so that the decompressed data received by the receiving end is more reasonable and reliable. And in the last layer, numerical data and control data are spliced again by using information such as marks, time stamps and the like in the protocol, and a set of integral compression scheme suitable for the power utilization protocol data structure is designed by referring to the principles of three traditional compression algorithms.
While the present invention has been described with reference to the embodiments shown in the drawings, the present invention is not limited to the embodiments, which are illustrative and not restrictive, and it will be apparent to those skilled in the art that various changes and modifications can be made therein without departing from the spirit and scope of the invention as defined in the appended claims.

Claims (10)

1. A compression method of communication messages of a power utilization information acquisition system based on a hierarchical structure is characterized by comprising the following steps:
s1: in a first layer of data architecture, an electricity consumption data acquisition terminal divides acquired electricity consumption protocol data into control type data and numerical type data, and analyzes and classifies the numerical type data to obtain original voltage data, original current data, original electric energy and rate data and original power data;
s2: the method comprises the steps that original electric energy and rate data are compressed by a differential coding data compression algorithm to obtain first compressed data, and a self-coding neural network model is obtained by training original voltage data and original current data;
s3: entering a second layer data architecture, and processing the original voltage data and the original current data by using the trained self-coding neural network model to obtain second compressed data and a data abnormal position;
s4: performing data correction according to the first compressed data, the second compressed data and the data abnormal position to obtain third compressed data;
s5: entering a third layer of data architecture, rearranging the third compressed data, the original power data and the control type data according to the original time stamp in the protocol to obtain data to be compressed, and performing compression processing by using a lossless compression algorithm to obtain fourth compressed data;
s6: and the electric data acquisition terminal transmits the fourth compressed data to the master station.
2. The method for compressing communication messages of the power consumption information collection system based on the hierarchical structure according to claim 1, wherein:
the self-coding neural network model comprises a coding network and a decoding network, wherein the coding network is used for outputting second compressed data, and the decoding network is used for outputting data abnormal positions.
3. The method for compressing communication messages of the power consumption information collection system based on the hierarchical structure according to claim 2, wherein:
the coding network is provided with four layers, and the number of the neurons is 64, 32, 24 and 16 in sequence;
the decoding network is provided with four layers, and the number of the neurons is 16, 24, 32 and 64 in sequence.
4. The method for compressing communication messages of the power consumption information acquisition system based on the hierarchical structure according to claim 2 or 3, wherein: the method comprises the steps that various independent training of original voltage data and original current data are respectively carried out to obtain a first self-coding neural network model and a second self-coding neural network model, the first self-coding neural network model is used for determining second compressed data of the original voltage data and positions of data abnormity of the second compressed data, and the second self-coding neural network model is used for determining second compressed data of the original current data and positions of data abnormity of the second compressed data.
5. The method for compressing communication messages of the electricity consumption information collection system based on the hierarchical structure according to claim 4, wherein: and normalizing and enhancing the original voltage data and the original current data, and then inputting the normalized and enhanced original voltage data and the enhanced original current data into the self-coding neural network model.
6. The method for compressing communication messages of the electricity consumption information collection system based on the hierarchical structure according to claim 5, wherein:
comparing the decoded output data of the first self-coding neural network model with the corresponding voltage input data to generate a first fitting curve, and determining the position of data abnormality in the original voltage data by using the first fitting curve;
and comparing the decoded output data of the second self-coding neural network model with the corresponding current input data to generate a second fitting curve, and determining the position of data abnormality in the original current data by using the second fitting curve.
7. The method for compressing communication messages of the electricity consumption information collection system based on the hierarchical structure according to claim 5, wherein: acquiring a plurality of pieces of output data of a first self-coding neural network model in a time range, generating a voltage data decoding curve in the time range aiming at each position point in the output data, and determining the position of data abnormality in the original voltage data based on the voltage data decoding curve and a preset voltage abnormality threshold;
acquiring a plurality of pieces of output data of the second self-coding neural network model in a period of time, generating a current data decoding curve in the time range aiming at each position point in the output data, and determining the position of data abnormality in the original current data based on the current data decoding curve and a preset voltage abnormality threshold.
8. The method for compressing communication messages of the electricity consumption information collection system based on the hierarchical structure according to claim 5, wherein:
the normalizing the raw voltage data comprises: normalizing the original voltage data to 215-235, and reserving 3-bit decimal values for the result;
the normalizing the raw current data comprises: the raw current data is normalized to 0-25, and the result is retained at 3 decimal places.
9. The method for compressing communication messages of the power consumption information collection system based on the hierarchical structure as claimed in claim 1, wherein the step S5 includes:
s51: removing repeated characters in the data to be compressed by using a run length coding algorithm to obtain primary data of fourth compressed data;
s52: replacing a frame header and a frame tail in the primary data of the fourth compressed data by using a uniform character, and coding by using a Huffman algorithm on the basis to obtain secondary data of the fourth compressed data;
s53: and coding the secondary data of the fourth compressed data by using an LZ77 algorithm to obtain final compressed data to be sent to the master station.
10. The method for compressing communication messages of the power consumption information collection system based on the hierarchical structure as claimed in claim 1, wherein the step S4 includes:
and when the original voltage data or the original current data has data loss, determining the lost voltage data or current data according to the voltage value or the current value near the data loss point, adding the lost voltage data or current data to the corresponding position in the original voltage data or the original current data to form complete data, and inputting the complete data from the coding neural network model again to obtain the third compressed data.
CN201911315950.2A 2019-12-19 2019-12-19 Compression method of communication messages of electricity consumption information acquisition system based on hierarchical structure Active CN110995396B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911315950.2A CN110995396B (en) 2019-12-19 2019-12-19 Compression method of communication messages of electricity consumption information acquisition system based on hierarchical structure

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911315950.2A CN110995396B (en) 2019-12-19 2019-12-19 Compression method of communication messages of electricity consumption information acquisition system based on hierarchical structure

Publications (2)

Publication Number Publication Date
CN110995396A CN110995396A (en) 2020-04-10
CN110995396B true CN110995396B (en) 2022-01-11

Family

ID=70096056

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911315950.2A Active CN110995396B (en) 2019-12-19 2019-12-19 Compression method of communication messages of electricity consumption information acquisition system based on hierarchical structure

Country Status (1)

Country Link
CN (1) CN110995396B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112380676B (en) * 2020-10-29 2023-07-14 贵州电网有限责任公司 Digital twin data stream modeling and compression method for multi-energy system
CN114860797B (en) * 2022-03-16 2023-05-26 电子科技大学 Data derivatization processing method

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105553625A (en) * 2016-01-19 2016-05-04 重庆邮电大学 Remote channel message compression method and system for electricity consumption collection system
CN106707099A (en) * 2016-11-30 2017-05-24 国网上海市电力公司 Monitoring and locating method based on abnormal electricity consumption detection module
CN107578124A (en) * 2017-08-28 2018-01-12 国网山东省电力公司电力科学研究院 The Short-Term Load Forecasting Method of GRU neutral nets is improved based on multilayer
CN110059357A (en) * 2019-03-19 2019-07-26 中国电力科学研究院有限公司 A kind of intelligent electric energy meter failure modes detection method and system based on autoencoder network

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105553625A (en) * 2016-01-19 2016-05-04 重庆邮电大学 Remote channel message compression method and system for electricity consumption collection system
CN106707099A (en) * 2016-11-30 2017-05-24 国网上海市电力公司 Monitoring and locating method based on abnormal electricity consumption detection module
CN107578124A (en) * 2017-08-28 2018-01-12 国网山东省电力公司电力科学研究院 The Short-Term Load Forecasting Method of GRU neutral nets is improved based on multilayer
CN110059357A (en) * 2019-03-19 2019-07-26 中国电力科学研究院有限公司 A kind of intelligent electric energy meter failure modes detection method and system based on autoencoder network

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
《数据压缩在用电信息采集远程通信中的应用》;杜锦阳;《2018智能电网信息化建设研讨会》;20181201;全文 *

Also Published As

Publication number Publication date
CN110995396A (en) 2020-04-10

Similar Documents

Publication Publication Date Title
CN110995396B (en) Compression method of communication messages of electricity consumption information acquisition system based on hierarchical structure
CN103457609B (en) Fault recorder data Lossless Compression, decompression method
CN110515931B (en) Capacitive type equipment defect prediction method based on random forest algorithm
CN111428201B (en) Prediction method for time series data based on empirical mode decomposition and feedforward neural network
CN110866604A (en) Cleaning method for power transformer state monitoring data
CN112332853A (en) Time sequence data compression and recovery method based on power system
CN108234464B (en) High-efficiency compression method for collected data of signal centralized monitoring system
CN115905959B (en) Defect factor-based power circuit breaker relevance fault analysis method and device
CN116739829B (en) Big data-based power data analysis method, system and medium
CN110569967A (en) Neural network model compression encryption method and system based on arithmetic coding
CN111651642B (en) Improved TEXT-GAN-based flow data set generation method
CN116016606A (en) Sewage treatment operation and maintenance data efficient management system based on intelligent cloud
CN113035282A (en) Data sequence processing method based on tag data growth gene
CN116361256A (en) Data synchronization method and system based on log analysis
Jiang et al. On the Channel Pruning using Graph Convolution Network for Convolutional Neural Network Acceleration.
CN117040542B (en) Intelligent comprehensive distribution box energy consumption data processing method
CN114221663A (en) Real-time spectrum data compression and recovery method based on character coding
CN116582133B (en) Intelligent management system for data in transformer production process
Sun et al. Deep joint source-channel coding for wireless image transmission with semantic importance
CN112667633A (en) Data compression method and system based on statistical probability
CN115865099B (en) Huffman coding-based multi-type data segment compression method and system
CN116737085A (en) Efficient elevator maintenance data storage method
Lewandowski et al. Autoencoder feature residuals for network intrusion detection: Unsupervised pre-training for improved performance
US20230053844A1 (en) Improved Quality Value Compression Framework in Aligned Sequencing Data Based on Novel Contexts
CN108573069A (en) A kind of Twins methods accelerating compression flow regular expression matching

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right

Effective date of registration: 20211028

Address after: 100085 Beijing city Haidian District Qinghe small Camp Road No. 15

Applicant after: CHINA ELECTRIC POWER RESEARCH INSTITUTE Co.,Ltd.

Applicant after: STATE GRID ZHEJIANG ELECTRIC POWER Co.,Ltd.

Applicant after: STATE GRID CORPORATION OF CHINA

Address before: 100085 Beijing city Haidian District Qinghe small Camp Road No. 15

Applicant before: CHINA ELECTRIC POWER RESEARCH INSTITUTE Co.,Ltd.

Applicant before: STATE GRID ZHEJIANG ELECTRIC POWER Co.,Ltd.

Applicant before: Chongqing University

Applicant before: STATE GRID CORPORATION OF CHINA

TA01 Transfer of patent application right
GR01 Patent grant
GR01 Patent grant