CN116483789A - Distributed photovoltaic data double compression method and device based on carrier communication - Google Patents
Distributed photovoltaic data double compression method and device based on carrier communication Download PDFInfo
- Publication number
- CN116483789A CN116483789A CN202310482625.5A CN202310482625A CN116483789A CN 116483789 A CN116483789 A CN 116483789A CN 202310482625 A CN202310482625 A CN 202310482625A CN 116483789 A CN116483789 A CN 116483789A
- Authority
- CN
- China
- Prior art keywords
- data
- distributed photovoltaic
- entity
- photovoltaic data
- compression
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000007906 compression Methods 0.000 title claims abstract description 52
- 230000006835 compression Effects 0.000 title claims abstract description 52
- 238000000034 method Methods 0.000 title claims abstract description 39
- 238000004891 communication Methods 0.000 title claims abstract description 26
- 238000013144 data compression Methods 0.000 claims abstract description 50
- 239000011159 matrix material Substances 0.000 claims abstract description 25
- 230000008859 change Effects 0.000 claims abstract description 20
- 238000000354 decomposition reaction Methods 0.000 claims description 16
- 230000009466 transformation Effects 0.000 claims description 11
- 230000009977 dual effect Effects 0.000 claims description 10
- 230000004913 activation Effects 0.000 claims description 4
- 238000007689 inspection Methods 0.000 claims description 4
- 230000002087 whitening effect Effects 0.000 claims description 4
- 230000008569 process Effects 0.000 claims description 3
- 238000012163 sequencing technique Methods 0.000 claims description 3
- 230000005540 biological transmission Effects 0.000 description 10
- 238000012544 monitoring process Methods 0.000 description 5
- 238000010586 diagram Methods 0.000 description 4
- 238000012545 processing Methods 0.000 description 4
- 239000013598 vector Substances 0.000 description 3
- 230000009286 beneficial effect Effects 0.000 description 2
- 238000004364 calculation method Methods 0.000 description 2
- 239000000969 carrier Substances 0.000 description 2
- 238000010276 construction Methods 0.000 description 2
- 230000001133 acceleration Effects 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000005286 illumination Methods 0.000 description 1
- 238000007726 management method Methods 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 238000002945 steepest descent method Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/17—Details of further file system functions
- G06F16/174—Redundancy elimination performed by the file system
- G06F16/1744—Redundancy elimination performed by the file system using compression, e.g. sparse files
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/27—Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/36—Creation of semantic tools, e.g. ontology or thesauri
- G06F16/367—Ontology
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L69/00—Network arrangements, protocols or services independent of the application payload and not provided for in the other groups of this subclass
- H04L69/04—Protocols for data compression, e.g. ROHC
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Computing Systems (AREA)
- Life Sciences & Earth Sciences (AREA)
- Animal Behavior & Ethology (AREA)
- Computational Linguistics (AREA)
- Computer Security & Cryptography (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Photovoltaic Devices (AREA)
Abstract
The invention provides a distributed photovoltaic data double compression method and device for carrier communication, wherein the method comprises the following steps: collecting distributed photovoltaic original data, and denoising the distributed photovoltaic data to obtain first distributed photovoltaic data; constructing a time period matrix according to different time periods and a plurality of characteristic dimensions of the first distributed photovoltaic data; when a preset condition is met, performing data compression on the first distributed photovoltaic data based on the wavelet LZW dictionary to obtain second distributed photovoltaic data; constructing a characteristic value knowledge graph, calculating the inferable degree, respectively giving a first weight, and carrying out second data compression on the second distributed photovoltaic data according to the given weight. Compared with the prior art, by constructing a time period matrix and carrying out lossless data compression based on wavelet information entropy, the change of the time domain information entropy of the acquired data can be accurately captured, and the recognition capability of redundant data with small change of the information entropy of the front and rear adjacent time periods is improved.
Description
Technical Field
The invention relates to the field of power data compression, in particular to a distributed photovoltaic data double compression method and device based on carrier communication.
Background
Along with the acceleration of the construction of a novel power system, massive distributed photovoltaic equipment needs to be connected into a power grid and fine management is carried out. In order to support smooth development of services such as fault accurate positioning, peak shaving/frequency modulation, grid connection/off-grid and the like of the photovoltaic panel, monitoring data generated by a distributed photovoltaic system rises exponentially. The monitoring data such as voltage, current, active power, reactive power, inclination angle, temperature, humidity and illumination data collected by the device need to be converged to an edge gateway through power line carrier communication and uploaded to a main station. And the carrier communication channel has limited capacity, so that a novel data compression method is necessary to be researched in order to support mass data transmission, and multi-source heterogeneous data is compressed to the maximum extent on the basis of not losing useful information, thereby reducing data transmission and processing time delay and relieving data storage pressure.
The traditional photovoltaic data compression method based on the time domain cannot accurately capture the time domain information entropy change of the acquired data, the redundant data with small information entropy change in the front and rear adjacent time periods is poor in recognition capability, and redundant information residues are more after data compression is completed.
Disclosure of Invention
The invention provides a distributed photovoltaic data double compression method and device based on carrier communication, which solve the technical problem of how to improve redundant data recognition capability by constructing a time segment matrix and calculating information entropy difference values between adjacent time periods and accurately capture time domain information entropy change of collected data.
In order to solve the above technical problems, an embodiment of the present invention provides a distributed photovoltaic data dual compression method for carrier communication, including:
collecting distributed photovoltaic original data, and denoising the distributed photovoltaic data to obtain first distributed photovoltaic data;
constructing a time period matrix according to different time periods of the first distributed photovoltaic data and a plurality of characteristic dimensions of the first distributed photovoltaic data; when the difference value of the information entropy between adjacent time periods with the same characteristic dimension in the time period matrix is smaller than a preset information entropy change threshold value, performing data compression based on a wavelet LZW dictionary on first distributed photovoltaic data corresponding to a time period before the adjacent time periods to obtain second distributed photovoltaic data; the characteristic dimension comprises a voltage characteristic dimension, a current characteristic dimension, an active power characteristic dimension, a reactive power characteristic dimension, an inclination characteristic dimension, a temperature characteristic dimension and a humidity characteristic dimension of the distributed photovoltaic;
and constructing a characteristic value knowledge graph according to the first distributed photovoltaic data, calculating the inferable degree of each main entity of the characteristic value knowledge graph, respectively endowing each main entity with a first weight, and carrying out second data compression on the second distributed photovoltaic data according to the endowed first weight.
Preferably, the feature value knowledge graph is g= { E O ,R,E S ,T};
Wherein E is O As a main entity set, E S R is a relation set between a host entity and a guest entity, and T is a time stamp set; the main entity set comprises a current characteristic, a voltage characteristic and a power characteristic; the guest entity set includes light intensity features, humidity features, and temperature features.
As a preferred solution, the calculating the inferable degree of each main entity of the eigenvalue knowledge graph specifically includes:
calculating the master entity e according to the following o Is (1) the inferable degree Y of o :
Wherein e S A guest entity element related to a host entity, h ei Representing a set of involved entity embeddings, the entity embeddings combining comprising a guest and a host, h ri Representing the related host-guest embedded relation set, h ti For the set of guest timestamps involved,representing Cartesian products, U i o Representing the number of neighboring primary entities, I.I. | 2 Representing taking the binary norms.
As a preferred solution, the second data compression is performed on the second distributed photovoltaic data according to the given first weight, specifically:
determining a eigenvalue data compression threshold based on the assigned first weight and the calculated inferable degree;
when the first weight of the characteristic value of the second distributed photovoltaic data is smaller than the characteristic value data compression threshold value, calculating the data quantity transmitted by the characteristic value;
calculating a second weight corresponding to each moment of the characteristic value, sorting the data of all the moments of the characteristic value according to the second weight, and deleting the characteristic value of each moment in sequence from the data with the minimum second weight until the quantity of the data transmitted by the characteristic value is met.
As a preferred solution, the first weights are respectively given to the main entities, specifically:
weighting and sequencing all the main entities according to the inferable degree and the information entropy, and respectively giving a first weight; wherein the information entropy of the master entity H (e o ) The method comprises the following steps:
a first weight w of the main entity n The method comprises the following steps:
wherein T is the total number of time instants, P (x i,n ) Sigmoid is the activation function for the probability of the feature occurrence in the nth dimension at i time, and N is the total number of feature dimensions.
As a preferred solution, the denoising processing is performed on the distributed photovoltaic data to obtain first distributed photovoltaic data, which specifically includes:
performing wavelet transformation on the distributed original photovoltaic data, and performing wavelet decomposition on the distributed photovoltaic data subjected to wavelet transformation through a wavelet function;
and denoising the detail coefficients of each layer through a wavelet threshold value, and reconstructing the approximation coefficient of the last layer and the detail coefficients of each layer according to the recorded wavelet to obtain the first distributed photovoltaic data.
As a preferred scheme, the wavelet decomposition is performed on the distributed photovoltaic data subjected to wavelet transformation through a wavelet function, specifically:
and carrying out wavelet decomposition on the distributed photovoltaic data subjected to wavelet transformation through a wavelet function, and carrying out whitening inspection on the detail coefficient after decomposition until the energy contained in the detail coefficient is greater than or equal to a set second threshold value.
Correspondingly, the embodiment of the invention also provides a distributed photovoltaic data double compression device based on carrier communication, which comprises a denoising module, a first compression module and a second compression module; wherein,,
the denoising module is used for acquiring the distributed photovoltaic original data, denoising the distributed photovoltaic data and obtaining first distributed photovoltaic data;
the first compression module is used for constructing a time period matrix according to different time periods of the first distributed photovoltaic data and a plurality of characteristic dimensions of the first distributed photovoltaic data; when the difference value of the information entropy between adjacent time periods with the same characteristic dimension in the time period matrix is smaller than a preset information entropy change threshold value, performing data compression based on a wavelet LZW dictionary on first distributed photovoltaic data corresponding to a time period before the adjacent time periods to obtain second distributed photovoltaic data; the characteristic dimension comprises a voltage characteristic dimension, a current characteristic dimension, an active power characteristic dimension, a reactive power characteristic dimension, an inclination characteristic dimension, a temperature characteristic dimension and a humidity characteristic dimension of the distributed photovoltaic;
the second compression module is used for constructing a characteristic value knowledge graph according to the first distributed photovoltaic data, calculating the inferable degree of each main entity of the characteristic value knowledge graph, respectively endowing each main entity with a first weight, and carrying out second data compression on the second distributed photovoltaic data according to the endowed first weight.
Preferably, the feature value knowledge graph is g= { E O ,R,E S ,T};
Wherein E is O As a main entity set, E S R is a relation set between a host entity and a guest entity, and T is a time stamp set; the main entity set comprises a current characteristic, a voltage characteristic and a power characteristic; the guest entity set includes light intensity features, humidity features, and temperature features.
As a preferred solution, the second compression module calculates the inferable degree of each main entity of the feature value knowledge graph, specifically:
the second compression module calculates the main entity e according to the following formula o Is (1) the inferable degree Y of o :
Wherein e S A guest entity element related to a host entity, h ei Representing a set of involved entity embeddings, the entity embeddings combining comprising a guest and a host, h ri Representing the related host-guest embedded relation set, h ti For the set of guest timestamps involved,representing Cartesian product,/->Representing the number of neighboring primary entities, I.I. | 2 Representing taking the binary norms.
Compared with the prior art, the embodiment of the invention has the following beneficial effects:
the embodiment of the invention provides a distributed photovoltaic data double compression method and device for carrier communication, wherein the distributed photovoltaic data double compression method comprises the following steps: collecting distributed photovoltaic original data, and denoising the distributed photovoltaic data to obtain first distributed photovoltaic data; constructing a time period matrix according to different time periods of the first distributed photovoltaic data and a plurality of characteristic dimensions of the first distributed photovoltaic data; when the difference value of the information entropy between adjacent time periods with the same characteristic dimension in the time period matrix is smaller than a preset information entropy change threshold value, performing data compression based on a wavelet LZW dictionary on first distributed photovoltaic data corresponding to a time period before the adjacent time periods to obtain second distributed photovoltaic data; the characteristic dimension comprises a voltage characteristic dimension, a current characteristic dimension, an active power characteristic dimension, a reactive power characteristic dimension, an inclination characteristic dimension, a temperature characteristic dimension and a humidity characteristic dimension of the distributed photovoltaic; and constructing a characteristic value knowledge graph according to the first distributed photovoltaic data, calculating the inferable degree of each main entity of the characteristic value knowledge graph, respectively endowing each main entity with a first weight, and carrying out second data compression on the second distributed photovoltaic data according to the endowed first weight. Compared with the existing photovoltaic data compression scheme based on the time domain, by constructing a time period matrix and carrying out lossless data compression based on wavelet information entropy, the change of the time domain information entropy of the acquired data can be accurately captured, the recognition capability of redundant data with small information entropy change of the front and rear adjacent time periods is improved, the residual quantity of the redundant information is reduced after the data compression is completed, and the quality of the compressed data is effectively ensured.
Further, the inferable property of different main entities is measured by the knowledge graph of the feature domain, the weight of the feature value can be calculated by combining the information entropy of different main entities, the feature value data compression threshold is set based on the weight and the inferable property, the secondary compression of the photovoltaic data is carried out, specifically, the feature value with low partial weight and strong inferable property is compressed, the data can be further simplified, and the data quantity required to be transmitted is reduced.
Further, the correlation between the time domain and the characteristic domain of the photovoltaic monitoring data is deeply mined by utilizing the continuity of time and space and the reasoning among the characteristics, and the redundancy of the data is reduced from the two dimensions of time and the characteristics so as to compress the data as much as possible, thereby realizing the efficient data transmission based on carrier waves and improving the compression efficiency.
Drawings
Fig. 1: the invention provides a flow diagram of one embodiment of a distributed photovoltaic data double compression method for carrier communication.
Fig. 2: the principle schematic diagram of an embodiment of the knowledge graph of the characteristic values of different dimensions at the moment t constructed by the embodiment of the invention.
Fig. 3: the invention provides a structural schematic diagram of one embodiment of a distributed photovoltaic data double compression device for carrier communication.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Embodiment one:
according to the related technology, the traditional data compression method is single, and is a photovoltaic data compression method based on a time domain, the method cannot accurately capture the time domain information entropy change of collected data, the redundant data with small information entropy change in the adjacent time periods before and after has poor recognition capability, and redundant information residues are more after data compression is completed. On the other hand, the photovoltaic data compression method based on the feature domain, however, the photovoltaic data compression method based on the feature domain cannot accurately mine the association between different photovoltaic sign values, and cannot perform redundant data compression according to the inferability among different features and the weight of the data, which results in the reduction of the data compression efficiency. Meanwhile, the prior art ignores close association of two dimensions of a time domain and a feature domain, but only relies on a single domain for compression, so that maximum compression of redundant data cannot be realized, transmission, storage and processing of massive photovoltaic monitoring data are difficult to support, and the difficulty of realizing efficient compression of data and efficient transmission based on carriers is high.
Referring to fig. 1, fig. 1 is a schematic diagram illustrating a method for dual compression of distributed photovoltaic data for carrier communication according to an embodiment of the present invention, including steps S1 to S3; wherein,,
step S1, collecting distributed photovoltaic original data, and denoising the distributed photovoltaic data to obtain first distributed photovoltaic data.
Because the distributed photovoltaic original data contains certain noise, the embodiment performs denoising processing on the distributed photovoltaic data to obtain first distributed photovoltaic data, which specifically comprises the following steps:
and carrying out wavelet transformation on the distributed original photovoltaic data, calculating the ratio of the approximate coefficient energy to the energy of all wavelet functions after the data are decomposed by using different wavelet functions, and selecting the wavelet function with the largest signal energy ratio as the optimal wavelet function.
Then, wavelet decomposition is performed on the distributed photovoltaic data subjected to wavelet transformation through the selected optimal wavelet function, and the wavelet decomposition is used as a preferred implementation manner of the embodiment:
and carrying out wavelet decomposition on the distributed photovoltaic data subjected to wavelet transformation through a wavelet function, and carrying out whitening inspection on the detail coefficient after decomposition until the energy contained in the detail coefficient is greater than or equal to a set second threshold value. In this embodiment, with the increase of the wavelet decomposition layer number, the resolution of the original signal is gradually improved, so that the accurate removal of noise in the subsequent denoising process can be facilitated. However, if the number of wavelet decomposition layers is too large, the useful signal gradually adds detail coefficients, which cull the useful signal together in the subsequent denoising process. Therefore, in this embodiment, the whitening inspection is performed on the decomposed detail coefficient, and when the energy contained in the detail coefficient is greater than or equal to the set threshold (the second threshold), the energy of the useful signal is considered to enter the detail coefficient, so that the decomposition should be stopped, and the purpose of accurately determining the wavelet decomposition layer number is achieved.
And denoising the detail coefficients of each layer through a wavelet threshold value, reconstructing the approximation coefficient of the last layer and the detail coefficients of each layer according to recorded wavelets to obtain the first distributed photovoltaic data, denoising signals is realized, and the distributed photovoltaic original data containing noise is processed into first distributed photovoltaic data without noise. The first threshold value adopts a steepest descent method to carry out self-adaptive estimation of the threshold value.
Step S2, constructing a time period matrix according to different time periods of the first distributed photovoltaic data and a plurality of characteristic dimensions of the first distributed photovoltaic data; when the difference value of the information entropy between adjacent time periods with the same characteristic dimension in the time period matrix is smaller than a preset information entropy change threshold value, performing data compression based on a wavelet LZW dictionary on first distributed photovoltaic data corresponding to a time period before the adjacent time periods to obtain second distributed photovoltaic data; the characteristic dimensions comprise a voltage characteristic dimension, a current characteristic dimension, an active power characteristic dimension, a reactive power characteristic dimension, an inclination characteristic dimension, a temperature characteristic dimension and a humidity characteristic dimension of the distributed photovoltaic.
In this embodiment, a t×n matrix is first constructed for the first distributed photovoltaic data. The three-dimensional photovoltaic module comprises N characteristic dimensions, namely a voltage characteristic dimension, a current characteristic dimension, an active power characteristic dimension, a reactive power characteristic dimension, an inclination angle characteristic dimension, a temperature characteristic dimension, a humidity characteristic dimension and the like of the distributed photovoltaic module, and also comprises a time dimension. The data matrix X is represented as:
wherein x is t,n For the data value of the nth dimension feature at the T moment, the T row is the N dimension photovoltaic grid-connected feature vector xt collected at the T moment, and the nth column is the feature value vector x of the nth dimension feature at the time from 1 to T n 。
Then, dividing the T time points into M time periods, wherein each time period has the length of The representation is rounded up, and zero padding is performed if the time point of the Mth time period is insufficient. The constructed time segment data matrix, namely the time segment matrix is:
wherein,,the mth time period is the nth feature.
The information entropy calculation mode of the nth dimension characteristic in the mth dimension time period is as follows:
wherein P (x) i,n ) The probability that the nth dimensional feature occurs in the mth time period at the moment i.
Then, the difference of the information entropy between the time period m and the time period m-1 is calculated, and at the same time, an information entropy change threshold value entropy can be preset T 。
When |H (x m,n )-H(x m-1,n )|<entropy T When the information is considered to be highly repeated for x m-1,n Data compression is performed.
Preferably, a data compression method of the LZW dictionary may be adopted, specifically:
for x m-1,n Wavelet reconstruction is performed on the elements in (a):
wherein,,in order to reconstruct the n-th dimension characteristic of the p moment in the m-1 time period, h and g are wavelet reconstruction filters in the time domain. After reading in the wavelet reconstruction x m-1,n New character +.>And is connected with x m,n Character composition character combination of the corresponding position in +.>And searching whether the character combination is in a dictionary preset by LZW.
If the character combination is in the dictionary, then the next x is read in m-1,n If not, outputting a character code as follows:
wherein,,is a modulo-2 addition. Further, pair->The inverse reconstruction is carried out, namely A -1 [x i,n ]And obtaining a final coding result. The above data compression method based on LZW dictionary is repeated, and all data are encoded (all data refer to all x m-1,n )。
And S3, constructing a characteristic value knowledge graph according to the first distributed photovoltaic data, calculating the inferable degree of each main entity of the characteristic value knowledge graph, respectively endowing each main entity with a first weight, and performing second data compression on the second distributed photovoltaic data according to the endowed first weight.
In this embodiment, the eigenvalue knowledge graph g= { E is constructed for the denoised first distributed photovoltaic data O ,R,E S ,T};
Wherein E is O As a main entity set, E S R is a relation set between a host entity and a guest entity, and T is a time stamp set; the main entity set comprises a current characteristic, a voltage characteristic and a power characteristic; the guest entity set includes light intensity features, humidity features, and temperature features. It should be noted that the main entity set includes all photovoltaic characteristic values that can be predicted, such as current, voltage, power, etc.; the guest entity contains all characteristic values which can predict the host entity, such as light intensity, humidity, temperature and the like; t is the set of time points for all measured data. The knowledge graph construction of the feature values of different dimensions at the moment t is shown in fig. 2.
Then, using the constructed knowledge graph, the corollary measurement is carried out on different main entity characteristics, in particular:
according to the following calculationThe master entity e o Is (1) the inferable degree Y of o :
Wherein e S A guest entity element related to a host entity, h ei Representing a set of involved entity embeddings, the entity embeddings combining comprising a guest and a host, h ri Representing the related host-guest embedded relation set, h ti For the set of guest timestamps involved,representing the cartesian product of the two dimensions of the two, for implementing the connection of vectors, ">Representing the number of neighboring primary entities, I.I. | 2 Representing taking the binary norms.
And then, different main entities, namely the characteristic values, are weighted and ordered according to the inferability and the information entropy, and are given different weights (first weights). Wherein the information entropy of the master entity H (e o ) The method comprises the following steps:
a first weight w of the main entity n The method comprises the following steps:
wherein T is the total number of time instants, P (x i,n ) Sigmoid is the activation function for the probability of the feature occurrence in the nth dimension at i time, and N is the total number of feature dimensions. Furthermore, consider the lack of different application scenariosOther activation functions may be used as desired.
As a preferred embodiment, the second data compression is performed on the second distributed photovoltaic data according to the given first weight, specifically:
determining a characteristic value data compression threshold TH based on the assigned first weight and the calculated inferable degree;
first weight w of eigenvalue of second distributed photovoltaic data n When the data transmission quantity is smaller than the characteristic value data compression threshold value TH, calculating the data quantity transmitted by the characteristic value, wherein a calculation formula is as follows:
wherein w is max As the maximum value of the weight of the ownership,is rounded downwards.
And, further calculating the second weight corresponding to each moment from the characteristic value 1 to the moment T:
sequencing the data of all times (total T times) of the characteristic values according to a second weight, and deleting the characteristic values of all times in sequence from the data with the minimum second weight until the data quantity p conforming to the characteristic value transmission n . At this time, the codes of all data in the time of the main entity with the maximum weight, namely the eigenvalue 1 to the time T are reserved, and the number of the eigenvalue data with the residual weight and the strong pushability is reduced according to the proportion. And it should be noted that the difference between the first weight and the second weight is: the first weight corresponds to the characteristic value and the second weight corresponds to each moment of the characteristic value.
Thus, it is achieved that dual compression of distributed photovoltaic data from both the time and feature dimensions can be used for data transmission and communication of carriers. The dual compression method reduces the redundancy of the data, compresses the data as much as possible, and simultaneously effectively ensures the quality of the compressed data so as to ensure the high-efficiency data transmission of the carrier.
Correspondingly, referring to fig. 3, the embodiment of the invention further provides a distributed photovoltaic data dual compression device based on carrier communication, which comprises a denoising module 101, a first compression module 102 and a second compression module 103; wherein,,
the denoising module 101 is configured to collect distributed photovoltaic raw data, denoise the distributed photovoltaic data, and obtain first distributed photovoltaic data;
the first compression module 102 is configured to construct a time period matrix according to different time periods of the first distributed photovoltaic data and a plurality of feature dimensions of the first distributed photovoltaic data; when the difference value of the information entropy between adjacent time periods with the same characteristic dimension in the time period matrix is smaller than a preset information entropy change threshold value, performing data compression based on a wavelet LZW dictionary on first distributed photovoltaic data corresponding to a time period before the adjacent time periods to obtain second distributed photovoltaic data; the characteristic dimension comprises a voltage characteristic dimension, a current characteristic dimension, an active power characteristic dimension, a reactive power characteristic dimension, an inclination characteristic dimension, a temperature characteristic dimension and a humidity characteristic dimension of the distributed photovoltaic;
the second compression module 103 is configured to construct a eigenvalue knowledge graph according to the first distributed photovoltaic data, calculate the inferable degree of each main entity of the eigenvalue knowledge graph, assign a first weight to each main entity, and perform a second data compression on the second distributed photovoltaic data according to the assigned first weight.
Preferably, the feature value knowledge graph is g= { E O ,R,E S ,T};
Wherein E is O As a main entity set, E S R is a relation set between a host entity and a guest entity, and T is a time stamp set; the main entity set comprises current characteristics, voltage characteristics anda power characteristic; the guest entity set includes light intensity features, humidity features, and temperature features.
As a preferred solution, the second compression module 103 calculates the inferable degree of each main entity of the feature value knowledge graph, specifically:
the second compression module 103 calculates the primary entity e according to the following o Is (1) the inferable degree Y of o :
Wherein e S A guest entity element related to a host entity, h ei Representing a set of involved entity embeddings, the entity embeddings combining comprising a guest and a host, h ri Representing the related host-guest embedded relation set, h ti For the set of guest timestamps involved,representing Cartesian product,/->Representing the number of neighboring primary entities, I.I. | 2 Representing taking the binary norms.
Compared with the prior art, the embodiment of the invention has the following beneficial effects:
the embodiment of the invention provides a distributed photovoltaic data double compression method and device for carrier communication, wherein the distributed photovoltaic data double compression method comprises the following steps: collecting distributed photovoltaic original data, and denoising the distributed photovoltaic data to obtain first distributed photovoltaic data; constructing a time period matrix according to different time periods of the first distributed photovoltaic data and a plurality of characteristic dimensions of the first distributed photovoltaic data; when the difference value of the information entropy between adjacent time periods with the same characteristic dimension in the time period matrix is smaller than a preset information entropy change threshold value, performing data compression based on a wavelet LZW dictionary on first distributed photovoltaic data corresponding to a time period before the adjacent time periods to obtain second distributed photovoltaic data; the characteristic dimension comprises a voltage characteristic dimension, a current characteristic dimension, an active power characteristic dimension, a reactive power characteristic dimension, an inclination characteristic dimension, a temperature characteristic dimension and a humidity characteristic dimension of the distributed photovoltaic; and constructing a characteristic value knowledge graph according to the first distributed photovoltaic data, calculating the inferable degree of each main entity of the characteristic value knowledge graph, respectively endowing each main entity with a first weight, and carrying out second data compression on the second distributed photovoltaic data according to the endowed first weight. Compared with the existing photovoltaic data compression scheme based on the time domain, by constructing a time period matrix and carrying out lossless data compression based on wavelet information entropy, the change of the time domain information entropy of the acquired data can be accurately captured, the recognition capability of redundant data with small information entropy change of the front and rear adjacent time periods is improved, the residual quantity of the redundant information is reduced after the data compression is completed, and the quality of the compressed data is effectively ensured.
Further, the inferable property of different main entities is measured by the knowledge graph of the feature domain, the weight of the feature value can be calculated by combining the information entropy of different main entities, the feature value data compression threshold is set based on the weight and the inferable property, the secondary compression of the photovoltaic data is carried out, specifically, the feature value with low partial weight and strong inferable property is compressed, the data can be further simplified, and the data quantity required to be transmitted is reduced.
Further, the correlation between the time domain and the characteristic domain of the photovoltaic monitoring data is deeply mined by utilizing the continuity of time and space and the reasoning among the characteristics, and the redundancy of the data is reduced from the two dimensions of time and the characteristics so as to compress the data as much as possible, thereby realizing the efficient data transmission based on carrier waves and improving the compression efficiency.
The foregoing embodiments have been provided for the purpose of illustrating the general principles of the present invention, and are not to be construed as limiting the scope of the invention. It should be noted that any modifications, equivalent substitutions, improvements, etc. made by those skilled in the art without departing from the spirit and principles of the present invention are intended to be included in the scope of the present invention.
Claims (10)
1. A distributed photovoltaic data double compression method for carrier communication, comprising:
collecting distributed photovoltaic original data, and denoising the distributed photovoltaic data to obtain first distributed photovoltaic data;
constructing a time period matrix according to different time periods of the first distributed photovoltaic data and a plurality of characteristic dimensions of the first distributed photovoltaic data; when the difference value of the information entropy between adjacent time periods with the same characteristic dimension in the time period matrix is smaller than a preset information entropy change threshold value, performing data compression based on a wavelet LZW dictionary on first distributed photovoltaic data corresponding to a time period before the adjacent time periods to obtain second distributed photovoltaic data; the characteristic dimension comprises a voltage characteristic dimension, a current characteristic dimension, an active power characteristic dimension, a reactive power characteristic dimension, an inclination characteristic dimension, a temperature characteristic dimension and a humidity characteristic dimension of the distributed photovoltaic;
and constructing a characteristic value knowledge graph according to the first distributed photovoltaic data, calculating the inferable degree of each main entity of the characteristic value knowledge graph, respectively endowing each main entity with a first weight, and carrying out second data compression on the second distributed photovoltaic data according to the endowed first weight.
2. The method for double compression of distributed photovoltaic data based on carrier communication according to claim 1, wherein the eigenvalue knowledge graph is g= { E O ,R,E S ,T};
Wherein E is O As a main entity set, E S R is a relation set between a host entity and a guest entity, and T is a time stamp set; the main entity set comprises a current characteristic, a voltage characteristic and a power characteristic; the set of guest entities includes light intensity featuresHumidity characteristics and temperature characteristics.
3. The method for double compression of distributed photovoltaic data based on carrier communication according to claim 2, wherein the calculating the inferable degree of each main entity of the eigenvalue knowledge graph is specifically as follows:
calculating the master entity e according to the following o Is (1) the inferable degree Y of o :
Wherein e S A guest entity element related to a host entity, h ei Representing a set of involved entity embeddings, the entity embeddings combining comprising a guest and a host, h ri Representing the related host-guest embedded relation set, h ti For the set of guest timestamps involved,representing Cartesian product,/->Representing the number of neighboring primary entities, I.I. | 2 Representing taking the binary norms.
4. A method for dual compression of distributed photovoltaic data based on carrier communication according to claim 3, wherein the second data compression is performed on the second distributed photovoltaic data according to the given first weight, specifically:
determining a eigenvalue data compression threshold based on the assigned first weight and the calculated inferable degree;
when the first weight of the characteristic value of the second distributed photovoltaic data is smaller than the characteristic value data compression threshold value, calculating the data quantity transmitted by the characteristic value;
calculating a second weight corresponding to each moment of the characteristic value, sorting the data of all the moments of the characteristic value according to the second weight, and deleting the characteristic value of each moment in sequence from the data with the minimum second weight until the quantity of the data transmitted by the characteristic value is met.
5. A method for dual compression of distributed photovoltaic data based on carrier communication according to claim 3, wherein each of the main entities is given a first weight, specifically:
weighting and sequencing all the main entities according to the inferable degree and the information entropy, and respectively giving a first weight; wherein the information entropy of the master entity H (e o ) The method comprises the following steps:
a first weight w of the main entity n The method comprises the following steps:
wherein T is the total number of time instants, P (x i,n ) Sigmoid is the activation function for the probability of the feature occurrence in the nth dimension at i time, and N is the total number of feature dimensions.
6. The method for dual compression of distributed photovoltaic data based on carrier communication according to any one of claims 1 to 5, wherein the denoising process is performed on the distributed photovoltaic data to obtain first distributed photovoltaic data, specifically:
performing wavelet transformation on the distributed original photovoltaic data, and performing wavelet decomposition on the distributed photovoltaic data subjected to wavelet transformation through a wavelet function;
and denoising the detail coefficients of each layer through a wavelet threshold value, and reconstructing the approximation coefficient of the last layer and the detail coefficients of each layer according to the recorded wavelet to obtain the first distributed photovoltaic data.
7. The carrier communication-based distributed photovoltaic data dual compression method according to claim 6, wherein the wavelet decomposition is performed on the distributed photovoltaic data subjected to wavelet transformation by a wavelet function, specifically:
and carrying out wavelet decomposition on the distributed photovoltaic data subjected to wavelet transformation through a wavelet function, and carrying out whitening inspection on the detail coefficient after decomposition until the energy contained in the detail coefficient is greater than or equal to a set second threshold value.
8. The distributed photovoltaic data double compression device based on carrier communication is characterized by comprising a denoising module, a first compression module and a second compression module; wherein,,
the denoising module is used for acquiring the distributed photovoltaic original data, denoising the distributed photovoltaic data and obtaining first distributed photovoltaic data;
the first compression module is used for constructing a time period matrix according to different time periods of the first distributed photovoltaic data and a plurality of characteristic dimensions of the first distributed photovoltaic data; when the difference value of the information entropy between adjacent time periods with the same characteristic dimension in the time period matrix is smaller than a preset information entropy change threshold value, performing data compression based on a wavelet LZW dictionary on first distributed photovoltaic data corresponding to a time period before the adjacent time periods to obtain second distributed photovoltaic data; the characteristic dimension comprises a voltage characteristic dimension, a current characteristic dimension, an active power characteristic dimension, a reactive power characteristic dimension, an inclination characteristic dimension, a temperature characteristic dimension and a humidity characteristic dimension of the distributed photovoltaic;
the second compression module is used for constructing a characteristic value knowledge graph according to the first distributed photovoltaic data, calculating the inferable degree of each main entity of the characteristic value knowledge graph, respectively endowing each main entity with a first weight, and carrying out second data compression on the second distributed photovoltaic data according to the endowed first weight.
9. The carrier communication-based distributed photovoltaic data double compression device as claimed in claim 8, wherein the eigenvalue knowledge graph is g= { E O ,R,E S ,T};
Wherein E is O As a main entity set, E S R is a relation set between a host entity and a guest entity, and T is a time stamp set; the main entity set comprises a current characteristic, a voltage characteristic and a power characteristic; the guest entity set includes light intensity features, humidity features, and temperature features.
10. The distributed photovoltaic data dual compression device based on carrier communication according to claim 9, wherein the second compression module calculates the inferable degree of each main entity of the eigenvalue knowledge graph, specifically:
the second compression module calculates the main entity e according to the following formula o Is (1) the inferable degree Y of o :
Wherein e S A guest entity element related to a host entity, h ei Representing a set of involved entity embeddings, the entity embeddings combining comprising a guest and a host, h ri Representing the related host-guest embedded relation set, h ti For the set of guest timestamps involved,representing Cartesian product,/->Representing adjacentThe number of primary entities, I.I. | 2 Representing taking the binary norms.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310482625.5A CN116483789A (en) | 2023-04-28 | 2023-04-28 | Distributed photovoltaic data double compression method and device based on carrier communication |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310482625.5A CN116483789A (en) | 2023-04-28 | 2023-04-28 | Distributed photovoltaic data double compression method and device based on carrier communication |
Publications (1)
Publication Number | Publication Date |
---|---|
CN116483789A true CN116483789A (en) | 2023-07-25 |
Family
ID=87221265
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310482625.5A Pending CN116483789A (en) | 2023-04-28 | 2023-04-28 | Distributed photovoltaic data double compression method and device based on carrier communication |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116483789A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117421699A (en) * | 2023-12-15 | 2024-01-19 | 佳源科技股份有限公司 | Electric energy meter fault fusion prediction method and system |
-
2023
- 2023-04-28 CN CN202310482625.5A patent/CN116483789A/en active Pending
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117421699A (en) * | 2023-12-15 | 2024-01-19 | 佳源科技股份有限公司 | Electric energy meter fault fusion prediction method and system |
CN117421699B (en) * | 2023-12-15 | 2024-03-01 | 佳源科技股份有限公司 | Electric energy meter fault fusion prediction method and system |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109145961B (en) | Pattern recognition method and system for unstructured partial discharge data | |
CN101944362B (en) | Integer wavelet transform-based audio lossless compression encoding and decoding method | |
CN116483789A (en) | Distributed photovoltaic data double compression method and device based on carrier communication | |
CN103398295B (en) | A kind of pipeline magnetic flux leakage signal data compression device and method | |
JPH08275165A (en) | Method and apparatus for coding video signal | |
CN112560699B (en) | Gear vibration information source underdetermined blind source separation method based on density and compressed sensing | |
CN115514343B (en) | Power grid waveform filtering system and filtering method thereof | |
CN110827198A (en) | Multi-camera panoramic image construction method based on compressed sensing and super-resolution reconstruction | |
CN116567269A (en) | Spectrum monitoring data compression method based on signal-to-noise separation | |
WO1985005514A1 (en) | Signal processing system | |
CN112468154A (en) | Data compression method suitable for visualization of oceanographic weather | |
CN111341331B (en) | Voice enhancement method, device and medium based on local attention mechanism | |
Pal et al. | A hybrid 2d ecg compression algorithm using dct and embedded zero tree wavelet | |
JPH0556070B2 (en) | ||
CN115389888B (en) | Partial discharge real-time monitoring system based on high-voltage cable | |
CN111224938A (en) | Wireless seismograph network compressed data transmission method | |
CN105021277A (en) | Wavelet-packet-correlation-dimension-combination-based vibration signal feature extraction method of high-voltage circuit breaker | |
WO2024011426A1 (en) | Point cloud geometry data augmentation method and apparatus, encoding method and apparatus, decoding method and apparatus, and encoding and decoding system | |
CN106160944B (en) | A kind of variable rate coding compression method of ultrasonic wave local discharge signal | |
Kok et al. | Multirate filter banks and transform coding gain | |
RU2227324C2 (en) | Device and method for coding and decoding graphical animation key data | |
Komatsu et al. | 3-d mean-separation-type short-time dft with its application to moving-image denoising | |
CN113949880A (en) | Extremely-low-bit-rate man-machine collaborative image coding training method and coding and decoding method | |
Acar et al. | Image coding using a weak membrane model of images | |
Sriraam et al. | Lossless compression of EEG data using neural network predictors |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |