CN116483789A - Distributed photovoltaic data double compression method and device based on carrier communication - Google Patents

Distributed photovoltaic data double compression method and device based on carrier communication Download PDF

Info

Publication number
CN116483789A
CN116483789A CN202310482625.5A CN202310482625A CN116483789A CN 116483789 A CN116483789 A CN 116483789A CN 202310482625 A CN202310482625 A CN 202310482625A CN 116483789 A CN116483789 A CN 116483789A
Authority
CN
China
Prior art keywords
data
distributed photovoltaic
entity
photovoltaic data
compression
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310482625.5A
Other languages
Chinese (zh)
Inventor
李波
施展
邓晓智
杨志花
杨嘉明
黄平
钱鑫
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong Power Grid Co Ltd
Electric Power Dispatch Control Center of Guangdong Power Grid Co Ltd
Original Assignee
Guangdong Power Grid Co Ltd
Electric Power Dispatch Control Center of Guangdong Power Grid Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong Power Grid Co Ltd, Electric Power Dispatch Control Center of Guangdong Power Grid Co Ltd filed Critical Guangdong Power Grid Co Ltd
Priority to CN202310482625.5A priority Critical patent/CN116483789A/en
Publication of CN116483789A publication Critical patent/CN116483789A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/17Details of further file system functions
    • G06F16/174Redundancy elimination performed by the file system
    • G06F16/1744Redundancy elimination performed by the file system using compression, e.g. sparse files
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L69/00Network arrangements, protocols or services independent of the application payload and not provided for in the other groups of this subclass
    • H04L69/04Protocols for data compression, e.g. ROHC

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Computing Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Animal Behavior & Ethology (AREA)
  • Computational Linguistics (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Photovoltaic Devices (AREA)

Abstract

The invention provides a distributed photovoltaic data double compression method and device for carrier communication, wherein the method comprises the following steps: collecting distributed photovoltaic original data, and denoising the distributed photovoltaic data to obtain first distributed photovoltaic data; constructing a time period matrix according to different time periods and a plurality of characteristic dimensions of the first distributed photovoltaic data; when a preset condition is met, performing data compression on the first distributed photovoltaic data based on the wavelet LZW dictionary to obtain second distributed photovoltaic data; constructing a characteristic value knowledge graph, calculating the inferable degree, respectively giving a first weight, and carrying out second data compression on the second distributed photovoltaic data according to the given weight. Compared with the prior art, by constructing a time period matrix and carrying out lossless data compression based on wavelet information entropy, the change of the time domain information entropy of the acquired data can be accurately captured, and the recognition capability of redundant data with small change of the information entropy of the front and rear adjacent time periods is improved.

Description

Distributed photovoltaic data double compression method and device based on carrier communication
Technical Field
The invention relates to the field of power data compression, in particular to a distributed photovoltaic data double compression method and device based on carrier communication.
Background
Along with the acceleration of the construction of a novel power system, massive distributed photovoltaic equipment needs to be connected into a power grid and fine management is carried out. In order to support smooth development of services such as fault accurate positioning, peak shaving/frequency modulation, grid connection/off-grid and the like of the photovoltaic panel, monitoring data generated by a distributed photovoltaic system rises exponentially. The monitoring data such as voltage, current, active power, reactive power, inclination angle, temperature, humidity and illumination data collected by the device need to be converged to an edge gateway through power line carrier communication and uploaded to a main station. And the carrier communication channel has limited capacity, so that a novel data compression method is necessary to be researched in order to support mass data transmission, and multi-source heterogeneous data is compressed to the maximum extent on the basis of not losing useful information, thereby reducing data transmission and processing time delay and relieving data storage pressure.
The traditional photovoltaic data compression method based on the time domain cannot accurately capture the time domain information entropy change of the acquired data, the redundant data with small information entropy change in the front and rear adjacent time periods is poor in recognition capability, and redundant information residues are more after data compression is completed.
Disclosure of Invention
The invention provides a distributed photovoltaic data double compression method and device based on carrier communication, which solve the technical problem of how to improve redundant data recognition capability by constructing a time segment matrix and calculating information entropy difference values between adjacent time periods and accurately capture time domain information entropy change of collected data.
In order to solve the above technical problems, an embodiment of the present invention provides a distributed photovoltaic data dual compression method for carrier communication, including:
collecting distributed photovoltaic original data, and denoising the distributed photovoltaic data to obtain first distributed photovoltaic data;
constructing a time period matrix according to different time periods of the first distributed photovoltaic data and a plurality of characteristic dimensions of the first distributed photovoltaic data; when the difference value of the information entropy between adjacent time periods with the same characteristic dimension in the time period matrix is smaller than a preset information entropy change threshold value, performing data compression based on a wavelet LZW dictionary on first distributed photovoltaic data corresponding to a time period before the adjacent time periods to obtain second distributed photovoltaic data; the characteristic dimension comprises a voltage characteristic dimension, a current characteristic dimension, an active power characteristic dimension, a reactive power characteristic dimension, an inclination characteristic dimension, a temperature characteristic dimension and a humidity characteristic dimension of the distributed photovoltaic;
and constructing a characteristic value knowledge graph according to the first distributed photovoltaic data, calculating the inferable degree of each main entity of the characteristic value knowledge graph, respectively endowing each main entity with a first weight, and carrying out second data compression on the second distributed photovoltaic data according to the endowed first weight.
Preferably, the feature value knowledge graph is g= { E O ,R,E S ,T};
Wherein E is O As a main entity set, E S R is a relation set between a host entity and a guest entity, and T is a time stamp set; the main entity set comprises a current characteristic, a voltage characteristic and a power characteristic; the guest entity set includes light intensity features, humidity features, and temperature features.
As a preferred solution, the calculating the inferable degree of each main entity of the eigenvalue knowledge graph specifically includes:
calculating the master entity e according to the following o Is (1) the inferable degree Y of o
Wherein e S A guest entity element related to a host entity, h ei Representing a set of involved entity embeddings, the entity embeddings combining comprising a guest and a host, h ri Representing the related host-guest embedded relation set, h ti For the set of guest timestamps involved,representing Cartesian products, U i o Representing the number of neighboring primary entities, I.I. | 2 Representing taking the binary norms.
As a preferred solution, the second data compression is performed on the second distributed photovoltaic data according to the given first weight, specifically:
determining a eigenvalue data compression threshold based on the assigned first weight and the calculated inferable degree;
when the first weight of the characteristic value of the second distributed photovoltaic data is smaller than the characteristic value data compression threshold value, calculating the data quantity transmitted by the characteristic value;
calculating a second weight corresponding to each moment of the characteristic value, sorting the data of all the moments of the characteristic value according to the second weight, and deleting the characteristic value of each moment in sequence from the data with the minimum second weight until the quantity of the data transmitted by the characteristic value is met.
As a preferred solution, the first weights are respectively given to the main entities, specifically:
weighting and sequencing all the main entities according to the inferable degree and the information entropy, and respectively giving a first weight; wherein the information entropy of the master entity H (e o ) The method comprises the following steps:
a first weight w of the main entity n The method comprises the following steps:
wherein T is the total number of time instants, P (x i,n ) Sigmoid is the activation function for the probability of the feature occurrence in the nth dimension at i time, and N is the total number of feature dimensions.
As a preferred solution, the denoising processing is performed on the distributed photovoltaic data to obtain first distributed photovoltaic data, which specifically includes:
performing wavelet transformation on the distributed original photovoltaic data, and performing wavelet decomposition on the distributed photovoltaic data subjected to wavelet transformation through a wavelet function;
and denoising the detail coefficients of each layer through a wavelet threshold value, and reconstructing the approximation coefficient of the last layer and the detail coefficients of each layer according to the recorded wavelet to obtain the first distributed photovoltaic data.
As a preferred scheme, the wavelet decomposition is performed on the distributed photovoltaic data subjected to wavelet transformation through a wavelet function, specifically:
and carrying out wavelet decomposition on the distributed photovoltaic data subjected to wavelet transformation through a wavelet function, and carrying out whitening inspection on the detail coefficient after decomposition until the energy contained in the detail coefficient is greater than or equal to a set second threshold value.
Correspondingly, the embodiment of the invention also provides a distributed photovoltaic data double compression device based on carrier communication, which comprises a denoising module, a first compression module and a second compression module; wherein,,
the denoising module is used for acquiring the distributed photovoltaic original data, denoising the distributed photovoltaic data and obtaining first distributed photovoltaic data;
the first compression module is used for constructing a time period matrix according to different time periods of the first distributed photovoltaic data and a plurality of characteristic dimensions of the first distributed photovoltaic data; when the difference value of the information entropy between adjacent time periods with the same characteristic dimension in the time period matrix is smaller than a preset information entropy change threshold value, performing data compression based on a wavelet LZW dictionary on first distributed photovoltaic data corresponding to a time period before the adjacent time periods to obtain second distributed photovoltaic data; the characteristic dimension comprises a voltage characteristic dimension, a current characteristic dimension, an active power characteristic dimension, a reactive power characteristic dimension, an inclination characteristic dimension, a temperature characteristic dimension and a humidity characteristic dimension of the distributed photovoltaic;
the second compression module is used for constructing a characteristic value knowledge graph according to the first distributed photovoltaic data, calculating the inferable degree of each main entity of the characteristic value knowledge graph, respectively endowing each main entity with a first weight, and carrying out second data compression on the second distributed photovoltaic data according to the endowed first weight.
Preferably, the feature value knowledge graph is g= { E O ,R,E S ,T};
Wherein E is O As a main entity set, E S R is a relation set between a host entity and a guest entity, and T is a time stamp set; the main entity set comprises a current characteristic, a voltage characteristic and a power characteristic; the guest entity set includes light intensity features, humidity features, and temperature features.
As a preferred solution, the second compression module calculates the inferable degree of each main entity of the feature value knowledge graph, specifically:
the second compression module calculates the main entity e according to the following formula o Is (1) the inferable degree Y of o
Wherein e S A guest entity element related to a host entity, h ei Representing a set of involved entity embeddings, the entity embeddings combining comprising a guest and a host, h ri Representing the related host-guest embedded relation set, h ti For the set of guest timestamps involved,representing Cartesian product,/->Representing the number of neighboring primary entities, I.I. | 2 Representing taking the binary norms.
Compared with the prior art, the embodiment of the invention has the following beneficial effects:
the embodiment of the invention provides a distributed photovoltaic data double compression method and device for carrier communication, wherein the distributed photovoltaic data double compression method comprises the following steps: collecting distributed photovoltaic original data, and denoising the distributed photovoltaic data to obtain first distributed photovoltaic data; constructing a time period matrix according to different time periods of the first distributed photovoltaic data and a plurality of characteristic dimensions of the first distributed photovoltaic data; when the difference value of the information entropy between adjacent time periods with the same characteristic dimension in the time period matrix is smaller than a preset information entropy change threshold value, performing data compression based on a wavelet LZW dictionary on first distributed photovoltaic data corresponding to a time period before the adjacent time periods to obtain second distributed photovoltaic data; the characteristic dimension comprises a voltage characteristic dimension, a current characteristic dimension, an active power characteristic dimension, a reactive power characteristic dimension, an inclination characteristic dimension, a temperature characteristic dimension and a humidity characteristic dimension of the distributed photovoltaic; and constructing a characteristic value knowledge graph according to the first distributed photovoltaic data, calculating the inferable degree of each main entity of the characteristic value knowledge graph, respectively endowing each main entity with a first weight, and carrying out second data compression on the second distributed photovoltaic data according to the endowed first weight. Compared with the existing photovoltaic data compression scheme based on the time domain, by constructing a time period matrix and carrying out lossless data compression based on wavelet information entropy, the change of the time domain information entropy of the acquired data can be accurately captured, the recognition capability of redundant data with small information entropy change of the front and rear adjacent time periods is improved, the residual quantity of the redundant information is reduced after the data compression is completed, and the quality of the compressed data is effectively ensured.
Further, the inferable property of different main entities is measured by the knowledge graph of the feature domain, the weight of the feature value can be calculated by combining the information entropy of different main entities, the feature value data compression threshold is set based on the weight and the inferable property, the secondary compression of the photovoltaic data is carried out, specifically, the feature value with low partial weight and strong inferable property is compressed, the data can be further simplified, and the data quantity required to be transmitted is reduced.
Further, the correlation between the time domain and the characteristic domain of the photovoltaic monitoring data is deeply mined by utilizing the continuity of time and space and the reasoning among the characteristics, and the redundancy of the data is reduced from the two dimensions of time and the characteristics so as to compress the data as much as possible, thereby realizing the efficient data transmission based on carrier waves and improving the compression efficiency.
Drawings
Fig. 1: the invention provides a flow diagram of one embodiment of a distributed photovoltaic data double compression method for carrier communication.
Fig. 2: the principle schematic diagram of an embodiment of the knowledge graph of the characteristic values of different dimensions at the moment t constructed by the embodiment of the invention.
Fig. 3: the invention provides a structural schematic diagram of one embodiment of a distributed photovoltaic data double compression device for carrier communication.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Embodiment one:
according to the related technology, the traditional data compression method is single, and is a photovoltaic data compression method based on a time domain, the method cannot accurately capture the time domain information entropy change of collected data, the redundant data with small information entropy change in the adjacent time periods before and after has poor recognition capability, and redundant information residues are more after data compression is completed. On the other hand, the photovoltaic data compression method based on the feature domain, however, the photovoltaic data compression method based on the feature domain cannot accurately mine the association between different photovoltaic sign values, and cannot perform redundant data compression according to the inferability among different features and the weight of the data, which results in the reduction of the data compression efficiency. Meanwhile, the prior art ignores close association of two dimensions of a time domain and a feature domain, but only relies on a single domain for compression, so that maximum compression of redundant data cannot be realized, transmission, storage and processing of massive photovoltaic monitoring data are difficult to support, and the difficulty of realizing efficient compression of data and efficient transmission based on carriers is high.
Referring to fig. 1, fig. 1 is a schematic diagram illustrating a method for dual compression of distributed photovoltaic data for carrier communication according to an embodiment of the present invention, including steps S1 to S3; wherein,,
step S1, collecting distributed photovoltaic original data, and denoising the distributed photovoltaic data to obtain first distributed photovoltaic data.
Because the distributed photovoltaic original data contains certain noise, the embodiment performs denoising processing on the distributed photovoltaic data to obtain first distributed photovoltaic data, which specifically comprises the following steps:
and carrying out wavelet transformation on the distributed original photovoltaic data, calculating the ratio of the approximate coefficient energy to the energy of all wavelet functions after the data are decomposed by using different wavelet functions, and selecting the wavelet function with the largest signal energy ratio as the optimal wavelet function.
Then, wavelet decomposition is performed on the distributed photovoltaic data subjected to wavelet transformation through the selected optimal wavelet function, and the wavelet decomposition is used as a preferred implementation manner of the embodiment:
and carrying out wavelet decomposition on the distributed photovoltaic data subjected to wavelet transformation through a wavelet function, and carrying out whitening inspection on the detail coefficient after decomposition until the energy contained in the detail coefficient is greater than or equal to a set second threshold value. In this embodiment, with the increase of the wavelet decomposition layer number, the resolution of the original signal is gradually improved, so that the accurate removal of noise in the subsequent denoising process can be facilitated. However, if the number of wavelet decomposition layers is too large, the useful signal gradually adds detail coefficients, which cull the useful signal together in the subsequent denoising process. Therefore, in this embodiment, the whitening inspection is performed on the decomposed detail coefficient, and when the energy contained in the detail coefficient is greater than or equal to the set threshold (the second threshold), the energy of the useful signal is considered to enter the detail coefficient, so that the decomposition should be stopped, and the purpose of accurately determining the wavelet decomposition layer number is achieved.
And denoising the detail coefficients of each layer through a wavelet threshold value, reconstructing the approximation coefficient of the last layer and the detail coefficients of each layer according to recorded wavelets to obtain the first distributed photovoltaic data, denoising signals is realized, and the distributed photovoltaic original data containing noise is processed into first distributed photovoltaic data without noise. The first threshold value adopts a steepest descent method to carry out self-adaptive estimation of the threshold value.
Step S2, constructing a time period matrix according to different time periods of the first distributed photovoltaic data and a plurality of characteristic dimensions of the first distributed photovoltaic data; when the difference value of the information entropy between adjacent time periods with the same characteristic dimension in the time period matrix is smaller than a preset information entropy change threshold value, performing data compression based on a wavelet LZW dictionary on first distributed photovoltaic data corresponding to a time period before the adjacent time periods to obtain second distributed photovoltaic data; the characteristic dimensions comprise a voltage characteristic dimension, a current characteristic dimension, an active power characteristic dimension, a reactive power characteristic dimension, an inclination characteristic dimension, a temperature characteristic dimension and a humidity characteristic dimension of the distributed photovoltaic.
In this embodiment, a t×n matrix is first constructed for the first distributed photovoltaic data. The three-dimensional photovoltaic module comprises N characteristic dimensions, namely a voltage characteristic dimension, a current characteristic dimension, an active power characteristic dimension, a reactive power characteristic dimension, an inclination angle characteristic dimension, a temperature characteristic dimension, a humidity characteristic dimension and the like of the distributed photovoltaic module, and also comprises a time dimension. The data matrix X is represented as:
wherein x is t,n For the data value of the nth dimension feature at the T moment, the T row is the N dimension photovoltaic grid-connected feature vector xt collected at the T moment, and the nth column is the feature value vector x of the nth dimension feature at the time from 1 to T n
Then, dividing the T time points into M time periods, wherein each time period has the length of The representation is rounded up, and zero padding is performed if the time point of the Mth time period is insufficient. The constructed time segment data matrix, namely the time segment matrix is:
wherein,,the mth time period is the nth feature.
The information entropy calculation mode of the nth dimension characteristic in the mth dimension time period is as follows:
wherein P (x) i,n ) The probability that the nth dimensional feature occurs in the mth time period at the moment i.
Then, the difference of the information entropy between the time period m and the time period m-1 is calculated, and at the same time, an information entropy change threshold value entropy can be preset T
When |H (x m,n )-H(x m-1,n )|<entropy T When the information is considered to be highly repeated for x m-1,n Data compression is performed.
Preferably, a data compression method of the LZW dictionary may be adopted, specifically:
for x m-1,n Wavelet reconstruction is performed on the elements in (a):
wherein,,in order to reconstruct the n-th dimension characteristic of the p moment in the m-1 time period, h and g are wavelet reconstruction filters in the time domain. After reading in the wavelet reconstruction x m-1,n New character +.>And is connected with x m,n Character composition character combination of the corresponding position in +.>And searching whether the character combination is in a dictionary preset by LZW.
If the character combination is in the dictionary, then the next x is read in m-1,n If not, outputting a character code as follows:
wherein,,is a modulo-2 addition. Further, pair->The inverse reconstruction is carried out, namely A -1 [x i,n ]And obtaining a final coding result. The above data compression method based on LZW dictionary is repeated, and all data are encoded (all data refer to all x m-1,n )。
And S3, constructing a characteristic value knowledge graph according to the first distributed photovoltaic data, calculating the inferable degree of each main entity of the characteristic value knowledge graph, respectively endowing each main entity with a first weight, and performing second data compression on the second distributed photovoltaic data according to the endowed first weight.
In this embodiment, the eigenvalue knowledge graph g= { E is constructed for the denoised first distributed photovoltaic data O ,R,E S ,T};
Wherein E is O As a main entity set, E S R is a relation set between a host entity and a guest entity, and T is a time stamp set; the main entity set comprises a current characteristic, a voltage characteristic and a power characteristic; the guest entity set includes light intensity features, humidity features, and temperature features. It should be noted that the main entity set includes all photovoltaic characteristic values that can be predicted, such as current, voltage, power, etc.; the guest entity contains all characteristic values which can predict the host entity, such as light intensity, humidity, temperature and the like; t is the set of time points for all measured data. The knowledge graph construction of the feature values of different dimensions at the moment t is shown in fig. 2.
Then, using the constructed knowledge graph, the corollary measurement is carried out on different main entity characteristics, in particular:
according to the following calculationThe master entity e o Is (1) the inferable degree Y of o
Wherein e S A guest entity element related to a host entity, h ei Representing a set of involved entity embeddings, the entity embeddings combining comprising a guest and a host, h ri Representing the related host-guest embedded relation set, h ti For the set of guest timestamps involved,representing the cartesian product of the two dimensions of the two, for implementing the connection of vectors, ">Representing the number of neighboring primary entities, I.I. | 2 Representing taking the binary norms.
And then, different main entities, namely the characteristic values, are weighted and ordered according to the inferability and the information entropy, and are given different weights (first weights). Wherein the information entropy of the master entity H (e o ) The method comprises the following steps:
a first weight w of the main entity n The method comprises the following steps:
wherein T is the total number of time instants, P (x i,n ) Sigmoid is the activation function for the probability of the feature occurrence in the nth dimension at i time, and N is the total number of feature dimensions. Furthermore, consider the lack of different application scenariosOther activation functions may be used as desired.
As a preferred embodiment, the second data compression is performed on the second distributed photovoltaic data according to the given first weight, specifically:
determining a characteristic value data compression threshold TH based on the assigned first weight and the calculated inferable degree;
first weight w of eigenvalue of second distributed photovoltaic data n When the data transmission quantity is smaller than the characteristic value data compression threshold value TH, calculating the data quantity transmitted by the characteristic value, wherein a calculation formula is as follows:
wherein w is max As the maximum value of the weight of the ownership,is rounded downwards.
And, further calculating the second weight corresponding to each moment from the characteristic value 1 to the moment T:
sequencing the data of all times (total T times) of the characteristic values according to a second weight, and deleting the characteristic values of all times in sequence from the data with the minimum second weight until the data quantity p conforming to the characteristic value transmission n . At this time, the codes of all data in the time of the main entity with the maximum weight, namely the eigenvalue 1 to the time T are reserved, and the number of the eigenvalue data with the residual weight and the strong pushability is reduced according to the proportion. And it should be noted that the difference between the first weight and the second weight is: the first weight corresponds to the characteristic value and the second weight corresponds to each moment of the characteristic value.
Thus, it is achieved that dual compression of distributed photovoltaic data from both the time and feature dimensions can be used for data transmission and communication of carriers. The dual compression method reduces the redundancy of the data, compresses the data as much as possible, and simultaneously effectively ensures the quality of the compressed data so as to ensure the high-efficiency data transmission of the carrier.
Correspondingly, referring to fig. 3, the embodiment of the invention further provides a distributed photovoltaic data dual compression device based on carrier communication, which comprises a denoising module 101, a first compression module 102 and a second compression module 103; wherein,,
the denoising module 101 is configured to collect distributed photovoltaic raw data, denoise the distributed photovoltaic data, and obtain first distributed photovoltaic data;
the first compression module 102 is configured to construct a time period matrix according to different time periods of the first distributed photovoltaic data and a plurality of feature dimensions of the first distributed photovoltaic data; when the difference value of the information entropy between adjacent time periods with the same characteristic dimension in the time period matrix is smaller than a preset information entropy change threshold value, performing data compression based on a wavelet LZW dictionary on first distributed photovoltaic data corresponding to a time period before the adjacent time periods to obtain second distributed photovoltaic data; the characteristic dimension comprises a voltage characteristic dimension, a current characteristic dimension, an active power characteristic dimension, a reactive power characteristic dimension, an inclination characteristic dimension, a temperature characteristic dimension and a humidity characteristic dimension of the distributed photovoltaic;
the second compression module 103 is configured to construct a eigenvalue knowledge graph according to the first distributed photovoltaic data, calculate the inferable degree of each main entity of the eigenvalue knowledge graph, assign a first weight to each main entity, and perform a second data compression on the second distributed photovoltaic data according to the assigned first weight.
Preferably, the feature value knowledge graph is g= { E O ,R,E S ,T};
Wherein E is O As a main entity set, E S R is a relation set between a host entity and a guest entity, and T is a time stamp set; the main entity set comprises current characteristics, voltage characteristics anda power characteristic; the guest entity set includes light intensity features, humidity features, and temperature features.
As a preferred solution, the second compression module 103 calculates the inferable degree of each main entity of the feature value knowledge graph, specifically:
the second compression module 103 calculates the primary entity e according to the following o Is (1) the inferable degree Y of o
Wherein e S A guest entity element related to a host entity, h ei Representing a set of involved entity embeddings, the entity embeddings combining comprising a guest and a host, h ri Representing the related host-guest embedded relation set, h ti For the set of guest timestamps involved,representing Cartesian product,/->Representing the number of neighboring primary entities, I.I. | 2 Representing taking the binary norms.
Compared with the prior art, the embodiment of the invention has the following beneficial effects:
the embodiment of the invention provides a distributed photovoltaic data double compression method and device for carrier communication, wherein the distributed photovoltaic data double compression method comprises the following steps: collecting distributed photovoltaic original data, and denoising the distributed photovoltaic data to obtain first distributed photovoltaic data; constructing a time period matrix according to different time periods of the first distributed photovoltaic data and a plurality of characteristic dimensions of the first distributed photovoltaic data; when the difference value of the information entropy between adjacent time periods with the same characteristic dimension in the time period matrix is smaller than a preset information entropy change threshold value, performing data compression based on a wavelet LZW dictionary on first distributed photovoltaic data corresponding to a time period before the adjacent time periods to obtain second distributed photovoltaic data; the characteristic dimension comprises a voltage characteristic dimension, a current characteristic dimension, an active power characteristic dimension, a reactive power characteristic dimension, an inclination characteristic dimension, a temperature characteristic dimension and a humidity characteristic dimension of the distributed photovoltaic; and constructing a characteristic value knowledge graph according to the first distributed photovoltaic data, calculating the inferable degree of each main entity of the characteristic value knowledge graph, respectively endowing each main entity with a first weight, and carrying out second data compression on the second distributed photovoltaic data according to the endowed first weight. Compared with the existing photovoltaic data compression scheme based on the time domain, by constructing a time period matrix and carrying out lossless data compression based on wavelet information entropy, the change of the time domain information entropy of the acquired data can be accurately captured, the recognition capability of redundant data with small information entropy change of the front and rear adjacent time periods is improved, the residual quantity of the redundant information is reduced after the data compression is completed, and the quality of the compressed data is effectively ensured.
Further, the inferable property of different main entities is measured by the knowledge graph of the feature domain, the weight of the feature value can be calculated by combining the information entropy of different main entities, the feature value data compression threshold is set based on the weight and the inferable property, the secondary compression of the photovoltaic data is carried out, specifically, the feature value with low partial weight and strong inferable property is compressed, the data can be further simplified, and the data quantity required to be transmitted is reduced.
Further, the correlation between the time domain and the characteristic domain of the photovoltaic monitoring data is deeply mined by utilizing the continuity of time and space and the reasoning among the characteristics, and the redundancy of the data is reduced from the two dimensions of time and the characteristics so as to compress the data as much as possible, thereby realizing the efficient data transmission based on carrier waves and improving the compression efficiency.
The foregoing embodiments have been provided for the purpose of illustrating the general principles of the present invention, and are not to be construed as limiting the scope of the invention. It should be noted that any modifications, equivalent substitutions, improvements, etc. made by those skilled in the art without departing from the spirit and principles of the present invention are intended to be included in the scope of the present invention.

Claims (10)

1. A distributed photovoltaic data double compression method for carrier communication, comprising:
collecting distributed photovoltaic original data, and denoising the distributed photovoltaic data to obtain first distributed photovoltaic data;
constructing a time period matrix according to different time periods of the first distributed photovoltaic data and a plurality of characteristic dimensions of the first distributed photovoltaic data; when the difference value of the information entropy between adjacent time periods with the same characteristic dimension in the time period matrix is smaller than a preset information entropy change threshold value, performing data compression based on a wavelet LZW dictionary on first distributed photovoltaic data corresponding to a time period before the adjacent time periods to obtain second distributed photovoltaic data; the characteristic dimension comprises a voltage characteristic dimension, a current characteristic dimension, an active power characteristic dimension, a reactive power characteristic dimension, an inclination characteristic dimension, a temperature characteristic dimension and a humidity characteristic dimension of the distributed photovoltaic;
and constructing a characteristic value knowledge graph according to the first distributed photovoltaic data, calculating the inferable degree of each main entity of the characteristic value knowledge graph, respectively endowing each main entity with a first weight, and carrying out second data compression on the second distributed photovoltaic data according to the endowed first weight.
2. The method for double compression of distributed photovoltaic data based on carrier communication according to claim 1, wherein the eigenvalue knowledge graph is g= { E O ,R,E S ,T};
Wherein E is O As a main entity set, E S R is a relation set between a host entity and a guest entity, and T is a time stamp set; the main entity set comprises a current characteristic, a voltage characteristic and a power characteristic; the set of guest entities includes light intensity featuresHumidity characteristics and temperature characteristics.
3. The method for double compression of distributed photovoltaic data based on carrier communication according to claim 2, wherein the calculating the inferable degree of each main entity of the eigenvalue knowledge graph is specifically as follows:
calculating the master entity e according to the following o Is (1) the inferable degree Y of o
Wherein e S A guest entity element related to a host entity, h ei Representing a set of involved entity embeddings, the entity embeddings combining comprising a guest and a host, h ri Representing the related host-guest embedded relation set, h ti For the set of guest timestamps involved,representing Cartesian product,/->Representing the number of neighboring primary entities, I.I. | 2 Representing taking the binary norms.
4. A method for dual compression of distributed photovoltaic data based on carrier communication according to claim 3, wherein the second data compression is performed on the second distributed photovoltaic data according to the given first weight, specifically:
determining a eigenvalue data compression threshold based on the assigned first weight and the calculated inferable degree;
when the first weight of the characteristic value of the second distributed photovoltaic data is smaller than the characteristic value data compression threshold value, calculating the data quantity transmitted by the characteristic value;
calculating a second weight corresponding to each moment of the characteristic value, sorting the data of all the moments of the characteristic value according to the second weight, and deleting the characteristic value of each moment in sequence from the data with the minimum second weight until the quantity of the data transmitted by the characteristic value is met.
5. A method for dual compression of distributed photovoltaic data based on carrier communication according to claim 3, wherein each of the main entities is given a first weight, specifically:
weighting and sequencing all the main entities according to the inferable degree and the information entropy, and respectively giving a first weight; wherein the information entropy of the master entity H (e o ) The method comprises the following steps:
a first weight w of the main entity n The method comprises the following steps:
wherein T is the total number of time instants, P (x i,n ) Sigmoid is the activation function for the probability of the feature occurrence in the nth dimension at i time, and N is the total number of feature dimensions.
6. The method for dual compression of distributed photovoltaic data based on carrier communication according to any one of claims 1 to 5, wherein the denoising process is performed on the distributed photovoltaic data to obtain first distributed photovoltaic data, specifically:
performing wavelet transformation on the distributed original photovoltaic data, and performing wavelet decomposition on the distributed photovoltaic data subjected to wavelet transformation through a wavelet function;
and denoising the detail coefficients of each layer through a wavelet threshold value, and reconstructing the approximation coefficient of the last layer and the detail coefficients of each layer according to the recorded wavelet to obtain the first distributed photovoltaic data.
7. The carrier communication-based distributed photovoltaic data dual compression method according to claim 6, wherein the wavelet decomposition is performed on the distributed photovoltaic data subjected to wavelet transformation by a wavelet function, specifically:
and carrying out wavelet decomposition on the distributed photovoltaic data subjected to wavelet transformation through a wavelet function, and carrying out whitening inspection on the detail coefficient after decomposition until the energy contained in the detail coefficient is greater than or equal to a set second threshold value.
8. The distributed photovoltaic data double compression device based on carrier communication is characterized by comprising a denoising module, a first compression module and a second compression module; wherein,,
the denoising module is used for acquiring the distributed photovoltaic original data, denoising the distributed photovoltaic data and obtaining first distributed photovoltaic data;
the first compression module is used for constructing a time period matrix according to different time periods of the first distributed photovoltaic data and a plurality of characteristic dimensions of the first distributed photovoltaic data; when the difference value of the information entropy between adjacent time periods with the same characteristic dimension in the time period matrix is smaller than a preset information entropy change threshold value, performing data compression based on a wavelet LZW dictionary on first distributed photovoltaic data corresponding to a time period before the adjacent time periods to obtain second distributed photovoltaic data; the characteristic dimension comprises a voltage characteristic dimension, a current characteristic dimension, an active power characteristic dimension, a reactive power characteristic dimension, an inclination characteristic dimension, a temperature characteristic dimension and a humidity characteristic dimension of the distributed photovoltaic;
the second compression module is used for constructing a characteristic value knowledge graph according to the first distributed photovoltaic data, calculating the inferable degree of each main entity of the characteristic value knowledge graph, respectively endowing each main entity with a first weight, and carrying out second data compression on the second distributed photovoltaic data according to the endowed first weight.
9. The carrier communication-based distributed photovoltaic data double compression device as claimed in claim 8, wherein the eigenvalue knowledge graph is g= { E O ,R,E S ,T};
Wherein E is O As a main entity set, E S R is a relation set between a host entity and a guest entity, and T is a time stamp set; the main entity set comprises a current characteristic, a voltage characteristic and a power characteristic; the guest entity set includes light intensity features, humidity features, and temperature features.
10. The distributed photovoltaic data dual compression device based on carrier communication according to claim 9, wherein the second compression module calculates the inferable degree of each main entity of the eigenvalue knowledge graph, specifically:
the second compression module calculates the main entity e according to the following formula o Is (1) the inferable degree Y of o
Wherein e S A guest entity element related to a host entity, h ei Representing a set of involved entity embeddings, the entity embeddings combining comprising a guest and a host, h ri Representing the related host-guest embedded relation set, h ti For the set of guest timestamps involved,representing Cartesian product,/->Representing adjacentThe number of primary entities, I.I. | 2 Representing taking the binary norms.
CN202310482625.5A 2023-04-28 2023-04-28 Distributed photovoltaic data double compression method and device based on carrier communication Pending CN116483789A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310482625.5A CN116483789A (en) 2023-04-28 2023-04-28 Distributed photovoltaic data double compression method and device based on carrier communication

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310482625.5A CN116483789A (en) 2023-04-28 2023-04-28 Distributed photovoltaic data double compression method and device based on carrier communication

Publications (1)

Publication Number Publication Date
CN116483789A true CN116483789A (en) 2023-07-25

Family

ID=87221265

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310482625.5A Pending CN116483789A (en) 2023-04-28 2023-04-28 Distributed photovoltaic data double compression method and device based on carrier communication

Country Status (1)

Country Link
CN (1) CN116483789A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117421699A (en) * 2023-12-15 2024-01-19 佳源科技股份有限公司 Electric energy meter fault fusion prediction method and system

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117421699A (en) * 2023-12-15 2024-01-19 佳源科技股份有限公司 Electric energy meter fault fusion prediction method and system
CN117421699B (en) * 2023-12-15 2024-03-01 佳源科技股份有限公司 Electric energy meter fault fusion prediction method and system

Similar Documents

Publication Publication Date Title
CN109145961B (en) Pattern recognition method and system for unstructured partial discharge data
CN101944362B (en) Integer wavelet transform-based audio lossless compression encoding and decoding method
CN116483789A (en) Distributed photovoltaic data double compression method and device based on carrier communication
CN103398295B (en) A kind of pipeline magnetic flux leakage signal data compression device and method
JPH08275165A (en) Method and apparatus for coding video signal
CN112560699B (en) Gear vibration information source underdetermined blind source separation method based on density and compressed sensing
CN115514343B (en) Power grid waveform filtering system and filtering method thereof
CN110827198A (en) Multi-camera panoramic image construction method based on compressed sensing and super-resolution reconstruction
CN116567269A (en) Spectrum monitoring data compression method based on signal-to-noise separation
WO1985005514A1 (en) Signal processing system
CN112468154A (en) Data compression method suitable for visualization of oceanographic weather
CN111341331B (en) Voice enhancement method, device and medium based on local attention mechanism
Pal et al. A hybrid 2d ecg compression algorithm using dct and embedded zero tree wavelet
JPH0556070B2 (en)
CN115389888B (en) Partial discharge real-time monitoring system based on high-voltage cable
CN111224938A (en) Wireless seismograph network compressed data transmission method
CN105021277A (en) Wavelet-packet-correlation-dimension-combination-based vibration signal feature extraction method of high-voltage circuit breaker
WO2024011426A1 (en) Point cloud geometry data augmentation method and apparatus, encoding method and apparatus, decoding method and apparatus, and encoding and decoding system
CN106160944B (en) A kind of variable rate coding compression method of ultrasonic wave local discharge signal
Kok et al. Multirate filter banks and transform coding gain
RU2227324C2 (en) Device and method for coding and decoding graphical animation key data
Komatsu et al. 3-d mean-separation-type short-time dft with its application to moving-image denoising
CN113949880A (en) Extremely-low-bit-rate man-machine collaborative image coding training method and coding and decoding method
Acar et al. Image coding using a weak membrane model of images
Sriraam et al. Lossless compression of EEG data using neural network predictors

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination