CN115718906A - Multi-energy system multi-source heterogeneous data fusion method and system - Google Patents

Multi-energy system multi-source heterogeneous data fusion method and system Download PDF

Info

Publication number
CN115718906A
CN115718906A CN202211474597.4A CN202211474597A CN115718906A CN 115718906 A CN115718906 A CN 115718906A CN 202211474597 A CN202211474597 A CN 202211474597A CN 115718906 A CN115718906 A CN 115718906A
Authority
CN
China
Prior art keywords
data
sensor
density
energy system
fusion
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211474597.4A
Other languages
Chinese (zh)
Inventor
赵猛
张文斌
李野
佘家驹
田润泽
王毅
王凯
游跃
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
State Grid Green Energy Co ltd
State Grid Corp of China SGCC
State Grid Tianjin Electric Power Co Ltd
Information and Telecommunication Branch of State Grid Tianjin Electric Power Co Ltd
Original Assignee
State Grid Green Energy Co ltd
State Grid Corp of China SGCC
State Grid Tianjin Electric Power Co Ltd
Information and Telecommunication Branch of State Grid Tianjin Electric Power Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by State Grid Green Energy Co ltd, State Grid Corp of China SGCC, State Grid Tianjin Electric Power Co Ltd, Information and Telecommunication Branch of State Grid Tianjin Electric Power Co Ltd filed Critical State Grid Green Energy Co ltd
Priority to CN202211474597.4A priority Critical patent/CN115718906A/en
Publication of CN115718906A publication Critical patent/CN115718906A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention relates to a multi-energy system multi-source heterogeneous data fusion method and system, which comprises the following steps: step 1, acquiring load data and environment monitoring data of a multi-energy system, and preliminarily cleaning abnormal data; step 2, considering that sampling periods of all sensors are different, carrying out data time registration on the cleaned multi-energy system load data and the environment monitoring data in the step 1; step 3, considering the measurement data acquired by each sensor data is missing, data filling is carried out on the data registered in the step 2, and a more complete sensor data set is obtained; step 4, based on the complete data set filled in the step 3, adopting a DBSCAN algorithm to perform data noise reduction; and 5, fusing the denoised multi-source heterogeneous data obtained in the step 4 by adopting a Kalman filtering algorithm. The method can effectively improve the fusion efficiency of multi-source heterogeneous data and reduce the error of data fusion.

Description

Multi-energy system multi-source heterogeneous data fusion method and system
Technical Field
The invention belongs to the technical field of data processing of a multi-energy system, and relates to a multi-energy system multi-source heterogeneous data fusion method and system.
Background
Along with the development of the state monitoring technology, the improvement of the structural complexity of the power grid and the popularization of the artificial intelligent application of the power grid, the intelligent power equipment has the characteristics of large scale, high updating speed, multi-source isomerism and low value density of acquired data. Under the new trend of explosive growth of big data of power equipment, knowledge acquisition and information analysis cannot be safely and accurately completed from massive data by the traditional data processing technology, so that the multisource heterogeneous big data cleaning and fusion technology plays an important role in stable, safe and reliable operation of a smart power grid.
For the fusion of multi-source heterogeneous data of a multi-energy system, relevant researches are carried out by scholars at home and abroad. Nie Qingke provides a method for multi-source heterogeneous monitoring data fusion, which preprocesses and denoises different types of original monitoring data by adopting a wavelet decomposition technology, and fuses the same types of monitoring data from different monitoring points into a monitoring sequence by means of an entropy weight method. Wang Gang provides a new method for analyzing and judging the state of a utility power cabin, which is used for fusing data layers of a plurality of distributed data sources from the utility power cabin by adopting middleware technology. Ngiam extracts the characteristics of audio data and video data through a deep learning model, and then forms an integrated characteristic vector of a target object by combining the audio and video characteristics to realize the nonlinear fusion of heterogeneous information characteristic sources. Mo Huiling provides a multi-source heterogeneous data fusion algorithm based on Tucker decomposition in federated learning, and a tensor Tucker decomposition theory is introduced to construct a high-order tensor with heterogeneous spatial dimension characteristics, so that fusion of multi-source heterogeneous data in federated learning is realized.
Therefore, an effective multi-source heterogeneous data fusion technology is lacked in the current research, and the existing multi-source heterogeneous data fusion method of the multi-energy system has the defects of large data fusion error and long time. Meanwhile, a large amount of monitoring data of various sensors is lack of effective processing. Therefore, data preprocessing technologies such as data registration, filling and noise reduction need to be researched, and a good foundation is created for data fusion.
Disclosure of Invention
The invention aims to overcome the defects in the prior art, and provides a multi-energy system multi-source heterogeneous data fusion method and system, which can effectively improve the fusion efficiency of multi-source heterogeneous data and reduce the error of data fusion.
The invention solves the practical problem by adopting the following technical scheme:
a multi-energy system multi-source heterogeneous data fusion method comprises the following steps:
step 1, collecting load data and environment monitoring data of a multi-energy system, and cleaning abnormal data;
step 2, carrying out data time registration on the cleaned multi-energy system load data and the environment monitoring data in the step 1;
step 3, data filling is carried out on the data registered in the step 2, and a more complete sensor data set is obtained;
step 4, carrying out data noise reduction on the complete data set filled in the step 3;
and 5, fusing the denoised multi-source heterogeneous data obtained in the step 4.
Further, the specific steps of step 1 include:
(1) Acquiring load data and environmental monitoring data of a multi-energy system;
the load data comprises attribute parameters of equipment and real-time operation state data of a power grid; the environment monitoring data comprises temperature, humidity and vibration data;
(2) Calculating the local reachable density and the local abnormal factor of each data by distributing an outlier degree value of the outlier factor depending on the density of the adjacent region to each acquired data by adopting a local abnormal factor algorithm;
local achievable density ρ of the data i (x) And local anomaly factor LOF i (x) The calculation formula of (2) is as follows:
Figure BDA0003959279870000031
Figure BDA0003959279870000032
in the formula, N i (x) The ith distance field for data point x; rho i (x) Is a local achievable density; s i (x,f j ) Monitoring data x and day j synchronous monitoring data f j The distance of (d);
(3) Judging whether the data points are abnormal data or not by judging the local abnormal factor value, and further obtaining initial cleaning abnormal data;
the judgment standard is as follows:
1)LOF i (x) Close to 1, it is stated that x may be in the same cluster as the domain point input;
2)LOF i (x) < 1, indicating that the density of x is dense higher than the density of its neighborhood pointsPoint;
3)LOF i (x) > 1, indicating that the density of x is lower than that of the neighboring points, which are outliers.
Further, the specific steps of step 2 include:
(1) Performing data time registration on the cleaned multi-energy system load data and the environment monitoring data in the step 1 by adopting a least square method, setting the sampling periods of the sensors A1 and A2 as T1 and T2 respectively, and marking the latest monitoring time of the sensor A1 as (k-1) T 1 The current time is represented by kT1= [ (k-1) T1+ nT2 =]That is, in one period of the sensor A1, the data monitoring times of the sensor A2 is n;
(2) Let the monitoring value of the sensor A1 be y n The monitoring sequence of the sensor A2 is Y n =(y 1 ,y 2 ,…,y n ) T The data set formed by the fusion value of the n monitoring values and the reciprocal thereof is
Figure BDA0003959279870000033
The processed data is
y i =Y n +(i-n)TY n +o i
In the formula: o i Noise values occurring during the data measurement process; y is n Is the monitoring value of the sensor A1; t is a sampling period; y is n Is the detection sequence of sensor A2;
(3) After time registration based on sensor A1 data, sensor A2 data and measurement vector are represented as
Figure BDA0003959279870000041
Wherein T' is the fusion time.
And in the step 3, the data after the registration in the step 2 is filled by adopting a KNN algorithm, and the specific step of obtaining the filled time sequence data set comprises the following steps:
(1) Utilizing and measuring a missing sensor m i Performing joint calculation on historical acquired data of neighbor sensors to obtain measuring sensorsSpatial correlation between:
selecting and measuring missing sensor m i K sensors with high correlation, defining a sensor m i And a neighbor sensor m j The spatial correlation coefficient of (a) is:
Figure BDA0003959279870000042
in the formula: r is the spatial correlation between the two measurement sensors; y is i,t-1 Is a sensor m i Metrology data at time t-1; y is j,t-1 Is a sensor m j Metrology data at time t-1;
(2) Based on the spatial correlation coefficient obtained by calculation, calculating the filling weight of each sensor relative to the measurement missing sensor, specifically:
Figure BDA0003959279870000043
in the formula: w is a i Is a weight coefficient; r is the spatial correlation between the two sensors;
(3) Based on KNN algorithm spatial correlation, k neighbor sensor measurement data and corresponding weight coefficients are obtained, and measurement missing sensor data y is filled i,t Can be written specifically as
Figure BDA0003959279870000051
In the formula:
Figure BDA0003959279870000052
is a sensor m i Filling results of missing data at time t;
by the above formula, the sensor v can be obtained i The padding of missing data at time t.
Moreover, the specific method of the step 4 is as follows:
carrying out data noise reduction on the complete data set filled in the step 3 by adopting a DBSCAN algorithm;
the method comprises the following specific steps:
(1) Input padded complete data set B = { y1, y2, …, y n Epsilon is a radius parameter, minPts is a minimum object parameter, and all objects in the data set B are marked as unread;
(2) Taking a data set B containing an arbitrary number of data objects p from the data set B i In which B is i E.g. B, i =1, 2, 3 …, and compare B with i Marking as read;
(3) Judging p through epsilon and MinPts parameters, if p is a core object, finding out all density reachable data objects of p, and marking the density reachable data objects as read data; if p is not a core object and no object can reach the density of p, marking p as noise data;
(4) In the process of satisfying
Figure BDA0003959279870000053
Repeating (2) and (3) until all data are marked as read;
(5) Taking one core object as a seed, and classifying all density reachable points of the object into one class to form a data object set with a larger range, which is also called a cluster;
(6) And (5) continuously circulating to traverse all the core objects, and completing the noise reduction of the filled complete data set.
Further, the specific steps of step 5 include:
(1) And performing exchange fusion on the data subjected to the noise reduction treatment and an adjacent data sequence by adopting a Kalman filtering algorithm, wherein an adopted information matrix is as follows:
Figure BDA0003959279870000061
in the formula:
Figure BDA0003959279870000062
estimating a covariance matrix for the posterior at time k;
Figure BDA0003959279870000063
is a state estimation value at the time k;
(2) The update of the time sequence i is realized by the following algorithm
Figure BDA0003959279870000064
In the formula: q and R are covariance matrices of system noise and observation noise respectively, and in practical fusion, each node i can receive data from nodes of the subset and can use local posterior covariance of the node i
Figure BDA0003959279870000065
Covariance matrix sent to adjacent nodes
Figure BDA0003959279870000066
Carrying out data fusion;
(3) Setting the total data set after noise reduction as N, and performing data fusion by using the data and the adjacent data, wherein the calculation method comprises the following steps
Figure BDA0003959279870000067
In the formula: w i,j A combining weight for the data;
(4) And continuously repeating the process until the multi-source heterogeneous data fusion of the multi-energy system is completed.
The utility model provides a heterogeneous data fusion system of multipotency source system multisource which characterized in that: the method comprises the following steps:
the acquisition module is used for acquiring load data and environment monitoring data of the multi-energy system and cleaning abnormal data;
the registration module is used for carrying out data time registration on the cleaned multi-energy system load data and the cleaned environmental monitoring data;
the filling module is used for filling data of the registered data to obtain a more complete sensor data set;
the noise reduction module is used for carrying out data noise reduction on the filled complete data set;
and the fusion module is used for fusing the multi-source heterogeneous data subjected to noise reduction.
Moreover, the acquisition module further comprises:
the data acquisition module is used for acquiring load data and environment monitoring data of the multi-energy system;
the load data comprises attribute parameters of equipment and real-time operation state data of a power grid; the environment monitoring data comprises temperature, humidity and vibration data;
the abnormal data cleaning module adopts a local abnormal factor algorithm and calculates local reachable density and local abnormal factors of each data by distributing an outlier degree value of the outlier factor depending on the density of adjacent regions to each acquired data;
local achievable density ρ of the data i (x) And local anomaly factor LOF i (x) The calculation formula of (2) is as follows:
Figure BDA0003959279870000071
Figure BDA0003959279870000072
in the formula, N i (x) The ith distance field for data point x; rho i (x) Is a local achievable density; s i (x,f j ) Monitoring data x and day j synchronous monitoring data f j The distance of (d);
the available data acquisition module judges whether the data point is abnormal data by judging the local abnormal factor value so as to obtain available data after the abnormal data is preliminarily cleaned;
the judgment standard is as follows:
1)LOF i (x) Close to 1, it is stated that x may be in the same cluster as the domain point input;
2)LOF i (x) If the density of x is less than 1, the density of the neighborhood points is higher than that of the x and is taken as dense points;
3)LOF i (x) > 1, indicating that the density of x is lower than its neighborhood point density as outliers.
A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the above-mentioned steps.
An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the steps of the above method when executing the program.
The invention has the advantages and beneficial effects that:
1. the invention provides a multi-source heterogeneous data fusion method for a multi-energy system, which can effectively improve the fusion efficiency of multi-source heterogeneous data and reduce the error of data fusion. Wherein, the cleaning of the abnormal data in the step 1 can reduce the influence of the abnormal monitoring data on the fusion effect; data registration in step 2 can realize data time matching; the data filling in the step 3 can realize the consistency of the data density in each sensor; in the step 4, the data noise reduction can improve the accuracy of the filled data; and step 5, based on the data processing, the high-precision fusion of the multi-source heterogeneous data can be realized.
2. According to the invention, the least square method, the KNN algorithm and the DBSCAN algorithm are respectively adopted to carry out registration, filling and noise reduction on the acquired data, so that a key technical support is provided for the fusion of multi-source heterogeneous data, and the information accuracy and fault tolerance of the multi-energy system are improved.
Drawings
Fig. 1 is a processing flow chart of a multi-source heterogeneous data fusion method of a multi-energy system according to the present invention.
Detailed Description
The embodiments of the invention will be described in further detail below with reference to the accompanying drawings:
a multi-source heterogeneous data fusion method of a multi-energy system is shown in figure 1 and comprises the following steps:
step 1, acquiring load data and environment monitoring data of a multi-energy system, and preliminarily cleaning abnormal data;
the specific steps of the step 1 comprise:
(1) Acquiring multi-energy system load data including attribute parameters of equipment and real-time running state data of a power grid; collecting environmental monitoring data including temperature, humidity and vibration data;
(2) Calculating the local reachable density and the local abnormal factor of each data by distributing an outlier degree value of the outlier factor depending on the density of the adjacent region to each acquired data by adopting a local abnormal factor algorithm;
local achievable density ρ of the data i (x) And local anomaly factor LOF i (x) The calculation formula of (2) is as follows:
Figure BDA0003959279870000091
Figure BDA0003959279870000092
in the formula, N i (x) The ith distance field for data point x; rho i (x) Is a local achievable density; s i (x,f j ) Monitoring data x and day j synchronous monitoring data f j The distance of (d);
(3) Judging whether the data points are abnormal data or not by judging the local abnormal factor values, and further obtaining available data after the abnormal data are preliminarily cleaned;
the judgment standard is as follows:
1)LOF i (x) Close to 1, it is stated that x may be in the same cluster as the domain point input;
2)LOF i (x) If the density of x is less than 1, the density of the neighborhood points is higher than that of the x and is taken as dense points;
3)LOF i (x) If the density of x is more than 1, the density of x is lower than that of the neighborhood points and is abnormal points;
step 2, considering that sampling periods of all sensors are different, carrying out data time registration on the cleaned multi-energy system load data and the environment monitoring data in the step 1;
the specific steps of the step 2 comprise:
(1) Performing data time registration on the cleaned multi-energy system load data and the environment monitoring data in the step 1 by adopting a least square method, setting the sampling periods of the sensors A1 and A2 as T1 and T2 respectively, and marking the latest monitoring time of the sensor A1 as (k-1) T 1 The current time is represented by kT1= [ (k-1) T1+ nT2 =]That is, in one period of the sensor A1, the data monitoring times of the sensor A2 is n;
(2) Let the monitoring value of the sensor A1 be y n The monitoring sequence of the sensor A2 is Y n =(y 1 ,y 2 ,…,y n ) T The data set formed by the fusion value of the n monitoring values and the reciprocal thereof is
Figure BDA0003959279870000101
The processed data is
y i =Y n +(i-n)TY n +o i
In the formula: o i Noise values occurring during the data measurement process; y is n Is the monitoring value of the sensor A1; t is a sampling period; y is n Is the detection sequence of sensor A2.
(4) After time registration based on sensor A1 data, sensor A2 data and measurement vector are represented as
Figure BDA0003959279870000102
Wherein T' is the fusion time.
Step 3, considering the measurement data missing of data acquisition of each sensor, performing data filling on the data registered in the step 2 to obtain a more complete sensor data set, and improving the information accuracy and fault tolerance of the multi-energy system;
step 3, performing data filling on the data registered in step 2 by using a KNN algorithm, and obtaining a time sequence data set after filling, specifically comprises the following steps:
(2) After registration for step 2Measurement missing data y of multi-energy system load data and environment monitoring data i,t Filling, using and measuring the missing sensor m i Performing joint calculation on historical collected data of the adjacent sensors to obtain spatial correlation among the measurement sensors:
selecting and measuring missing sensor m i K sensors with high correlation, defining a sensor m i And a neighbor sensor m j The spatial correlation coefficient of (a) is:
Figure BDA0003959279870000103
in the formula: r is the spatial correlation between the two measurement sensors; y is i,t-1 Is a sensor m i The measured data at time t-1; y is j,t-1 Is a sensor m j Metrology data at time t-1.
(2) Based on the spatial correlation coefficient obtained by calculation, calculating the filling weight of each sensor relative to the measurement missing sensor, specifically:
Figure BDA0003959279870000111
in the formula: w is a i Is a weight coefficient; r is the spatial correlation between the two sensors;
(3) Based on KNN algorithm spatial correlation, k neighbor sensor measurement data and corresponding weight coefficients are obtained, and measurement missing sensor data y is filled i,t Can be specifically written as
Figure BDA0003959279870000112
In the formula:
Figure BDA0003959279870000113
is a sensor m i The padding of missing data at time t.
By the above formula, sensing can be obtainedV apparatus i The padding of missing data at time t.
In this embodiment, in step 3, a KNN algorithm is used to perform data padding, for padding of measurement missing data, joint calculation is performed by using sensor history acquisition data adjacent to the measurement missing sensor to obtain spatial correlation between the measurement sensors, and based on a spatial correlation coefficient, padding weight of each sensor with respect to the measurement missing sensor is calculated to pad the measurement missing sensor data.
Step 4, based on the complete data set filled in the step 3, adopting a DBSCAN algorithm to perform data noise reduction;
the core of the DBSCAN algorithm lies in the parameters Eps and MinPts, the neighborhood and the core object of each point are determined through the two parameters, and then the density reachable points are searched through the core object, so that the clustering of the data objects is realized, and the data noise reduction is further completed.
The specific steps of the step 4 comprise:
(1) Input padded complete data set B = { y1, y2, …, y n And E is a radius parameter, minPts is a minimum object parameter, and all objects in the data set B are marked as unread.
(2) Taking a data set B containing an arbitrary number of data objects p from the data set B i In which B is i E.g. B, i =1, 2, 3 …, and compare B with i The flag is read.
(3) Judging p through epsilon and MinPts parameters, if p is a core object, finding out all density reachable data objects of p and marking the density reachable data objects as read. If p is not a core object and no object is reachable for p density, p is labeled as noisy data.
(4) In satisfying
Figure BDA0003959279870000121
Repeating (2) and (3) until all data are marked as read.
(5) One of the core objects is used as a seed, and all density reachable points of the object are classified into one class, so that a data object set with a large range is formed, and the data object set is also called a cluster.
(6) And (5) continuously circulating to traverse all the core objects, and completing the noise reduction of the filled complete data set.
The working principle of the step 4 is as follows:
characterizing a critical value of the distance between individuals by epsilon; minPts characterizes the threshold for the number of individuals within this distance.
For sample set B = (y) 1 ,y 2 ,…,y n ) The DBSCAN algorithm is defined as follows:
1) Epsilon neighborhood: for a certain individual y in B i The epsilon neighborhood represents the distance y in B i A subset of samples not exceeding epsilon. Namely:
N ε (y i )={y j ∈B|dist(y i ,y j )≤ε} (1)
by addition of N ε (y i ) Representing the number of individuals in this epsilon neighborhood;
2) Core point: for a certain individual y in B i If the number of individuals in the epsilon neighborhood is not less than
MinPts(Nε(y i )≥MinPts) (2)
Then call y i Is a core point;
3) The density is up to: for a certain core point y in B i All points in the epsilon neighborhood are directly reached by the density of the core points;
4) The density can reach: for y i And y j If such a sample sequence p is present 1 ,p 2 ,…,p m Satisfy p 1 =y i ,p m =y j And in the sequence p t+1 From p t When the density is up to, it is called y j By y i The density can be reached, namely the density can be reached to have transferability;
5) Density connection: for y i And y j If there is a core point y k Let y i And y j Are all y k Density can be reached, then y i And y j The densities are connected.
In this embodiment, in the step 4, a DBSCAN algorithm is used to perform data noise reduction, samples are clustered according to sample data density, and an outlier with a small data volume can be identified in a multidimensional space by using density or spatial distance as a reference, and the outlier is eliminated to achieve a data noise reduction effect, so that all points found out that are connected by density are sample bodies, and the rest of data are outliers, which are noise points.
And 5, fusing the denoised multi-source heterogeneous data obtained in the step 4 by adopting a Kalman filtering algorithm.
The specific steps of the step 5 comprise:
(1) And exchanging and fusing the data subjected to noise reduction processing with an adjacent data sequence, wherein an adopted information matrix is as follows:
Figure BDA0003959279870000131
in the formula:
Figure BDA0003959279870000132
estimating a covariance matrix for the posteriori at time k;
Figure BDA0003959279870000133
is the state estimate at time k.
(2) The update of the time sequence i is realized by the following algorithm
Figure BDA0003959279870000134
In the formula: q and R are covariance matrices of system noise and observation noise respectively, and in practical fusion, each node i can receive data from nodes of the subset and can use local posterior covariance of the node i
Figure BDA0003959279870000135
Covariance matrix sent to adjacent nodes
Figure BDA0003959279870000136
In the data fusion。
(3) Setting the total data set after noise reduction as N, performing data fusion by using the data and the adjacent data thereof, and calculating by using the method
Figure BDA0003959279870000141
In the formula: w i,j Is the combined weight of the data.
(4) And continuously repeating the process until the fusion of the multi-source heterogeneous data of the multi-energy system is completed.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

Claims (10)

1. A multi-energy system multi-source heterogeneous data fusion method is characterized by comprising the following steps: the method comprises the following steps:
step 1, acquiring load data and environment monitoring data of a multi-energy system, and cleaning abnormal data;
step 2, carrying out data time registration on the cleaned multi-energy system load data and the environment monitoring data in the step 1;
step 3, data filling is carried out on the data registered in the step 2, and a more complete sensor data set is obtained;
step 4, carrying out data noise reduction on the complete data set filled in the step 3;
and 5, fusing the denoised multi-source heterogeneous data obtained in the step 4.
2. The multi-energy system multi-source heterogeneous data fusion method according to claim 1, characterized in that: the specific steps of the step 1 comprise:
(1) Acquiring load data and environmental monitoring data of a multi-energy system;
the load data comprises attribute parameters of equipment and real-time operation state data of a power grid; the environment monitoring data comprises temperature, humidity and vibration data;
(2) Calculating local reachable density and local abnormal factors of each data by distributing an outlier degree value of an outlier factor depending on the density of adjacent regions to each acquired data by adopting a local abnormal factor algorithm;
local achievable density ρ of the data i (x) And local anomaly factor LOF i (x) The calculation formula of (2) is as follows:
Figure FDA0003959279860000011
Figure FDA0003959279860000012
in the formula, N i (x) Is the ith distance field of the data point x; rho i (x) Is a local achievable density; s i (x,f j ) Monitoring data x and day j synchronous monitoring data f j The distance of (a);
(3) Judging whether the data points are abnormal data or not by judging the local abnormal factor value, and further obtaining initial cleaning abnormal data;
the judgment standard is as follows:
1)LOF i (x) Close to 1, it is stated that x may be in the same cluster as the domain point input;
2)LOF i (x) If the density of x is less than 1, the density of the neighborhood points is higher than that of the x and is taken as dense points;
3)LOF i (x) > 1, indicating that the density of x is lower than its neighborhood point density as outliers.
3. The multi-energy system multi-source heterogeneous data fusion method according to claim 1, characterized in that: the specific steps of the step 2 comprise:
(1) Performing data time registration on the cleaned multi-energy system load data and the environment monitoring data in the step 1 by adopting a least square method, setting the sampling periods of the sensors A1 and A2 as T1 and T2 respectively, and marking the latest monitoring time of the sensor A1 as (k-1) T 1 The current time is represented by kT1= [ (k-1) T1+ nT2 =]That is, the number of data monitoring times of the sensor A2 in one cycle of the sensor A1 is n;
(2) Let the monitoring value of the sensor A1 be y n The monitoring sequence of the sensor A2 is Y n =(y 1 ,y 2 ,…,y n ) T The data set formed by the fusion value of the n monitoring values and the reciprocal thereof is
Figure FDA0003959279860000022
The processed data is
y i =Y n +(i-n)TY n +o i
In the formula: o i Noise values occurring during the data measurement process; y is n Is the monitoring value of the sensor A1; t is a sampling period; y is n Is the detection sequence of sensor A2;
(3) After time registration based on sensor A1 data, sensor A2 data and measurement vector are represented as
Figure FDA0003959279860000021
Wherein T' is the fusion time.
4. The multi-energy system multi-source heterogeneous data fusion method according to claim 1, characterized in that: step 3, performing data filling on the data registered in step 2 by using a KNN algorithm, and obtaining a time sequence data set after filling, specifically comprises the following steps:
(1) Utilizing and measuring a missing sensor m i Performing joint calculation on historical collected data of the adjacent sensors to obtain spatial correlation among the measurement sensors:
selecting and measuring missing sensor m i K sensors with high correlation, defining a sensor m i And a neighbor sensor m j The spatial correlation coefficient of (a) is:
Figure FDA0003959279860000031
in the formula: r is the spatial correlation between the two measurement sensors; y is i,t-1 Is a sensor m i Metrology data at time t-1; y is j,t-1 Is a sensor m j Metrology data at time t-1;
(2) Based on the spatial correlation coefficient obtained by calculation, calculating the filling weight of each sensor relative to the measurement missing sensor, specifically:
Figure FDA0003959279860000032
in the formula: w is a i Is a weight coefficient; r is the spatial correlation between the two sensors;
(3) Based on KNN algorithm spatial correlation, k neighbor sensor measurement data and corresponding weight coefficients are obtained, and measurement missing sensor data y is filled i,t Can be written specifically as
Figure FDA0003959279860000033
In the formula:
Figure FDA0003959279860000034
is a sensor m i Filling results of missing data at time t;
by the above formula, the sensor v can be obtained i The filling result of the missing data at time t.
5. The multi-energy system multi-source heterogeneous data fusion method according to claim 1, characterized in that: the specific method of the step 4 comprises the following steps:
carrying out data noise reduction on the complete data set filled in the step 3 by adopting a DBSCAN algorithm;
the method comprises the following specific steps:
(1) Input padded complete data set B = { y = { y 1 ,y 2 ,…,y n And epsilon is a radius parameter,MinPts is a minimum object parameter, and all objects in the data set B are marked as unread;
(2) Taking a data set B containing an arbitrary number of data objects p from the data set B i In which B is i E.g. B, i =1, 2, 3 …, and compare B with i Marking as read;
(3) Judging p through epsilon and MinPts parameters, if p is a core object, finding out all density reachable data objects of p, and marking the density reachable data objects as read data; if p is not a core object and no object can reach the density of p, marking p as noise data;
(4) In satisfying
Figure FDA0003959279860000041
Repeating (2) and (3) until all data are marked as read;
(5) Taking one core object as a seed, and classifying all density reachable points of the object into one class to form a data object set with a larger range, which is also called a cluster;
(6) And (5) continuously circulating to traverse all the core objects, and completing the noise reduction of the filled complete data set.
6. The multi-energy system multi-source heterogeneous data fusion method according to claim 1, characterized in that: the specific steps of the step 5 comprise:
(1) And performing exchange fusion on the data subjected to the noise reduction treatment and an adjacent data sequence by adopting a Kalman filtering algorithm, wherein an adopted information matrix is as follows:
Figure FDA0003959279860000042
in the formula:
Figure FDA0003959279860000043
estimating a covariance matrix for the posteriori at time k;
Figure FDA0003959279860000044
is a state estimation value at the time k;
(2) The update of the time sequence i is realized by the following algorithm
Figure FDA0003959279860000051
In the formula: q and R are covariance matrices of system noise and observation noise respectively, and in actual fusion, each node i can receive data from nodes of the subset and can generate local posterior covariance of the node i
Figure FDA0003959279860000052
Covariance matrix sent to adjacent nodes
Figure FDA0003959279860000053
Carrying out data fusion;
(3) Setting the total data set after noise reduction as N, performing data fusion by using the data and the adjacent data thereof, and calculating by using the method
Figure FDA0003959279860000054
In the formula: w is a group of i,j A combining weight for the data;
(4) And continuously repeating the process until the multi-source heterogeneous data fusion of the multi-energy system is completed.
7. The utility model provides a heterogeneous data fusion system of multipotency source system multisource which characterized in that: the method comprises the following steps:
the acquisition module is used for acquiring load data and environment monitoring data of the multi-energy system and cleaning abnormal data;
the registration module is used for carrying out data time registration on the cleaned multi-energy system load data and the cleaned environmental monitoring data;
the filling module is used for filling data of the registered data to obtain a more complete sensor data set;
the noise reduction module is used for carrying out data noise reduction on the filled complete data set;
and the fusion module is used for fusing the multi-source heterogeneous data subjected to noise reduction.
8. The multi-energy system multi-source heterogeneous data fusion system according to claim 7, wherein: the acquisition module further comprises:
the data acquisition module is used for acquiring load data and environmental monitoring data of the multi-energy system;
the load data comprises attribute parameters of equipment and real-time operation state data of a power grid; the environment monitoring data comprises temperature, humidity and vibration data;
the abnormal data cleaning module adopts a local abnormal factor algorithm and calculates local reachable density and local abnormal factors of each data by distributing an outlier degree value of the outlier factor depending on the density of adjacent regions to each acquired data;
local achievable density ρ of the data i (x) And local anomaly factor LOF i (x) The calculation formula of (2) is as follows:
Figure FDA0003959279860000061
Figure FDA0003959279860000062
in the formula, N i (x) The ith distance field for data point x; ρ is a unit of a gradient i (x) Is a local achievable density; s i (x,f j ) Monitoring data x and day j synchronous monitoring data f j The distance of (d);
the available data acquisition module judges whether the data points are abnormal data or not by judging the local abnormal factor values, and then obtains available data after the abnormal data is preliminarily cleaned;
the judgment standard is as follows:
1)LOF i (x) Close to 1, it is stated that x may be in the same cluster as the domain point input;
2)LOF i (x) If the density of x is less than 1, the density of the neighborhood points is higher than that of the x and is taken as dense points;
3)LOF i (x) > 1, indicating that the density of x is lower than its neighborhood point density as outliers.
9. A computer-readable storage medium having stored thereon a computer program, characterized in that: the program when executed by a processor implementing the steps of the method of any one of claims 1 to 6.
10. An electronic device comprising a memory, a processor, and a computer program stored on the memory and executable on the processor, wherein: the processor when executing the program realizes the steps of the method of any one of claims 1 to 6.
CN202211474597.4A 2022-11-23 2022-11-23 Multi-energy system multi-source heterogeneous data fusion method and system Pending CN115718906A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211474597.4A CN115718906A (en) 2022-11-23 2022-11-23 Multi-energy system multi-source heterogeneous data fusion method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211474597.4A CN115718906A (en) 2022-11-23 2022-11-23 Multi-energy system multi-source heterogeneous data fusion method and system

Publications (1)

Publication Number Publication Date
CN115718906A true CN115718906A (en) 2023-02-28

Family

ID=85256078

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211474597.4A Pending CN115718906A (en) 2022-11-23 2022-11-23 Multi-energy system multi-source heterogeneous data fusion method and system

Country Status (1)

Country Link
CN (1) CN115718906A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116662326A (en) * 2023-07-26 2023-08-29 江西省检验检测认证总院计量科学研究院 Multi-energy variety data cleaning and collecting method
CN117942079A (en) * 2024-03-27 2024-04-30 山东大学 Emotion intelligence classification method and system based on multidimensional sensing and fusion

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116662326A (en) * 2023-07-26 2023-08-29 江西省检验检测认证总院计量科学研究院 Multi-energy variety data cleaning and collecting method
CN116662326B (en) * 2023-07-26 2023-10-20 江西省检验检测认证总院计量科学研究院 Multi-energy variety data cleaning and collecting method
CN117942079A (en) * 2024-03-27 2024-04-30 山东大学 Emotion intelligence classification method and system based on multidimensional sensing and fusion

Similar Documents

Publication Publication Date Title
CN115718906A (en) Multi-energy system multi-source heterogeneous data fusion method and system
CN110285969B (en) Rolling bearing fault migration diagnosis method with polynomial nuclear implantation characteristic distribution adaptation
Samtaney et al. Visualizing features and tracking their evolution
JP2021519997A (en) Heterogeneous graphs, methods for achieving identification of molecular spatial structural properties, their devices, computer devices and computer programs
CN110659693B (en) K-nearest neighbor classification-based power distribution network rapid topology identification method, system and medium
Xiao et al. Research on generalized non‐equidistance GM (1, 1) model based on matrix analysis
CN110674752A (en) Hidden Markov model-based tool wear state identification and prediction method
CN106685427B (en) A kind of sparse signal reconfiguring method based on consistency on messaging
CN102288843A (en) Power quality disturbance signal detection method
CN116599857B (en) Digital twin application system suitable for multiple scenes of Internet of things
CN109543693A (en) Weak labeling data noise reduction method based on regularization label propagation
CN105426583A (en) Synchronization-based homogeneous sensor fusion processing method
Lee et al. Channel pruning via gradient of mutual information for light-weight convolutional neural networks
CN116465628A (en) Rolling bearing fault diagnosis method based on improved multi-source domain heterogeneous model parameter transmission
CN106227965B (en) Soil organic carbon space sampling network design method considering non-stationary characteristics of space-time distribution
Meng et al. Bearing fault diagnosis under multi-sensor fusion based on modal analysis and graph attention network
Mellit et al. Neural network adaptive wavelets for sizing of stand-alone photovoltaic systems
CN116304950A (en) Multi-source heterogeneous data fusion method and device for power distribution network and storage medium
CN115795350B (en) Abnormal data information processing method in production process of blood rheological test cup
CN109151760B (en) Distributed state filtering method based on square root volume measurement weighting consistency
Basu et al. Retracted: Localizing and extracting filament distributions from microscopy images
CN109670243B (en) Service life prediction method based on Leeberg space model
CN115168326A (en) Hadoop big data platform distributed energy data cleaning method and system
CN111968113B (en) Brain image two-dimensional convolution deep learning method based on optimal transmission mapping
CN111179254B (en) Domain adaptive medical image segmentation method based on feature function and countermeasure learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination