CN115718906A - Multi-energy system multi-source heterogeneous data fusion method and system - Google Patents
Multi-energy system multi-source heterogeneous data fusion method and system Download PDFInfo
- Publication number
- CN115718906A CN115718906A CN202211474597.4A CN202211474597A CN115718906A CN 115718906 A CN115718906 A CN 115718906A CN 202211474597 A CN202211474597 A CN 202211474597A CN 115718906 A CN115718906 A CN 115718906A
- Authority
- CN
- China
- Prior art keywords
- data
- sensor
- density
- energy system
- fusion
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Landscapes
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention relates to a multi-energy system multi-source heterogeneous data fusion method and system, which comprises the following steps: step 1, acquiring load data and environment monitoring data of a multi-energy system, and preliminarily cleaning abnormal data; step 2, considering that sampling periods of all sensors are different, carrying out data time registration on the cleaned multi-energy system load data and the environment monitoring data in the step 1; step 3, considering the measurement data acquired by each sensor data is missing, data filling is carried out on the data registered in the step 2, and a more complete sensor data set is obtained; step 4, based on the complete data set filled in the step 3, adopting a DBSCAN algorithm to perform data noise reduction; and 5, fusing the denoised multi-source heterogeneous data obtained in the step 4 by adopting a Kalman filtering algorithm. The method can effectively improve the fusion efficiency of multi-source heterogeneous data and reduce the error of data fusion.
Description
Technical Field
The invention belongs to the technical field of data processing of a multi-energy system, and relates to a multi-energy system multi-source heterogeneous data fusion method and system.
Background
Along with the development of the state monitoring technology, the improvement of the structural complexity of the power grid and the popularization of the artificial intelligent application of the power grid, the intelligent power equipment has the characteristics of large scale, high updating speed, multi-source isomerism and low value density of acquired data. Under the new trend of explosive growth of big data of power equipment, knowledge acquisition and information analysis cannot be safely and accurately completed from massive data by the traditional data processing technology, so that the multisource heterogeneous big data cleaning and fusion technology plays an important role in stable, safe and reliable operation of a smart power grid.
For the fusion of multi-source heterogeneous data of a multi-energy system, relevant researches are carried out by scholars at home and abroad. Nie Qingke provides a method for multi-source heterogeneous monitoring data fusion, which preprocesses and denoises different types of original monitoring data by adopting a wavelet decomposition technology, and fuses the same types of monitoring data from different monitoring points into a monitoring sequence by means of an entropy weight method. Wang Gang provides a new method for analyzing and judging the state of a utility power cabin, which is used for fusing data layers of a plurality of distributed data sources from the utility power cabin by adopting middleware technology. Ngiam extracts the characteristics of audio data and video data through a deep learning model, and then forms an integrated characteristic vector of a target object by combining the audio and video characteristics to realize the nonlinear fusion of heterogeneous information characteristic sources. Mo Huiling provides a multi-source heterogeneous data fusion algorithm based on Tucker decomposition in federated learning, and a tensor Tucker decomposition theory is introduced to construct a high-order tensor with heterogeneous spatial dimension characteristics, so that fusion of multi-source heterogeneous data in federated learning is realized.
Therefore, an effective multi-source heterogeneous data fusion technology is lacked in the current research, and the existing multi-source heterogeneous data fusion method of the multi-energy system has the defects of large data fusion error and long time. Meanwhile, a large amount of monitoring data of various sensors is lack of effective processing. Therefore, data preprocessing technologies such as data registration, filling and noise reduction need to be researched, and a good foundation is created for data fusion.
Disclosure of Invention
The invention aims to overcome the defects in the prior art, and provides a multi-energy system multi-source heterogeneous data fusion method and system, which can effectively improve the fusion efficiency of multi-source heterogeneous data and reduce the error of data fusion.
The invention solves the practical problem by adopting the following technical scheme:
a multi-energy system multi-source heterogeneous data fusion method comprises the following steps:
step 2, carrying out data time registration on the cleaned multi-energy system load data and the environment monitoring data in the step 1;
step 3, data filling is carried out on the data registered in the step 2, and a more complete sensor data set is obtained;
step 4, carrying out data noise reduction on the complete data set filled in the step 3;
and 5, fusing the denoised multi-source heterogeneous data obtained in the step 4.
Further, the specific steps of step 1 include:
(1) Acquiring load data and environmental monitoring data of a multi-energy system;
the load data comprises attribute parameters of equipment and real-time operation state data of a power grid; the environment monitoring data comprises temperature, humidity and vibration data;
(2) Calculating the local reachable density and the local abnormal factor of each data by distributing an outlier degree value of the outlier factor depending on the density of the adjacent region to each acquired data by adopting a local abnormal factor algorithm;
local achievable density ρ of the data i (x) And local anomaly factor LOF i (x) The calculation formula of (2) is as follows:
in the formula, N i (x) The ith distance field for data point x; rho i (x) Is a local achievable density; s i (x,f j ) Monitoring data x and day j synchronous monitoring data f j The distance of (d);
(3) Judging whether the data points are abnormal data or not by judging the local abnormal factor value, and further obtaining initial cleaning abnormal data;
the judgment standard is as follows:
1)LOF i (x) Close to 1, it is stated that x may be in the same cluster as the domain point input;
2)LOF i (x) < 1, indicating that the density of x is dense higher than the density of its neighborhood pointsPoint;
3)LOF i (x) > 1, indicating that the density of x is lower than that of the neighboring points, which are outliers.
Further, the specific steps of step 2 include:
(1) Performing data time registration on the cleaned multi-energy system load data and the environment monitoring data in the step 1 by adopting a least square method, setting the sampling periods of the sensors A1 and A2 as T1 and T2 respectively, and marking the latest monitoring time of the sensor A1 as (k-1) T 1 The current time is represented by kT1= [ (k-1) T1+ nT2 =]That is, in one period of the sensor A1, the data monitoring times of the sensor A2 is n;
(2) Let the monitoring value of the sensor A1 be y n The monitoring sequence of the sensor A2 is Y n =(y 1 ,y 2 ,…,y n ) T The data set formed by the fusion value of the n monitoring values and the reciprocal thereof isThe processed data is
y i =Y n +(i-n)TY n +o i
In the formula: o i Noise values occurring during the data measurement process; y is n Is the monitoring value of the sensor A1; t is a sampling period; y is n Is the detection sequence of sensor A2;
(3) After time registration based on sensor A1 data, sensor A2 data and measurement vector are represented as
Wherein T' is the fusion time.
And in the step 3, the data after the registration in the step 2 is filled by adopting a KNN algorithm, and the specific step of obtaining the filled time sequence data set comprises the following steps:
(1) Utilizing and measuring a missing sensor m i Performing joint calculation on historical acquired data of neighbor sensors to obtain measuring sensorsSpatial correlation between:
selecting and measuring missing sensor m i K sensors with high correlation, defining a sensor m i And a neighbor sensor m j The spatial correlation coefficient of (a) is:
in the formula: r is the spatial correlation between the two measurement sensors; y is i,t-1 Is a sensor m i Metrology data at time t-1; y is j,t-1 Is a sensor m j Metrology data at time t-1;
(2) Based on the spatial correlation coefficient obtained by calculation, calculating the filling weight of each sensor relative to the measurement missing sensor, specifically:
in the formula: w is a i Is a weight coefficient; r is the spatial correlation between the two sensors;
(3) Based on KNN algorithm spatial correlation, k neighbor sensor measurement data and corresponding weight coefficients are obtained, and measurement missing sensor data y is filled i,t Can be written specifically as
by the above formula, the sensor v can be obtained i The padding of missing data at time t.
Moreover, the specific method of the step 4 is as follows:
carrying out data noise reduction on the complete data set filled in the step 3 by adopting a DBSCAN algorithm;
the method comprises the following specific steps:
(1) Input padded complete data set B = { y1, y2, …, y n Epsilon is a radius parameter, minPts is a minimum object parameter, and all objects in the data set B are marked as unread;
(2) Taking a data set B containing an arbitrary number of data objects p from the data set B i In which B is i E.g. B, i =1, 2, 3 …, and compare B with i Marking as read;
(3) Judging p through epsilon and MinPts parameters, if p is a core object, finding out all density reachable data objects of p, and marking the density reachable data objects as read data; if p is not a core object and no object can reach the density of p, marking p as noise data;
(5) Taking one core object as a seed, and classifying all density reachable points of the object into one class to form a data object set with a larger range, which is also called a cluster;
(6) And (5) continuously circulating to traverse all the core objects, and completing the noise reduction of the filled complete data set.
Further, the specific steps of step 5 include:
(1) And performing exchange fusion on the data subjected to the noise reduction treatment and an adjacent data sequence by adopting a Kalman filtering algorithm, wherein an adopted information matrix is as follows:
in the formula:estimating a covariance matrix for the posterior at time k;is a state estimation value at the time k;
(2) The update of the time sequence i is realized by the following algorithm
In the formula: q and R are covariance matrices of system noise and observation noise respectively, and in practical fusion, each node i can receive data from nodes of the subset and can use local posterior covariance of the node iCovariance matrix sent to adjacent nodesCarrying out data fusion;
(3) Setting the total data set after noise reduction as N, and performing data fusion by using the data and the adjacent data, wherein the calculation method comprises the following steps
In the formula: w i,j A combining weight for the data;
(4) And continuously repeating the process until the multi-source heterogeneous data fusion of the multi-energy system is completed.
The utility model provides a heterogeneous data fusion system of multipotency source system multisource which characterized in that: the method comprises the following steps:
the acquisition module is used for acquiring load data and environment monitoring data of the multi-energy system and cleaning abnormal data;
the registration module is used for carrying out data time registration on the cleaned multi-energy system load data and the cleaned environmental monitoring data;
the filling module is used for filling data of the registered data to obtain a more complete sensor data set;
the noise reduction module is used for carrying out data noise reduction on the filled complete data set;
and the fusion module is used for fusing the multi-source heterogeneous data subjected to noise reduction.
Moreover, the acquisition module further comprises:
the data acquisition module is used for acquiring load data and environment monitoring data of the multi-energy system;
the load data comprises attribute parameters of equipment and real-time operation state data of a power grid; the environment monitoring data comprises temperature, humidity and vibration data;
the abnormal data cleaning module adopts a local abnormal factor algorithm and calculates local reachable density and local abnormal factors of each data by distributing an outlier degree value of the outlier factor depending on the density of adjacent regions to each acquired data;
local achievable density ρ of the data i (x) And local anomaly factor LOF i (x) The calculation formula of (2) is as follows:
in the formula, N i (x) The ith distance field for data point x; rho i (x) Is a local achievable density; s i (x,f j ) Monitoring data x and day j synchronous monitoring data f j The distance of (d);
the available data acquisition module judges whether the data point is abnormal data by judging the local abnormal factor value so as to obtain available data after the abnormal data is preliminarily cleaned;
the judgment standard is as follows:
1)LOF i (x) Close to 1, it is stated that x may be in the same cluster as the domain point input;
2)LOF i (x) If the density of x is less than 1, the density of the neighborhood points is higher than that of the x and is taken as dense points;
3)LOF i (x) > 1, indicating that the density of x is lower than its neighborhood point density as outliers.
A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the above-mentioned steps.
An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the steps of the above method when executing the program.
The invention has the advantages and beneficial effects that:
1. the invention provides a multi-source heterogeneous data fusion method for a multi-energy system, which can effectively improve the fusion efficiency of multi-source heterogeneous data and reduce the error of data fusion. Wherein, the cleaning of the abnormal data in the step 1 can reduce the influence of the abnormal monitoring data on the fusion effect; data registration in step 2 can realize data time matching; the data filling in the step 3 can realize the consistency of the data density in each sensor; in the step 4, the data noise reduction can improve the accuracy of the filled data; and step 5, based on the data processing, the high-precision fusion of the multi-source heterogeneous data can be realized.
2. According to the invention, the least square method, the KNN algorithm and the DBSCAN algorithm are respectively adopted to carry out registration, filling and noise reduction on the acquired data, so that a key technical support is provided for the fusion of multi-source heterogeneous data, and the information accuracy and fault tolerance of the multi-energy system are improved.
Drawings
Fig. 1 is a processing flow chart of a multi-source heterogeneous data fusion method of a multi-energy system according to the present invention.
Detailed Description
The embodiments of the invention will be described in further detail below with reference to the accompanying drawings:
a multi-source heterogeneous data fusion method of a multi-energy system is shown in figure 1 and comprises the following steps:
the specific steps of the step 1 comprise:
(1) Acquiring multi-energy system load data including attribute parameters of equipment and real-time running state data of a power grid; collecting environmental monitoring data including temperature, humidity and vibration data;
(2) Calculating the local reachable density and the local abnormal factor of each data by distributing an outlier degree value of the outlier factor depending on the density of the adjacent region to each acquired data by adopting a local abnormal factor algorithm;
local achievable density ρ of the data i (x) And local anomaly factor LOF i (x) The calculation formula of (2) is as follows:
in the formula, N i (x) The ith distance field for data point x; rho i (x) Is a local achievable density; s i (x,f j ) Monitoring data x and day j synchronous monitoring data f j The distance of (d);
(3) Judging whether the data points are abnormal data or not by judging the local abnormal factor values, and further obtaining available data after the abnormal data are preliminarily cleaned;
the judgment standard is as follows:
1)LOF i (x) Close to 1, it is stated that x may be in the same cluster as the domain point input;
2)LOF i (x) If the density of x is less than 1, the density of the neighborhood points is higher than that of the x and is taken as dense points;
3)LOF i (x) If the density of x is more than 1, the density of x is lower than that of the neighborhood points and is abnormal points;
step 2, considering that sampling periods of all sensors are different, carrying out data time registration on the cleaned multi-energy system load data and the environment monitoring data in the step 1;
the specific steps of the step 2 comprise:
(1) Performing data time registration on the cleaned multi-energy system load data and the environment monitoring data in the step 1 by adopting a least square method, setting the sampling periods of the sensors A1 and A2 as T1 and T2 respectively, and marking the latest monitoring time of the sensor A1 as (k-1) T 1 The current time is represented by kT1= [ (k-1) T1+ nT2 =]That is, in one period of the sensor A1, the data monitoring times of the sensor A2 is n;
(2) Let the monitoring value of the sensor A1 be y n The monitoring sequence of the sensor A2 is Y n =(y 1 ,y 2 ,…,y n ) T The data set formed by the fusion value of the n monitoring values and the reciprocal thereof isThe processed data is
y i =Y n +(i-n)TY n +o i
In the formula: o i Noise values occurring during the data measurement process; y is n Is the monitoring value of the sensor A1; t is a sampling period; y is n Is the detection sequence of sensor A2.
(4) After time registration based on sensor A1 data, sensor A2 data and measurement vector are represented as
Wherein T' is the fusion time.
Step 3, considering the measurement data missing of data acquisition of each sensor, performing data filling on the data registered in the step 2 to obtain a more complete sensor data set, and improving the information accuracy and fault tolerance of the multi-energy system;
step 3, performing data filling on the data registered in step 2 by using a KNN algorithm, and obtaining a time sequence data set after filling, specifically comprises the following steps:
(2) After registration for step 2Measurement missing data y of multi-energy system load data and environment monitoring data i,t Filling, using and measuring the missing sensor m i Performing joint calculation on historical collected data of the adjacent sensors to obtain spatial correlation among the measurement sensors:
selecting and measuring missing sensor m i K sensors with high correlation, defining a sensor m i And a neighbor sensor m j The spatial correlation coefficient of (a) is:
in the formula: r is the spatial correlation between the two measurement sensors; y is i,t-1 Is a sensor m i The measured data at time t-1; y is j,t-1 Is a sensor m j Metrology data at time t-1.
(2) Based on the spatial correlation coefficient obtained by calculation, calculating the filling weight of each sensor relative to the measurement missing sensor, specifically:
in the formula: w is a i Is a weight coefficient; r is the spatial correlation between the two sensors;
(3) Based on KNN algorithm spatial correlation, k neighbor sensor measurement data and corresponding weight coefficients are obtained, and measurement missing sensor data y is filled i,t Can be specifically written as
By the above formula, sensing can be obtainedV apparatus i The padding of missing data at time t.
In this embodiment, in step 3, a KNN algorithm is used to perform data padding, for padding of measurement missing data, joint calculation is performed by using sensor history acquisition data adjacent to the measurement missing sensor to obtain spatial correlation between the measurement sensors, and based on a spatial correlation coefficient, padding weight of each sensor with respect to the measurement missing sensor is calculated to pad the measurement missing sensor data.
Step 4, based on the complete data set filled in the step 3, adopting a DBSCAN algorithm to perform data noise reduction;
the core of the DBSCAN algorithm lies in the parameters Eps and MinPts, the neighborhood and the core object of each point are determined through the two parameters, and then the density reachable points are searched through the core object, so that the clustering of the data objects is realized, and the data noise reduction is further completed.
The specific steps of the step 4 comprise:
(1) Input padded complete data set B = { y1, y2, …, y n And E is a radius parameter, minPts is a minimum object parameter, and all objects in the data set B are marked as unread.
(2) Taking a data set B containing an arbitrary number of data objects p from the data set B i In which B is i E.g. B, i =1, 2, 3 …, and compare B with i The flag is read.
(3) Judging p through epsilon and MinPts parameters, if p is a core object, finding out all density reachable data objects of p and marking the density reachable data objects as read. If p is not a core object and no object is reachable for p density, p is labeled as noisy data.
(5) One of the core objects is used as a seed, and all density reachable points of the object are classified into one class, so that a data object set with a large range is formed, and the data object set is also called a cluster.
(6) And (5) continuously circulating to traverse all the core objects, and completing the noise reduction of the filled complete data set.
The working principle of the step 4 is as follows:
characterizing a critical value of the distance between individuals by epsilon; minPts characterizes the threshold for the number of individuals within this distance.
For sample set B = (y) 1 ,y 2 ,…,y n ) The DBSCAN algorithm is defined as follows:
1) Epsilon neighborhood: for a certain individual y in B i The epsilon neighborhood represents the distance y in B i A subset of samples not exceeding epsilon. Namely:
N ε (y i )={y j ∈B|dist(y i ,y j )≤ε} (1)
by addition of N ε (y i ) Representing the number of individuals in this epsilon neighborhood;
2) Core point: for a certain individual y in B i If the number of individuals in the epsilon neighborhood is not less than
MinPts(Nε(y i )≥MinPts) (2)
Then call y i Is a core point;
3) The density is up to: for a certain core point y in B i All points in the epsilon neighborhood are directly reached by the density of the core points;
4) The density can reach: for y i And y j If such a sample sequence p is present 1 ,p 2 ,…,p m Satisfy p 1 =y i ,p m =y j And in the sequence p t+1 From p t When the density is up to, it is called y j By y i The density can be reached, namely the density can be reached to have transferability;
5) Density connection: for y i And y j If there is a core point y k Let y i And y j Are all y k Density can be reached, then y i And y j The densities are connected.
In this embodiment, in the step 4, a DBSCAN algorithm is used to perform data noise reduction, samples are clustered according to sample data density, and an outlier with a small data volume can be identified in a multidimensional space by using density or spatial distance as a reference, and the outlier is eliminated to achieve a data noise reduction effect, so that all points found out that are connected by density are sample bodies, and the rest of data are outliers, which are noise points.
And 5, fusing the denoised multi-source heterogeneous data obtained in the step 4 by adopting a Kalman filtering algorithm.
The specific steps of the step 5 comprise:
(1) And exchanging and fusing the data subjected to noise reduction processing with an adjacent data sequence, wherein an adopted information matrix is as follows:
in the formula:estimating a covariance matrix for the posteriori at time k;is the state estimate at time k.
(2) The update of the time sequence i is realized by the following algorithm
In the formula: q and R are covariance matrices of system noise and observation noise respectively, and in practical fusion, each node i can receive data from nodes of the subset and can use local posterior covariance of the node iCovariance matrix sent to adjacent nodesIn the data fusion。
(3) Setting the total data set after noise reduction as N, performing data fusion by using the data and the adjacent data thereof, and calculating by using the method
In the formula: w i,j Is the combined weight of the data.
(4) And continuously repeating the process until the fusion of the multi-source heterogeneous data of the multi-energy system is completed.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
Claims (10)
1. A multi-energy system multi-source heterogeneous data fusion method is characterized by comprising the following steps: the method comprises the following steps:
step 1, acquiring load data and environment monitoring data of a multi-energy system, and cleaning abnormal data;
step 2, carrying out data time registration on the cleaned multi-energy system load data and the environment monitoring data in the step 1;
step 3, data filling is carried out on the data registered in the step 2, and a more complete sensor data set is obtained;
step 4, carrying out data noise reduction on the complete data set filled in the step 3;
and 5, fusing the denoised multi-source heterogeneous data obtained in the step 4.
2. The multi-energy system multi-source heterogeneous data fusion method according to claim 1, characterized in that: the specific steps of the step 1 comprise:
(1) Acquiring load data and environmental monitoring data of a multi-energy system;
the load data comprises attribute parameters of equipment and real-time operation state data of a power grid; the environment monitoring data comprises temperature, humidity and vibration data;
(2) Calculating local reachable density and local abnormal factors of each data by distributing an outlier degree value of an outlier factor depending on the density of adjacent regions to each acquired data by adopting a local abnormal factor algorithm;
local achievable density ρ of the data i (x) And local anomaly factor LOF i (x) The calculation formula of (2) is as follows:
in the formula, N i (x) Is the ith distance field of the data point x; rho i (x) Is a local achievable density; s i (x,f j ) Monitoring data x and day j synchronous monitoring data f j The distance of (a);
(3) Judging whether the data points are abnormal data or not by judging the local abnormal factor value, and further obtaining initial cleaning abnormal data;
the judgment standard is as follows:
1)LOF i (x) Close to 1, it is stated that x may be in the same cluster as the domain point input;
2)LOF i (x) If the density of x is less than 1, the density of the neighborhood points is higher than that of the x and is taken as dense points;
3)LOF i (x) > 1, indicating that the density of x is lower than its neighborhood point density as outliers.
3. The multi-energy system multi-source heterogeneous data fusion method according to claim 1, characterized in that: the specific steps of the step 2 comprise:
(1) Performing data time registration on the cleaned multi-energy system load data and the environment monitoring data in the step 1 by adopting a least square method, setting the sampling periods of the sensors A1 and A2 as T1 and T2 respectively, and marking the latest monitoring time of the sensor A1 as (k-1) T 1 The current time is represented by kT1= [ (k-1) T1+ nT2 =]That is, the number of data monitoring times of the sensor A2 in one cycle of the sensor A1 is n;
(2) Let the monitoring value of the sensor A1 be y n The monitoring sequence of the sensor A2 is Y n =(y 1 ,y 2 ,…,y n ) T The data set formed by the fusion value of the n monitoring values and the reciprocal thereof isThe processed data is
y i =Y n +(i-n)TY n +o i
In the formula: o i Noise values occurring during the data measurement process; y is n Is the monitoring value of the sensor A1; t is a sampling period; y is n Is the detection sequence of sensor A2;
(3) After time registration based on sensor A1 data, sensor A2 data and measurement vector are represented as
Wherein T' is the fusion time.
4. The multi-energy system multi-source heterogeneous data fusion method according to claim 1, characterized in that: step 3, performing data filling on the data registered in step 2 by using a KNN algorithm, and obtaining a time sequence data set after filling, specifically comprises the following steps:
(1) Utilizing and measuring a missing sensor m i Performing joint calculation on historical collected data of the adjacent sensors to obtain spatial correlation among the measurement sensors:
selecting and measuring missing sensor m i K sensors with high correlation, defining a sensor m i And a neighbor sensor m j The spatial correlation coefficient of (a) is:
in the formula: r is the spatial correlation between the two measurement sensors; y is i,t-1 Is a sensor m i Metrology data at time t-1; y is j,t-1 Is a sensor m j Metrology data at time t-1;
(2) Based on the spatial correlation coefficient obtained by calculation, calculating the filling weight of each sensor relative to the measurement missing sensor, specifically:
in the formula: w is a i Is a weight coefficient; r is the spatial correlation between the two sensors;
(3) Based on KNN algorithm spatial correlation, k neighbor sensor measurement data and corresponding weight coefficients are obtained, and measurement missing sensor data y is filled i,t Can be written specifically as
by the above formula, the sensor v can be obtained i The filling result of the missing data at time t.
5. The multi-energy system multi-source heterogeneous data fusion method according to claim 1, characterized in that: the specific method of the step 4 comprises the following steps:
carrying out data noise reduction on the complete data set filled in the step 3 by adopting a DBSCAN algorithm;
the method comprises the following specific steps:
(1) Input padded complete data set B = { y = { y 1 ,y 2 ,…,y n And epsilon is a radius parameter,MinPts is a minimum object parameter, and all objects in the data set B are marked as unread;
(2) Taking a data set B containing an arbitrary number of data objects p from the data set B i In which B is i E.g. B, i =1, 2, 3 …, and compare B with i Marking as read;
(3) Judging p through epsilon and MinPts parameters, if p is a core object, finding out all density reachable data objects of p, and marking the density reachable data objects as read data; if p is not a core object and no object can reach the density of p, marking p as noise data;
(5) Taking one core object as a seed, and classifying all density reachable points of the object into one class to form a data object set with a larger range, which is also called a cluster;
(6) And (5) continuously circulating to traverse all the core objects, and completing the noise reduction of the filled complete data set.
6. The multi-energy system multi-source heterogeneous data fusion method according to claim 1, characterized in that: the specific steps of the step 5 comprise:
(1) And performing exchange fusion on the data subjected to the noise reduction treatment and an adjacent data sequence by adopting a Kalman filtering algorithm, wherein an adopted information matrix is as follows:
in the formula:estimating a covariance matrix for the posteriori at time k;is a state estimation value at the time k;
(2) The update of the time sequence i is realized by the following algorithm
In the formula: q and R are covariance matrices of system noise and observation noise respectively, and in actual fusion, each node i can receive data from nodes of the subset and can generate local posterior covariance of the node iCovariance matrix sent to adjacent nodesCarrying out data fusion;
(3) Setting the total data set after noise reduction as N, performing data fusion by using the data and the adjacent data thereof, and calculating by using the method
In the formula: w is a group of i,j A combining weight for the data;
(4) And continuously repeating the process until the multi-source heterogeneous data fusion of the multi-energy system is completed.
7. The utility model provides a heterogeneous data fusion system of multipotency source system multisource which characterized in that: the method comprises the following steps:
the acquisition module is used for acquiring load data and environment monitoring data of the multi-energy system and cleaning abnormal data;
the registration module is used for carrying out data time registration on the cleaned multi-energy system load data and the cleaned environmental monitoring data;
the filling module is used for filling data of the registered data to obtain a more complete sensor data set;
the noise reduction module is used for carrying out data noise reduction on the filled complete data set;
and the fusion module is used for fusing the multi-source heterogeneous data subjected to noise reduction.
8. The multi-energy system multi-source heterogeneous data fusion system according to claim 7, wherein: the acquisition module further comprises:
the data acquisition module is used for acquiring load data and environmental monitoring data of the multi-energy system;
the load data comprises attribute parameters of equipment and real-time operation state data of a power grid; the environment monitoring data comprises temperature, humidity and vibration data;
the abnormal data cleaning module adopts a local abnormal factor algorithm and calculates local reachable density and local abnormal factors of each data by distributing an outlier degree value of the outlier factor depending on the density of adjacent regions to each acquired data;
local achievable density ρ of the data i (x) And local anomaly factor LOF i (x) The calculation formula of (2) is as follows:
in the formula, N i (x) The ith distance field for data point x; ρ is a unit of a gradient i (x) Is a local achievable density; s i (x,f j ) Monitoring data x and day j synchronous monitoring data f j The distance of (d);
the available data acquisition module judges whether the data points are abnormal data or not by judging the local abnormal factor values, and then obtains available data after the abnormal data is preliminarily cleaned;
the judgment standard is as follows:
1)LOF i (x) Close to 1, it is stated that x may be in the same cluster as the domain point input;
2)LOF i (x) If the density of x is less than 1, the density of the neighborhood points is higher than that of the x and is taken as dense points;
3)LOF i (x) > 1, indicating that the density of x is lower than its neighborhood point density as outliers.
9. A computer-readable storage medium having stored thereon a computer program, characterized in that: the program when executed by a processor implementing the steps of the method of any one of claims 1 to 6.
10. An electronic device comprising a memory, a processor, and a computer program stored on the memory and executable on the processor, wherein: the processor when executing the program realizes the steps of the method of any one of claims 1 to 6.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211474597.4A CN115718906A (en) | 2022-11-23 | 2022-11-23 | Multi-energy system multi-source heterogeneous data fusion method and system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211474597.4A CN115718906A (en) | 2022-11-23 | 2022-11-23 | Multi-energy system multi-source heterogeneous data fusion method and system |
Publications (1)
Publication Number | Publication Date |
---|---|
CN115718906A true CN115718906A (en) | 2023-02-28 |
Family
ID=85256078
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202211474597.4A Pending CN115718906A (en) | 2022-11-23 | 2022-11-23 | Multi-energy system multi-source heterogeneous data fusion method and system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115718906A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116662326A (en) * | 2023-07-26 | 2023-08-29 | 江西省检验检测认证总院计量科学研究院 | Multi-energy variety data cleaning and collecting method |
CN117942079A (en) * | 2024-03-27 | 2024-04-30 | 山东大学 | Emotion intelligence classification method and system based on multidimensional sensing and fusion |
-
2022
- 2022-11-23 CN CN202211474597.4A patent/CN115718906A/en active Pending
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116662326A (en) * | 2023-07-26 | 2023-08-29 | 江西省检验检测认证总院计量科学研究院 | Multi-energy variety data cleaning and collecting method |
CN116662326B (en) * | 2023-07-26 | 2023-10-20 | 江西省检验检测认证总院计量科学研究院 | Multi-energy variety data cleaning and collecting method |
CN117942079A (en) * | 2024-03-27 | 2024-04-30 | 山东大学 | Emotion intelligence classification method and system based on multidimensional sensing and fusion |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN115718906A (en) | Multi-energy system multi-source heterogeneous data fusion method and system | |
CN110285969B (en) | Rolling bearing fault migration diagnosis method with polynomial nuclear implantation characteristic distribution adaptation | |
Samtaney et al. | Visualizing features and tracking their evolution | |
JP2021519997A (en) | Heterogeneous graphs, methods for achieving identification of molecular spatial structural properties, their devices, computer devices and computer programs | |
CN110659693B (en) | K-nearest neighbor classification-based power distribution network rapid topology identification method, system and medium | |
Xiao et al. | Research on generalized non‐equidistance GM (1, 1) model based on matrix analysis | |
CN110674752A (en) | Hidden Markov model-based tool wear state identification and prediction method | |
CN106685427B (en) | A kind of sparse signal reconfiguring method based on consistency on messaging | |
CN102288843A (en) | Power quality disturbance signal detection method | |
CN116599857B (en) | Digital twin application system suitable for multiple scenes of Internet of things | |
CN109543693A (en) | Weak labeling data noise reduction method based on regularization label propagation | |
CN105426583A (en) | Synchronization-based homogeneous sensor fusion processing method | |
Lee et al. | Channel pruning via gradient of mutual information for light-weight convolutional neural networks | |
CN116465628A (en) | Rolling bearing fault diagnosis method based on improved multi-source domain heterogeneous model parameter transmission | |
CN106227965B (en) | Soil organic carbon space sampling network design method considering non-stationary characteristics of space-time distribution | |
Meng et al. | Bearing fault diagnosis under multi-sensor fusion based on modal analysis and graph attention network | |
Mellit et al. | Neural network adaptive wavelets for sizing of stand-alone photovoltaic systems | |
CN116304950A (en) | Multi-source heterogeneous data fusion method and device for power distribution network and storage medium | |
CN115795350B (en) | Abnormal data information processing method in production process of blood rheological test cup | |
CN109151760B (en) | Distributed state filtering method based on square root volume measurement weighting consistency | |
Basu et al. | Retracted: Localizing and extracting filament distributions from microscopy images | |
CN109670243B (en) | Service life prediction method based on Leeberg space model | |
CN115168326A (en) | Hadoop big data platform distributed energy data cleaning method and system | |
CN111968113B (en) | Brain image two-dimensional convolution deep learning method based on optimal transmission mapping | |
CN111179254B (en) | Domain adaptive medical image segmentation method based on feature function and countermeasure learning |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |