CN112699601B - Space-time reconstruction method for sensor network data - Google Patents
Space-time reconstruction method for sensor network data Download PDFInfo
- Publication number
- CN112699601B CN112699601B CN202011576661.0A CN202011576661A CN112699601B CN 112699601 B CN112699601 B CN 112699601B CN 202011576661 A CN202011576661 A CN 202011576661A CN 112699601 B CN112699601 B CN 112699601B
- Authority
- CN
- China
- Prior art keywords
- data
- time
- space
- sensor
- training
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F30/00—Computer-aided design [CAD]
- G06F30/20—Design optimisation, verification or simulation
- G06F30/27—Design optimisation, verification or simulation using machine learning, e.g. artificial intelligence, neural networks, support vector machines [SVM] or training a model
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/21—Design, administration or maintenance of databases
- G06F16/215—Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/28—Databases characterised by their database models, e.g. relational or object models
- G06F16/284—Relational databases
- G06F16/285—Clustering or classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/18—Complex mathematical operations for evaluating statistical data, e.g. average values, frequency distributions, probability functions, regression analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D30/00—Reducing energy consumption in communication networks
- Y02D30/70—Reducing energy consumption in communication networks in wireless communication networks
Abstract
The invention discloses a space-time reconstruction method of sensor network data, which comprises the following steps: s1, collecting space-time data of the sensor; s2, preprocessing the acquired space-time data; s3, determining the length of training data; s4, constructing a training data set and a testing data set; s5, establishing a space-time data model; s6, training a space-time data model; and S7, performing space-time data reconstruction by using the model. The space-time reconstruction method can reconstruct the data of the position without the sensor and the sampling moment according to the sensor data of a limited number of measuring positions and measuring time; by estimating the time decorrelation length of each sensor data and determining the time length of the training data, the computation amount during model training and data testing can be reduced.
Description
Technical Field
The invention relates to a data processing method of a sensor network, in particular to a space-time reconstruction method of sensor network data.
Background
The rapid popularization and application of sensor network technologies, such as Internet of Things (Internet of Things, iot), optical sensor networks, radio radar sensor networks, and the like, provide new capabilities and opportunities for large-scale collection, analysis, and utilization of space and time to sense large data, which makes human social life, industrial manufacturing, security, and other fields more intelligent, safe, and efficient.
According to the motion characteristics of the sensor platform, the data acquisition modes of the sensor can be roughly divided into two types: in the first category, the sensors are fixed and installed in the infrastructure of the sensing area when being arranged, so that the action range of each sensor node is determined in advance; in the second category, the sensors are mobile and placed on a moving platform or a person, the action area of each sensor node is dynamically changed, and the moving platform or the carrier needs to travel in the sensing area to acquire measurement data at all positions. However, limited by the number of sensor nodes, data transmission rate, energy consumption and other factors, in any way, the collected sensor data is local and discontinuous in time and space, that is: the sensor network can only sense or sample a portion of the locations in the sensing region at a series of discrete time instants. Therefore, both of the above two sensing methods inherently present a false alarm or false alarm risk for any possible event in time and space. The method has great potential safety hazards for sensor network applications aiming at disaster, safety, danger monitoring and the like, such as mine production, building facility health monitoring, forest fire or geological disaster early warning, environmental pollution monitoring, intrusion detection and the like.
For the above defects of the sensor network, if a data processing method can be found, the state of the sensing object in any time and space can be recovered only by using the collected limited space-time sensing data, which has a very important application value.
Disclosure of Invention
The invention aims to overcome the defects of the prior art and provide a space-time reconstruction method for sensor network data, which can reconstruct the data of a position without a sensor and a sampling moment according to sensor data of a limited number of measuring positions and measuring times.
The purpose of the invention is realized by the following technical scheme: a space-time reconstruction method of sensor network data comprises the following steps:
s1, collecting space-time data of the sensor;
s2, preprocessing the acquired space-time data;
s3, determining the length of training data;
s4, constructing a training data set and a testing data set;
s5, establishing a space-time data model;
s6, training a space-time data model;
and S7, performing space-time data reconstruction by using the model.
Further, the specific implementation method of step S1 is as follows: sensor siTime series data d for data collectioni=[di(1),di(2),...,di(Ki)]TIs represented by the formula (I) in which di(k),k=1,2,...,KiIndicating sensor siData collected at time K, KiIs siT is the transpose operator;
set for all sensor data collected by sensor networkRepresentation in which superscripts are placed(0)Which represents the raw data of the acquisition,indicating sensor siThe acquired time series data i is 1, …, and N represents the number of sensors.
Further, the specific implementation method of step S2 is as follows:
s21, removing outliers in the sensor data, and defining the outliers of each sensor to be data values which are n times larger than the standard deviation of the sensor data, wherein the value range of n is 2-10; the method of removing outliers is to replace them with the values of the normal data points that are closest to them;
s22, filling missing data, and taking the average value of the time series before missing, which is equal to the data length of the missing segment, as the filling value of the missing data segment;
s23, data aggregation processing, namely aggregating the data of each sensor into time sequence data with a time interval delta T, and keeping the starting time of an aggregation time period as a time stamp of the aggregated data points; the aggregation operation is to take the average value, the median value or a specific quantile numerical value of the data in the aggregation time period;
s24, truncating all sensor data to start from the same timestamp and end by the same timestamp;
after data preprocessing, the time series data of all sensors are equal in length, the time stamps and the time intervals of the data points are also identical, and a data set is represented asWherein the superscript is(1)Representing the pre-processed data.
Further, the specific implementation method of step S3 is as follows:
s31, calculating the time correlation sequence of each sensor:
in the formula E [. C]Representing a desired operator; τ is a time delay length and τ is 0,1τ,KτRepresents a maximum delay time;andare respectively a time sequenceAndthe average value of (a) of (b),presentation pairA time sequence obtained by delaying tau time units;
s32, estimating each sensor SiDecorrelation length τ of time series datac,iThere are two cases:
case 1: if the time correlation sequence ri(τ) is aperiodic, then find the condition riThe maximum time delay length with (tau) less than or equal to 0.05 is used as the decorrelation length tauc,i;
Case 2: if the time correlation sequence ri(τ) is periodic, the period of the interval between peaks is found as the decorrelation length τc,i;
S33, determining the maximum decorrelation length of the sensor data as follows:
s34, taking the length L of the training data as: l ═ n · τc,maxWherein the value range of n is 1.5-10.
Further, the specific implementation method of step S4 is as follows:
s41, constructing a training data set: sensor siIs represented byShort forWhich corresponds to a space-time position (x)i,yi(ii) a k) The data on (1); time series data of length K for N sensorsTotal NK data points;
the sensor data is normalized, and the processing method comprises the following steps:
in the formulaIs a sequence of dataThe average value of (a) of (b),is thatStandard deviation of (2), superscript(2)Representing the data after normalization processing;
and (3) constructing a training data set omega by using the normalized sensor data to obtain:
the representation is located at spatial coordinates (x)i,yi) Sensor siNormalizing the data acquired at the moment k to obtain data, wherein the training data set omega has NK data points;
s42, constructing a test data set: the test data set is a collection of data points to be reconstructed, the location (x) of which is tested*,y*,k*) Space-time trellis(s)g;tg) Above, i.e. (x)*,y*)∈sgAnd k*∈tg,sgIs a spatial grid, tgIs a time grid;
defining a test data set omega*Comprises the following steps:
in the formula (I), the compound is shown in the specification,for the data sequence to be reconstructed,for the spatial position of the data to be reconstructed, k*For the temporal position of the data to be reconstructed,data on data points to be reconstructed; mx、My、MtRespectively representing the maximum indexes of the test data on x, y axes and a time axis t, wherein the total number of the test data points is MxMyMt(ii) a When k is more than or equal to 1*When K is less than or equal to K, the reconstructed data is regarded as a time interpolation result of the training data; when k is*When the value is more than K, the reconstructed data is regarded as a prediction result of the training data;
in order to reduce the operation complexity, selecting sensor data of L time sections from omega in a training process as training data, wherein the total number of training data points is NL; from Ω*Middle selection of L*Taking the spatial grid of each time section as a test data point to be reconstructed, wherein the total number of the test data points is MxMyL*(ii) a All M's are to be reconstructedxMyMtData of each test point is required to be carried outA secondary test whereinDenotes a ceiling operation, L*Indicating the reconstructed data length.
Further, the specific implementation method of step S5 is as follows:
s51, constructing a space-time covariance matrix of the data: space-time kernel function ker (v)p,vq) Containing the space-time correlation of the data, where vp=(xp,yp,kp) And vq=(xq,yq,kq) Respectively representing two different space-time positions of p and q; the space-time data is three-dimensional data in the x, y, and k dimensions, represented by ker (v) using a product tensor kernelp,vq) Namely:
ker(vp,vq)=kerx(xp,xq)kery(yp,yq)kerk(kp,kq) (6)
formula (III) kerx(·,·)、kery(. phi.) and kerk(-) is a kernel function in dimensions x, y, and k, respectively;
using space-time kernel function ker (v)p,vq) Constructing a space-time covariance matrix of training and testing data, expressed as:
where V represents the position of NL training data points, so Σ (V, V) represents a training covariance matrix of NL × NL dimensions; v*Represents MxMyL*Location of a test point, so (V)*,V*) Represents MxMyL*×MxMyL*A dimensional test covariance matrix; sigma (V, V)*) And Σ (V)*V) represents NL × MxMyL*And MxMyL*A cross covariance matrix of xnl dimensions;
s52, establishing a space-time data model by using a Gaussian process: under the gaussian process model, the training data and the test data obey a joint gaussian distribution, namely:
in the formula dNL×1Is a training data vector, i.e.Where Vec (-) is the vectoring operator,is represented by a sensor siIs/are as followsA training data vector consisting of the L data points in (1);is a test data vector, i.e.:
Further, the specific implementation method of step S6 is as follows: calculating a hyper-parameter in the kernel function; known a priori from Gauss, training data dNL×1The distribution of (a) is gaussian, i.e.: dNL×1|V~N(0,Σ(V,V));dNL×1The log-edge likelihood function of (a) is:
in the formula []-1Representing a matrix inversion operator;
solving the hyperparameters by a numerical method, namely the hyperparameters with the maximum log-edge likelihood function:
further, the specific implementation method of step S7 is as follows: according to the space-time data model, the space-time data reconstructed or predicted at the test point position is as follows:
the covariance of the reconstructed data is:
the invention has the beneficial effects that:
1. the space-time reconstruction method can reconstruct the data of the position without the sensor and the sampling moment according to the sensor data of a limited number of measuring positions and measuring time;
2. the effect of data reconstruction may be interpolation of training data, or extrapolation or prediction of training data, based on selected data samples from the training data set;
3. the space-time data model utilizes the space-time correlation of the training data and the test data, and has good interpretability;
4. by using the product tensor kernel model, a high-dimensional space-time kernel function can be calculated by using a low-dimensional kernel function;
5. by estimating the time decorrelation length of each sensor data and determining the time length of the training data, the computation amount during model training and data testing can be reduced.
Drawings
Fig. 1 is a flowchart of a space-time reconstruction method of sensor network data according to the present invention;
FIG. 2 is a diagram of a sensor arrangement for Internet of things in a room;
FIG. 3 is a schematic diagram of a sensor network data structure according to the present invention;
fig. 4 is spatial panel data collected by the sensor network at time t-5 h in this embodiment;
fig. 5 shows the spatial panel data reconstructed by the present method at time t-5 h in this embodiment;
fig. 6 is a comparison curve between the predicted value and the actual value of the temperature data predicted by using the method at the time when the position t of the sensor 1, the sensor 4, the sensor 5 or the sensor 12 is 6-15h and t is 15-20h in the embodiment;
fig. 7 is a training data slice with t being 0-15h, a test point reconstructed space-time field data slice with t being 2h and t being 7.5h in this embodiment;
fig. 8 shows training data slices with t being 0 to 15h, and test point predicted space-time field data slices with t being 16h and t being 18.5h in this embodiment.
Detailed Description
The technical scheme of the invention is further explained by combining the attached drawings.
It is assumed that the sensor network is laid out in a two-dimensional space represented by a Cartesian coordinate system O-xy, in which there are a total of N sensor nodes siN, the position coordinates of the node i are (x)i,yi). Each sensor node may simultaneously contain a plurality of different types of sensors. To simplify the description of the proposed method, each sensor node is made to contain only one sensor of the same type.
The space and time to be sensed by the sensor network are divided by a grid method, and the size of grid units is determined according to specific application requirements. In particular, the spatial grid sgWith Mx×MyUnit of unit size Δx×ΔyIn which ΔxAnd ΔyThe x-and y-dimensional lengths of the cells, respectively. Typically, the number of spatial grid cells MxMy> N. Time grid tgWith MtUnit of unit size Δt. Thus, a space-time trellis(s)g;tg) The space-time positions of the data points to be reconstructed according to the invention are specified, namely: each space-time grid cell represents one space-time data point to be reconstructed. The total number of reconstructed data points may be MxMyMtAnd (4) respectively.
As shown in fig. 1, a method for space-time reconstruction of sensor network data according to the present invention includes the following steps:
s1, collecting space-time data of the sensor;
the specific implementation method comprises the following steps: sensor siTime series data d for data collectioni=[di(1),di(2),...,di(Ki)]TIs represented by the formula (I) in which di(k),k=1,2,...,KiIndicating sensor siData collected at time K, KiIs siT is the transpose operator;
set for all sensor data collected by sensor networkRepresentation in which superscripts are placed(0)Which represents the raw data of the acquisition,indicating sensor siThe acquired time series data i is 1, …, and N represents the number of sensors. D(0)The position information comprises position coordinates, time stamps and the like of the sensor networks.
In this embodiment, the overall size of the working scene space of the internet of things sensor in a certain room is 40m×32m, and 54 sensors are arranged in total, and the layout is shown in figure 2. Wherein the size of the space-time grid is deltax=2m,Δy2m, time grid size, Δ T1 h, sensor siThe space-time position of the data is (x)i,yi(ii) a k) 1,2,., 54, and k 1,2,., 15, the data structure of which is shown in fig. 3.
S2, preprocessing the acquired space-time data;
the specific implementation method comprises the following steps:
s21, removing outliers in the sensor data, defining an outlier of each sensor as a data value n times larger than a standard deviation of the sensor data, where n is 2-10, and in this embodiment, n is 5; the method for eliminating outliers is to replace outliers with the value of the nearest normal data point;
s22, filling missing data, and taking the average value of the time series before missing, which is equal to the data length of the missing segment, as the filling value of the missing data segment;
s23, data aggregation processing, namely aggregating the data of each sensor into time sequence data with a time interval delta T, and keeping the starting time of an aggregation time period as a time stamp of the aggregated data points; the aggregation operation is to take the average value, the median value or a specific quantile numerical value of the data in the aggregation time period;
s24, truncating all sensor data to start from the same timestamp and end by the same timestamp;
after data preprocessing, the time series data of all sensors are equal in length, the time stamps and the time intervals of the data points are also identical, and a data set is represented asWherein the superscript is(1)Representing pre-processed data, time series data of each sensorAll are K.
S3, determining the length of training data;
the specific implementation method comprises the following steps:
s31, calculating a time correlation sequence of each sensor:
in the formula E [. C]Representing a desired operator; τ is a time delay length and τ is 0,1τ,KτRepresents a maximum delay time;andare respectively a time sequenceAndthe average value of (a) of (b),presentation pairA time sequence obtained by delaying tau time units;
s32, estimating each sensor SiDecorrelation length tau of time series datac,iThere are two cases:
case 1: if the time correlation sequence ri(τ) is aperiodic, then find the condition riThe maximum time delay length with (tau) less than or equal to 0.05 is used as the decorrelation length tauc,i;
Case 2: if the time correlation sequence ri(τ) is periodic, the period of the interval between peaks is found as the decorrelation length τc,i;
S33, determining the maximum decorrelation length of the sensor data as follows:
the correlation sequence r in the present embodiment is found by calculationi(τ) is aperiodic, so the decorrelation length τ is found by case 1c,i(ii) a And the maximum decorrelation length tau is obtained by using the formula (2) extremelyc,max=1;
S34, taking the length L of the training data as follows: l ═ n · τc,maxWherein n is 1.5 to 10, and in this embodiment, n is 10, so that L is 10.
S4, constructing a training data set and a testing data set;
the specific implementation method comprises the following steps:
s41, constructing a training data set: sensor siIs represented byIt is briefly described asWhich corresponds to a space-time position (x)i,yi(ii) a k) The data on (1); time series data of 54 sensors and 15 lengthThere are a total of 810 data points;
the sensor data is normalized, and the processing method comprises the following steps:
in the formulaIs a sequence of dataThe average value of (a) of (b),is thatStandard deviation of (2), superscript(2)Representing the data after normalization processing;
and (3) constructing a training data set omega by using the normalized sensor data to obtain:
the representation is located at spatial coordinates (x)i,yi) Sensor s ofiNumber obtained after normalization processing of data acquired at time kAccording to the data, the training data set omega has NK data points;
s42, constructing a test data set: the test data set is a collection of data points to be reconstructed, the location (x) of which is tested*,y*,k*) Space-time trellis(s)g;tg) Above, i.e. (x)*,y*)∈sgAnd k*∈tg,sgIs a spatial grid, tgIs a time grid;
defining a test data set omega*Comprises the following steps:
in the formula (I), the compound is shown in the specification,for the data sequence to be reconstructed,for the spatial position of the data to be reconstructed, k*For the temporal position of the data to be reconstructed,data on data points to be reconstructed; mx、My、MtRespectively representing the maximum indexes of the test data on x, y axes and a time axis t, wherein the total number of the test data points is MxMyMt(ii) a When k is more than or equal to 1*At ≦ K, the reconstructed data is considered as a result of the temporal interpolation on the training data, mx=1,2,...,Mx,my=1,2,...,My, k *1,2, 15; when k is*When > K, the reconstructed data is regarded as a prediction result about the training data, mx=1,2,...,Mx,my=1,2,...,My,k*=16,17,...,20;
In order to reduce the operation complexity, the sensor data of L time sections are selected from omega in one training process to be used as training data, and the total number of the training data points isNL; from Ω*Middle selection of L*Taking the spatial grid of each time section as a test data point to be reconstructed, wherein the total number of the test data points is MxMyL*(ii) a All M's are to be reconstructedxMyMtData of each test point is required to be carried outA secondary test whereinDenotes a ceiling operation, L*Indicating the reconstructed data length.
Selecting k from omega in the training process *6,7, 15, i.e. L10 time slices of sensor data as training data, the total number of training data points NL 540; when the reconstruction operation is performed in the present embodiment, the operation is performed from Ω*In select k *6,7, 15, i.e. L*Taking a spatial grid of 10 time sections as test data points to be reconstructed, wherein the total number of the test data points is MxMyL*2660; when the prediction operation is finally performed in the present embodiment, the operation is started from Ω*In select k *16,17, 20, i.e. L*Taking a spatial grid of 5 time sections as test data points to be reconstructed, wherein the total number of the test data points is MxMyL*=270。
S5, establishing a space-time data model;
the specific implementation method comprises the following steps:
s51, constructing a space-time covariance matrix of the data: space-time kernel function ker (v)p,vq) Containing the space-time correlation of the data, where vp=(xp,yp,kp) And vq=(xq,yq,kq) Respectively representing two different space-time positions of p and q; the space-time data is three-dimensional data in the x, y, and k dimensions, represented by ker (v) using a product tensor kernelp,vq) Namely:
ker(vp,vq)=kerx(xp,xq)kery(yp,yq)kerk(kp,kq) (6)
formula (III) kerx(·,·)、kery(. phi.) and kerk(-) is a kernel function in dimensions x, y, and k, respectively; the specific form of the kernel function can be determined in advance, namely, the kernel function is determined by the correlation of historical data, and is a square exponential kernel, a matrix kernel, a periodic kernel or the like; or estimated from the acquired data, such as spectral mixing kernel. The hyperparameters in the kernel function are represented by a parameter vector θ.
In this embodiment, a Squared explicit iteration (SE) kernel is empirically chosen in the x and y dimensions, and the expression of the SE kernel is as follows:
wherein s is2And l is a hyper-parameter which respectively represents the variance and the length scale of the training data, wherein the length scale is the proportion of the distance between the samples before and after the feature space mapping under the weight space view angle.
According to experience, a Spectral Mixture (SM) kernel is selected in the k dimension, and the expression of the SM kernel is as follows:
in the formula, wq、Σe=[Σe (1),Σe (2),...,Σe (Q)]And mue=[μe (1),μe (2),...,μe (Q)]The weight, variance and frequency parameters corresponding to the e-th accumulation term are respectively represented, Q is an accumulation parameter, and in this embodiment, an empirical value of 5 is taken. Before the SM core is used, the SM core needs to be initialized according to the following method: w is aqAll elements in the list are initialConverting into data variance; mu.seAre each initialized to the reciprocal of the minimum sampling interval; sigmaeAre initialized to the inverse of the length scale.
Using space-time kernel function ker (v)p,vq) Constructing a space-time covariance matrix of training and testing data, expressed as:
where V represents the position of NL training data points, so Σ (V, V) represents a training covariance matrix of NL × NL dimensions; v*Represents MxMyL*Location of a test point, so (V)*,V*) Represents MxMyL*×MxMyL*A dimensional test covariance matrix; sigma (V, V)*) And sigma (V)*V) represents NL × MxMyL*And MxMyL*A cross covariance matrix of xnl dimensions;
s52, establishing a space-time data model by using a Gaussian process: under the gaussian process model, the training data and the test data obey a joint gaussian distribution, namely:
in the formula dNL×1Is a training data vector, i.e.Where Vec (-) is the vectoring operator,is represented by a sensor siIs/are as followsA training data vector consisting of the L data points in (1);is a test data vector, i.e.:
S6, training a space-time data model;
the specific implementation method comprises the following steps: calculating a hyper-parameter in the kernel function; known a priori from Gauss, training data dNL×1The distribution of (a) is gaussian, i.e.: d is a radical ofNL×1|V~N(0,Σ(V,V));dNL×1The log-edge likelihood function of (a) is:
in the formula []-1Representing a matrix inversion operator;
solving the hyperparameters by a numerical method, namely the hyperparameters with the maximum log-edge likelihood function:
s7, performing space-time data reconstruction by using the model;
the specific implementation method comprises the following steps: according to the space-time data model, the space-time data reconstructed or predicted at the test point position is as follows:
fig. 4 shows a sensor network data slice at the time point t-5 h in this embodiment; the space-time field slice reconstructed by using the method at the time t-5 h is shown in fig. 5;
a space-time data model can also be applied to ensure that the formula (11) predicts the space-time data at the position of the predicted test pointIn the example, comparison curves of predicted values and actual values of temperature data predicted by the method at the time when the position t of the No. 1, No. 4, No. 5 and No. 12 sensors is 6-15h and t is 15-20h are shown in FIG. 6;
the training data slice with t being 0-15h, the test point reconstruction space-time field data slice with t being 2h and t being 7.5h in the embodiment are shown in fig. 7; training data slices with t being 0-15h, test point prediction space-time field data slices with t being 16h and t being 18.5h in the embodiment are shown in fig. 8.
It can be seen from the figure that the space-time field reconstruction method based on the gaussian process, which is provided by the invention, can be used for reconstructing the space-time field without depending on a physical model by using the known node data in the space-time field.
The covariance of the reconstructed data is:
it will be appreciated by those of ordinary skill in the art that the embodiments described herein are intended to assist the reader in understanding the principles of the invention and are to be construed as being without limitation to such specifically recited embodiments and examples. Those skilled in the art can make various other specific changes and combinations based on the teachings of the present invention without departing from the spirit of the invention, and these changes and combinations are within the scope of the invention.
Claims (2)
1. A space-time reconstruction method of sensor network data is characterized by comprising the following steps:
s1, collecting space-time data of the sensor; the specific implementation method comprises the following steps: sensor siTime series data d for data collectioni=[di(1),di(2),...,di(Ki)]TIs represented by (a) wherein di(k),k=1,2,...,KiIndicating sensor siData collected at time K, KiIs siT is the transpose operator;
set for all sensor data collected by sensor networkIndicating, where superscript (0) indicates raw acquisition data,indicating sensor siThe acquired time series data, i is 1, …, N, N represents the number of sensors;
s2, preprocessing the acquired space-time data; the specific implementation method comprises the following steps:
s21, removing outliers in the sensor data, and defining the outliers of each sensor to be data values which are n times larger than the standard deviation of the sensor data, wherein the value range of n is 2-10; the method for eliminating outliers is to replace outliers with the value of the nearest normal data point;
s22, filling missing data, and taking the average value of the time series before missing, which is equal to the data length of the missing segment, as the filling value of the missing data segment;
s23, data aggregation processing, namely aggregating the data of each sensor into time sequence data with a time interval delta T, and keeping the starting time of an aggregation time period as a time stamp of the aggregated data points; the aggregation operation is to take the average value, the median value or a specific quantile numerical value of the data in the aggregation time period;
s24, truncating all sensor data to start from the same timestamp and end by the same timestamp;
after data preprocessing, the time series data of all sensors are equal in length, the time stamps and the time intervals of the data points are also identical, and a data set is represented asWherein the superscript is(1)Representing the preprocessed data;
s3, determining the length of training data;
s4, constructing a training data set and a testing data set; the specific implementation method comprises the following steps:
s41, constructing a training data set: sensor siIs represented byIt is briefly described asWhich corresponds to a space-time position (x)i,yi(ii) a k) The data on (1); time series data of length K for N sensorsTotal NK data points;
the sensor data is normalized, and the processing method comprises the following steps:
in the formulaIs a sequence of dataThe average value of (a) of (b),is thatStandard deviation of (2), superscript(2)Representing the data after normalization processing;
and (3) constructing a training data set omega by using the normalized sensor data to obtain:
the representation is located at spatial coordinates (x)i,yi) Sensor siCarrying out normalization processing on the data acquired at the moment k to obtain data, wherein the training data set omega has NK data points;
s42, constructing a test data set: the test data set is a collection of data points to be reconstructed, the location (x) of which is tested*,y*,k*) Space-time trellis(s)g;tg) Above, i.e. (x)*,y*)∈sgAnd k*∈tg,sgIs a spatial grid, tgIs a time grid;
defining a test data set omega*Comprises the following steps:
in the formula (I), the compound is shown in the specification,for the data sequence to be reconstructed,for the spatial position of the data to be reconstructed, k*For the temporal position of the data to be reconstructed,data on data points to be reconstructed; mx、My、MtRespectively representing the test data in x, y axes and timeMaximum index on axis t, total number of test data points MxMyMt(ii) a When k is more than or equal to 1*When K is less than or equal to K, the reconstructed data is regarded as a time interpolation result of the training data; when k is*When the data is more than K, the reconstructed data is regarded as a prediction result about the training data;
in order to reduce the operation complexity, selecting sensor data of L time sections from omega in a training process as training data, wherein the total number of training data points is NL; from Ω*Middle selection L*Taking the spatial grid of each time section as a test data point to be reconstructed, wherein the total number of the test data points is MxMyL*(ii) a All M's are to be reconstructedxMyMtData of each test point is needed to be carried outSub-test of whereinDenotes a ceiling operation, L*Representing the reconstructed data length;
s5, establishing a space-time data model; the specific implementation method comprises the following steps:
s51, constructing a space-time covariance matrix of the data: space-time kernel function ker (v)p,vq) Containing the space-time correlation of the data, where vp=(xp,yp,kp) And vq=(xq,yq,kq) Respectively representing two different space-time positions of p and q; the space-time data is three-dimensional data in the x, y, and k dimensions, represented by ker (v) using a product tensor kernelp,vq) Namely:
ker(vp,vq)=kerx(xp,xq)kery(yp,yq)kerk(kp,kq) (6)
formula (III) kerx(·,·)、kery(. cndot.) and kerk(-) is a kernel function in dimensions x, y, and k, respectively;
using space-time kernel function ker (v)p,vq) Constructing a space-time covariance matrix of training and testing data, expressed as:
where V represents the position of NL training data points, so Σ (V, V) represents a training covariance matrix of NL × NL dimensions; v*Represents MxMyL*Location of a test point, so (V)*,V*) Represents MxMyL*×MxMyL*A dimensional test covariance matrix; sigma (V, V)*) And Σ (V)*V) represents NL × MxMyL*And MxMyL*A cross covariance matrix of xnl dimensions;
s52, establishing a space-time data model by using a Gaussian process: under the gaussian process model, the training data and the test data obey a joint gaussian distribution, namely:
in the formula dNL×1Is a training data vector, i.e.Where Vec (-) is the vectoring operator,is represented by a sensor siIs/are as followsA training data vector consisting of the L data points in (1);is a test data vector, i.e.:
s6, training a space-time data model; the specific implementation method comprises the following steps: calculating a hyper-parameter in the kernel function; known a priori from Gauss, training data dNL×1The distribution of (a) is gaussian, i.e.: dNL×1|V~N(0,Σ(V,V));dNL×1The log-edge likelihood function of (a) is:
in the formula []-1Representing a matrix inversion operator;
solving the hyperparameters by a numerical method, namely the hyperparameters with the maximum log-edge likelihood function:
s7, performing space-time data reconstruction by using the model; the specific implementation method comprises the following steps: according to the space-time data model, the space-time data reconstructed or predicted at the test point position is as follows:
the covariance of the reconstructed data is:
2. a space-time reconstruction method of sensor network data according to claim 1, wherein the step S3 is specifically implemented by:
s31, calculating the time correlation sequence of each sensor:
in the formula E [ ·]Representing a desired operator; τ is a time delay length and τ is 0,1τ,KτRepresents a maximum delay time;andare respectively a time sequenceAndthe average value of (a) of (b),presentation pairA time sequence obtained by delaying tau time units;
s32, estimating each sensor SiDecorrelation length τ of time series datac,iThere are two cases:
case 1: if the time correlation sequence ri(τ) is aperiodic, then find the condition riThe maximum time delay length with (tau) less than or equal to 0.05 is used as the decorrelation length tauc,i;
Case 2: if the time correlation sequence ri(τ) is periodic, the period of the interval between peaks is found as the decorrelation length τc,i;
S33, determining the maximum decorrelation length of the sensor data as follows:
s34, taking the length L of the training data as: l ═ n · τc,maxWherein the value range of n is 1.5-10.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011576661.0A CN112699601B (en) | 2020-12-28 | 2020-12-28 | Space-time reconstruction method for sensor network data |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011576661.0A CN112699601B (en) | 2020-12-28 | 2020-12-28 | Space-time reconstruction method for sensor network data |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112699601A CN112699601A (en) | 2021-04-23 |
CN112699601B true CN112699601B (en) | 2022-05-31 |
Family
ID=75512190
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011576661.0A Active CN112699601B (en) | 2020-12-28 | 2020-12-28 | Space-time reconstruction method for sensor network data |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112699601B (en) |
Family Cites Families (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7981424B2 (en) * | 2006-05-05 | 2011-07-19 | Transtech Pharma, Inc. | RAGE fusion proteins, formulations, and methods of use thereof |
CN104156615A (en) * | 2014-08-25 | 2014-11-19 | 哈尔滨工业大学 | Sensor test data point anomaly detection method based on LS-SVM |
CN106295703B (en) * | 2016-08-15 | 2022-03-25 | 清华大学 | Method for modeling and identifying time sequence |
CN107704962B (en) * | 2017-10-11 | 2021-03-26 | 大连理工大学 | Steam flow interval prediction method based on incomplete training data set |
CN109583386B (en) * | 2018-11-30 | 2020-08-25 | 中南大学 | Intelligent rotating machinery fault depth network feature identification method |
CN112067294A (en) * | 2019-09-20 | 2020-12-11 | 宫文峰 | Rolling bearing intelligent fault diagnosis method based on deep learning |
CN111861002A (en) * | 2020-07-22 | 2020-10-30 | 上海明华电力科技有限公司 | Building cold and hot load prediction method based on data-driven Gaussian learning technology |
-
2020
- 2020-12-28 CN CN202011576661.0A patent/CN112699601B/en active Active
Also Published As
Publication number | Publication date |
---|---|
CN112699601A (en) | 2021-04-23 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109492822B (en) | Air pollutant concentration time-space domain correlation prediction method | |
Le et al. | Spatiotemporal deep learning model for citywide air pollution interpolation and prediction | |
Trebing et al. | Wind speed prediction using multidimensional convolutional neural networks | |
Wu et al. | Promoting wind energy for sustainable development by precise wind speed prediction based on graph neural networks | |
CN106559749B (en) | Multi-target passive positioning method based on radio frequency tomography | |
CN114220271A (en) | Traffic flow prediction method, equipment and storage medium based on dynamic space-time graph convolution cycle network | |
CN111079977A (en) | Heterogeneous federated learning mine electromagnetic radiation trend tracking method based on SVD algorithm | |
CN113705880A (en) | Traffic speed prediction method and device based on space-time attention diagram convolutional network | |
Sun et al. | Device-free wireless localization using artificial neural networks in wireless sensor networks | |
CN107967487A (en) | A kind of colliding data fusion method based on evidence distance and uncertainty | |
CN110008508A (en) | Three-dimensional temperature field monitoring method based on space-time condition dynamic modeling | |
CN103298156A (en) | Passive multi-target detecting and tracking method based on wireless sensor networks | |
CN108647643A (en) | A kind of packed tower liquid flooding state on-line identification method based on deep learning | |
CN114297907A (en) | Greenhouse environment spatial distribution prediction method and device | |
CN113033654A (en) | Indoor intrusion detection method and system based on WiFi channel state information | |
CN110796360A (en) | Fixed traffic detection source multi-scale data fusion method | |
Chen et al. | WSN sampling optimization for signal reconstruction using spatiotemporal autoencoder | |
CN111209968A (en) | Multi-meteorological factor mode forecast temperature correction method and system based on deep learning | |
CN112699601B (en) | Space-time reconstruction method for sensor network data | |
CN117156442B (en) | Cloud data security protection method and system based on 5G network | |
Chen et al. | Temperature monitoring and prediction under different transmission modes | |
CN113538239B (en) | Interpolation method based on space-time autoregressive neural network model | |
Xuegang et al. | Missing Data Reconstruction Based on Spectral k-Support Norm Minimization for NB-IoT Data | |
Li et al. | Missing data reconstruction in attitude for quadrotor unmanned aerial vehicle based on deep regression model with different sensor failures | |
Zhu et al. | Multi-resolution spatio-temporal prediction with application to wind power generation |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |