Time-space delay correlation visualization method
Technical Field
The invention provides a time-space delay correlation visualization method, which can visualize and assist in analyzing the delay correlation of a plurality of time series data with spatial relationship and belongs to the field of information visualization.
Background
The spatiotemporal data are data with time and space labels, and the quantity of the spatiotemporal data is increased sharply along with the popularization of sensor technology. Through the analysis of the time-space data, the trend and the abnormal condition in the data can be found. The visualization method is used for presenting data in an intuitive mode and has a very good auxiliary effect on understanding information behind complex data.
The station-Based space-time data refers to data which is acquired by a data acquisition device fixed at a certain specific position and changes along with time, such as an air quality monitoring station of a foundation (Ground base), a temperature sensor of an office area and the like. The space tag contained in the spatio-temporal data is fixed, and is the position information of the sensor, and the spatio-temporal data can be understood as time series data with fixed space information.
Delay correlation is a very important pattern of time series data with spatial relationships. Delay correlation indicates that for A, B two pieces of timing data, if a has correlation in the time range (t1-t, t2-t) and B has correlation in the time range (t1, t2), A, B two pieces of timing data can be said to have delay correlation, the length of the delay being t. Through the analysis of the delay correlation of a plurality of time series data, the causal relationship among different spatial regions can be further discovered, and the method has very important practical significance.
However, there is no good method for visualizing multiple time series data with spatial relationship at present, and it is difficult to perform auxiliary analysis on delay correlation of multiple time series data with spatial relationship by using fewer visualization methods.
Disclosure of Invention
In view of the above problems, the present invention provides a method for visualizing a spatio-temporal delay correlation, which can visualize the delay correlations of a plurality of time series data having a spatial relationship.
The technical scheme adopted by the invention is as follows:
a spatio-temporal delay correlation visualization method comprises the following steps:
1) selecting a certain time sequence data from the multiple time sequence data with the spatial relationship, and calculating the distance between the spatial position of other time sequence data and the spatial position of the selected time sequence data according to the spatial position of the selected time sequence data;
2) selecting time sequence data and corresponding time range (t1, t2) of a certain station S, and determining a delay time range (-h1, h2) of the calculation of the leading correlation and the lagging correlation, wherein h1 represents the leading time, and h2 represents the lagging time; for each site S needing to calculate the correlation, calculating the correlation between the time sequence data of the S site at the time t1 to t2 and the time sequence data of the S site at the time t1+ i to t2+ i, wherein the value of i is from-h 1 to h 2;
3) and according to the calculation results of the step 1) and the step 2), expressing the correlation of different delay times and expressing the correlation of sites with different spatial relationships by adopting a rectangular block-based visualization method.
Further, step 3) represents the relevance of the sites with different spatial relationships through hierarchical division and distance mapping of rectangular zones; and the correlation of each different delay time is represented by a continuous rectangular block of different color maps.
Further, the specific method for expressing the relevance of the sites with different spatial relationships in step 3) is as follows:
a) based on the distance between the target station and other adjacent stations, performing hierarchical division on the adjacent stations, and dividing the adjacent stations into l layers;
b) drawing (l +1) rectangular frames, wherein l is a positive integer greater than or equal to 0, and one rectangular frame presents the time sequence data of the target site by using a traditional time sequence data visualization method; and c) respectively visualizing the relevance of the different layers divided in the step a) by the rest one rectangular frame from far to near according to the distance from the drawn rectangular frame.
Further, a specific method for expressing the correlation of each different delay time in step 3) is as follows:
a1) for all stations in a layer, respectively representing the delay correlation of the station and a target station by a rectangular band from far to near according to the distance between the station and a drawn rectangular frame representing the target station; the width of each rectangular strip is the width of the rectangular frame divided by the number of stations contained in the layer;
b1) for each rectangular band, the rectangular band is divided into equal parts (h1+ h2+1) of rectangular blocks on average, each rectangular block from left to right sequentially represents the delay time from-h 1 to h2, and the color of each divided rectangular block represents the correlation corresponding to a certain delay time.
Further, a specific method for expressing the correlation of each different delay time in step 3) is as follows:
a2) for all stations in a layer, respectively representing the delay correlation of the station and a target station by a rectangular band from far to near according to the distance between the station and a drawn rectangular frame representing the target station; the width of each rectangular strip is the width of the rectangular frame divided by the number of stations contained in the layer;
b2) for each rectangular band, dividing rectangular blocks according to the proportion of the actual distance between each station and a target station, wherein the color of each divided rectangular block represents the correlation corresponding to a certain delay time.
Further, the spatial mapping is divided into two mutually corresponding directions, and the presentation of the delay correlation is performed in each direction.
Further, the spatial region is divided into several equal parts, and the presentation of the delay correlation is performed in each region.
The invention has the following beneficial effects:
the method is Based on a rectangular-Based (Rectangle Based) visualization method, and the relevance of different delay times is expressed through continuous rectangular blocks mapped by different colors; the relevance of sites with different spatial relationships is represented by hierarchical division and distance mapping of rectangular bands (a plurality of rectangular blocks corresponding to each site). By the method and the device, the delay correlation of the time sequence data with the spatial relationship can be visualized, and an analyst is helped to discover the causal relationship implied behind the correlation.
Drawings
Fig. 1 is a schematic diagram of a target station and its neighboring stations in the embodiment.
Fig. 2 is a schematic diagram of drawing a rectangular frame in the embodiment.
Fig. 3 is a schematic diagram of dividing a rectangular band in the embodiment.
Fig. 4 is a schematic diagram of an average division rectangular block in the embodiment.
Fig. 5 is a schematic diagram of dividing rectangular blocks according to the proportion of the actual distance from the target station in the embodiment.
Fig. 6 is a schematic diagram of dividing the spatial mapping into two mutually corresponding directions for delay correlation presentation in the embodiment.
Fig. 7 is a schematic diagram of the embodiment of dividing the spatial region into equal parts for delayed correlation presentation.
Fig. 8 is a schematic diagram of a related art correlation matrix method.
Fig. 9 is a schematic diagram of a conventional time-series correlation visualization method.
Detailed Description
In order to make the aforementioned objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in detail below.
The method is based on a rectangular block-based visualization method, and the relevance of different delay times is expressed through continuous rectangular blocks mapped by different colors; the relevance of sites with different spatial relationships is represented by hierarchical division and distance mapping of rectangular bands (a plurality of rectangular blocks corresponding to each site).
1. Data preparation
a. The present invention is directed to multiple time series data having a spatial relationship. And calculating the distance between the spatial position of other time sequence data and the position of a selected time sequence data.
b. Selecting time sequence data of a certain site S and corresponding time range (t1, t2), and determining a range (-h1, h2) of delay time calculated by leading correlation (leading correlation) and lagging correlation (lag correlation), wherein h1 represents leading time, and h2 represents lagging time. For each site s for which correlation needs to be computed, from i to h1 to h2, compute the correlation ci:
ci=Correlation(S(t1,t2),s(t1+i,t2+i))
wherein Correlation (S (t1, t2), S (t1+ i, t2+ i)) represents the Correlation between the time series data of the S site at the time t1 to t2 and the time series data of the S site at the time t1+ i to t2+ i, and the Correlation calculation is performed by using the pearson Correlation coefficient (calculation may be performed by using a Correlation calculation method other than the pearson Correlation coefficient). The value is saved for subsequent visualization.
After the calculation is completed, besides the values of the relevant time sequence data, each time sequence data is added with (h1+ h2+1) pieces of data of the delay correlation value and the time length of the delay.
2. Delay correlation visualization method
a. And carrying out hierarchical division on the adjacent stations on the basis of the distances between the target station and other adjacent stations, and dividing the adjacent stations into l layers. l is a positive integer greater than or equal to 0.
b. Drawing (l +1) rectangular boxes, wherein one rectangular box presents the time sequence data of the target station by using a traditional time sequence data visualization method (such as a line chart and an area chart). And (4) respectively visualizing the relevance of different layers divided in the step a according to the distance from the drawn rectangular frame to the distance of the rest one rectangular frame.
c. Since more than one station is included in a layer, for all stations in a layer, the delay correlation between the station and the target station is represented by a rectangular band from far to near according to the distance from the drawn rectangular box representing the target station. The width of each rectangular strip is the width of the rectangular box divided by the number of stations contained in the layer.
d. For each rectangular band, the average is divided into equal (h1+ h2+1) rectangular blocks, each block from left to right representing the delay time from-h 1 to h2 in turn. The color of each divided rectangular block represents the correlation corresponding to a certain delay time. For the step d, besides using an average division method, the layout and division can be performed according to the proportion of the actual distance between each station and the target station.
We illustrate the above method with 3 layers (l ═ 3), 6 station numbers, and a delay time (-12 hours, 12 hours) as an example. As shown in fig. 1, the solid circular area is the selected target station, and the stations adjacent thereto are S1 to S6.
a. The scene is divided into 3 layers, wherein the L1 layer comprises S1 sites, the L2 layer comprises S2 and S3 sites, and the L3 layer comprises S4, S5 and S6 sites.
b. Drawing 4 rectangular boxes, and presenting the time sequence data of the target station by using a line graph. The remaining 3 rectangular boxes L1, L2, L3 are drawn from far to near from the drawn rectangular box. As shown in fig. 2.
c. In this example, tier L1 includes S1 sites, tier L2 includes S2, S3 sites, and tier L3 includes S4, S5, S6 sites. The rectangular frame representing the three layers is divided into corresponding rectangular bands, and the division result is shown in fig. 3. For the arrangement of the rectangular bands in a rectangular frame, the stations are arranged from far to near according to the distance from the target station, and the arrangement result is shown in fig. 3.
d. For each rectangular band, the average is divided into 25 equal parts (12+12+ 1). The color of each divided rectangular block represents the correlation corresponding to a certain delay time. In this example, red to white to blue indicates a change in correlation from 1 to 0 to-1. The different shades of gray in fig. 4 represent different colors.
Step d may be performed by performing layout and division according to the ratio of the actual distance between each station and the target station, in addition to the average division method, and the result is shown in fig. 5.
Fig. 6 is another embodiment of the present invention, in which the spatial mapping is further divided into two mutually corresponding directions (north and south, east and west, etc.), and in each direction, the presentation of the delay correlation is performed using the steps described in the method of the present invention.
Fig. 7 is another embodiment of the present invention, in which the spatial region is further divided into a plurality of equal parts according to actual requirements, as shown in the figure, the spatial region is divided into 8 parts, and the drawing of each region is performed by using the steps described in the method.
The method of the present invention is compared below with two conventional correlation visualization methods.
1) Compared with the existing correlation matrix method, the method can present the correlation of continuous time sequence data of one spatial position (station), which cannot be expressed by the correlation matrix. Besides, the method can better express the spatial information (relative distance). Fig. 8 is a diagram for comparing correlations between 7 sites (S, S1 to S6) by the related art correlation matrix method, and color mapping is the same as the method in the embodiment of the method. From this figure, a very strong professional knowledge is necessary to find the spatial variation law of the correlation and to present the temporal variation law.
2) Compared with the existing time sequence correlation visualization method, the method can introduce the spatial relationship by using methods such as graphic change, color mapping and the like, and realizes the presentation of the correlation of a plurality of spatial positions (sites). Fig. 9 is a visualization method of timing correlation of two sites, where the horizontal axis represents time, which is the same as the time represented in the embodiment of the method, and the vertical axis represents a correlation value (-1, 1), which is difficult to implement the presentation of multi-site correlation, for example, when a plurality of histograms are used for presentation, the continuous correlation existing between multiple sites cannot be found due to insufficient rendering space.
The above embodiments are only intended to illustrate the technical solution of the present invention and not to limit the same, and a person skilled in the art can modify the technical solution of the present invention or substitute the same without departing from the spirit and scope of the present invention, and the scope of the present invention should be determined by the claims.