CN110389982A - A kind of spatiotemporal mode visual analysis system and method based on air quality data - Google Patents

A kind of spatiotemporal mode visual analysis system and method based on air quality data Download PDF

Info

Publication number
CN110389982A
CN110389982A CN201910678017.5A CN201910678017A CN110389982A CN 110389982 A CN110389982 A CN 110389982A CN 201910678017 A CN201910678017 A CN 201910678017A CN 110389982 A CN110389982 A CN 110389982A
Authority
CN
China
Prior art keywords
mode
air quality
data
quality data
time
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910678017.5A
Other languages
Chinese (zh)
Inventor
张慧杰
任珂
曲德展
吕程
蔺依铭
王蓉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Northeastern University China
Northeast Normal University
Original Assignee
Northeast Normal University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Northeast Normal University filed Critical Northeast Normal University
Priority to CN201910678017.5A priority Critical patent/CN110389982A/en
Publication of CN110389982A publication Critical patent/CN110389982A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/26Visual data mining; Browsing structured data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/26Government or public services

Abstract

The present invention relates to visualization technique fields, purpose is to provide a kind of spatiotemporal mode visual analysis system based on air quality data, system includes data preprocessing module, data analysis module and visualization model, visual analysis is mainly explored from single city timing variations, and intercity spatial correlation expansion in the timing variations and mode of adjacent cities group different mode on geographical space, analysis result is coordinated by Multiview linkage;There is specific data distribution characteristics mode by design inter-view characterization, explore mode of rule, the abnormal patterns in space-time characteristic, analyze these modes further to extract valuable information, analyst can be helped intuitively, comprehensively to analyze the normal mode of air quality data, excavate hidden data Mode exploration mode distribution characteristics, time-varying trend is explored, decision support is provided for analysis personnel and provides scientific basis for air pollution treatment policy making.

Description

A kind of spatiotemporal mode visual analysis system and method based on air quality data
Technical field
The present invention relates to visualization fields, and in particular to a kind of spatiotemporal mode visual analysis system based on air quality data System and method.
Background technique
In recent years, global air pollution problems inherent is got worse, and causes the extensive concern of many experts and scholar.Visually Analysis is a kind of effective means for analyzing big data, can help the inherent law of user's intuitively heuristic data.So far, Various analysis methods have been proposed to study air quality problems in researcher.However, existing air quality is ground Study carefully the time varying characteristic for being primarily upon single pollutant, the less correlation analysis considered between pollutant, and then it is stronger to probe into correlation Dimension subspace on data time-varying mode be distributed influence.Meanwhile the spatiotemporal simulation about group of cities, existing air matter Amount data visualization analysis method has ignored intercity spatial correlation mostly, it is difficult to effectively explore air quality spatiotemporal mode Neighborhood information and timing.Air quality spatiotemporal mode carries out visual analysis, and analyst is helped to explore air pollution quality mode And its origin cause of formation, promote air pollution treatment policy making to provide scientific basis.
With the development of science and technology the mankind can obtain extensive air matter by multiple means such as sensor, monitoring stations Data are measured, this analyzes air pollution problems inherent for researcher and provides reliable Research foundation.However, data is extensive, more The features such as variable, time variation, brings huge challenge to air quality spatiotemporal simulation.
The invention proposes a kind of spatiotemporal mode visual analysis system based on air quality data, and introduce visualization skill Art helps user to explore air quality spatiotemporal mode and its multidimensional characteristic from multiple visual angles.
Summary of the invention
It is an object of that present invention to provide a kind of spatiotemporal mode visual analysis system based on air quality data, proposes one Kind air quality data dimension Subspace partition method, one kind based on comentropy and mutual information are based on concentration and successional list The air quality time-varying mode identification method in a city and a kind of air quality space-time based on concentration and successional group of cities Mode identification method;It designs and develops an integrated visualization system to explore the spatiotemporal mode system of air quality, needle To single city, major design time varying characteristic view visualizes its air quality time-varying mode, for geographically adjacent Group of cities devises space-time characteristic view and carrys out interactive its air quality spatiotemporal mode of exploration.
To achieve the above object, the technical scheme adopted by the invention is that: a kind of space-time mould based on air quality data Formula visual analysis system, including data preprocessing module, data analysis module and visualization model, the data preprocessing module It is to choose urban air-quality data from space-time and attribute dimensions;The data analysis module is extracted by similitude cluster The feature of the urban air-quality data;The visualization model is set according to the extracted feature of the data analysis module A variety of visualization views are counted, interactive mode explores the distribution characteristics in different time and spatial data, and the visualization view includes There are class time-space attribute scatter plot, spatiotemporal mode arrangement view, stream view and time varying characteristic view, class time-space attribute scatter plot can root Mode is chosen according to the aggregation extent and continuity degree of spatio-temporal distribution, is found frequently by spatiotemporal mode arrangement view same When appear in the city of same mode the time varying characteristic view passed through by time-varying trend between the stream view exploration mode The pollution mode concentrated and appear in continuous time period is found, and analysis result is coordinated by Multiview linkage.
Preferably, the data preprocessing module is multiple ground monitoring station for acquiring air quality datas from city, Every air quality data includes six kinds of pollutants PM2.5, PM10, NO2、SO2、O3, the concentration information of CO and described The geographical location information of face monitoring station.
Preferably, the location information of the ground monitoring website includes the longitude and latitude of the ground monitoring website.
Preferably, the data analysis module is by when space scale selects group of cities and different time granularity selection Between section, its air quality data is clustered using clustering algorithm, the clustering algorithm includes being obtained using Canopy algorithm The initial cluster center and cluster number K, the clustering algorithm of K-means clustering algorithm are as follows:
Input: air quality data D, initial distance threshold value are T1, T2, and T1 > T2
Output: Canopy clusters to obtain center point set N, and gathering closes C
Step1: all samples in traversal D calculate the distance between sample Dis using Euclidean distance;
Step2: a sample d is taken out from D, using d as first class, that is, Canopy, and marks d;
Step3: new sample P is taken out in continuation from D, searches the distance between the sample and Canopy Dis;
Step4: if Dis < T1, P is classified as the Canopy;If the distance Dis > T1, P of P to all Canopy As new Canopy;If the distance of P to Canopy is less than T2, P is marked;
Step5: repeating Step3-Step4, and until point all in D is labeled, Canopy algorithm terminates;
Step6: center of each Canopy sample attribute mean value as the Canopy, output center point set N are calculated;
Step7: all sample points are calculated to the distance between each central point Dis using Euclidean distance;
Step8: d is assigned to away from it apart from nearest cluster;
Step9: calculating the mean vector of sample point in each cluster, using the vector as new class central point;
Step10: Step7-Step9 in repetition, until central point is no longer changed;
Step11: output gathering closes C;
Step12: algorithm terminates;
Further use K-means cluster refinement cluster result.
Preferably, in the visualization model, single city is used using time-varying spiral figure, dandelion figure and radar map It is daily the timing distribution of the mode of granularity in displaying.
Preferably, in the visualization model, group of cities is shown using time-varying Voronoi diagram, Sang Jitu, radar map The situation of change of mode over time.
A kind of spatiotemporal mode visual analysis method based on air quality data is based on a kind of base described in claim 1 In the spatiotemporal mode visual analysis system of air quality data, division, Canopy+K-means cluster including dimension subspace Algorithm and concentration and continuity identify air quality spatiotemporal mode, carry out dimension by extracting urban air-quality data The division in space handles the dimension subspace of the air quality data according to comentropy and mutual information;By described Data in the air quality data with similar features are divided into same class by Canopy+K-means clustering algorithm, from And obtain the spatiotemporal mode of class;By to concentration and continuity in the concentration and continuity identification air quality spatiotemporal mode The aggregation extent being distributed in the spatiotemporal mode and continuity degree are judged respectively.
Preferably, it is described by time and space demarcation interval bin come count different mode the section bin point Cloth calculates concentration and continuity by solving the effectively section bin, the positioning effectively section bin.
Preferably, the comentropy is obtained by following calculation mode:
For the air quality data X, the comentropy H (X) is defined as, and x is a sample point of X, and p (x) is x's Probability;The mutual information is obtained by following calculation mode:
Air quality data X and Y described for two, x are a sample point of X, and wherein y is a sample point of Y, p (x, y) is the joint probability density of scalar value x and y.
Compared with prior art, beneficial effects of the present invention include: probing into sky with visual analysis using visualization The spatiotemporal mode of gas qualitative data is distributed, and analyst can be helped intuitively, comprehensively to analyze the normal mode of air quality data, Hidden data Mode exploration mode distribution characteristics is excavated, explores time-varying trend, personnel provide decision support for analysis.
Detailed description of the invention
Fig. 1 is a kind of block schematic illustration of the spatiotemporal mode visual analysis system based on air quality data in the present invention;
Fig. 2 is the time varying characteristic view of one embodiment in the present invention;
Fig. 3 is the schematic diagram of the dimension Subspace partition of one embodiment in the present invention;
Fig. 4 is the group of cities classification space-time characterisation figure of one embodiment in the present invention;
Fig. 5 time-varying trend view between the mode of one embodiment in the present invention;
Fig. 6 is the spatial and temporal distributions view of the different mode of one embodiment in the present invention;
Fig. 7 is that the air quality of one embodiment in the present invention continues to decline the schematic diagram of mode;
Fig. 8 is the single city spatial and temporal distributions view of one embodiment in the present invention.
Specific embodiment
Below with reference to attached drawing 1~8 of the invention, technical solution in the embodiment of the present invention is clearly and completely retouched It states, it is clear that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.Based on the present invention In embodiment, every other implementation obtained by those of ordinary skill in the art without making creative efforts.
In the description of the present invention, it is to be understood that, term " counterclockwise ", " clockwise " " longitudinal direction ", " transverse direction ", The orientation of the instructions such as "upper", "lower", "front", "rear", "left", "right", "vertical", "horizontal", "top", "bottom", "inner", "outside" or Positional relationship is to be based on the orientation or positional relationship shown in the drawings, and is merely for convenience of the description present invention, rather than is indicated or dark Show that signified device or element must have a particular orientation, be constructed and operated in a specific orientation, therefore should not be understood as pair Limitation of the invention.
Fig. 1 is please referred to, a kind of block schematic illustration of the spatiotemporal mode visual analysis system based on air quality data is a kind of Spatiotemporal mode visual analysis system based on air quality data, including data preprocessing module, data analysis module and visual Change module, the data preprocessing module is to choose urban air-quality data from space-time and attribute dimensions;The data analysis Module is that the feature of the urban air-quality data is extracted by similitude cluster;The visualization model is according to The a variety of visualization views of the extracted characteristic Design of data analysis module, interactive mode explore point in different time and spatial data Cloth feature, the visualization view include class time-space attribute scatter plot, spatiotemporal mode arrangement view, stream view and time varying characteristic View, class time-space attribute scatter plot can choose mode according to the aggregation extent and continuity degree of spatio-temporal distribution, pass through Spatiotemporal mode arrangement view finds frequently while appearing in the city of same mode, passes through time-varying between the stream view exploration mode Trend is found by the time varying characteristic view and concentrates the pollution mode for appearing in continuous time period, and passes through Multiview linkage Coordinate analysis result.
The data preprocessing module is multiple ground monitoring station for acquiring air quality datas from city, described in every Air quality data includes six kinds of pollutants PM2.5, PM10, NO2、SO2、O3, CO concentration information and the ground monitoring station The geographical location longitude and latitude of point count station data since initial data is in the presence of lacking or being not digital numerical value According to cleaning, interpolation missing data;It is corresponding dirty that certain pollutant concentration annual average of the single all websites in city is able to reflect the city The day concentration value of object is contaminated, then, extracts the annual average of six kinds of pollutants of Urban Data, and calculate the air quality of six kinds of pollutants Separate index number IAQI, range are 0 to 500.
Specific embodiments of the present invention will be explored from single city time-varying Mode exploration and group of cities spatiotemporal mode respectively Carry out the validity of authentication visualization system.
In embodiment one, single city time-varying mode visibleization is explored: selecting Hebei province Baoding in the present embodiment In on January 3,5 days to 2016 January in 2015, the time-varying mode distribution in 52 weeks Data Minings city, single city then selects All air quality datas in the city are clustered, and then explore its time-varying mode.
It is worth noting that we are using the phase between comentropy and the uncertainty and variable of mutual information difference gauge variable Guan Xing, comentropy is big to illustrate that data include that information content is more, may directly will affect the division of cluster result;Mutual information is got between variable Big to indicate that correlation of variables is stronger, being based on this for variable partitions is smaller group, and guidance user explores air pollution data dimension Subspace;Information theory provides a kind of method how many comprising information content using comentropy quantization stochastic variable;For becoming at random Measure X, entropy H (X) is defined as:
Wherein x is a sample point of X, and p (x) is the probability of x;It, will due to air pollution quality object value range [0-500] [0-500] equalization is divided into 10 sections, calculates the probability distribution p (x) of each dimension sample x, calculates the information of each dimension Entropy;
The exploration that variable subspace is carried out by the degree of information correlativity between situational variables, it is random for two to become X and Y is measured, mutual information I (X, Y) is defined as relativity measurement, and mutual information has for measuring with specific scalar value x, x ∈ X The information of another stochastic variable Y closed, I (X, Y) are defined as follows:
Wherein y is the joint probability density that sample point y ∈ a Y, p (x, y) of Y are scalar value x and y.
It is worth noting that all air quality datas of selected group of cities is then selected to cluster for group of cities, and The spatiotemporal mode of the group of cities is explored;Clustering algorithm is generallyd use to be divided into the data with similar space-time characteristic together In one kind, to carry out spatiotemporal mode identification.Compared to other clustering algorithms, Canopy algorithm does not need the quantity of specified cluster, and It is to divide class based on two initial distance threshold values T1, T2, obtains suitably clustering number, since Canopy algorithm only needs iteration Once, higher compared to other algorithm execution efficiencys, but the cluster result after algorithm executes is not accurate enough, air quality data sample This point may be assigned in multiple clusters: and K-means algorithm time-consuming when handling large-scale dataset is small, high-efficient, but K- Means clustering algorithm needs to preassign cluster number K.In addition, shadow of the Clustering Effect of K-means by initial center point It rings, different cluster result is calculated in different initial center points.Therefore, embodiment hereof is calculated using Canopy+K-means Method calculates cluster result, obtains initial cluster result and classification number K value by an iteration by Canopy algorithm, and count Class central point after calculating cluster is stablized, accurately cluster is tied with initializing the K value and class central point of K-means algorithm Fruit.
The part b referring to figure 2., class 5 is it is characterized in that high concentration, high continuity, data sample are more, the portion d referring to figure 2. Divide and show dandelion view, can be found that such was all distributed in addition to winter in 2015 from the distribution of external flower, and It is significantly less than other classes from such pollutant concentration known to internal radar map.Therefore, it is concluded that the mode is Baoding The excellent normal mode of air quality.Higher concentration value illustrates that air occur to present mode in continuous many days within a certain period of time Quality situation, relatively low continuity are because the air quality in a few weeks is relatively poor.Class 4 is found by the b of Fig. 2 And in whole year in addition to winter is all distributed, the one kind that can be defined as current city according to the size of pollutant concentration value is slight Pollution mode, and this mode has relatively high concentration and highest continuity, can could see Baoding slight pollution Mode continuously occurs in a short time.Higher concentration, successional reason are to appear in this slight pollution set of patterns It summer and autumn and is almost all distributed weekly.Class 2 belongs to low concentration, high continuity, and low concentration is due to occurring in time bin Frequency it is less, high continuity is since main integrated distribution is in Dong Di.Pass through such higher category of pollutant value of radar graph discovery In serious pollution mode, because Baoding belongs to heating city, heat supply in winter problem causes Air Quality relatively poor.It is right Belong to intermediate pollution mode in class 3, although there is the less integrated distribution of frequency in winter.Class 1 is characterized in low concentration and low Continuity, the quick-fried table of air quality index, the mode mainly appear on winter.
In order to probe into variation of the correlation to data pattern of subspace, by mutual information and comentropy come partition dimension Subspace, referring to shown in a of Fig. 2, for the discovery PM10 and SO of the 1 year air quality data in Baoding2There is biggish information Entropy, and PM10, SO2With O3Between association relationship it is higher.It is operated by brush and chooses these three dimensions, as shown in a of Fig. 3,
Further by the way that the data clusters for choosing subspace, CO3 measures cluster result, the d of Fig. 3 visualizes partition dimension The cluster result arrived behind subspace.The d of the d and Fig. 2 of comparison diagram 3 we can be found that from left to right, the classification 1 of the d of Fig. 3, Classification 1 that classification 2, classification 3 cluster together with dimensions all in the d of Fig. 2, classification 2, the corresponding attribute value of classification 3 are close, And corresponding classification is almost similar on Annual distribution.These three classifications are all distributed January and December in winter.Explanation PM10、SO2And O3Winter classification can accurately be divided.Remaining classification is compared although attribute value is very close, timing distribution exists The difference of very little illustrates for the two classification subspaces PM10, SO2And O3Except properties affect category division, other several categories Property also relatively has an impact cluster result.
In conclusion the principle of the present embodiment is single city, all air quality datas in city is selected to be clustered, And then its time-varying mode is explored, for the discovery PM10 and SO of the 1 year air quality data in Baoding2There is biggish comentropy, And PM10, SO2And O3Between association relationship it is higher, operated by brush and choose these three dimensions, further by empty to son is chosen Between data clusters, and use CO3Measure cluster result.
Embodiment two, group of cities spatiotemporal mode is visualized and is explored: we have selected Beijing, Baoding, Langfang from map Deng totally 14 weekly datas in 84 cities September in 2015 on January 3rd, 28 to 2016, check the different mode of group of cities data in space It is operated by circular brush referring to figure 4. with the aggregation extent and continuity of Annual distribution and chooses group of cities, the b of Fig. 4 is cluster The spatial-temporal distribution characteristic scatter plot of classification afterwards helps user to select interested classification;Time-varying trend view between Fig. 5 mode, side Researcher is helped to analyze air quality dynamic change factor;Spatial and temporal distributions view between Fig. 6 mode helps researcher to explore different Distribution situation of the city on room and time in mode.
It is worth noting that the spatiotemporal mode for exploring multiple cities needs to carry out granularity stroke in time and Spatial Dimension simultaneously Point, be similar to one space-time body of building, be different from it is traditional space-time cube is constructed based on grid dividing mode, in the present embodiment The irregular space-time body that geospatial location is divided based on Voronoi diagram is constructed, when time dimension division can be used intuitively Between dimension geometrical property divide, space according to dimension promise divide.
One kind is proposed based on concentration and successional air quality model space-time characterisation recognition methods, by time and sky Between demarcation interval bin count different mode in the distribution of bin, calculate concentration by solving effective bin, positioning valid interval And continuity, when exploring the time-varying mode in single city, concentration indicate the city in such air quality model effective Average occupancy in time bin, continuity indicate the continuity degree of air quality model in time;When exploration group of cities When spatiotemporal mode, concentration indicates average occupancy of the air quality model in the effective time bin with geographical labels, continuity Indicate air quality model in the continuity degree of time and space;For high concentration and the mode that is distributed in less bin when There are higher representativeness, concentration C in space division cloth1It calculates shown in following formula:
Wherein BsumIntermediate scheme sample size, PsumIndicate that effective bin sample number accounts for BsumPercentage, K be effective bin Number;
For continuity, effective bin is connected if effective bin is connected on time or space and constitutes valid interval, even Continuous property C2It calculates shown in following formula:
Wherein R indicates valid interval number, and K is the number of effective bin;Existing continuity calculates the single time or single empty Between it is especially effective in terms of mode, but do not consider the distribution of two dimensions of space-time simultaneously, we are based on building spatiotemporal Irregular space-time body defines the bin of space-time body, and based on improved seed fill algorithm explores valid interval and calculate space-time and connect Continuous property.
The visualization of time-varying trend is often used for exploring the differentiation between different event sequence over time, referring to figure 5. institute Show occur slight pollution after heating period and intermediate pollution city increases, serious pollution mode mainly appears on heating period.It compares Compared with other event sections, winter (December, January, 2 months) is the season that haze takes place frequently, and causes the serious basic reason of air pollution It is a large amount of coal-fired to be heating period burning, so that sulfur dioxide, nitrogen dioxide isoconcentration increase.In addition, the haze in winter compares More another aspects be as caused by temperature inversion, winter easily occur inversion layer be since nocturnal radiation cooling is obvious so that Atmosphere exchange negotiability dies down, and discharges aerial pollutant and constantly accumulates in inferior atmospheric layer layer, can not space-ward expansion It sheds to aggravate to pollute.But although winter pollution is relatively serious, there are air qualities continuous severe occurs for heating period Air quality becomes excellent situation, such as (frame on the left of Fig. 5 in 24 days November 22 to 2015 years November in 2015 in the short time after pollution Mark), on December 3rd, 2015 (collimation mark note) among Fig. 5 and on December 16th, 2015 (the collimation mark note on the right side of Fig. 5) be winter The turning point of heating period pollution mode variation.There is left frame by the weather conditions discovery on the inquiry same day and middle boxes day is vaporous It is that the whole nation attracts heavy snow weather and the duration is long that good situation, which occurs, in condition, and snowfall can alleviate mist conducive to pollutant sedimentation Haze causes collective's air quality to improve.It is due to the same day that the excellent situation of more urban air-qualities, which occurs, in right frame label time point There are the diffusions that lasting high wind is conducive to haze, cause integrated air high-quality.Weather conditions include custom, rainfall it is big The factors such as small, temperature, to a certain extent have an impact air pollution.
It is worth noting that passing through three kinds of trend up and down of analysis, the steady timing variations feelings to check pollution mode Condition.As shown in fig. 7, judging that current time associative mode includes city number by size of node, the thickness of line indicates flow Size.It was found that the case where air quality improves suddenly (turning point), is not directly to return in subsequent time Air Quality To the case where serious pollution, but the variation gradual with the time.
B mode spatial-temporal distribution characteristic scatter plot in referring to figure 4. selects interested mode, last in Fig. 6 A classification belongs to that number in the low continuity of low concentration and class is less, is higher than PM2.5 compared to the value of other classification categories PM10 Value and the value of PM10 be significantly greater than other classifications, the attribute value of other classifications is all PM2.5 highest.Further in time-varying It is found in Voronoi diagram, which occurs mainly in Xingtai on December 23rd, 2015, Anyang city Liang Ge and on January 2nd, 2016 There is the mode again in Xingtai.By inquiry news report discovery, the two cities are in the past few days, since haze effectively causes sky Gas, which pollutes quick-fried table, forces primary school and kindergarten to suspend classes.Such is found according to the size of attribute value in radar map for first classification Do not belong to the excellent mode of air quality, but the mode is more in distribution in first 9 weeks, into heating period after only a few cities appear in The mode.Occurring the excellent city distribution of air quality at the 7th week to significantly reduce, there is slight pollution mode in part of city, It can be seen that both second of mode city distribution is more for the 7th week slight pollution mode;It is high continuous to belong to high concentration for second of mode Property and class in number it is big, belong to normal mode of the selection area in 14 weeks, different periods city be distributed it is relatively uniform.
It is worth noting that the city of adjacent geographical space by geographical location, environmental factor, economic development etc. it is similar because The influence of element, there may be certain similitudes for air quality.By selecting Beijing in map view, it is desirable to observe Beijing Whether the air pollution model distribution in city is similar to adjacent cities, please refers to shown in Fig. 8, single city spatial and temporal distributions view, ties It closes Voronoi diagram and checks that the time-varying Voronoi diagram in the city appears in phase in the distribution situation of different mode and with the city With other cities of mode.It can be inferred that according to position of the city in map in 28 days to 2015 October 3 of September in 2015 Day and 8 days to 2015 on October, 12, Beijing October in 2015 adjacent cities include Tianjin, Langfang, Tangshan, Cangzhou, Baoding etc. With similar time-varying mode.As can be seen from the figure the air mode in the preceding Beijing of heating period is distributed also phase with adjacent cities Seemingly.
In conclusion the implementation principle of the present embodiment is to show city during time-evolution by time-varying Voronoi diagram Group checks aggregation extent and continuity of the city on room and time in different mode in the distribution situation of different mode, opens up Show current class in the distribution situation in different moments city, wherein Voronoi diagram divides geographical zone according to city longitude and latitude and uses One Polygons Representation, the corresponding city of each polygon.
In embodiment three, a kind of spatiotemporal mode visual analysis method based on air quality data, based in embodiment one A kind of spatiotemporal mode visual analysis system based on air quality data, division, Canopy+ including dimension subspace K-means clustering algorithm and concentration and continuity identify air quality spatiotemporal mode, by extract urban air-quality data into The division of the row dimension subspace, the dimension for handling the air quality data according to comentropy and mutual information are empty Between;The data in the air quality data with similar features are divided into together by institute's Canopy+K-means clustering algorithm In one kind, to obtain the spatiotemporal mode of class;By to dense in the concentration and continuity identification air quality spatiotemporal mode Degree and continuity judge the aggregation extent being distributed in the spatiotemporal mode and continuity degree respectively.
The purpose of dimension Subspace partition: user is instructed to explore multivariable air quality data according to comentropy and mutual information Dimension subspace, design and Implement multivariable information view for showing mutual information between the comentropy of variable and variable;It should View uses the node connection figure based on power guiding layout, and node indicates 6 kinds of pollutants of air quality data, size of node The size of encoding variable comentropy, the mutual information between node between link width encoding variable.Size and variable based on comentropy Between mutual information size, interested dimension subspace is selected by brush selection operation, is visited by data variation in subspace The influence that rope is distributed data pattern.
The purpose of Canopy+K-means clustering algorithm: the data in air quality data with similar features are divided into In same class, to identify the spatiotemporal mode of class;The initial clustering of K-means clustering algorithm is obtained using Canopy algorithm first Center and cluster number K further use K-means cluster refinement cluster result.For single city, the city is selected daily It is clustered for the air quality data in the time range of granularity;For group of cities, select in time range selected by group of cities All air quality datas clustered.
The purpose of concentration and continuity identification air quality spatiotemporal mode: measurement pattern point is distinguished by concentration and continuity The aggregation extent and continuity degree of cloth, the exploration for single city time-varying mode, using time-varying spiral figure, dandelion figure and thunder Up to figure for showing daily as the timing distribution of the mode of granularity.One class of each dandelion icon representation.At the top of dandelion Flower is for showing that mode using day as the timing distribution of granularity, is ranked up mode according to the maximum value of IAQI in mode, from a left side Maximum IAQI from high to low, indicates pollution level by seriously to good to the right.Internal radar map is for showing each attribute Mean value, i.e. the IAQI mean value of various pollutants.The cake chart of radar map outer ring is used to show that mode to be the timing of granularity by week Distribution, auxiliary user observe such continuity degree and intensity in timing.The starting of cake chart Annual distribution and clock and watch 0 point it is identical, carry out Annual distribution in the direction of the clock.For group of cities spatiotemporal mode explore, using time-varying Voronoi diagram, Sang Jitu, radar map show the situation of change of mode over time.Voronoi diagram divides geographical space according to city longitude and latitude And with a Polygons Representation, the corresponding city of each polygon.A line shows a mode at any time in geographical space Distribution.One of Voronoi diagram shows that the present mode frequency that each city appears in a time bin, color are got over It is bigger that frequency occurs in red expression.Sang Jitu appears in the frequency in model identical by analysis current city and other cities, point Analysis and other city mode distribution similarities.Radar map shows that the mean value of pollutant in class is used to visualize the pollution mould of current class Formula is ranked up mode according to greatest contamination IAQI value in six kinds of pollutants, and greatest contamination object value is increasing from top to bottom, Indicate that pollution level is more serious.
It is worth noting that there is specific data distribution characteristics mode by design inter-view characterization, it is special to explore space-time Mode of rule, abnormal patterns in sign are analyzed these modes further to extract valuable information, theoretical value: are directed to more Variable space-time data collection is limited by dimension, visualizes presentation difficulty, spatiotemporal mode complexity is difficult to the problems such as analyzing, design The visual analysis view that multi-angle is explored;Realistic meaning: using visualization with visual analysis come probe into air quality data when Empty mode distribution, can help analyst intuitively, comprehensively to analyze the normal mode of air quality data, excavate hidden data mould Formula explores mode distribution characteristics, explores time-varying trend, personnel provide decision support for analysis.
In conclusion the implementation principle of the present embodiment is to be clustered by the division of dimension subspace, Canopy+K-means Algorithm and concentration and continuity identify air quality spatiotemporal mode, carry out dimension by extracting urban air-quality data The division in space handles the dimension subspace of the air quality data according to comentropy and mutual information;Pass through institute Data in the air quality data with similar features are divided into same class by Canopy+K-means clustering algorithm, from And obtain the spatiotemporal mode of class;By to concentration and continuity in the concentration and continuity identification air quality spatiotemporal mode The aggregation extent being distributed in the spatiotemporal mode and continuity degree are judged respectively.

Claims (9)

1. a kind of spatiotemporal mode visual analysis system based on air quality data, it is characterised in that: including data prediction mould Block, data analysis module and visualization model, the data preprocessing module are to choose urban air from space-time and attribute dimensions Qualitative data;The data analysis module is the feature that the urban air-quality data are extracted by similitude cluster;Institute Stating visualization model is according to a variety of visualization views of the extracted characteristic Design of the data analysis module, and interactive mode is explored The distribution characteristics of different time and spatial data, the visualization view include class time-space attribute scatter plot, spatiotemporal mode point Cloth view, stream view and time varying characteristic view, class time-space attribute scatter plot can according to the aggregation extent of spatio-temporal distribution with And continuity degree chooses mode, is found frequently by spatiotemporal mode arrangement view while appearing in the city in same mode, leads to Time-varying trend between the stream view exploration mode is crossed, concentration is found by the time varying characteristic view and appears in continuous time period Pollution mode, and analysis result is coordinated by Multiview linkage.
2. a kind of spatiotemporal mode visual analysis system based on air quality data according to claim 1, feature exist In the data preprocessing module is multiple ground monitoring station for acquiring air quality datas from city, every air Qualitative data includes six kinds of pollutants PM2.5, PM10, NO2、SO2、O3, CO concentration information and the ground monitoring website Geographical location information.
3. a kind of spatiotemporal mode visual analysis system based on air quality data according to claim 2, feature exist In the location information of the ground monitoring website includes the longitude and latitude of the ground monitoring website.
4. a kind of spatiotemporal mode visual analysis system based on air quality data according to claim 1, feature exist In the data analysis module is by selecting group of cities and different time granularity selection period in space scale, using poly- Class algorithm clusters its air quality data, and the clustering algorithm includes that K-means cluster is obtained using Canopy algorithm The initial cluster center and cluster number K, the clustering algorithm of algorithm are as follows:
Input: air quality data D, initial distance threshold value are T1, T2, and T1 > T2
Output: Canopy clusters to obtain center point set N, and gathering closes C
Step1: all samples in traversal D calculate the distance between sample Dis using Euclidean distance;
Step2: a sample d is taken out from D, using d as first class, that is, Canopy, and marks d;
Step3: new sample P is taken out in continuation from D, searches the distance between the sample and Canopy Dis;
Step4: if Dis < T1, P is classified as the Canopy;If distance Dis > T1, the P conduct of P to all Canopy New Canopy;If the distance of P to Canopy is less than T2, P is marked;
Step5: repeating Step3-Step4, and until point all in D is labeled, Canopy algorithm terminates;
Step6: center of each Canopy sample attribute mean value as the Canopy, output center point set N are calculated;
Step7: all sample points are calculated to the distance between each central point Dis using Euclidean distance;
Step8: d is assigned to away from it apart from nearest cluster;
Step9: calculating the mean vector of sample point in each cluster, using the vector as new class central point;
Step10: Step7-Step9 in repetition, until central point is no longer changed;
Step11: output gathering closes C;
Step12: algorithm terminates;
Further use K-means cluster refinement cluster result.
5. a kind of spatiotemporal mode visual analysis system based on air quality data according to claim 1, feature exist In, in the visualization model, for single city use time-varying spiral figure, dandelion figure and radar map for show daily for The timing distribution of the mode of granularity.
6. a kind of spatiotemporal mode visual analysis system based on air quality data according to claim 1, feature exist In showing mould over time using time-varying Voronoi diagram, Sang Jitu, radar map for group of cities in the visualization model The situation of change of formula.
7. a kind of spatiotemporal mode visual analysis method based on air quality data, which is characterized in that based on described in claim 1 A kind of spatiotemporal mode visual analysis system based on air quality data, division, Canopy+K- including dimension subspace Means clustering algorithm and concentration and continuity identify air quality spatiotemporal mode, are carried out by extracting urban air-quality data The division of the dimension subspace handles the dimension subspace of the air quality data according to comentropy and mutual information; The data in the air quality data with similar features are divided into together by the Canopy+K-means clustering algorithm In one kind, to obtain the spatiotemporal mode of class;By to dense in the concentration and continuity identification air quality spatiotemporal mode Degree and continuity judge the aggregation extent being distributed in the spatiotemporal mode and continuity degree respectively.
8. a kind of spatiotemporal mode visual analysis method based on air quality data according to claim 7, feature exist In, it is described by counting different mode in the distribution of the section bin in time and space demarcation interval bin, pass through solution Effective section bin, the effectively section bin is positioned to calculate concentration and continuity.
9. a kind of spatiotemporal mode visual analysis method based on air quality data according to claim 7, feature exist In the comentropy is obtained by following calculation mode:
For the air quality data X, the comentropy H (X) is defined as, and x is a sample point of X, and p (x) is the general of x Rate;The mutual information is obtained by following calculation mode:
Air quality data X and Y described for two, x are a sample point of X, and wherein y is a sample point of Y, p (x, y) For the joint probability density of scalar value x and y.
CN201910678017.5A 2019-07-25 2019-07-25 A kind of spatiotemporal mode visual analysis system and method based on air quality data Pending CN110389982A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910678017.5A CN110389982A (en) 2019-07-25 2019-07-25 A kind of spatiotemporal mode visual analysis system and method based on air quality data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910678017.5A CN110389982A (en) 2019-07-25 2019-07-25 A kind of spatiotemporal mode visual analysis system and method based on air quality data

Publications (1)

Publication Number Publication Date
CN110389982A true CN110389982A (en) 2019-10-29

Family

ID=68287526

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910678017.5A Pending CN110389982A (en) 2019-07-25 2019-07-25 A kind of spatiotemporal mode visual analysis system and method based on air quality data

Country Status (1)

Country Link
CN (1) CN110389982A (en)

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110895526A (en) * 2019-11-29 2020-03-20 南京信息工程大学 Method for correcting data abnormity in atmosphere monitoring system
CN111125194A (en) * 2019-12-25 2020-05-08 中国建筑科学研究院有限公司 Data construction method and device applied to city-level clean heating
CN111581325A (en) * 2020-07-13 2020-08-25 重庆大学 K-means station area division method based on space-time influence distance
CN111639243A (en) * 2020-06-04 2020-09-08 东北师范大学 Space-time data progressive multi-dimensional mode extraction and anomaly detection visual analysis method
CN111783832A (en) * 2020-06-03 2020-10-16 浙江工业大学 Interactive selection method of space-time data prediction model
CN111899106A (en) * 2020-08-06 2020-11-06 天津大学 Visual analysis system for futures big data
CN112634113A (en) * 2020-12-22 2021-04-09 山西大学 Polluted waste gas correlation analysis method based on dynamic sliding window
CN113009086A (en) * 2021-03-08 2021-06-22 重庆邮电大学 Method for exploring urban atmospheric pollutant source based on backward trajectory mode
CN113269675A (en) * 2021-05-18 2021-08-17 东北师范大学 Time-variant data time super-resolution visualization method based on deep learning model
CN113326472A (en) * 2021-05-28 2021-08-31 东北师范大学 Pattern extraction and evolution visual analysis method based on time sequence multivariable data
CN114062618A (en) * 2022-01-17 2022-02-18 京友科技(深圳)有限公司 Internet of things-based system for realizing air monitoring and gathering in multiple environments
CN115983725A (en) * 2023-03-20 2023-04-18 四川国蓝中天环境科技集团有限公司 Pollutant spatial distribution trend mining method based on mobile station monitoring data
CN117114245A (en) * 2023-10-18 2023-11-24 长春市联心花信息科技有限公司 Urban data integration method based on digital twinning

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105426461A (en) * 2015-11-12 2016-03-23 中国科学院遥感与数字地球研究所 Map visualization system and method for performing knowledge mining based on big spatial data
CN107038236A (en) * 2017-04-19 2017-08-11 合肥学院 A kind of air quality data visualization system
CN107588462A (en) * 2017-10-11 2018-01-16 西安交通大学 A kind of haze controlling device and administering method
CN108564110A (en) * 2018-03-26 2018-09-21 上海电力学院 A kind of Air Quality Forecast method based on clustering algorithm

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105426461A (en) * 2015-11-12 2016-03-23 中国科学院遥感与数字地球研究所 Map visualization system and method for performing knowledge mining based on big spatial data
CN107038236A (en) * 2017-04-19 2017-08-11 合肥学院 A kind of air quality data visualization system
CN107588462A (en) * 2017-10-11 2018-01-16 西安交通大学 A kind of haze controlling device and administering method
CN108564110A (en) * 2018-03-26 2018-09-21 上海电力学院 A kind of Air Quality Forecast method based on clustering algorithm

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
王程: "基于参与感知的Web气象服务系统", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *
王蓉: "基于空气质量数据的时空模式可视分析", 《中国优秀硕士学位论文全文数据库 工程科技Ⅰ辑》 *
胡亚娟: "基于城市群的空气质量数据的可视分析方法研究", 《中国优秀硕士学位论文全文数据库 工程科技Ⅰ辑》 *

Cited By (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110895526A (en) * 2019-11-29 2020-03-20 南京信息工程大学 Method for correcting data abnormity in atmosphere monitoring system
CN111125194A (en) * 2019-12-25 2020-05-08 中国建筑科学研究院有限公司 Data construction method and device applied to city-level clean heating
CN111783832B (en) * 2020-06-03 2022-07-15 浙江工业大学 Interactive selection method of space-time data prediction model
CN111783832A (en) * 2020-06-03 2020-10-16 浙江工业大学 Interactive selection method of space-time data prediction model
CN111639243A (en) * 2020-06-04 2020-09-08 东北师范大学 Space-time data progressive multi-dimensional mode extraction and anomaly detection visual analysis method
CN111639243B (en) * 2020-06-04 2021-03-09 东北师范大学 Space-time data progressive multi-dimensional mode extraction and anomaly detection visual analysis method
CN111581325A (en) * 2020-07-13 2020-08-25 重庆大学 K-means station area division method based on space-time influence distance
CN111581325B (en) * 2020-07-13 2021-02-02 重庆大学 K-means station area division method based on space-time influence distance
CN111899106A (en) * 2020-08-06 2020-11-06 天津大学 Visual analysis system for futures big data
CN112634113A (en) * 2020-12-22 2021-04-09 山西大学 Polluted waste gas correlation analysis method based on dynamic sliding window
CN112634113B (en) * 2020-12-22 2023-09-26 山西大学 Pollution waste gas correlation analysis method based on dynamic sliding window
CN113009086A (en) * 2021-03-08 2021-06-22 重庆邮电大学 Method for exploring urban atmospheric pollutant source based on backward trajectory mode
CN113269675A (en) * 2021-05-18 2021-08-17 东北师范大学 Time-variant data time super-resolution visualization method based on deep learning model
CN113269675B (en) * 2021-05-18 2022-05-13 东北师范大学 Time-variant data time super-resolution visualization method based on deep learning model
CN113326472B (en) * 2021-05-28 2022-07-15 东北师范大学 Pattern extraction and evolution visual analysis method based on time sequence multivariable data
CN113326472A (en) * 2021-05-28 2021-08-31 东北师范大学 Pattern extraction and evolution visual analysis method based on time sequence multivariable data
CN114062618B (en) * 2022-01-17 2022-03-18 京友科技(深圳)有限公司 Internet of things-based system for realizing air monitoring and gathering in multiple environments
CN114062618A (en) * 2022-01-17 2022-02-18 京友科技(深圳)有限公司 Internet of things-based system for realizing air monitoring and gathering in multiple environments
CN115983725A (en) * 2023-03-20 2023-04-18 四川国蓝中天环境科技集团有限公司 Pollutant spatial distribution trend mining method based on mobile station monitoring data
CN115983725B (en) * 2023-03-20 2023-06-16 四川国蓝中天环境科技集团有限公司 Pollutant space distribution trend mining method based on mobile station monitoring data
CN117114245A (en) * 2023-10-18 2023-11-24 长春市联心花信息科技有限公司 Urban data integration method based on digital twinning

Similar Documents

Publication Publication Date Title
CN110389982A (en) A kind of spatiotemporal mode visual analysis system and method based on air quality data
Grinberger et al. Typologies of tourists' time–space consumption: A new approach using GPS data and GIS tools
CN108846832A (en) A kind of change detecting method and system based on multi-temporal remote sensing image and GIS data
Wang et al. Evaluation of urban green space in terms of thermal environmental benefits using geographical detector analysis
Quan Enhanced geographic information system-based mapping of local climate zones in Beijing, China
CN104239712B (en) Real-time evaluation method for anti-interference performance of radar
Ming et al. Nonlinear effects of urban and industrial forms on surface urban heat island: Evidence from 162 Chinese prefecture-level cities
CN109214468A (en) It is a kind of based on can open up away from optimization cluster centre data clustering method
CN111984701A (en) Method, device, equipment and storage medium for predicting village settlement evolution
Häb et al. TraVis-A visualization framework for mobile transect data sets in an urban microclimate context
Zeng et al. Exploring the spatial interplay between built-up environments and surface urban heat island phenomena in the main urban area of Shanghai
CN113673619B (en) Geographic big data space latent pattern analysis method based on topology analysis
CN115273645A (en) Map making method for automatically clustering indoor surface elements
CN111476434B (en) GIS-based soil heavy metal fractal dimension spatial variation analysis method
Xu et al. Analysis of the spatiotemporal expansion and pattern evolution of urban areas in Anhui Province, China, based on nighttime light data
Deng et al. Exploring the effects of local environment on population distribution: using imagery segmentation technology and street view
Lyu et al. Intelligent clustering analysis model for mining area mineral resource prediction
Min et al. Landscape Evaluation of Forest Park Based on Analytic Hierarchy Process
Ma et al. XGBoost-based analysis of the relationship between urban 2D/3D morphology and seasonal gradient land surface temperature
Zhao et al. Spatiotemporal interaction pattern of the Beijing agricultural product circulation
Häb et al. Spatial Aggregation of Mobile Transect Measurements for the Identification of Climatic Microenvironments.
Wang et al. Physical Urban Area Identification Based on Vector Buildings and Quantitative Attribution of Optimal Distance Threshold: A Case Study in Chongqing Municipality, Southwestern China
Qingqing et al. The Spatial-temporal Characteristics of PM2. 5 Concentrations in Chinese Cities and the Influencing Factors
Niu et al. Study on optimal urban land classification method based on remote sensing images
Song et al. Progress of Spatial Geographic Phenomenon Visualization Methods Based on Multi-Source Data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination