CN106991525B - Air quality and resident trip visual analysis method and system - Google Patents

Air quality and resident trip visual analysis method and system Download PDF

Info

Publication number
CN106991525B
CN106991525B CN201710173669.4A CN201710173669A CN106991525B CN 106991525 B CN106991525 B CN 106991525B CN 201710173669 A CN201710173669 A CN 201710173669A CN 106991525 B CN106991525 B CN 106991525B
Authority
CN
China
Prior art keywords
poi
data
activity
air quality
weight
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710173669.4A
Other languages
Chinese (zh)
Other versions
CN106991525A (en
Inventor
谢波
姜波
潘伟丰
王家乐
殷骏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang Gongshang University
Original Assignee
Zhejiang Gongshang University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang Gongshang University filed Critical Zhejiang Gongshang University
Priority to CN201710173669.4A priority Critical patent/CN106991525B/en
Publication of CN106991525A publication Critical patent/CN106991525A/en
Application granted granted Critical
Publication of CN106991525B publication Critical patent/CN106991525B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/26Government or public services

Landscapes

  • Business, Economics & Management (AREA)
  • Human Resources & Organizations (AREA)
  • Engineering & Computer Science (AREA)
  • Strategic Management (AREA)
  • Economics (AREA)
  • Tourism & Hospitality (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Marketing (AREA)
  • Development Economics (AREA)
  • General Business, Economics & Management (AREA)
  • Educational Administration (AREA)
  • Physics & Mathematics (AREA)
  • Quality & Reliability (AREA)
  • Operations Research (AREA)
  • Game Theory and Decision Science (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Primary Health Care (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Instructional Devices (AREA)

Abstract

The invention discloses a visual analysis method and a system for air quality and resident trip based on big data drive, which comprises the following steps of (1) reconstructing original air quality data, temperature data, POI data and taxi taking difficulty data; (2) calculating the POI zone weight activity and the deviation rate: the POI weighted activity reflects the flow of people around the POI; the deviation rate reflects the change condition of the POI zone weight activity; (3) POI clustering of the same type; (4) and (4) visual analysis of air quality and resident trip. The invention has the advantages of low cost, simple maintenance, rapid deployment, diversity of visual interface interaction, and multi-granularity analysis of air quality and resident travel condition for each user.

Description

Air quality and resident trip visual analysis method and system
Technical Field
The invention relates to a large data drive-based air quality and resident trip visual analysis method and system.
Background
Along with the development of the industrialized process in China, the pollution problem of the industrial excrement mainly comprising sulfide (SOx), nitride (NOx), ozone (O3), carbide (COx) and particulate matters (the particle size is less than or equal to 10 microns and 2.5 microns) to the air quality is increasingly serious, and the pollution problem greatly influences the daily travel and the life of people.
With the development of science and technology, data is collected and stored in large quantities, the data volume is increased explosively, and how to extract valuable information from the data becomes an urgent problem to be solved. In the face of large and complex data, traditional data mining and data analysis methods are not compelling to explore the data. In order to obtain the value contained in the data, various data analysis and mining methods are applied.
Therefore, an effective method for solving these problems is needed. In recent years, as an analysis reasoning science based on a visual interactive interface, visual analysis provides a brand new means for data mining and data analysis, and the visual analysis is popular with researchers due to the characteristics of interactivity, visibility and the like and is gradually a research hotspot.
Therefore, the visual research aiming at the air quality and the travel of the residents has important significance for researching the relationship between the air quality and the travel of the residents, not only can provide important reference for exploring the travel behaviors of the residents, but also can cause the attention of relevant departments such as transportation, medical treatment and the like to the air quality. Therefore, the visual research for exploring the air quality and the resident trip has very important research value in both theory and practical application.
Disclosure of Invention
The invention designs an air quality and resident trip visual analysis method and system based on big data drive aiming at the problems of air quality and resident trip analysis, better helps departments of transportation, medical treatment and the like to analyze the air quality and the resident trip, provides a set of visual analysis system to help a user to analyze air quality characteristics and resident trip characteristics, displays an air quality bar graph, a temperature box graph, a POI (point of interest) activity stacking graph and flow graph, a POI (point of interest) activity migration rate calendar thermal graph and a multidimensional histogram and explores urban air quality and resident trip. The purpose of the invention is realized by the following technical scheme: a big data drive-based air quality and resident travel visual analysis method comprises the following steps:
(1) original air quality data, temperature data, POI data and taxi taking difficulty data are reconstructed: the method comprises the steps of firstly, respectively carrying out data cleaning and sorting on air quality data, temperature data, POI data and taxi taking difficulty and degree data, wherein the data cleaning mainly comprises the steps of searching and removing data abnormity and missing values in various data sources, and then sorting all data according to time according to a timestamp, so that the visualization of subsequent time sequence data is facilitated. The taxi taking difficulty data comprise geographic coordinates and weights of taxi taking difficulty distribution points. The POI data comprises the geographic coordinates of the POI distribution points and the POI types.
(2) Calculating the POI zone weight activity and the deviation rate: the POI weighted activity reflects the flow of people around the POI; the offset rate reflects the change of the POI zone weight activity.
The calculation of the POI zone right activity is specifically as follows:
and (2.1) calculating the Euclidean distance between the taxi taking difficulty distribution points and each POI distribution point, judging whether the Euclidean distance is smaller than a preset threshold value T, and if the condition is met, setting the weight of the taxi taking difficulty distribution points as the weight of the POI activity.
And (2.2) respectively counting the accumulated sum of the activity degrees of the POI of various types according to different types of the POI, and taking the accumulated sum as the weighted activity degree of the POI of the type.
The calculation of the POI zone weight activity offset rate specifically includes:
Offsett=(POIWeightt-Averweek,hour)/(POIWeightt)-1
wherein, Averweek,hourPOI weighted average of activity for each hour of each week, POIWeighttTaking the weighted activity, Offset, for the current hour POItIs the offset rate.
3) Same type POI clustering: calculating all POI distribution points within the range that the Euclidean distance around each driving difficulty distribution point is less than or equal to T, and recording as POIdidi. Statistical POIdidiAnd (4) calculating the position of the clustering center of the POI distribution points of the same type, and setting the weight of the distribution points with difficulty and easiness in taxi taking as the weight of the clustering center. And clustering the POI distribution points by using a k-means-based clustering algorithm, and taking the calculated new longitude and latitude coordinates of the clustering center as the longitude and latitude coordinates of the center position of the POI.
4) Visual analysis of air quality and resident's trip specifically is:
(4.1) color visual coding: when mapping the color, due to the difference of the Air Quality Index (AQI), a dynamic mapping scheme is adopted, namely, the color is dynamically adjusted according to the air quality index value:
Figure BDA0001251762300000031
wherein the ColorrectIs a rectangular fill color.
(4.2) strip-box plot analysis component: the air quality index for each day is shown as a rectangle with the order of the rectangles from left to right indicating the day's day, the fill color of the rectangle being determined according to the protocol of step 4.1 and the height being determined according to the air quality index AQI. The boxplot represents the temperature every hour of the week, the boxplot shows the date and time of the week from left to right, the upper dotted line and the lower dotted line of the boxplot respectively represent the upper quarter data range and the lower quarter data range, the small rectangle in the center of the boxplot represents the data range from one quarter to three quarters of the place, and the horizontal line position in the center of the small rectangle represents the median of the data.
(4.3) flowsheet-stacking diagram analysis component: the abscissa of the stacked graph and the flowsheet refers to the hourly coordinate of the timing range and takes the weekly scale as the basic scale. The ordinate is the POI weighted activity value. The stacked graph represents different types of POI by using area graphs with different colors, is arranged along a coordinate axis on one side and shows the change condition of the one or more POI with the right activity within a specified time range. And the flow graphs are arranged along the two sides of the coordinate, and the change condition of the one or more POI (point of interest) with the right activity within the appointed time range is displayed.
(4.4) scatter matrix-GeoMap-calendar heatmap analysis component: the scatter matrix diagram is an expansion of the high-dimensional aspect of the scatter diagram and is used for displaying air quality, temperature and POI (point of interest) zone authority activity. The calendar heat map presents the multidimensional data in a two-dimensional form, and the size of the numerical value is represented by the shade of color, and the change of the POI tape weight activity offset rate under different air quality and temperature conditions of the same POI is displayed through the calendar heat map. The GeoMap is used for displaying the activity weight and the geographic distribution condition of the POI clusters of the same type.
A big data drive-based air quality and resident trip visual analysis system comprises the following components:
(1) bar-box plot analysis assembly: the air quality index of each day is shown by a rectangle, and the sequence of the rectangles from left to right represents the sequence of the days; the height of the rectangle is determined according to the air quality index AQI, and the filling color adopts a dynamic mapping scheme, namely, the height is dynamically adjusted according to the air quality index value:
Figure BDA0001251762300000041
wherein the ColorrectIs a rectangular fill color.
The boxplot represents the temperature every hour of the week, the boxplot shows the date and time of the week from left to right, the upper dotted line and the lower dotted line of the boxplot respectively represent the upper quarter data range and the lower quarter data range, the small rectangle in the center of the boxplot represents the data range from one quarter to three quarters of the place, and the horizontal line position in the center of the small rectangle represents the median of the data.
(2) Flowsheet-stacking diagram analysis component: the abscissa of the stacked graph and the flowsheet refers to the hourly coordinate of the timing range and takes the weekly scale as the basic scale. The ordinate is the POI weighted activity value. The stacked graph represents different types of POI by using area graphs with different colors, is arranged along a coordinate axis on one side and shows the change condition of the one or more POI with the right activity within a specified time range. The flow graph is arranged along the two sides of the coordinate, the change situation of one or more POI (point of interest) belt weight activeness in a specified time range is displayed, and the calculation of the POI belt weight activeness specifically comprises the following steps:
and (2.1) calculating the Euclidean distance between the taxi taking difficulty distribution points and each POI distribution point, judging whether the Euclidean distance is smaller than a preset threshold value T, and if the condition is met, setting the weight of the taxi taking difficulty distribution points as the weight of the POI activity.
And (2.2) respectively counting the accumulated sum of the activity degrees of the POI of various types according to different types of the POI, and taking the accumulated sum as the weighted activity degree of the POI of the type.
(3) Scatter matrix-GeoMap-calendar heat map analysis component: the scatter matrix diagram is an expansion of the high-dimensional aspect of the scatter diagram and is used for displaying air quality, temperature and POI (point of interest) zone authority activity. The calendar heat map presents the multidimensional data in a two-dimensional form, and the size of the numerical value is represented by the shade of color, and the change of the POI tape weight activity offset rate under different air quality and temperature conditions of the same POI is displayed through the calendar heat map. The GeoMap is used for displaying the activity weight and the geographic distribution condition of the POI clusters of the same type.
The calculation of the liveness weight of the POI clusters of the same type is specifically as follows: calculating all POI distribution points within the range that the Euclidean distance around each driving difficulty distribution point is less than or equal to T, and recording as POIdidi. Statistical POIdidiAnd (4) calculating the position of the clustering center of the POI distribution points of the same type, and setting the weight of the distribution points with difficulty and easiness in taxi taking as the weight of the clustering center. And clustering the POI distribution points by using a k-means-based clustering algorithm, and taking the calculated new longitude and latitude coordinates of the clustering center as the longitude and latitude coordinates of the center position of the POI.
The invention has the beneficial effects that: the method is different from the traditional air quality visualization, and aims at the visualization of the air quality and the data of the residents during traveling, so that a user can explore the change situation of the activity of the air quality to different areas of a city from the global to the local and then to the global, and the change of traveling destinations of the residents influenced by the air quality is analyzed. Through the interactive means, the cost of using the system by an analyst is reduced, a good display effect is achieved, and the system can display various rules of air quality and resident trip from four levels of air quality, temperature, POI zone authority activity and offset rate.
Drawings
FIG. 1 bar-box plot analysis component;
FIG. 2 flow sheet-stacking diagram analysis component;
FIG. 3 is a scatter matrix-GeoMap-calendar heatmap analysis component;
FIG. 4 is a front-end dependency diagram of the system.
Detailed Description
The following detailed description is made with reference to the embodiments and the accompanying drawings.
The data base on which the present invention is based is: the air quality data is issued by environment protection administrative departments or environment monitoring stations authorized by the administrative departments at various levels and above, and comprises daily reports and time reports. The time period of the time report data is 1 hour, the real-time report of each monitoring station is issued at each integral point moment, and the indexes of the real-time report comprise SO2、NO2、O3、CO、PM2.5、PM10Concentration, daily data is one day SO2、NO2、O3、CO、 PM2.5、PM1024 hour mean concentration; the atmospheric environment data is issued by the meteorological protection administrative departments at different levels and above or the meteorological monitoring stations authorized by the meteorological protection administrative departments, and comprises daily reports and time reports. The time period of the time report data is 1 hour, the real-time report of each detection station is issued every whole time, and indexes of the real-time report comprise air pressure, temperature, humidity, precipitation, wind direction and other data. The daily data is the average value of 24-hour data of daily air pressure, temperature, humidity, precipitation and wind direction; the resident trip data is driving difficulty data provided by a drop-and-dome-shaped large data platform, wherein the data time period is 1 hour, and driving difficulty of different places is provided at each integral point. Each piece of integer data includes: longitude, latitude, difficulty of taxi taking; the POI distribution data is detailed data of the POI and comprises a POI address, a POI name, a POI longitude, a POI latitude and a POI type.
The invention provides a big data drive-based air quality and resident trip visual analysis method, which comprises the following steps:
(1) original air quality data, temperature data, POI data and taxi taking difficulty data are reconstructed: the method comprises the steps of firstly, respectively carrying out data cleaning and sorting on air quality data, temperature data, POI data and taxi taking difficulty and degree data, wherein the data cleaning mainly comprises the steps of searching and removing data abnormity and missing values in various data sources, and then sorting all data according to time according to a timestamp, so that the visualization of subsequent time sequence data is facilitated. The taxi taking difficulty data comprise geographic coordinates and weights of taxi taking difficulty distribution points. The POI data comprises the geographic coordinates of the POI distribution points and the POI types.
(2) Calculating the POI zone weight activity and the deviation rate: the POI weighted activity reflects the flow of people around the POI; the offset rate reflects the change of the POI zone weight activity.
The calculation of the POI zone right activity is specifically as follows:
(2.1) calculating the Euclidean distance between the difficulty and difficulty degree distribution points of taxi taking and each POI distribution point, judging whether the Euclidean distance is smaller than a preset threshold value T, wherein the T can be 0.5km, and if the condition is met, setting the weight of the difficulty and difficulty degree distribution points of taxi taking as the weight of the POI activity.
And (2.2) respectively counting the accumulated sum of the activity degrees of the POI of various types according to different types of the POI, and taking the accumulated sum as the weighted activity degree of the POI of the type.
The calculation of the POI zone weight activity offset rate specifically includes:
Offsett=(POIWeightt-Averweek,hour)/(POIWeightt)-1
wherein, Averweek,hourPOI weighted average of activity for each hour of each week, POIWeighttTaking the weighted activity, Offset, for the current hour POItIs the offset rate.
3) Same type POI clustering: calculating all POI distribution points within the range that the Euclidean distance around each driving difficulty distribution point is less than or equal to T, and recording as POIdidi. Statistical POIdidiAnd (4) calculating the position of the clustering center of the POI distribution points of the same type, and setting the weight of the distribution points with difficulty and easiness in taxi taking as the weight of the clustering center. And clustering the POI distribution points by using a k-means-based clustering algorithm, and taking the calculated new longitude and latitude coordinates of the clustering center as the longitude and latitude coordinates of the center position of the POI.
4) Visual analysis of air quality and resident's trip specifically is:
(4.1) color visual coding: when mapping the color, due to the difference of the Air Quality Index (AQI), a dynamic mapping scheme is adopted, namely, the color is dynamically adjusted according to the air quality index value:
Figure BDA0001251762300000071
wherein the ColorrectIs a rectangular fill color.
(4.2) strip-box plot analysis component: the air quality index for each day is shown as a rectangle with the order of the rectangles from left to right indicating the day's day, the fill color of the rectangle being determined according to the protocol of step 4.1 and the height being determined according to the air quality index AQI. The boxplot represents the temperature every hour of the week, the boxplot shows the date and time of the week from left to right, the dotted lines on the boxplot represent the upper quarter data range and the lower quarter data range respectively, the small rectangle in the center of the boxplot represents the data range from one quarter to three quarters of the quartile, and the horizontal line position in the center of the small rectangle represents the median of the data, as shown in fig. 1.
(4.3) flowsheet-stacking diagram analysis component: the abscissa of the stacked graph and the flowsheet refers to the hourly coordinate of the timing range and takes the weekly scale as the basic scale. The ordinate is the POI weighted activity value. The stacked graph represents different types of POI by using area graphs with different colors, is arranged along a coordinate axis on one side and shows the change condition of the one or more POI with the right activity within a specified time range. The flow graph is arranged along the coordinate on both sides, and shows the change situation of the one or more POI (point of interest) with the right activity within the specified time range, as shown in FIG. 2.
(4.4) scatter matrix-GeoMap-calendar heatmap analysis component: the scatter matrix diagram is an expansion of the high-dimensional aspect of the scatter diagram and is used for displaying air quality, temperature and POI (point of interest) zone authority activity. The calendar heat map presents the multidimensional data in a two-dimensional form, and the size of the numerical value is represented by the shade of color, and the change of the POI tape weight activity offset rate under different air quality and temperature conditions of the same POI is displayed through the calendar heat map. The GeoMap is used for showing the activity weight and the geographic distribution of the POI clusters of the same type, as shown in fig. 3.
A big data drive-based air quality and resident trip visual analysis system comprises the following components:
(1) bar-box plot analysis assembly: the air quality index of each day is shown by a rectangle, and the sequence of the rectangles from left to right represents the sequence of the days; the height of the rectangle is determined according to the air quality index AQI, and the filling color adopts a dynamic mapping scheme, namely, the height is dynamically adjusted according to the air quality index value:
Figure BDA0001251762300000081
wherein the ColorrectIs a rectangular fill color.
The boxplot represents the temperature every hour of the week, the boxplot shows the date and time of the week from left to right, the dotted lines on the boxplot represent the upper quarter data range and the lower quarter data range respectively, the small rectangle in the center of the boxplot represents the data range from one quarter to three quarters of the quartile, and the horizontal line position in the center of the small rectangle represents the median of the data, as shown in fig. 1.
(2) Flowsheet-stacking diagram analysis component: the abscissa of the stacked graph and the flowsheet refers to the hourly coordinate of the timing range and takes the weekly scale as the basic scale. The ordinate is the POI weighted activity value. The stacked graph represents different types of POI by using area graphs with different colors, is arranged along a coordinate axis on one side and shows the change condition of the one or more POI with the right activity within a specified time range. The flow graph is arranged along the coordinate on both sides, and shows the change situation of the one or more POI (point of interest) with the right activity within the specified time range, as shown in FIG. 2. The calculation of the POI zone right activity is specifically as follows:
and (2.1) calculating the Euclidean distance between the taxi taking difficulty distribution points and each POI distribution point, judging whether the Euclidean distance is smaller than a preset threshold value T, and if the condition is met, setting the weight of the taxi taking difficulty distribution points as the weight of the POI activity.
And (2.2) respectively counting the accumulated sum of the activity degrees of the POI of various types according to different types of the POI, and taking the accumulated sum as the weighted activity degree of the POI of the type.
(3) Scatter matrix-GeoMap-calendar heat map analysis component: the scatter matrix diagram is an expansion of the high-dimensional aspect of the scatter diagram and is used for displaying air quality, temperature and POI (point of interest) zone authority activity. The calendar heat map presents the multidimensional data in a two-dimensional form, and the size of the numerical value is represented by the shade of color, and the change of the POI tape weight activity offset rate under different air quality and temperature conditions of the same POI is displayed through the calendar heat map. The GeoMap is used for showing the activity weight and the geographic distribution of the POI clusters of the same type, as shown in fig. 3.
Liveness weight for same type POI clusteringThe value calculation is specifically: calculating all POI distribution points within the range that the Euclidean distance around each driving difficulty distribution point is less than or equal to T, and recording as POIdidi. Statistical POIdidiAnd (4) calculating the position of the clustering center of the POI distribution points of the same type, and setting the weight of the distribution points with difficulty and easiness in taxi taking as the weight of the clustering center. And clustering the POI distribution points by using a k-means-based clustering algorithm, and taking the calculated new longitude and latitude coordinates of the clustering center as the longitude and latitude coordinates of the center position of the POI.
In the preprocessing process of the method, the calculation of the POI weighted activity degree is mainly carried out by counting the accumulated sum of the number of POIs of different types around each taxi taking difficulty degree point so as to obtain the measurement of the POI weighted activity degree; the POI weighted activity deviation rate is mainly used for counting the deviation condition of the real-time POI activity relative to the historical POI weighted activity mean value. By drawing a column-box diagram, a stack-flow diagram and a scatter matrix-GeoMap-calendar heat map, a user can provide important reference for exploring travel behaviors of residents through interaction among various visual views, can also bring importance to air quality of related departments such as transportation and medical treatment, and provides constructive opinions for the related departments.
While the invention has been described with respect to a single embodiment, showing the various aspects of the useful visualization components, it will be apparent that the invention is not limited to the embodiment described, but is capable of numerous modifications without departing from the basic spirit and scope of the invention.

Claims (2)

1. A big data driving-based air quality and resident travel visual analysis method is characterized by comprising the following steps:
(1) original air quality data, temperature data, POI data and taxi taking difficulty data are reconstructed: firstly, respectively carrying out data cleaning and sorting on air quality data, temperature data, POI data and taxi taking difficulty data, wherein the data cleaning mainly comprises the steps of searching and removing abnormal data and missing values in various data sources, and then sorting all data according to time according to a timestamp; the taxi taking difficulty data comprise geographic coordinates and weights of taxi taking difficulty distribution points; the POI data comprises the geographic coordinates and POI types of the POI distribution points;
(2) calculating the POI zone weight activity and the deviation rate: the POI weighted activity reflects the flow of people around the POI; the deviation rate reflects the change condition of the POI zone weight activity;
the calculation of the POI zone right activity is specifically as follows:
(2.1) calculating the Euclidean distance between the taxi taking difficulty distribution points and each POI distribution point, judging whether the Euclidean distance is smaller than a preset threshold value T, and if the Euclidean distance meets the condition, setting the weight of the taxi taking difficulty distribution points as the weight of the POI activity;
(2.2) respectively counting the accumulated sum of the activity degrees of the POI of various types according to different types of the POI, and taking the accumulated sum as the weighted activity degree of the POI of the type;
the calculation of the offset rate is specifically:
Offsett=(POIWeightt-Averweek,hour)/(POIWeightt)-1
wherein, Averweek,hourPOI weighted average of activity for each hour of each week, POIWeighttTaking the weighted activity, Offset, for the current hour POItIs the offset rate;
3) same type POI clustering: calculating all POI distribution points within the range that the Euclidean distance around each driving difficulty distribution point is less than or equal to T, and recording as POIdidi(ii) a Statistical POIdidiCalculating the position of a clustering center of the POI distribution points of the same type, and setting the weight of the distribution points with difficulty and easiness in taxi taking as the weight of the clustering center; clustering POI distribution points based on a k-means clustering algorithm, and taking the calculated new clustering center longitude and latitude coordinates as the longitude and latitude coordinates of the POI center position;
4) visual analysis of air quality and resident's trip specifically is:
(4.1) color visual coding: when mapping the color, due to the difference of the Air Quality Index (AQI), a dynamic mapping scheme is adopted, namely, the color is dynamically adjusted according to the air quality index value:
Figure FDA0002980052260000021
wherein the ColorrectFill colors that are rectangular;
(4.2) strip-box plot analysis component: the air quality index of each day is shown by a rectangle, the sequence of the rectangles from left to right represents the sequence of the day, the filling color of the rectangle is determined according to the scheme of the step 4.1, and the height is determined according to the air quality index AQI; the boxplot represents the temperature every hour of each week, the boxplot shows the date and time of each week from left to right, the upper dotted line and the lower dotted line of the boxplot respectively represent an upper quarter data range and a lower quarter data range, a small rectangle in the center of the boxplot represents a data range from one quarter to three quarters of the place of the data, and the horizontal line position in the center of the small rectangle represents the median of the data;
(4.3) flowsheet-stacking diagram analysis component: the abscissa of the stacking graph and the flow graph refers to the hourly coordinate of a timing range, takes each week as basic scale, and the ordinate is the weighted activity value of the POI; area graphs with different colors are used for representing different types of POI in the stacked graph, the stacked graph is arranged along one side of a coordinate axis, and the change condition of the weighted activity of one or more POI in a specified time range is displayed; the flow graphs are arranged along the two sides of the coordinate, and the change condition of the one or more POI (point of interest) with the right activity degree in the appointed time range is displayed;
(4.4) scatter matrix-GeoMap-calendar heatmap analysis component: the scatter matrix diagram is an expansion of a scatter diagram in the aspect of high dimension and is used for displaying air quality, temperature and POI (point of interest) weighted activity; the calendar heat map displays the multidimensional data in a two-dimensional form, the size of a numerical value is represented by the shade of color, and the change condition of the offset rate of the same POI under the conditions of different air quality and temperature is displayed through the calendar heat map; the GeoMap is used for displaying the activity weight and the geographic distribution condition of the POI clusters of the same type.
2. A big data drive-based air quality and resident trip visual analysis system is characterized by comprising the following components:
(1) bar-box plot analysis assembly: the air quality index of each day is shown by a rectangle, and the sequence of the rectangles from left to right represents the sequence of the days; the height of the rectangle is determined according to the air quality index AQI, and the filling color adopts a dynamic mapping scheme, namely, the height is dynamically adjusted according to the air quality index value:
Figure FDA0002980052260000031
the boxplot represents the temperature every hour of each week, the boxplot shows the date and time of each week from left to right, the upper dotted line and the lower dotted line of the boxplot respectively represent an upper quarter data range and a lower quarter data range, a small rectangle in the center of the boxplot represents a data range from one quarter to three quarters of the place of the data, and the horizontal line position in the center of the small rectangle represents the median of the data;
(2) flowsheet-stacking diagram analysis component: the abscissa of the stacking graph and the flow graph refers to the hourly coordinate of a timing range, takes each week as basic scale, and the ordinate is the weighted activity value of the POI; area graphs with different colors are used for representing different types of POI in the stacked graph, the stacked graph is arranged along one side of a coordinate axis, and the change condition of the weighted activity of one or more POI in a specified time range is displayed; the flow graphs are arranged along the two sides of the coordinate, and the change condition of the one or more POI (point of interest) with the right activity degree in the appointed time range is displayed; the calculation of the POI zone right activity is specifically as follows:
(2.1) calculating the Euclidean distance between the taxi taking difficulty distribution points and each POI distribution point, judging whether the Euclidean distance is smaller than a preset threshold value T, and if the Euclidean distance meets the condition, setting the weight of the taxi taking difficulty distribution points as the weight of the POI activity;
(2.2) respectively counting the accumulated sum of the activity degrees of the POI of various types according to different types of the POI, and taking the accumulated sum as the weighted activity degree of the POI of the type;
(3) scatter matrix-GeoMap-calendar heat map analysis component: the scatter matrix diagram is an expansion of a scatter diagram in the aspect of high dimension and is used for displaying air quality, temperature and POI (point of interest) weighted activity; the calendar heat map displays the multidimensional data in a two-dimensional form, the size of a numerical value is represented by the shade of color, and the change condition of the offset rate of the same POI under the conditions of different air quality and temperature is displayed through the calendar heat map; the GeoMap is used for displaying the liveness weight and the geographic distribution condition of the POI clusters of the same type;
the calculation of the offset rate is specifically:
Offsett=(POIWeightt-Averweek,hour)/(POIWeightt)-1
wherein, Averweek,hourPOI weighted average of activity for each hour of each week, POIWeighttTaking the weighted activity, Offset, for the current hour POItIs the offset rate;
the calculation of the liveness weight of the POI clusters of the same type is specifically as follows: calculating all POI distribution points within the range that the Euclidean distance around each driving difficulty distribution point is less than or equal to T, and recording as POIdidi(ii) a Statistical POIdidiCalculating the position of a clustering center of the POI distribution points of the same type, and setting the weight of the distribution points with difficulty and easiness in taxi taking as the weight of the clustering center; and clustering the POI distribution points by using a k-means-based clustering algorithm, and taking the calculated new longitude and latitude coordinates of the clustering center as the longitude and latitude coordinates of the center position of the POI.
CN201710173669.4A 2017-03-22 2017-03-22 Air quality and resident trip visual analysis method and system Active CN106991525B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710173669.4A CN106991525B (en) 2017-03-22 2017-03-22 Air quality and resident trip visual analysis method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710173669.4A CN106991525B (en) 2017-03-22 2017-03-22 Air quality and resident trip visual analysis method and system

Publications (2)

Publication Number Publication Date
CN106991525A CN106991525A (en) 2017-07-28
CN106991525B true CN106991525B (en) 2021-06-18

Family

ID=59411741

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710173669.4A Active CN106991525B (en) 2017-03-22 2017-03-22 Air quality and resident trip visual analysis method and system

Country Status (1)

Country Link
CN (1) CN106991525B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110286663B (en) * 2019-06-28 2021-05-25 云南中烟工业有限责任公司 Regional cigarette physical index standardized production improving method
CN112699284B (en) * 2021-01-11 2022-08-30 四川大学 Bus stop optimization visualization method based on multi-source data

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7826965B2 (en) * 2005-06-16 2010-11-02 Yahoo! Inc. Systems and methods for determining a relevance rank for a point of interest
US7991561B2 (en) * 2005-09-29 2011-08-02 Roche Molecular Systems, Inc. Ct determination by cluster analysis with variable cluster endpoint
US8669884B2 (en) * 2011-02-02 2014-03-11 Mapquest, Inc. Systems and methods for generating electronic map displays with points of-interest information
WO2014194480A1 (en) * 2013-06-05 2014-12-11 Microsoft Corporation Air quality inference using multiple data sources
CN105679009B (en) * 2016-02-03 2017-12-26 西安交通大学 A kind of call a taxi/order POI commending systems and method excavated based on GPS data from taxi
CN105825672B (en) * 2016-04-11 2019-06-14 中山大学 A kind of city guide method for extracting region based on floating car data

Also Published As

Publication number Publication date
CN106991525A (en) 2017-07-28

Similar Documents

Publication Publication Date Title
Kuik et al. Air quality modelling in the Berlin–Brandenburg region using WRF-Chem v3. 7.1: sensitivity to resolution of model grid and input data
Wang et al. Spatial-temporal characteristics and determinants of PM2. 5 in the Bohai Rim Urban Agglomeration
CN107767659B (en) Shared bicycle attraction amount and occurrence amount prediction method based on ARIMA model
Wang et al. The contribution from distant dust sources to the atmospheric particulate matter loadings at XiAn, China during spring
Borge et al. Analysis of long-range transport influences on urban PM10 using two-stage atmospheric trajectory clusters
Yuan et al. Land cover classification and change analysis of the Twin Cities (Minnesota) Metropolitan Area by multitemporal Landsat remote sensing
Yeprintsev et al. Technologies for creating geographic information resources for monitoring the socio-ecological conditions of cities
CN110427533B (en) Pollution propagation mode visual analysis method and system based on time sequence particle tracking
CN112699284B (en) Bus stop optimization visualization method based on multi-source data
Cao et al. Using a distributed air sensor network to investigate the spatiotemporal patterns of PM2. 5 concentrations
CN109782373A (en) A kind of sand-dust storm forecast method based on improved Naive Bayesian-CNN multiple target sorting algorithm
CN104318768A (en) Hadoop based self-adaption traffic information tiled map generating system and method
CN105206057A (en) Detection method and system based on floating car resident trip hot spot regions
Wadlow et al. Understanding spatial variability of air quality in Sydney: Part 2—A roadside case study
Xu et al. A gradient boost approach for predicting near-road ultrafine particle concentrations using detailed traffic characterization
CN112906941B (en) Prediction method and system for dynamic correlation air quality time series
Cummings et al. Mobile monitoring of air pollution reveals spatial and temporal variation in an urban landscape
CN106991525B (en) Air quality and resident trip visual analysis method and system
CN115203189A (en) Method for improving atmospheric transmission quantification capability by fusing multi-source data and visualization system
CN114997499A (en) Urban particulate matter concentration space-time prediction method under semi-supervised learning
Wong et al. Association between NO2 concentrations and spatial configuration: a study of the impacts of COVID-19 lockdowns in 54 US cities
Akinosho et al. A scalable deep learning system for monitoring and forecasting pollutant concentration levels on UK highways
CN111400877A (en) Intelligent city simulation system and method based on GIS data
Ndletyana et al. Spatial Distribution of PM 10 and NO 2 in Ambient Air Quality in Cape Town CBD, South Africa.
Chen et al. A Spatiotemporal Interpolation Graph Convolutional Network for Estimating PM₂. ₅ Concentrations Based on Urban Functional Zones

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant