WO2018214060A1 - Small-scale air quality index prediction method and system for city - Google Patents

Small-scale air quality index prediction method and system for city Download PDF

Info

Publication number
WO2018214060A1
WO2018214060A1 PCT/CN2017/085715 CN2017085715W WO2018214060A1 WO 2018214060 A1 WO2018214060 A1 WO 2018214060A1 CN 2017085715 W CN2017085715 W CN 2017085715W WO 2018214060 A1 WO2018214060 A1 WO 2018214060A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
air quality
prediction
predicted
time
Prior art date
Application number
PCT/CN2017/085715
Other languages
French (fr)
Chinese (zh)
Inventor
王绍鑫
陈矿
吴建东
曹袭亚
林爱德华·罗伯特
Original Assignee
北京质享科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京质享科技有限公司 filed Critical 北京质享科技有限公司
Priority to PCT/CN2017/085715 priority Critical patent/WO2018214060A1/en
Priority to CN201780005024.8A priority patent/CN108701274B/en
Publication of WO2018214060A1 publication Critical patent/WO2018214060A1/en

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16ZINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS, NOT OTHERWISE PROVIDED FOR
    • G16Z99/00Subject matter not provided for in other main groups of this subclass
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"

Definitions

  • the invention relates to the technical field of air quality index prediction, in particular to a city small-scale air quality index prediction method and system based on machine learning algorithm.
  • AQI air quality index
  • SO 2 sulfur dioxide
  • NO 2 nitrogen dioxide
  • NO nitrogen monoxide
  • CO carbon monoxide
  • O 3 ozone
  • PM2.5 suspended particulate matter
  • Atmospheric dispersion modeling This model simulates the transport, diffusion and migration processes of atmospheric pollutants, and predicts the temporal and spatial distribution of a certain pollutant concentration under different pollution source conditions, meteorological conditions and underlying surface conditions.
  • the mathematical model is a simplified mathematical description of the migration and diffusion of pollutants in the lower atmosphere. The form of the model varies according to different modeling theory systems, contaminant migration, diffusion processes, and different description objects.
  • the SPRINTARS method Spectral Radiation-Transport Model for Aerosol Species
  • Kyushu University Japan is a typical representative. It is a numerical model developed on a global scale to simulate the effects of atmospheric suspended particulate matter on the climate system and atmospheric pollution.
  • the atmospheric aerosols present in the troposphere are studied.
  • Such methods are scientific in nature, but have the following disadvantages: the diffusion form of pollutants is mainly considered from the macroscopic atmospheric circulation, and it is difficult to distinguish the specific climate conditions of key areas (such as cities). Due to the specific climatic conditions in the same area, it will change due to seasons, time periods, and even human factors. For example, before and after a new chemical plant in a certain area, the emission and accumulation of pollutants are significantly different. Therefore, it is difficult to accurately target specific areas.
  • the method has a large amount of data collection and a huge amount of data calculation, and at least a large amount of pollution source specific information needs to be collected. And satellite meteorological information, while configuring high-performance hardware devices to provide data processing functions, high cost, professional, and not suitable for ordinary users.
  • the technical problem to be solved by the present invention is to use a cooperative training algorithm that combines various prediction methods to perform air quality index prediction for each geographical point in a city range not limited to an air quality monitoring base station, while keeping the calculation low. Improve the accuracy of prediction while increasing complexity.
  • the technical solution adopted by the invention is: a city small-scale air quality index prediction method, comprising:
  • the time prediction model corresponding to the current time prediction represents the relationship between the historical monitoring data and the current monitoring data, and corresponds to the temporal prediction model predicted in the future, and the historical monitoring data and the current monitoring number are represented. According to the relationship between the monitoring data at various moments in the future, according to the specified time span of the future period, including a plurality of time prediction models corresponding to each moment;
  • the spatial prediction model characterizes the relationship between the real-time monitoring data of each known location or base station and the air quality index data of the to-be-predicted point whose real-time monitoring data is unknown.
  • the invention realizes the air quality prediction of the location outside the base station through the establishment of each model and the fusion of the prediction results of each model at the time of prediction, and the prediction result integrates the influence of various related factors, and the accuracy is higher.
  • the present invention also includes:
  • the monitoring data monitored by the air quality monitoring base station in the present invention includes date and time, base station name, and base station Latitude, AQI data, temperature, air pressure, wind, humidity, and weather type data.
  • AQI data for the case of missing historical data, interpolation of local time series can be performed.
  • step S3 a multivariate linear regression method is used to establish a temporal prediction model corresponding to the current time prediction and the future time prediction.
  • step S3 includes the steps of:
  • ⁇ i is the regression coefficient
  • X i is the model input data
  • Y 1 is the air quality index of the time to be predicted
  • the model input data is the 1 hour historical AQI data and the temperature, air pressure, wind, humidity and weather type data of the hour at the current time;
  • the model input data is the current time AQI data, l 1 hour historical AQI data, and current temperature, air pressure, wind, humidity and weather type data.
  • each regression coefficient in each initial multiple linear regression model can be obtained, thereby obtaining each initial multiple linear regression model, that is, an initial time prediction model.
  • the present invention may use a numerical number for the weather type data, such as 0 for sunny days, 1 for cloudy cloudy days, 2 for rainy days, and the like. Other existing data processing and presentation methods can also be employed.
  • a two-dimensional linear interpolation method is used to establish a spatial prediction model for air quality prediction at a specified coordinate, including:
  • the input of the model is S 2
  • the output of the model is the air quality index of the location to be predicted
  • the griddata () represents the two-dimensional interpolation function.
  • the initial training data set of the spatial prediction model contains only the air quality index at the base station.
  • the traffic data includes length data of the smooth path segment, the slow road segment, and the congestion road segment in each of the to-be-predicted locations and the air quality monitoring base station.
  • the geographic point of interest data includes distribution data of geographic object entities within a set radius area around the base station and the air quality monitoring base station; the geographic object entity types include schools, banks, restaurants, and gas stations. Other geographic object entities may also be included, and the exhaustion is not described.
  • step S5 of the present invention establishes a dynamic prediction model by using a multiple linear regression method, including the steps of:
  • S51 Obtain traffic data and geographic interest point data within a given radius of each base station corresponding to each time point in the historical database, and the traffic data includes the proportion data of the length of the smooth path segment, the slow road segment, and the congestion segment length, and is defined as T 1 .
  • T 2 , T 3 geographic interest point data includes the number of geographical interest points within a given radius of the base station, defined as T 4 , T 5 ,..., T q , and the air quality index monitoring of the corresponding base station at the corresponding time.
  • Data establish an initial training set S 3 ;
  • the model input quantity is the traffic data and geographic interest point data within a given radius of the to-be-predicted location at the specified time
  • the model output quantity Y 3 is the air quality index of the point to be predicted.
  • Dynamic prediction model initial training data in S 3 only contains the data at the base station in the history database.
  • the values of the regression coefficients in the dynamic prediction model can be obtained through the training of the training set before each prediction, so that the corresponding dynamic prediction model is obtained, and the current and future moments of the air quality index data are obtained by using the dynamic prediction model.
  • the prior art can predict the future time of the traffic data. Therefore, when the present invention performs dynamic prediction for the future time, the input data can directly use the traffic prediction data that has been predicted by the prior art.
  • indoor air quality and outdoor air quality have various types of numerical relationships. This depends on a variety of conditions: the type of building environment, the floor, the distance from the main road, whether to open the central air conditioning, whether to open the window ventilation, whether to open the air purifier, etc. will affect the relationship between indoor and outdoor air quality index.
  • the regression tree algorithm is used to establish an indoor and outdoor prediction model, including the steps:
  • the model input quantity is the indoor air quality index M shared by the user acquired at the time to be predicted, and the indoor air quality index related data, and the model output quantity Y 4 is the air quality index data of the to-be-predicted place at the time to be predicted.
  • the model coefficients of the regression tree RT() are also different, and the training of the present invention is adopted.
  • the input and output data are trained, and the regression tree model and its coefficient which characterize the relationship between indoor and outdoor air quality index under each condition are obtained, which is applied to the subsequent prediction of the air quality index of the predicted location under the same conditions.
  • the input data may be data of a model input data acquired by using the prior art at a corresponding time in the future.
  • the indoor and outdoor prediction model is established as follows:
  • indoor air quality is about 60% of outdoor air quality.
  • step S7 of the present invention performs collaborative training on the established time prediction model, spatial prediction model, dynamic prediction model and indoor and outdoor prediction model, including:
  • the time prediction model, the spatial prediction model, the dynamic prediction model, and the indoor and outdoor prediction models are predictors F 1 , F 2 , F 3 , and F 4 , respectively, and the training sets of each predictor are respectively recorded as L 1 , L 2 , L 3 , L 4 , initialize the training set to:
  • the weight vector for initializing each predictor prediction result is [w 1 , w 2 , w 3 , w 4 ], and the sum of the four weighting factors is equal to 1.
  • the AQI fusion value at the time to be predicted is:
  • the present invention performs at least one round of training for each time prediction.
  • each round of the training process is completed, and the next round of training, the data of the training data sets of each model will be Updated to provide more accurate predictions in subsequent training.
  • the newly added data in each training data set is the relevant data at the predicted position where the sum of the prediction results of the predictors and the deviation of the cooperative training results is the smallest in the previous round of training.
  • the newly added training data is The coordinates and AQI data at the predicted location obtained from the previous round of training; for the dynamic prediction model, the newly added training data is the historical air quality index data and traffic data and geographic interest point data at the predicted location, and so on.
  • the AQI fusion value is calculated in the following formula using the following formula:
  • the invention also provides a city small-scale air quality index prediction system, comprising:
  • the area dividing module divides the urban area into a grid-shaped advancing area, and the grid intersection points correspond to the location of the air quality index to be predicted;
  • the historical monitoring data acquisition module acquires historical monitoring data of the air quality monitoring base station and establishes a historical database; the historical monitoring data includes AQI data, weather data, and weather type data;
  • a time prediction model building module which establishes a time prediction model based on a historical database
  • the spatial prediction model building module acquires real-time monitoring data of each air quality monitoring base station and establishes a spatial prediction model
  • the dynamic prediction model establishing module acquires traffic data and geographic interest point data of each to-be-predicted location and air quality monitoring base station, and establishes a dynamic prediction model
  • the indoor and outdoor prediction model building module acquires the indoor air quality index shared by the user and establishes an indoor and outdoor prediction model
  • the collaborative training module cooperatively trains the established time prediction model, spatial prediction model, dynamic prediction model and indoor and outdoor prediction model to fuse the prediction results of all models to obtain all the predicted locations at the current time and for a period of time in the future. Air quality index forecast.
  • the urban small-scale air quality prediction method provided by the invention has the following advantages:
  • the invention combines multiple data sources and multiple prediction models, avoiding the limitations of a single prediction model and ensuring the accuracy of the model;
  • the invention separates multiple prediction models and then finally cooperates with each other, which reduces the overall computational complexity and shortens the calculation time.
  • Figure 1 is a schematic flow chart of the method of the present invention.
  • the urban small-scale air quality index prediction method of the invention comprises the steps of:
  • the time prediction model corresponding to the current time prediction represents the relationship between the historical monitoring data and the current monitoring data, and corresponds to the temporal prediction model predicted in the future, and represents the historical monitoring data and the current monitoring data and the future. Monitoring the relationship between the data at each moment in time, according to the specified time span of the future period, including a plurality of temporal prediction models corresponding to each moment;
  • the spatial prediction model characterizes the relationship between the real-time monitoring data of each known location or base station and the air quality index data of the to-be-predicted point whose real-time monitoring data is unknown.
  • the invention realizes the air quality prediction of the location outside the base station through the establishment of each model and the fusion of the prediction results of each model at the time of prediction, and the prediction result integrates the influence of various related factors, and the accuracy is higher.
  • Figure 1 is a flow chart of the present invention. As shown in FIG. 1, the present invention uses a cooperative training algorithm with multiple prediction models to predict the null. Gas quality index. The various prediction models, cooperative training algorithms, and final evaluation accuracy for predicting air quality are described in detail below.
  • a square grid system is built in the area to be predicted.
  • the to-be-predicted area is the inner five-ring area of Beijing, and a square grid system is established, and the grid size is one square kilometer.
  • the grid intersection is the location where the air quality index is to be predicted.
  • the number of air quality monitoring base stations is recorded as N. In this embodiment, there are 36 air quality monitoring base stations in Beijing.
  • the sampling time interval for the historical data is preferably 1 hour.
  • the local time series interpolation is completed for the case of missing historical data.
  • a unified time series prediction model is established for each air quality monitoring base station, and is used to predict the air quality index of the specified predicted location at a certain point in time in the future. This step further includes the following substeps:
  • the length of the historical sequence used and the forecast period Specifies the length of the historical sequence used and the forecast period.
  • the data at the current time is recorded as x n
  • the length of the historical sequence is L 1
  • the history sequence is recorded as
  • the future sequence length is L 2
  • the future sequence is recorded as
  • the length of the historical sequence is selected to be 6
  • the length of the prediction period is selected to be 6. That is, at any time, the corresponding 6-hour historical data is used to predict the most recent 6-hour air quality index.
  • all consecutive L 1 +1+L 2 hour sequences in the extraction history database constitute the training data set S 1 .
  • the multivariate linear regression model was used to predict the current time and the next 6 hours.
  • a multivariate linear regression model is established for each predicted time point, that is, there are a total of seven time prediction models.
  • the input data S 1 is the 6-hour historical data of the AQI and the temperature, air pressure, wind, humidity, and weather type of the previous hour.
  • the input data S 1 is the current time AQI and 6-hour historical AQI data, and the current time temperature, air pressure, wind, humidity, weather type.
  • the output of the multiple linear regression model is the AQI data at the time point that needs to be predicted.
  • Multiple linear regression models can be written in the following form,
  • ⁇ i is the regression coefficient
  • X i is the input data
  • Y 1 is the air quality index of the point to be predicted.
  • the spatial prediction model uses a two-dimensional linear interpolation algorithm.
  • the input data S 2 is the latitude and longitude, AQI of the base station or grid point of the known AQI value.
  • the spatial prediction model can be expressed as:
  • x, y are the coordinates of the point to be predicted
  • S 2 is the input data, that is, the training set
  • Y 2 is the air quality index of the point to be predicted.
  • the initial training data S 2 of the spatial prediction model contains only relevant data at the base station.
  • the griddata function is an existing interpolation function.
  • the initial training data of S 2 is only the base station related data. After the training set is updated, the updated data is the previous round prediction result value of the to-be-predicted location with the smallest deviation of the prediction result in the previous round of training.
  • the traffic data includes unblocked, slow, and congested road lengths, and converted into proportional data;
  • the geographic interest point data includes distribution data of various types of geographic object entities within a given radius of the designated location, such as schools, banks, The number of restaurants, gas stations, etc.;
  • the dynamic prediction model is established by using multiple linear regression models.
  • the input data is traffic data and geographic interest point data, and the output data is AQI data.
  • the model form is as follows,
  • T 1 , T 2 , T 3 are the ratio of smooth, slow, and congested road segments
  • T 4 , T 5 ,..., T q are the number of geographic interest points of various types
  • Y 3 is to be The air quality index of the predicted point.
  • the initial dynamic prediction model training data S 3 only contains the data at the base station.
  • the indoor air quality index is measured by an air quality sensor placed on an air purifier that is compatible with the software system.
  • the indoor air quality index is measured by an air quality sensor placed on an air purifier that is compatible with the software system.
  • indoor air quality and outdoor air quality have various types of numerical relationships. This depends on a variety of conditions: the type of building environment, the floor, the distance from the main road, whether the central air conditioning is turned on, whether the window is ventilated, whether the air purifier is turned on, etc.
  • the regression tree algorithm was used to fit the indoor and outdoor air quality index relationships under each category.
  • indoor and outdoor prediction models can be expressed as
  • RT is the regression tree algorithm
  • M is the indoor air quality index measured by the sensor
  • Y 4 is the outdoor air quality index to be predicted.
  • indoor and outdoor prediction models are obtained using the following method. According to the statistical relationship between indoor and outdoor air quality published by the US Environmental Protection Agency [1], indoor air quality is about 60% of outdoor air quality, namely:
  • M is the indoor air quality index measured by the sensor and Y 4 is the outdoor air quality index to be predicted.
  • the cooperative training algorithm is a semi-supervised learning algorithm whose main purpose is to efficiently use a small amount of marker data and a large amount of unlabeled data to train the predictor.
  • This embodiment uses a simplified version of the collaborative training algorithm. The specific implementation steps are as follows:
  • the time prediction model, the spatial prediction model, the dynamic prediction model, and the indoor and outdoor prediction models are predictors F 1 , F 2 , F 3 , and F 4 , respectively, and the training sets of each predictor are respectively recorded as L 1 , L 2 , L 3 , L 4 , initialize the training set to:
  • the weight vector for initializing each predictor prediction result is [w 1 , w 2 , w 3 , w 4 ], and the sum of the four weighting factors is equal to 1.
  • the AQI fusion value at the time to be predicted is:
  • the present invention performs at least one round of training for each time prediction.
  • each round of the training process is completed, and the next round of training, the data of the training data sets of each model will be Updated to provide more accurate predictions in subsequent training.
  • the newly added data in each training data set is the relevant data at the predicted position where the sum of the prediction results of the predictors and the deviation of the cooperative training results is the smallest in the previous round of training.
  • the newly added training data is The coordinates and AQI data at the predicted location obtained from the previous round of training; for the dynamic prediction model, the newly added training data is the historical air quality index data and traffic data and geographic interest point data at the predicted location, and so on.
  • the AQI fusion value is calculated in the following formula using the following formula:
  • Step S81 evaluates the accuracy of the current time prediction system
  • step 7 to obtain an AQI prediction value of each base station in the k-th base station that is separately isolated, at the current time, and record
  • Obtaining the measured AQI value of the kth base station is y 1 , y 2 , . . . , y c , and the accuracy of the current time prediction by the prediction system when removing the kth base station may be described by the following indicator ⁇ k :
  • Step S82 evaluates the accuracy of AQI prediction in future time
  • the predicted value of the grid in which all base stations are located at a specified future time after performing the step S7 is recorded.
  • the actual measured values of the base station are z 1 , z 2 ,..., z N , and the accuracy of the prediction system for future predictions is:
  • the present invention utilizes a cooperative training algorithm that combines various prediction methods to perform air quality index prediction for each geographical point in a city range not limited to an air quality monitoring base station, while maintaining a low computational complexity while improving The accuracy of the forecast.
  • the invention also provides a city small-scale air quality index prediction system, comprising:
  • the area dividing module divides the urban area into a grid-shaped advancing area, and the grid intersection points correspond to the location of the air quality index to be predicted;
  • the historical monitoring data acquisition module acquires historical monitoring data of the air quality monitoring base station and establishes a historical database; Historical monitoring data includes AQI data, meteorological data, and weather type data;
  • a time prediction model building module which establishes a time prediction model based on a historical database
  • the spatial prediction model building module acquires real-time monitoring data of each air quality monitoring base station and establishes a spatial prediction model
  • the dynamic prediction model establishing module acquires traffic data and geographic interest point data of each to-be-predicted location and air quality monitoring base station, and establishes a dynamic prediction model
  • the indoor and outdoor prediction model building module acquires the indoor air quality index shared by the user and establishes an indoor and outdoor prediction model
  • the collaborative training module cooperatively trains the established time prediction model, spatial prediction model, dynamic prediction model and indoor and outdoor prediction model to fuse the prediction results of all models to obtain all the predicted locations at the current time and for a period of time in the future. Air quality index forecast.
  • embodiments of the present application can be provided as a method, system, or computer program product.
  • the present application can take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment in combination of software and hardware.
  • the application can take the form of a computer program product embodied on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) including computer usable program code.
  • the computer program instructions can also be stored in a computer readable memory that can direct a computer or other programmable data processing device to operate in a particular manner, such that the instructions stored in the computer readable memory produce an article of manufacture comprising the instruction device.
  • the apparatus implements the functions specified in one or more blocks of a flow or a flow and/or block diagram of the flowchart.
  • These computer program instructions can also be loaded onto a computer or other programmable data processing device such that a series of operational steps are performed on a computer or other programmable device to produce computer-implemented processing for execution on a computer or other programmable device.
  • the instructions provide steps for implementing the functions specified in one or more of the flow or in a block or blocks of a flow diagram.

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Human Resources & Organizations (AREA)
  • Economics (AREA)
  • Strategic Management (AREA)
  • Marketing (AREA)
  • Game Theory and Decision Science (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Development Economics (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Tourism & Hospitality (AREA)
  • Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The present invention discloses a small-scale air quality index prediction method and system for a city. The method comprises: firstly dividing a city area into a grid of multiple locations to undergo prediction; acquiring historical data related to each model and creating the following models on the basis of the historical data: a time prediction model respectively corresponding to a prediction relating to a current moment and predictions relating to each moment in a future period of time, a spatial prediction model for performing air quality predictions with respect to locations at specified coordinates, a dynamic prediction model for characterizing a relationship of traffic data and data of a geographical location of interest to an air quality index, and an indoor and outdoor prediction model for characterizing a relationship of an indoor air quality index to an outdoor air quality index; when performing prediction, performing coordinated training on the created time prediction model, the spatial prediction model, the dynamic prediction model and the indoor and outdoor prediction model with respect to any one of the locations to undergo prediction at any real-time moment, so as to merge prediction results of all of the models and obtain air quality index predicted values of each of the locations to undergo prediction at a corresponding current moment, and at each moment in the future period of time.

Description

一种城市小尺度空气质量指数预测方法与系统Method and system for predicting urban small-scale air quality index 技术领域Technical field
本发明涉及空气质量指数预测技术领域,特别是一种基于机器学习算法的城市小尺度空气质量指数预测方法和系统。The invention relates to the technical field of air quality index prediction, in particular to a city small-scale air quality index prediction method and system based on machine learning algorithm.
背景技术Background technique
随着城市化和工业化进程的推进,环境污染问题越来越严重。近年来,广泛且严重的大气污染直接威胁着人们的身体健康,影响了社会经济的绿色可持续发展。目前大部分地区仅提供城市级别的空气质量指数预测,而不能精确到城市范围内各地理位置点。对于生活在城市中的居民,准确合理的空气质量预测有助于他们安排生产生活,调整出行方式和采取相应的防护措施,从而减少空气污染物对身体的侵害,提高社会整体的健康水平。With the advancement of urbanization and industrialization, the problem of environmental pollution has become more and more serious. In recent years, widespread and serious air pollution has directly threatened people's health and affected the sustainable development of social economy. At present, most of the regions only provide city-level air quality index forecasts, but not to geographical locations within the city. For residents living in cities, accurate and reasonable air quality predictions help them arrange production and life, adjust travel modes and take appropriate protective measures to reduce the harm of air pollutants to the body and improve the overall health of the society.
AQI(空气质量指数,air quality index)是定量描述空气质量的无量纲指标,也是目前衡量空气质量的最广泛使用的指标。参照国家标准HJ633-2012,AQI由若干污染物的浓度通过函数关系计算而成。这些污染物包括二氧化硫(SO2)、二氧化氮(NO2)、一氧化氮(NO)、一氧化碳(CO)、臭氧(O3)、悬浮颗粒物PM2.5和PM10。AQI的数值范围为0到500,越大表示空气污染状况越严重。AQI (air quality index) is a dimensionless indicator that quantitatively describes air quality and is currently the most widely used indicator for measuring air quality. Referring to the national standard HJ633-2012, AQI is calculated from the concentration of several pollutants through a functional relationship. These contaminants include sulfur dioxide (SO 2 ), nitrogen dioxide (NO 2 ), nitrogen monoxide (NO), carbon monoxide (CO), ozone (O 3 ), suspended particulate matter PM2.5, and PM10. The value of AQI ranges from 0 to 500. A larger value indicates a more serious air pollution condition.
目前针对AQI预测的方法有如下几类:The current methods for AQI prediction are as follows:
1、大气污染物扩散模式(Atmospheric dispersion modeling):这一类模型是模拟大气污染物的输送、扩散、迁移过程,预测在不同污染源条件、气象条件及下垫面条件下某污染物浓度时空分布的数学模型,是低层大气中污染物迁移和扩散规律的、简单化的数学描述。根据不同的建模理论体系、污染物迁移、扩散过程以及不同的描述对象,模式的形式也各不相同。由日本九州大学开发的SPRINTARS方法(Spectral Radiation-Transport Model for Aerosol Species)是其中的典型代表。它是以全球规模模拟大气悬浮颗粒物对气候系统造成的影响及大气污染状况开发的数值模型。以海气耦合模型MIROC为基础,对存在于对流层中的大气气溶胶进行研究。这类方法具有一定的科学性,但存在以下缺点:主要从宏观大气环流考虑污染物的扩散形式,而对于重点关注区域(如城市)的具体气候情况难以详细区分。由于同一区域的具体气候情况,会因季节、时间段、甚至人为因素等发生变化,例如,某地区新建化工厂前后,污染物的排放和积累明显不同,因此,该方法难以对特定区域进行准确的预测;另一方面,该方法数据采集量巨大、数据计算量巨大,至少需要收集大量的污染源具体信息 及卫星气象信息,同时配置高性能的硬件设备提供数据处理功能,成本高、专业性强,并不适于普通用户使用。1. Atmospheric dispersion modeling: This model simulates the transport, diffusion and migration processes of atmospheric pollutants, and predicts the temporal and spatial distribution of a certain pollutant concentration under different pollution source conditions, meteorological conditions and underlying surface conditions. The mathematical model is a simplified mathematical description of the migration and diffusion of pollutants in the lower atmosphere. The form of the model varies according to different modeling theory systems, contaminant migration, diffusion processes, and different description objects. The SPRINTARS method (Spectral Radiation-Transport Model for Aerosol Species) developed by Kyushu University, Japan is a typical representative. It is a numerical model developed on a global scale to simulate the effects of atmospheric suspended particulate matter on the climate system and atmospheric pollution. Based on the air-sea coupled model MIROC, the atmospheric aerosols present in the troposphere are studied. Such methods are scientific in nature, but have the following disadvantages: the diffusion form of pollutants is mainly considered from the macroscopic atmospheric circulation, and it is difficult to distinguish the specific climate conditions of key areas (such as cities). Due to the specific climatic conditions in the same area, it will change due to seasons, time periods, and even human factors. For example, before and after a new chemical plant in a certain area, the emission and accumulation of pollutants are significantly different. Therefore, it is difficult to accurately target specific areas. On the other hand, the method has a large amount of data collection and a huge amount of data calculation, and at least a large amount of pollution source specific information needs to be collected. And satellite meteorological information, while configuring high-performance hardware devices to provide data processing functions, high cost, professional, and not suitable for ordinary users.
2、基于历史数据的统计模型,例如线性回归、人工神经网络。这类方法通常仅使用采集于空气质量监测基站附近的数据,对基站附近局部的空气质量指数进行预测。这类方法的缺点是只考虑了基站附近,而对没有基站的地点无法建立预测模型。另一方面,由于只考虑了地理局部信息,而很少考虑污染物在空间上的扩散过程。因此不仅不同地点的预测模型可能有巨大差异,预测的准确度也难以保持在较高水平。2. Statistical models based on historical data, such as linear regression, artificial neural networks. Such methods typically only use local data collected near the air quality monitoring base station to predict local air quality indices near the base station. The disadvantage of this type of method is that only the vicinity of the base station is considered, and the prediction model cannot be established for a location without a base station. On the other hand, since only geographical local information is considered, the spatial diffusion process of pollutants is rarely considered. Therefore, not only the prediction models in different locations may have huge differences, but also the accuracy of predictions is difficult to maintain at a high level.
发明内容Summary of the invention
本发明要解决的技术问题为:利用将多种预测方法相融合的协同训练算法,对不限于空气质量监测基站附近的城市范围内的各个地理位置点进行空气质量指数预测,在保持较低计算复杂度的同时,提高预测的准确度。The technical problem to be solved by the present invention is to use a cooperative training algorithm that combines various prediction methods to perform air quality index prediction for each geographical point in a city range not limited to an air quality monitoring base station, while keeping the calculation low. Improve the accuracy of prediction while increasing complexity.
本发明采取的技术方案为:一种城市小尺度空气质量指数预测方法,包括:The technical solution adopted by the invention is: a city small-scale air quality index prediction method, comprising:
S1,将城市区域以网格形式进形区域划分,网格交点对应待预测空气质量指数的地点;S1, dividing the urban area into a grid-shaped advancing area, and the grid intersection points correspond to the location of the air quality index to be predicted;
S2,获取各空气质量监测基站的历史监测数据,建立历史数据库;S2, obtaining historical monitoring data of each air quality monitoring base station, and establishing a historical database;
S3,基于历史数据库中各基站多个时间序列的监测数据,建立分别对应当前时刻预测以及未来一段时间中各时刻预测的时间预测模型;S3. Based on the monitoring data of multiple time series of each base station in the historical database, establish a time prediction model corresponding to the current time prediction and the prediction of each time in the future time period;
S4,基于历史数据库中各个空气质量监测基站在同一时刻的监测数据,利用二维线性插值方法,建立对指定坐标处进行空气质量预测的空间预测模型;S4, based on monitoring data of the base station at the same time by each air quality in the historical database, using a two-dimensional linear interpolation method to establish a spatial prediction model for air quality prediction at a specified coordinate;
S5,获取各个待预测地点和空气质量监测基站的交通数据和地理兴趣点数据,及其对应时刻各待预测地点和空气质量监测基站的空气质量指数数据;S5, obtaining traffic data and geographic interest point data of each to-be-predicted location and air quality monitoring base station, and air quality index data of each base station to be predicted and the air quality monitoring base station at corresponding moments;
基于所获取的数据,建立表征交通数据和地理兴趣点数据与空气质量指数之间关系的动态预测模型;Based on the acquired data, a dynamic prediction model that characterizes the relationship between traffic data and geographic interest point data and air quality index is established;
S6,获取用户共享的室内空气质量指数、用户居住环境数据,以及相应地点的空气质量指数数据,建立表征室内空气质量指数与室外空气质量指数之间关系的室内外预测模型;S6, obtaining indoor air quality index shared by the user, user living environment data, and air quality index data of the corresponding location, and establishing an indoor and outdoor prediction model for characterizing the relationship between the indoor air quality index and the outdoor air quality index;
S7,对于待预测空气质量指数的任一实时时刻的任一待预测地点,将已建立的时间预测模型、空间预测模型、动态预测模型和室内外预测模型进行协同训练,以将所有模型的预测结果相融合,进而得到各待预测地点在相应当前时刻和未来一段时间中各时刻的空气质量指数预测值。S7, cooperatively training the established time prediction model, spatial prediction model, dynamic prediction model and indoor and outdoor prediction model for any predicted location of any real-time moment of the air quality index to be predicted, so as to predict the prediction results of all models The fusion is performed to obtain the predicted values of the air quality index of each of the to-be-predicted locations at respective current moments and at various moments in the future.
本发明中,对应当前时刻预测的时间预测模型,表征的是历史监测数据与当前监测数据之间的关系,对应未来一段时间预测的时间预测模型,表征的是历史监测数据及当前监测数 据与未来一段时间中各时刻监测数据之间的关系,根据所指定的未来一段时间的时间跨度,包括多个对应各时刻的时间预测模型;In the present invention, the time prediction model corresponding to the current time prediction represents the relationship between the historical monitoring data and the current monitoring data, and corresponds to the temporal prediction model predicted in the future, and the historical monitoring data and the current monitoring number are represented. According to the relationship between the monitoring data at various moments in the future, according to the specified time span of the future period, including a plurality of time prediction models corresponding to each moment;
空间预测模型,表征的是各已知地点或者基站的实时监测数据,与实时监测数据未知的待预测点空气质量指数数据之间的关系。The spatial prediction model characterizes the relationship between the real-time monitoring data of each known location or base station and the air quality index data of the to-be-predicted point whose real-time monitoring data is unknown.
本发明通过各模型的建立,以及预测时对各模型预测结果的融合,实现了对基站外地点的空气质量预测,且预测结果综合了各种相关因素的影响,准确度更高。The invention realizes the air quality prediction of the location outside the base station through the establishment of each model and the fusion of the prediction results of each model at the time of prediction, and the prediction result integrates the influence of various related factors, and the accuracy is higher.
进一步的,本发明还包括:Further, the present invention also includes:
S8,实时评估空气质量指数预测值结果的准确性,包括:S8, real-time assessment of the accuracy of air quality index prediction results, including:
S81,采用K折交叉验证算法,评估当前时刻预测值结果的准确性:S81, using the K-fold cross-validation algorithm to evaluate the accuracy of the current time prediction result:
S811,假设基站数量为N,将所有基站均分为K份,每份编号依次为1,2,...,k,k+1,...,K,每份有c=N/K个基站;S811, assuming that the number of base stations is N, all base stations are divided into K shares, each number is 1, 2, ..., k, k+1, ..., K, and each has c=N/K Base stations;
S812,从K份基站中除去第k份,剩余的K-1份基站作为已知数据;S812, removing the kth part from the K base station, and remaining K-1 base stations as known data;
S813,基于已知的K-1份基站的数据,得到被除去的第k份基站中各基站在当前时刻的AQI预测值,记为
Figure PCTCN2017085715-appb-000001
S813, based on the data of the known K-1 base stations, obtain the AQI prediction value of each base station in the removed kth base station at the current time, and record it as
Figure PCTCN2017085715-appb-000001
S814,获取第k份中各基站的实测AQI值y1,y2,...,yc,则当前时刻预测值结果的准确性用如下指标ηk描述:S814. Obtain the measured AQI values y 1 , y 2 , . . . , y c of each base station in the kth part, and the accuracy of the current time predicted value result is described by the following indicator η k :
Figure PCTCN2017085715-appb-000002
Figure PCTCN2017085715-appb-000002
S815,将k从1遍历至K,分别重复步骤S712至步骤S714,然后得到预测系统在当前时刻的准确性指标η为:S815, traversing k from 1 to K, repeating steps S712 to S714 respectively, and then obtaining an accuracy index η of the prediction system at the current time is:
Figure PCTCN2017085715-appb-000003
Figure PCTCN2017085715-appb-000003
η越接近于1,则系统的当前时刻预测的准确性越高;The closer η is to 1, the higher the accuracy of the current time prediction of the system;
S82,评估未来一段时间预测值结果的准确性:S82, assessing the accuracy of the predicted results for a period of time:
假设预测得到的未来一段时间内某指定时刻所有基站对应的预测值结果为
Figure PCTCN2017085715-appb-000004
到该指定时刻时各基站的实际测量值为z1,z2,...,zN,则预测系统对未来时刻的预测准确性为:
Assume that the predicted value of all base stations corresponding to a specified time in the future is predicted to be
Figure PCTCN2017085715-appb-000004
The actual measured values of the base stations at the specified time are z 1 , z 2 ,..., z N , and the prediction accuracy of the prediction system for future moments is:
Figure PCTCN2017085715-appb-000005
Figure PCTCN2017085715-appb-000005
ψ越接近于1,则系统对未来时刻预测的准确性越高。The closer the ψ is to 1, the higher the accuracy of the system's prediction of future moments.
优选的,本发明中空气质量监测基站监测的监测数据包括日期时间、基站名称、基站经 纬度、AQI数据、气温、气压、风力、湿度和天气类型数据。对于历史数据缺失的情况,可进行局部时间序列的插值补全。Preferably, the monitoring data monitored by the air quality monitoring base station in the present invention includes date and time, base station name, and base station Latitude, AQI data, temperature, air pressure, wind, humidity, and weather type data. For the case of missing historical data, interpolation of local time series can be performed.
优选的,步骤S3中利用多元线性回归方法建立分别对应当前时刻预测以及未来一段时间预测的时间预测模型。Preferably, in step S3, a multivariate linear regression method is used to establish a temporal prediction model corresponding to the current time prediction and the future time prediction.
步骤S3中基于历史数据库建立时间预测模型包括步骤:The step of establishing a temporal prediction model based on the historical database in step S3 includes the steps of:
S31,指定历史序列长度l1及预测期即未来序列长度为l2,记当前时刻的数据为xn,则历史序列为
Figure PCTCN2017085715-appb-000006
未来序列为
Figure PCTCN2017085715-appb-000007
S31, specifying the length of the historical sequence l 1 and the prediction period, that is, the length of the future sequence is l 2 , and the data of the current time is x n , then the historical sequence is
Figure PCTCN2017085715-appb-000006
The future sequence is
Figure PCTCN2017085715-appb-000007
提取历史数据库中所有连续l1+1+l2小时的多个序列数据组成初始训练数据集S1Extracting all consecutive sequence data of l 1 +1+l 2 hours in the historical database to form an initial training data set S 1 ;
S32,建立l2+1个多元线性回归模型,各多元线性回归模型分别对应当前时刻以及未来l2小时中各时刻的预测,表示为:S32, establishing l 2 +1 multiple linear regression models, each of the multiple linear regression models respectively corresponding to the current time and the prediction of each moment in the next 12 hours, expressed as:
Y1=β01X12X2+...+βpXp Y 101 X 12 X 2 +...+β p X p
其中βi为回归系数,Xi为模型输入数据,Y1为待预测时刻的空气质量指数;Where β i is the regression coefficient, X i is the model input data, and Y 1 is the air quality index of the time to be predicted;
对于当前时刻的预测,模型输入数据为l1小时历史AQI数据以及当前时刻上一小时的气温、气压、风力、湿度和天气类型数据;For the current time prediction, the model input data is the 1 hour historical AQI data and the temperature, air pressure, wind, humidity and weather type data of the hour at the current time;
对于未来l2小时中各时刻的预测,模型输入数据为当前时刻的AQI数据、l1小时的历史AQI数据以及当前时刻的气温、气压、风力、湿度和天气类型数据。For the prediction of each moment in the next 12 hours, the model input data is the current time AQI data, l 1 hour historical AQI data, and current temperature, air pressure, wind, humidity and weather type data.
利用初始训练数据集S1中的多个序列对各多元线性回归模型进行训练,即可得到各初始多元线性回归模型中的各回归系数,从而得到各初始多元线性回归模型,即初始时间预测模型。By using multiple sequences in the initial training data set S 1 to train each multiple linear regression model, each regression coefficient in each initial multiple linear regression model can be obtained, thereby obtaining each initial multiple linear regression model, that is, an initial time prediction model. .
本发明对于天气类型数据可采用数字编号,如0表示晴天,1表示多云阴天,2表示雨天等等。也可采用现有其它数据处理和表述方式。The present invention may use a numerical number for the weather type data, such as 0 for sunny days, 1 for cloudy cloudy days, 2 for rainy days, and the like. Other existing data processing and presentation methods can also be employed.
优选的,本发明步骤S4中,利用二维线性插值方法,建立对指定坐标处进行空气质量预测的空间预测模型,包括:Preferably, in step S4 of the present invention, a two-dimensional linear interpolation method is used to establish a spatial prediction model for air quality prediction at a specified coordinate, including:
S41,获取所有已知空气质量指数的地点在相同时刻的实时监测数据,与相应地点的经纬度数据,组成空间预测模型的训练数据集S2S41: acquiring real-time monitoring data of all known air quality index locations at the same time, and latitude and longitude data of corresponding locations, forming a training data set S 2 of the spatial prediction model;
S42,定义待预测地点的坐标为(x,y),对该地点进行空气指数预测的空间预测模型表示为: S42, the coordinate of the location to be predicted is (x, y), and the spatial prediction model for air index prediction of the location is expressed as:
Y2=griddata(x,y,S2)Y 2 =griddata(x,y,S 2 )
其中模型的输入量即S2,模型输出量为待预测地点的空气质量指数,griddata()代表二维插值函数。The input of the model is S 2 , the output of the model is the air quality index of the location to be predicted, and the griddata () represents the two-dimensional interpolation function.
空间预测模型的初始训练数据集仅包含基站处的空气质量指数。The initial training data set of the spatial prediction model contains only the air quality index at the base station.
优选的,步骤S5中,交通数据包括各个待预测地点和空气质量监测基站周边设定半径区域内的畅通路段、缓慢路段和拥堵路段的长度数据。Preferably, in step S5, the traffic data includes length data of the smooth path segment, the slow road segment, and the congestion road segment in each of the to-be-predicted locations and the air quality monitoring base station.
地理兴趣点数据包括各个待预测地点和空气质量监测基站周边设定半径区域内的地理对象实体的分布数据;所述地理对象实体类型包括学校、银行、餐厅和加油站。还可包括其它地理对象实体,不赘述累举。The geographic point of interest data includes distribution data of geographic object entities within a set radius area around the base station and the air quality monitoring base station; the geographic object entity types include schools, banks, restaurants, and gas stations. Other geographic object entities may also be included, and the exhaustion is not described.
优选的,本发明步骤S5利用多元线性回归方法建立动态预测模型,包括步骤:Preferably, step S5 of the present invention establishes a dynamic prediction model by using a multiple linear regression method, including the steps of:
S51,获取历史数据库中多个时刻分别对应的各基站外周给定半径内的交通数据和地理兴趣点数据,交通数据包括畅通路段、缓慢路段和拥堵路段长度的占比数据,定义为T1,T2,T3,地理兴趣点数据包括基站外周给定半径内各地理兴趣点的分布数量,定义为T4,T5,...,Tq,以及相应时刻相应基站的空气质量指数监测数据,建立初始训练集S3S51: Obtain traffic data and geographic interest point data within a given radius of each base station corresponding to each time point in the historical database, and the traffic data includes the proportion data of the length of the smooth path segment, the slow road segment, and the congestion segment length, and is defined as T 1 . T 2 , T 3 , geographic interest point data includes the number of geographical interest points within a given radius of the base station, defined as T 4 , T 5 ,..., T q , and the air quality index monitoring of the corresponding base station at the corresponding time. Data, establish an initial training set S 3 ;
S52,建立动态预测模型,表示为:S52, establishing a dynamic prediction model, expressed as:
Y3=α01T12T23T34T4+...++αqTq Y 301 T 12 T 23 T 34 T 4 +...++α q T q
其中αi为回归系数,模型输入量为待预测地点在指定时刻给定半径内的交通数据和地理兴趣点数据,模型输出量Y3即待预测点的空气质量指数。Where α i is the regression coefficient, the model input quantity is the traffic data and geographic interest point data within a given radius of the to-be-predicted location at the specified time, and the model output quantity Y 3 is the air quality index of the point to be predicted.
动态预测模型的初始训练数据S3中仅包含历史数据库中基站处的相关数据。在每次预测前通过训练集的训练可得到动态预测模型中各回归系数的值,从而得到相应的动态预测模型,并利用动态预测模型得到当前及未来时刻的空气质量指数数据。对于未来时刻的预测,现有技术已经可对交通数据进行未来时间的预测,因此本发明在进行对未来时间的动态预测时,输入数据可直接采用现有技术已经预测得到的交通预测数据。Dynamic prediction model initial training data in S 3 only contains the data at the base station in the history database. The values of the regression coefficients in the dynamic prediction model can be obtained through the training of the training set before each prediction, so that the corresponding dynamic prediction model is obtained, and the current and future moments of the air quality index data are obtained by using the dynamic prediction model. For the prediction of future time, the prior art can predict the future time of the traffic data. Therefore, when the present invention performs dynamic prediction for the future time, the input data can directly use the traffic prediction data that has been predicted by the prior art.
根据清华大学电子工程系公布的室内空气质量调研的数据分析报告[2],室内空气质量与室外空气质量具有多种类型的数值关系。这取决于多种条件:建筑环境类型、楼层、距主干道距离、是否开启中央空调、是否开窗通风、是否开启空气净化器等会影响室内外空气质量指数关系的条件。According to the data analysis report of the indoor air quality survey released by the Department of Electronic Engineering of Tsinghua University [2], indoor air quality and outdoor air quality have various types of numerical relationships. This depends on a variety of conditions: the type of building environment, the floor, the distance from the main road, whether to open the central air conditioning, whether to open the window ventilation, whether to open the air purifier, etc. will affect the relationship between indoor and outdoor air quality index.
优选的,本发明步骤S6中采用回归树算法建立室内外预测模型,包括步骤: Preferably, in step S6 of the present invention, the regression tree algorithm is used to establish an indoor and outdoor prediction model, including the steps:
S61,获取历史数据库中多个指定时刻各基站监测的空气质量指数数据,以及相应时刻相应地点用户共享的室内空气质量指数数据和室内空气质量指数相关数据,室内空气质量指数相关数据包括建筑环境类型、楼层、距主干道距离、是否开启中央空调、是否开窗通风及是否开启空气净化器;基于获取到的数据建立室内外预测模型的初始训练集S4S61. Acquire air quality index data monitored by each base station at a plurality of specified moments in the historical database, and indoor air quality index data and indoor air quality index related data shared by users at corresponding moments in the corresponding time, and the indoor air quality index related data includes the type of the building environment. , floor, distance from the main road, whether to open the central air conditioner, whether to open the window ventilation and whether to open the air purifier; based on the acquired data to establish an initial training set S 4 of the indoor and outdoor prediction model;
S62,建立室内外预测模型,表示为:S62, establishing an indoor and outdoor prediction model, expressed as:
Y4=RT(M,S4)Y 4 =RT(M,S 4 )
模型输入量为待预测地点在待预测时刻所获取的用户共享的室内空气质量指数M,和室内空气质量指数相关数据,模型输出量Y4为待预测地点在待预测时刻的空气质量指数数据。The model input quantity is the indoor air quality index M shared by the user acquired at the time to be predicted, and the indoor air quality index related data, and the model output quantity Y 4 is the air quality index data of the to-be-predicted place at the time to be predicted.
室内外预测模型中,当训练数据中各室内空气质量指数相关数据不同时,即影响室内外空气质量指数的条件状态不同时,回归树RT()的模型系数也是不同的,本发明训练通过对各相同条件下输入与输出数据的训练,得到各条件下表征室内外空气质量指数关系的回归树模型及其系数,即应用于后续对相同条件下的预测地点进行空气质量指数的预测。在对某待预测地点进行未来时刻的预测时,输入数据可为利用现有技术获取的模型输入数据在未来相应时刻的数据。In the indoor and outdoor prediction model, when the indoor air quality index related data in the training data are different, that is, the conditional states affecting the indoor and outdoor air quality index are different, the model coefficients of the regression tree RT() are also different, and the training of the present invention is adopted. Under the same conditions, the input and output data are trained, and the regression tree model and its coefficient which characterize the relationship between indoor and outdoor air quality index under each condition are obtained, which is applied to the subsequent prediction of the air quality index of the predicted location under the same conditions. When predicting a future time of a to-be-predicted location, the input data may be data of a model input data acquired by using the prior art at a corresponding time in the future.
在实际预测时,若无法获得相应时刻的室内空气质量指数相关数据,则建立室内外预测模型为:In the actual forecast, if the indoor air quality index related data at the corresponding time cannot be obtained, the indoor and outdoor prediction model is established as follows:
Y4=M/60%Y 4 =M/60%
根据美国环保局公布的室内外空气质量统计关系[1],室内空气质量约为室外空气质量的60%。According to the statistical relationship between indoor and outdoor air quality published by the US Environmental Protection Agency [1], indoor air quality is about 60% of outdoor air quality.
优选的,本发明步骤S7将已建立的时间预测模型、空间预测模型、动态预测模型和室内外预测模型进行协同训练,包括:Preferably, step S7 of the present invention performs collaborative training on the established time prediction model, spatial prediction model, dynamic prediction model and indoor and outdoor prediction model, including:
S71,记时间预测模型、空间预测模型、动态预测模型和室内外预测模型分别为预测器F1,F2,F3,F4,各预测器的训练集分别记为L1,L2,L3,L4,将训练集初始化为:S71, the time prediction model, the spatial prediction model, the dynamic prediction model, and the indoor and outdoor prediction models are predictors F 1 , F 2 , F 3 , and F 4 , respectively, and the training sets of each predictor are respectively recorded as L 1 , L 2 , L 3 , L 4 , initialize the training set to:
L1=S1,L2=S2,L3=S3,L4=S4L 1 =S 1 , L 2 =S 2 , L 3 =S 3 , L 4 =S 4 ;
初始化各预测器预测结果的权重向量为[w1,w2,w3,w4],四个权重因子的和等于1。The weight vector for initializing each predictor prediction result is [w 1 , w 2 , w 3 , w 4 ], and the sum of the four weighting factors is equal to 1.
S72,基于训练集L1,L2,L3,L4分别训练得到F1,F2,F3,F4S72, based on the training set L 1, L 2, L 3 , L 4 , respectively, is trained F 1, F 2, F 3 , F 4;
S73,获取待预测地点在待预测时刻对应各预测器的模型输入量数据,利用所获取的数据,对于各待预测地点,利用训练得到的四个预测器分别计算得到待预测时刻的预测值,记为: S73. Acquire a model input quantity data corresponding to each predictor at a time to be predicted at a time to be predicted, and use the acquired data to calculate a predicted value of the to-be-predicted time by using four predictors obtained by the training for each to-be-predicted place. Recorded as:
Y1=F1(x,y)Y 1 =F 1 (x,y)
Y2=F2(x,y)Y 2 =F 2 (x,y)
Y3=F3(x,y)Y 3 =F 3 (x,y)
Y4=F4(x,y)Y 4 =F 4 (x,y)
S74,对于各待预测地点,其在待预测时刻的AQI融合值为:S74. For each location to be predicted, the AQI fusion value at the time to be predicted is:
Figure PCTCN2017085715-appb-000008
Figure PCTCN2017085715-appb-000008
S75,定义预测结果的偏差阈值Rth,计算四个预测器预测结果的偏差之和:S75, defining a deviation threshold Rth of the prediction result, and calculating a sum of deviations of the prediction results of the four predictors:
Figure PCTCN2017085715-appb-000009
Figure PCTCN2017085715-appb-000009
S76,对于各待预测地点,分别将计算得到的Rx,y与偏差阈值Rth进行比较,若满足:S76: For each location to be predicted, compare the calculated R x,y with the deviation threshold R th respectively, if:
Figure PCTCN2017085715-appb-000010
Figure PCTCN2017085715-appb-000010
则退出循环,以Y0作为各待预测地点在待预测时刻的空气质量指数预测值;否则转至步骤S77;Then exit the loop, taking Y 0 as the air quality index predicted value of each to-be-predicted location at the time to be predicted; otherwise, go to step S77;
S77,从所有待预测地点中,以相应的Rx,y从小到大为顺序,选取n个待预测地点,记为:S77, from all the to-be-predicted locations, in the order of the corresponding R x, y from small to large, select n locations to be predicted, and record:
S={(x1,y1),(x2,y2),...,(xn,yn)};S={(x 1 , y 1 ), (x 2 , y 2 ), ..., (x n , y n )};
S78,更新各预测器的训练集为:L1={L1,S},L2={L2,S},L3={L3,S},L4为当前的S4;转至步骤S72,并重复步骤S72至步骤S78继续进行训练,直至进行步骤S76时满足
Figure PCTCN2017085715-appb-000011
则以满足时对应的Y0作为各待预测地点在待预测时刻的空气质量指数预测值。
S78, updating the training set of each predictor is: L 1 ={L 1 ,S}, L 2 ={L 2 ,S}, L 3 ={L 3 ,S}, L 4 is the current S 4 ; Go to step S72, and repeat steps S72 to S78 to continue training until the step S76 is satisfied.
Figure PCTCN2017085715-appb-000011
Then, Y 0 corresponding to the time of satisfaction is used as the predicted value of the air quality index of each to-be-predicted location at the time to be predicted.
由上述方法可见,对于每个时刻的预测本发明进行了最少一轮的训练,在循环训练的过程中,每轮训练过程完毕,进行下一轮训练时,各模型的训练数据集中的数据将有所更新,从而在后续的训练中能够得到更准确的预测结果。各训练数据集中所新增加的数据即为上一轮训练时各预测器预测结果与协同训练结果偏差之和最小的预测地点处的相关数据,如对于空间预测模型,新增加的训练数据即为上一轮训练得到的该预测地点处的坐标和AQI数据;对于动态预测模型,新增加的训练数据即为该预测地点处的历史空气质量指数数据及交通数据和地理兴趣点数据,依次类推。It can be seen from the above method that the present invention performs at least one round of training for each time prediction. In the process of the cyclic training, each round of the training process is completed, and the next round of training, the data of the training data sets of each model will be Updated to provide more accurate predictions in subsequent training. The newly added data in each training data set is the relevant data at the predicted position where the sum of the prediction results of the predictors and the deviation of the cooperative training results is the smallest in the previous round of training. For the spatial prediction model, the newly added training data is The coordinates and AQI data at the predicted location obtained from the previous round of training; for the dynamic prediction model, the newly added training data is the historical air quality index data and traffic data and geographic interest point data at the predicted location, and so on.
进一步的,若步骤S73中无法通过预测器F4得到相应的AQI预测值,则在步骤S74中采用以下公式计算AQI融合值: Further, if the corresponding AQI prediction value cannot be obtained by the predictor F 4 in step S73, the AQI fusion value is calculated in the following formula using the following formula:
Figure PCTCN2017085715-appb-000012
Figure PCTCN2017085715-appb-000012
本发明还提供一种城市小尺度空气质量指数预测系统,包括:The invention also provides a city small-scale air quality index prediction system, comprising:
区域划分模块,将城市区域以网格形式进形区域划分,网格交点对应待预测空气质量指数的地点;The area dividing module divides the urban area into a grid-shaped advancing area, and the grid intersection points correspond to the location of the air quality index to be predicted;
历史监测数据获取模块,获取空气质量监测基站的历史监测数据,建立历史数据库;所述历史监测数据包括AQI数据、气象数据和天气类型数据;The historical monitoring data acquisition module acquires historical monitoring data of the air quality monitoring base station and establishes a historical database; the historical monitoring data includes AQI data, weather data, and weather type data;
时间预测模型建立模块,基于历史数据库建立时间预测模型;A time prediction model building module, which establishes a time prediction model based on a historical database;
空间预测模型建立模块,获取各个空气质量监测基站的实时监测数据,建立空间预测模型;The spatial prediction model building module acquires real-time monitoring data of each air quality monitoring base station and establishes a spatial prediction model;
动态预测模型建立模块,获取各个待预测地点和空气质量监测基站的交通数据和地理兴趣点数据,建立动态预测模型;The dynamic prediction model establishing module acquires traffic data and geographic interest point data of each to-be-predicted location and air quality monitoring base station, and establishes a dynamic prediction model;
室内外预测模型建立模块,获取用户共享的室内空气质量指数,建立室内外预测模型;The indoor and outdoor prediction model building module acquires the indoor air quality index shared by the user and establishes an indoor and outdoor prediction model;
协同训练模块,将已建立的时间预测模型、空间预测模型、动态预测模型和室内外预测模型进行协同训练,以将所有模型的预测结果相融合,得到所有待预测地点在当前时间和未来一段时间的空气质量指数预测值。The collaborative training module cooperatively trains the established time prediction model, spatial prediction model, dynamic prediction model and indoor and outdoor prediction model to fuse the prediction results of all models to obtain all the predicted locations at the current time and for a period of time in the future. Air quality index forecast.
有益效果Beneficial effect
与现有技术相比,本发明所提供的城市小尺度的空气质量预测方法具有如下优点:Compared with the prior art, the urban small-scale air quality prediction method provided by the invention has the following advantages:
1、可以更准确地预测城市范围内任何一个地点的当前及未来若干小时的空气质量指数,为人们提供准确的空气质量预测;1. It can more accurately predict the current and future air quality index of any location in the city to provide accurate air quality predictions;
2、本发明将多种数据源、多种预测模型相融合,避免了单一预测模型的局限性,保证了模型的准确性;2. The invention combines multiple data sources and multiple prediction models, avoiding the limitations of a single prediction model and ensuring the accuracy of the model;
3、本发明将多种预测模型分开进行再最终协同训练,降低了总体的计算复杂度,缩短了计算时间。3. The invention separates multiple prediction models and then finally cooperates with each other, which reduces the overall computational complexity and shortens the calculation time.
附图说明DRAWINGS
图1所示为本发明方法流程示意图。 Figure 1 is a schematic flow chart of the method of the present invention.
具体实施方式detailed description
下面通过实施例,并结合附图,对本发明的技术方案作进一步具体的说明。应当注意的是,本发明不应局限于下文所述的具体实施例。另外,为了简便起见,省略了部分公知技术的详细描述。The technical solutions of the present invention will be further specifically described below by way of embodiments and with reference to the accompanying drawings. It should be noted that the present invention should not be limited to the specific embodiments described below. In addition, a detailed description of some well-known techniques is omitted for the sake of brevity.
本发明的城市小尺度空气质量指数预测方法,包括步骤:The urban small-scale air quality index prediction method of the invention comprises the steps of:
S1,将城市区域以网格形式进形区域划分,网格交点对应待预测空气质量指数的地点;S1, dividing the urban area into a grid-shaped advancing area, and the grid intersection points correspond to the location of the air quality index to be predicted;
S2,获取各空气质量监测基站的历史监测数据,建立历史数据库;S2, obtaining historical monitoring data of each air quality monitoring base station, and establishing a historical database;
S3,基于历史数据库中各基站多个时间序列的监测数据,建立分别对应当前时刻预测以及未来一段时间中各时刻预测的时间预测模型;S3. Based on the monitoring data of multiple time series of each base station in the historical database, establish a time prediction model corresponding to the current time prediction and the prediction of each time in the future time period;
S4,基于历史数据库中各个空气质量监测基站在同一时刻的监测数据,利用二维线性插值方法,建立对指定坐标处进行空气质量预测的空间预测模型;S4, based on monitoring data of the base station at the same time by each air quality in the historical database, using a two-dimensional linear interpolation method to establish a spatial prediction model for air quality prediction at a specified coordinate;
S5,获取各个待预测地点和空气质量监测基站的交通数据和地理兴趣点数据,及其对应时刻各待预测地点和空气质量监测基站的空气质量指数数据;S5, obtaining traffic data and geographic interest point data of each to-be-predicted location and air quality monitoring base station, and air quality index data of each base station to be predicted and the air quality monitoring base station at corresponding moments;
基于所获取的数据,建立表征交通数据和地理兴趣点数据与空气质量指数之间关系的动态预测模型;Based on the acquired data, a dynamic prediction model that characterizes the relationship between traffic data and geographic interest point data and air quality index is established;
S6,获取用户共享的室内空气质量指数、用户居住环境数据,以及相应地点的空气质量指数数据,建立表征室内空气质量指数与室外空气质量指数之间关系的室内外预测模型;S6, obtaining indoor air quality index shared by the user, user living environment data, and air quality index data of the corresponding location, and establishing an indoor and outdoor prediction model for characterizing the relationship between the indoor air quality index and the outdoor air quality index;
S7,对于待预测空气质量指数的任一实时时刻的任一待预测地点,将已建立的时间预测模型、空间预测模型、动态预测模型和室内外预测模型进行协同训练,以将所有模型的预测结果相融合,进而得到各待预测地点在相应当前时刻和未来一段时间中各时刻的空气质量指数预测值。S7, cooperatively training the established time prediction model, spatial prediction model, dynamic prediction model and indoor and outdoor prediction model for any predicted location of any real-time moment of the air quality index to be predicted, so as to predict the prediction results of all models The fusion is performed to obtain the predicted values of the air quality index of each of the to-be-predicted locations at respective current moments and at various moments in the future.
本发明中,对应当前时刻预测的时间预测模型,表征的是历史监测数据与当前监测数据之间的关系,对应未来一段时间预测的时间预测模型,表征的是历史监测数据及当前监测数据与未来一段时间中各时刻监测数据之间的关系,根据所指定的未来一段时间的时间跨度,包括多个对应各时刻的时间预测模型;In the present invention, the time prediction model corresponding to the current time prediction represents the relationship between the historical monitoring data and the current monitoring data, and corresponds to the temporal prediction model predicted in the future, and represents the historical monitoring data and the current monitoring data and the future. Monitoring the relationship between the data at each moment in time, according to the specified time span of the future period, including a plurality of temporal prediction models corresponding to each moment;
空间预测模型,表征的是各已知地点或者基站的实时监测数据,与实时监测数据未知的待预测点空气质量指数数据之间的关系。The spatial prediction model characterizes the relationship between the real-time monitoring data of each known location or base station and the air quality index data of the to-be-predicted point whose real-time monitoring data is unknown.
本发明通过各模型的建立,以及预测时对各模型预测结果的融合,实现了对基站外地点的空气质量预测,且预测结果综合了各种相关因素的影响,准确度更高。The invention realizes the air quality prediction of the location outside the base station through the establishment of each model and the fusion of the prediction results of each model at the time of prediction, and the prediction result integrates the influence of various related factors, and the accuracy is higher.
实施例Example
图1是本发明的流程图。如图1所示,本发明采用多个预测模型的协同训练算法预测空 气质量指数。下面对用于预测空气质量的各个预测模型、协同训练算法及最终评估准确性进行详细介绍。Figure 1 is a flow chart of the present invention. As shown in FIG. 1, the present invention uses a cooperative training algorithm with multiple prediction models to predict the null. Gas quality index. The various prediction models, cooperative training algorithms, and final evaluation accuracy for predicting air quality are described in detail below.
首先在待预测区域内建立正方形网格系统。本实施例中待预测区域为北京市五环内区域,建立正方形网格系统,网格大小为一平方千米。网格交点即为空气质量指数待预测的地点。空气质量监测基站数量记为N。本实施例中,北京市共有36个空气质量监测基站。First, a square grid system is built in the area to be predicted. In this embodiment, the to-be-predicted area is the inner five-ring area of Beijing, and a square grid system is established, and the grid size is one square kilometer. The grid intersection is the location where the air quality index is to be predicted. The number of air quality monitoring base stations is recorded as N. In this embodiment, there are 36 air quality monitoring base stations in Beijing.
步骤S3中时间预测模型F1的构建Construction of time prediction model F 1 in step S3
获取并建立空气质量监测基站的历史数据库,包含日期时间、基站名称、基站经纬度、AQI数据、气温、气压、风力、湿度、天气类型。本实施例中,对历史数据的采样时间间隔优选为1小时。为保证训练样本的完整性,对历史数据缺失的情况进行局部时间序列的插值补全。Obtain and establish a historical database of air quality monitoring base stations, including date and time, base station name, base station latitude and longitude, AQI data, temperature, air pressure, wind, humidity, and weather type. In this embodiment, the sampling time interval for the historical data is preferably 1 hour. In order to ensure the integrity of the training samples, the local time series interpolation is completed for the case of missing historical data.
根据历史数据库对各个空气质量监测基站建立统一的时间序列预测模型,用于预测指定待预测地点在未来某个时间点的空气质量指数。该步骤进一步包含如下子步骤:According to the historical database, a unified time series prediction model is established for each air quality monitoring base station, and is used to predict the air quality index of the specified predicted location at a certain point in time in the future. This step further includes the following substeps:
指定所使用的历史序列长度及预测期。当前时刻的数据记为xn,历史序列长度为L1,历史序列记为
Figure PCTCN2017085715-appb-000013
未来序列长度为L2,未来序列记为
Figure PCTCN2017085715-appb-000014
优选的,历史序列长度选取为6,预测期长度选为6。即对任意时刻,使用相应的最近6小时历史数据预测最近的未来6个小时的空气质量指数。因而,提取历史数据库中所有连续L1+1+L2小时序列组成训练数据集S1
Specifies the length of the historical sequence used and the forecast period. The data at the current time is recorded as x n , the length of the historical sequence is L 1 , and the history sequence is recorded as
Figure PCTCN2017085715-appb-000013
The future sequence length is L 2 , and the future sequence is recorded as
Figure PCTCN2017085715-appb-000014
Preferably, the length of the historical sequence is selected to be 6 and the length of the prediction period is selected to be 6. That is, at any time, the corresponding 6-hour historical data is used to predict the most recent 6-hour air quality index. Thus, all consecutive L 1 +1+L 2 hour sequences in the extraction history database constitute the training data set S 1 .
采用多元线性回归模型进行当前时刻及未来6小时的预测。对于每一个预测时间点都建立一个多元线性回归模型,即一共有7个时间预测模型。对于当前时刻的预测,输入数据S1为AQI的6小时历史数据和上一小时的气温、气压、风力、湿度、天气类型。对于未来6小时的预测,输入数据S1为当前时刻的AQI及6小时历史AQI数据,和当前时刻的气温、气压、风力、湿度、天气类型。多元线性回归模型的输出都是需要预测的时间点的AQI数据。多元线性回归模型可以写成如下的形式,The multivariate linear regression model was used to predict the current time and the next 6 hours. A multivariate linear regression model is established for each predicted time point, that is, there are a total of seven time prediction models. For the current time prediction, the input data S 1 is the 6-hour historical data of the AQI and the temperature, air pressure, wind, humidity, and weather type of the previous hour. For the next 6 hours of prediction, the input data S 1 is the current time AQI and 6-hour historical AQI data, and the current time temperature, air pressure, wind, humidity, weather type. The output of the multiple linear regression model is the AQI data at the time point that needs to be predicted. Multiple linear regression models can be written in the following form,
Y1=β01X12X2+...+βpXp          (1)Y 101 X 12 X 2 +...+β p X p (1)
其中βi为回归系数,Xi为各项输入数据,Y1为待预测点的空气质量指数。Where β i is the regression coefficient, X i is the input data, and Y 1 is the air quality index of the point to be predicted.
步骤S4中空间预测模型F2的构建Construction of spatial prediction model F 2 in step S4
获取所有基站在同一时间的实时数据,包含日期时间、基站名称、基站经纬度、AQI数 据。Get real-time data of all base stations at the same time, including date and time, base station name, base station latitude and longitude, AQI number according to.
空间预测模型采用二维线性插值算法。输入数据S2为已知AQI值的基站或网格点的经纬度、AQI。空间预测模型可以表示为:The spatial prediction model uses a two-dimensional linear interpolation algorithm. The input data S 2 is the latitude and longitude, AQI of the base station or grid point of the known AQI value. The spatial prediction model can be expressed as:
Y2=griddata(x,y,S2)             (2)Y 2 =griddata(x,y,S 2 ) (2)
其中x,y为待预测点的坐标,S2为输入数据,亦即训练集,Y2为待预测点的空气质量指数。空间预测模型的初始训练数据S2仅包含基站处的相关数据。griddata函数为现有插值函数。Where x, y are the coordinates of the point to be predicted, S 2 is the input data, that is, the training set, and Y 2 is the air quality index of the point to be predicted. The initial training data S 2 of the spatial prediction model contains only relevant data at the base station. The griddata function is an existing interpolation function.
S2的初始训练数据仅为基站相关数据,在训练集更新后,更新后的数据即为上一轮训练中预测结果偏差最小的待预测地点的上一轮预测结果值。The initial training data of S 2 is only the base station related data. After the training set is updated, the updated data is the previous round prediction result value of the to-be-predicted location with the smallest deviation of the prediction result in the previous round of training.
步骤S5中动态预测模型F3的构建Construction of dynamic prediction model F 3 in step S5
获取所有基站及待预测网格点附近给定半径内的交通数据和地理兴趣点数据。所述交通数据包括畅通、缓慢、拥堵路段长度,并转换为比例数据;所述地理兴趣点数据包括指定地点的给定半径范围内各种类型的地理对象实体的分布数据,如学校、银行、餐厅、加油站等的数量;Obtain traffic data and geographic point of interest data within a given radius of all base stations and grid points to be predicted. The traffic data includes unblocked, slow, and congested road lengths, and converted into proportional data; the geographic interest point data includes distribution data of various types of geographic object entities within a given radius of the designated location, such as schools, banks, The number of restaurants, gas stations, etc.;
采用多元线性回归模型建立动态预测模型,输入数据为交通数据和地理兴趣点数据,输出数据为AQI数据。模型形式如下,The dynamic prediction model is established by using multiple linear regression models. The input data is traffic data and geographic interest point data, and the output data is AQI data. The model form is as follows,
Y3=α01T12T23T34T4+...++αqTq           (3)Y 301 T 12 T 23 T 34 T 4 +...++α q T q (3)
其中αi为回归系数,T1,T2,T3为畅通、缓慢、拥堵路段占比,T4,T5,...,Tq为各类型地理兴趣点的数量,Y3为待预测点的空气质量指数。动态预测模型的初始训练数据S3仅包含基站处的相关数据。Where α i is the regression coefficient, T 1 , T 2 , T 3 are the ratio of smooth, slow, and congested road segments, T 4 , T 5 ,..., T q are the number of geographic interest points of various types, and Y 3 is to be The air quality index of the predicted point. The initial dynamic prediction model training data S 3 only contains the data at the base station.
步骤S6中室内外预测模型F4 Indoor and outdoor prediction model F 4 in step S6
获取用户共享的室内空气质量指数。该室内空气质量指数是通过与本软件系统相配套的空气净化器上安置的空气质量传感器测量所得。记所有用户共享数据集合为S4,作为本模型的训练数据。Get the indoor air quality index shared by users. The indoor air quality index is measured by an air quality sensor placed on an air purifier that is compatible with the software system. Remember that all users share the data set as S 4 as the training data for this model.
根据清华大学电子工程系公布的室内空气质量调研的数据分析报告,室内空气质量与室外空气质量具有多种类型的数值关系。这取决于多种条件:建筑环境类型、楼层、距主干道距离、是否开启中央空调、是否开窗通风、是否开启空气净化器等。采用回归树算法分别拟合各个类别下的室内外空气质量指数关系。一般地,室内外预测模型可以表示为 According to the data analysis report of the indoor air quality survey released by the Department of Electronic Engineering of Tsinghua University, indoor air quality and outdoor air quality have various types of numerical relationships. This depends on a variety of conditions: the type of building environment, the floor, the distance from the main road, whether the central air conditioning is turned on, whether the window is ventilated, whether the air purifier is turned on, etc. The regression tree algorithm was used to fit the indoor and outdoor air quality index relationships under each category. In general, indoor and outdoor prediction models can be expressed as
Y4=RT(M,S4)           (4-1)Y 4 =RT(M,S 4 ) (4-1)
其中RT为回归树算法,M为传感器测量的室内空气质量指数,Y4为待预测点的室外空气质量指数。当S4中各条件的状态组合不同时,回归树的系数也是不同的,因此通过历史数据中不同条件的组合训练得到相应的空间预测模型,用于相应条件组合下的预测。Where RT is the regression tree algorithm, M is the indoor air quality index measured by the sensor, and Y 4 is the outdoor air quality index to be predicted. When the state combinations of the conditions in S 4 are different, the coefficients of the regression tree are also different. Therefore, the corresponding spatial prediction model is obtained by the combination of different conditions in the historical data, and is used for the prediction under the corresponding condition combination.
如果训练数据S4缺失或者实测数据中缺少建筑环境类型、楼层、距主干道距离、是否开启中央空调、是否开窗通风、是否开启空气净化器等情况,则使用以下方法得到室内外预测模型。根据美国环保局公布的室内外空气质量统计关系[1],室内空气质量约为室外空气质量的60%,即:If the training data S 4 is missing or the measured data lacks the type of building environment, the floor, the distance from the main road, whether the central air conditioner is turned on, whether the window is ventilated, whether the air purifier is turned on, etc., the indoor and outdoor prediction models are obtained using the following method. According to the statistical relationship between indoor and outdoor air quality published by the US Environmental Protection Agency [1], indoor air quality is about 60% of outdoor air quality, namely:
Y4=M/60%           (4-2)Y 4 =M/60% (4-2)
其中M为传感器测量的室内空气质量指数,Y4为待预测点的室外空气质量指数。Where M is the indoor air quality index measured by the sensor and Y 4 is the outdoor air quality index to be predicted.
步骤S7中的协同训练算法Cooperative training algorithm in step S7
待以上四个预测模型建立完成后,采取协同训练算法将各个模型的计算结果进行融合。同时,这四个模型将可能有不同程度的更新。协同训练算法是一项半监督学习算法,其主要目的是高效的利用少量标记数据和大量的未标记数据来训练预测器。本实施例使用了简化版的协同训练算法。具体的实施步骤如下:After the above four prediction models are established, a collaborative training algorithm is adopted to fuse the calculation results of each model. At the same time, these four models will likely have different levels of updates. The cooperative training algorithm is a semi-supervised learning algorithm whose main purpose is to efficiently use a small amount of marker data and a large amount of unlabeled data to train the predictor. This embodiment uses a simplified version of the collaborative training algorithm. The specific implementation steps are as follows:
S71,记时间预测模型、空间预测模型、动态预测模型和室内外预测模型分别为预测器F1,F2,F3,F4,各预测器的训练集分别记为L1,L2,L3,L4,将训练集初始化为:S71, the time prediction model, the spatial prediction model, the dynamic prediction model, and the indoor and outdoor prediction models are predictors F 1 , F 2 , F 3 , and F 4 , respectively, and the training sets of each predictor are respectively recorded as L 1 , L 2 , L 3 , L 4 , initialize the training set to:
L1=S1,L2=S2,L3=S3,L4=S4L 1 =S 1 , L 2 =S 2 , L 3 =S 3 , L 4 =S 4 ;
初始化各预测器预测结果的权重向量为[w1,w2,w3,w4],四个权重因子的和等于1。The weight vector for initializing each predictor prediction result is [w 1 , w 2 , w 3 , w 4 ], and the sum of the four weighting factors is equal to 1.
S72,基于训练集L1,L2,L3,L4分别训练得到F1,F2,F3,F4S72, based on the training set L 1, L 2, L 3 , L 4 , respectively, is trained F 1, F 2, F 3 , F 4;
S73,获取待预测地点在待预测时刻对应各预测器的模型输入量数据,利用所获取的数据,对于各待预测地点,利用训练得到的四个预测器分别计算得到待预测时刻的预测值,记为:S73. Acquire a model input quantity data corresponding to each predictor at a time to be predicted at a time to be predicted, and use the acquired data to calculate a predicted value of the to-be-predicted time by using four predictors obtained by the training for each to-be-predicted place. Recorded as:
Y1=F1(x,y)Y 1 =F 1 (x,y)
Y2=F2(x,y)Y 2 =F 2 (x,y)
Y3=F3(x,y)Y 3 =F 3 (x,y)
Y4=F4(x,y)Y 4 =F 4 (x,y)
S74,对于各待预测地点,其在待预测时刻的AQI融合值为: S74. For each location to be predicted, the AQI fusion value at the time to be predicted is:
Figure PCTCN2017085715-appb-000015
Figure PCTCN2017085715-appb-000015
S75,定义预测结果的偏差阈值Rth,计算四个预测器预测结果的偏差之和:S75, defining a deviation threshold Rth of the prediction result, and calculating a sum of deviations of the prediction results of the four predictors:
Figure PCTCN2017085715-appb-000016
Figure PCTCN2017085715-appb-000016
S76,对于各待预测地点,分别将计算得到的Rx,y与偏差阈值Rth进行比较,若满足:S76: For each location to be predicted, compare the calculated R x,y with the deviation threshold R th respectively, if:
Figure PCTCN2017085715-appb-000017
Figure PCTCN2017085715-appb-000017
则退出循环,以Y0作为各待预测地点在待预测时刻的空气质量指数预测值;否则转至步骤S77;Then exit the loop, taking Y 0 as the air quality index predicted value of each to-be-predicted location at the time to be predicted; otherwise, go to step S77;
S77,从所有待预测地点中,以相应的Rx,y从小到大为顺序,选取n个待预测地点,记为:S77, from all the to-be-predicted locations, in the order of the corresponding R x, y from small to large, select n locations to be predicted, and record:
S={(x1,y1),(x2,y2),...,(xn,yn)};S={(x 1 , y 1 ), (x 2 , y 2 ), ..., (x n , y n )};
S78,更新各预测器的训练集为:L1={L1,S},L2={L2,S},L3={L3,S},L4为当前的S4;转至步骤S72,并重复步骤S72至步骤S78继续进行训练,直至进行步骤S76时满足
Figure PCTCN2017085715-appb-000018
则以满足时对应的Y0作为各待预测地点在待预测时刻的空气质量指数预测值。
S78, updating the training set of each predictor is: L 1 ={L 1 ,S}, L 2 ={L 2 ,S}, L 3 ={L 3 ,S}, L 4 is the current S 4 ; Go to step S72, and repeat steps S72 to S78 to continue training until the step S76 is satisfied.
Figure PCTCN2017085715-appb-000018
Then, Y 0 corresponding to the time of satisfaction is used as the predicted value of the air quality index of each to-be-predicted location at the time to be predicted.
由上述方法可见,对于每个时刻的预测本发明进行了最少一轮的训练,在循环训练的过程中,每轮训练过程完毕,进行下一轮训练时,各模型的训练数据集中的数据将有所更新,从而在后续的训练中能够得到更准确的预测结果。各训练数据集中所新增加的数据即为上一轮训练时各预测器预测结果与协同训练结果偏差之和最小的预测地点处的相关数据,如对于空间预测模型,新增加的训练数据即为上一轮训练得到的该预测地点处的坐标和AQI数据;对于动态预测模型,新增加的训练数据即为该预测地点处的历史空气质量指数数据及交通数据和地理兴趣点数据,依次类推。It can be seen from the above method that the present invention performs at least one round of training for each time prediction. In the process of the cyclic training, each round of the training process is completed, and the next round of training, the data of the training data sets of each model will be Updated to provide more accurate predictions in subsequent training. The newly added data in each training data set is the relevant data at the predicted position where the sum of the prediction results of the predictors and the deviation of the cooperative training results is the smallest in the previous round of training. For the spatial prediction model, the newly added training data is The coordinates and AQI data at the predicted location obtained from the previous round of training; for the dynamic prediction model, the newly added training data is the historical air quality index data and traffic data and geographic interest point data at the predicted location, and so on.
若步骤S73中无法通过预测器F4得到相应的AQI预测值,则在步骤S74中采用以下公式计算AQI融合值:If the corresponding AQI prediction value cannot be obtained by the predictor F 4 in step S73, the AQI fusion value is calculated in the following formula using the following formula:
Figure PCTCN2017085715-appb-000019
Figure PCTCN2017085715-appb-000019
步骤S81对于当前时刻预测系统的准确性进行评估Step S81 evaluates the accuracy of the current time prediction system
对于当前时刻各个网格点的AQI预测,采用交叉检验的方式计算协同训练算法的准确性。具体实施步骤如下: For the AQI prediction of each grid point at the current time, the accuracy of the cooperative training algorithm is calculated by cross-checking. The specific implementation steps are as follows:
S811,采用K折交叉检验的方式,将所有基站随机均分成K份,每份依次编号为1,2,...,k,k+1,...,K,每份有c=N/K个基站。优选的,本实施例中K取为18。因而,每份中有c=N/K=36/18=2个基站;S811, using K-fold cross-checking method, all base stations are randomly divided into K parts, each number is sequentially numbered 1, 2, ..., k, k+1, ..., K, each having c=N /K base stations. Preferably, in the embodiment, K is taken as 18. Thus, there are c=N/K=36/18=2 base stations in each copy;
S812,从K份基站中去除第k份,这1份中的基站的测量值和基站所在的网格的预测值将在后续步骤用于计算准确性,剩余的K-1份基站作为已知数据;S812, removing the kth part from the K base station, the measured value of the base station in the 1 part and the predicted value of the grid where the base station is located will be used for calculating accuracy in the subsequent steps, and the remaining K-1 base stations are known as data;
S813,基于前述的K-1份基站的数据,执行步骤7,得到单独隔离的1份基站即第k份基站中各基站在当前时刻的AQI预测值,记为
Figure PCTCN2017085715-appb-000020
S813, based on the data of the K-1 base stations, perform step 7 to obtain an AQI prediction value of each base station in the k-th base station that is separately isolated, at the current time, and record
Figure PCTCN2017085715-appb-000020
S814,获取第k份基站的实测AQI值为y1,y2,...,yc,则预测系统在去除第k份基站时对当前时刻预测的准确性可用如下指标ηk描述:S814. Obtaining the measured AQI value of the kth base station is y 1 , y 2 , . . . , y c , and the accuracy of the current time prediction by the prediction system when removing the kth base station may be described by the following indicator η k :
Figure PCTCN2017085715-appb-000021
Figure PCTCN2017085715-appb-000021
S815,将k从1遍历至K,得到预测系统在当前时刻的准确性指标η如下:S815, traversing k from 1 to K, and obtaining an accuracy index η of the prediction system at the current time is as follows:
Figure PCTCN2017085715-appb-000022
Figure PCTCN2017085715-appb-000022
η越接近于1,则系统的当前时刻预测的准确性越高。The closer η is to 1, the higher the accuracy of the current time prediction of the system.
步骤S82对于未来时刻AQI预测的准确性评估Step S82 evaluates the accuracy of AQI prediction in future time
记执行所述步骤S7后对指定未来时刻所有基站所在网格的预测值为
Figure PCTCN2017085715-appb-000023
而基站的实际测量值为z1,z2,...,zN,则预测系统对未来预测准确性为:
The predicted value of the grid in which all base stations are located at a specified future time after performing the step S7 is recorded.
Figure PCTCN2017085715-appb-000023
The actual measured values of the base station are z 1 , z 2 ,..., z N , and the accuracy of the prediction system for future predictions is:
Figure PCTCN2017085715-appb-000024
Figure PCTCN2017085715-appb-000024
ψ越接近于1,则系统对未来时刻预测的准确性越高。The closer the ψ is to 1, the higher the accuracy of the system's prediction of future moments.
实施例2Example 2
本发明通过利用将多种预测方法相融合的协同训练算法,对不限于空气质量监测基站附近的城市范围内的各个地理位置点进行空气质量指数预测,在保持较低计算复杂度的同时,提高预测的准确度。The present invention utilizes a cooperative training algorithm that combines various prediction methods to perform air quality index prediction for each geographical point in a city range not limited to an air quality monitoring base station, while maintaining a low computational complexity while improving The accuracy of the forecast.
本发明还提供一种城市小尺度空气质量指数预测系统,包括:The invention also provides a city small-scale air quality index prediction system, comprising:
区域划分模块,将城市区域以网格形式进形区域划分,网格交点对应待预测空气质量指数的地点;The area dividing module divides the urban area into a grid-shaped advancing area, and the grid intersection points correspond to the location of the air quality index to be predicted;
历史监测数据获取模块,获取空气质量监测基站的历史监测数据,建立历史数据库;所 述历史监测数据包括AQI数据、气象数据和天气类型数据;The historical monitoring data acquisition module acquires historical monitoring data of the air quality monitoring base station and establishes a historical database; Historical monitoring data includes AQI data, meteorological data, and weather type data;
时间预测模型建立模块,基于历史数据库建立时间预测模型;A time prediction model building module, which establishes a time prediction model based on a historical database;
空间预测模型建立模块,获取各个空气质量监测基站的实时监测数据,建立空间预测模型;The spatial prediction model building module acquires real-time monitoring data of each air quality monitoring base station and establishes a spatial prediction model;
动态预测模型建立模块,获取各个待预测地点和空气质量监测基站的交通数据和地理兴趣点数据,建立动态预测模型;The dynamic prediction model establishing module acquires traffic data and geographic interest point data of each to-be-predicted location and air quality monitoring base station, and establishes a dynamic prediction model;
室内外预测模型建立模块,获取用户共享的室内空气质量指数,建立室内外预测模型;The indoor and outdoor prediction model building module acquires the indoor air quality index shared by the user and establishes an indoor and outdoor prediction model;
协同训练模块,将已建立的时间预测模型、空间预测模型、动态预测模型和室内外预测模型进行协同训练,以将所有模型的预测结果相融合,得到所有待预测地点在当前时间和未来一段时间的空气质量指数预测值。The collaborative training module cooperatively trains the established time prediction model, spatial prediction model, dynamic prediction model and indoor and outdoor prediction model to fuse the prediction results of all models to obtain all the predicted locations at the current time and for a period of time in the future. Air quality index forecast.
本领域内的技术人员应明白,本申请的实施例可提供为方法、系统、或计算机程序产品。因此,本申请可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。而且,本申请可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。Those skilled in the art will appreciate that embodiments of the present application can be provided as a method, system, or computer program product. Thus, the present application can take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment in combination of software and hardware. Moreover, the application can take the form of a computer program product embodied on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) including computer usable program code.
本申请是参照根据本申请实施例的方法、设备(系统)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器,使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (system), and computer program products according to embodiments of the present application. It will be understood that each flow and/or block of the flowchart illustrations and/or FIG. These computer program instructions can be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing device to produce a machine for the execution of instructions for execution by a processor of a computer or other programmable data processing device. Means for implementing the functions specified in one or more of the flow or in a block or blocks of the flow chart.
这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的计算机可读存储器中,使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品,该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。The computer program instructions can also be stored in a computer readable memory that can direct a computer or other programmable data processing device to operate in a particular manner, such that the instructions stored in the computer readable memory produce an article of manufacture comprising the instruction device. The apparatus implements the functions specified in one or more blocks of a flow or a flow and/or block diagram of the flowchart.
这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上,使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理,从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。 These computer program instructions can also be loaded onto a computer or other programmable data processing device such that a series of operational steps are performed on a computer or other programmable device to produce computer-implemented processing for execution on a computer or other programmable device. The instructions provide steps for implementing the functions specified in one or more of the flow or in a block or blocks of a flow diagram.

Claims (9)

  1. 一种城市小尺度空气质量指数预测方法,其特征是,包括:A city small-scale air quality index prediction method, which is characterized by:
    S1,将城市区域以网格形式进行区域划分,网格交点对应待预测空气质量指数的地点;S1, the urban area is divided into regions by a grid, and the intersection of the grid corresponds to the location of the air quality index to be predicted;
    S2,获取各空气质量监测基站的历史监测数据,建立历史数据库;S2, obtaining historical monitoring data of each air quality monitoring base station, and establishing a historical database;
    S3,基于历史数据库中各基站多个时间序列的监测数据,建立分别对应当前时刻预测以及未来一段时间中各时刻预测的时间预测模型;S3. Based on the monitoring data of multiple time series of each base station in the historical database, establish a time prediction model corresponding to the current time prediction and the prediction of each time in the future time period;
    S4,基于历史数据库中各个空气质量监测基站在同一时刻的监测数据,利用二维线性插值方法,建立对指定坐标处进行空气质量预测的空间预测模型;S4, based on monitoring data of the base station at the same time by each air quality in the historical database, using a two-dimensional linear interpolation method to establish a spatial prediction model for air quality prediction at a specified coordinate;
    S5,获取各个待预测地点和空气质量监测基站的交通数据和地理兴趣点数据,及其对应时刻各待预测地点和空气质量监测基站的空气质量指数数据;S5, obtaining traffic data and geographic interest point data of each to-be-predicted location and air quality monitoring base station, and air quality index data of each base station to be predicted and the air quality monitoring base station at corresponding moments;
    基于所获取的数据,建立表征交通数据和地理兴趣点数据与空气质量指数之间关系的动态预测模型;Based on the acquired data, a dynamic prediction model that characterizes the relationship between traffic data and geographic interest point data and air quality index is established;
    S6,获取用户共享的室内空气质量指数、用户居住环境数据,以及相应地点的空气质量指数数据,建立表征室内空气质量指数与室外空气质量指数之间关系的室内外预测模型;S6, obtaining indoor air quality index shared by the user, user living environment data, and air quality index data of the corresponding location, and establishing an indoor and outdoor prediction model for characterizing the relationship between the indoor air quality index and the outdoor air quality index;
    S7,对于待预测空气质量指数的任一实时时刻的任一待预测地点,将已建立的时间预测模型、空间预测模型、动态预测模型和室内外预测模型进行协同训练,以将所有模型的预测结果相融合,进而得到各待预测地点在相应当前时刻和未来一段时间中各时刻的空气质量指数预测值。S7, cooperatively training the established time prediction model, spatial prediction model, dynamic prediction model and indoor and outdoor prediction model for any predicted location of any real-time moment of the air quality index to be predicted, so as to predict the prediction results of all models The fusion is performed to obtain the predicted values of the air quality index of each of the to-be-predicted locations at respective current moments and at various moments in the future.
  2. 根据权利要求1所述的方法,其特征是,还包括:The method of claim 1 further comprising:
    S8,实时评估空气质量指数预测值结果的准确性:S8, real-time assessment of the accuracy of the air quality index prediction results:
    S81,采用K折交叉验证算法,评估当前时刻预测值结果的准确性:S81, using the K-fold cross-validation algorithm to evaluate the accuracy of the current time prediction result:
    S811,假设基站数量为N,将所有基站均分为K份,每份编号依次为1,2,…,k,k+1,…,K,每份有c=N/K个基站;S811, assuming that the number of base stations is N, all base stations are divided into K shares, each number is 1, 2, ..., k, k+1, ..., K, each having c = N / K base stations;
    S812,从K份基站中除去第k份,剩余的K-1份基站作为已知数据;S812, removing the kth part from the K base station, and remaining K-1 base stations as known data;
    S813,基于已知的K-1份基站的数据,得到被除去的第k份基站中各基站在当前时刻的AQI预测值,记为
    Figure PCTCN2017085715-appb-100001
    S813, based on the data of the known K-1 base stations, obtain the AQI prediction value of each base station in the removed kth base station at the current time, and record it as
    Figure PCTCN2017085715-appb-100001
    S814,获取第k份中各基站的实测AQI值y1,y2,…,yc,则当前时刻预测值结果的准确性用如下指标ηk描述:S814. Obtain the measured AQI values y 1 , y 2 , . . . , y c of each base station in the kth part, and the accuracy of the current time predicted value result is described by the following indicator η k :
    Figure PCTCN2017085715-appb-100002
    Figure PCTCN2017085715-appb-100002
    S815,将k从1遍历至K,分别重复步骤S712至步骤S714,然后得到预测系统在当前时刻的准确性指标η为: S815, traversing k from 1 to K, repeating steps S712 to S714 respectively, and then obtaining an accuracy index η of the prediction system at the current time is:
    Figure PCTCN2017085715-appb-100003
    Figure PCTCN2017085715-appb-100003
    η越接近于1,则系统的当前时刻预测的准确性越高;The closer η is to 1, the higher the accuracy of the current time prediction of the system;
    S82,评估未来一段时间预测值结果的准确性:S82, assessing the accuracy of the predicted results for a period of time:
    假设预测得到的未来一段时间内某指定时刻所有基站对应的预测值结果为
    Figure PCTCN2017085715-appb-100004
    到该指定时刻时各基站的实际测量值为z1,z2,...,zN,则预测系统对未来时刻的预测准确性为:
    Assume that the predicted value of all base stations corresponding to a specified time in the future is predicted to be
    Figure PCTCN2017085715-appb-100004
    The actual measured values of the base stations at the specified time are z 1 , z 2 ,..., z N , and the prediction accuracy of the prediction system for future moments is:
    Figure PCTCN2017085715-appb-100005
    Figure PCTCN2017085715-appb-100005
    ψ越接近于1,则系统对未来时刻预测的准确性越高。The closer the ψ is to 1, the higher the accuracy of the system's prediction of future moments.
  3. 根据权利要求1所述的方法,其特征是,空气质量监测基站监测的监测数据包括日期时间、基站名称、基站经纬度、AQI数据、气温、气压、风力、湿度和天气类型数据;The method according to claim 1, wherein the monitoring data monitored by the air quality monitoring base station comprises date and time, base station name, base station latitude and longitude, AQI data, temperature, air pressure, wind, humidity and weather type data;
    步骤S3中基于历史数据库建立时间预测模型包括步骤:The step of establishing a temporal prediction model based on the historical database in step S3 includes the steps of:
    S31,指定历史序列长度l1及预测期即未来序列长度为l2,记当前时刻的数据为xn,则历史序列为
    Figure PCTCN2017085715-appb-100006
    未来序列为
    Figure PCTCN2017085715-appb-100007
    S31, specifying the length of the historical sequence l 1 and the prediction period, that is, the length of the future sequence is l 2 , and the data of the current time is x n , then the historical sequence is
    Figure PCTCN2017085715-appb-100006
    The future sequence is
    Figure PCTCN2017085715-appb-100007
    提取历史数据库中所有连续l1+1+l2小时的多个序列数据组成初始训练数据集S1Extracting all consecutive sequence data of l 1 +1+l 2 hours in the historical database to form an initial training data set S 1 ;
    S32,建立l2+1个多元线性回归模型,各多元线性回归模型分别对应当前时刻以及未来l2小时中各时刻的预测,表示为:S32, establishing l 2 +1 multiple linear regression models, each of the multiple linear regression models respectively corresponding to the current time and the prediction of each moment in the next 12 hours, expressed as:
    Y1=β01X12X2+…+βpXp Y 101 X 12 X 2 +...+β p X p
    其中βi为回归系数,Xi为模型输入数据,Y1为待预测时刻的空气质量指数;Where β i is the regression coefficient, X i is the model input data, and Y 1 is the air quality index of the time to be predicted;
    对于当前时刻的预测,模型输入数据为l1小时历史AQI数据以及当前时刻上一小时的气温、气压、风力、湿度和天气类型数据;For the current time prediction, the model input data is the 1 hour historical AQI data and the temperature, air pressure, wind, humidity and weather type data of the hour at the current time;
    对于未来l2小时中各时刻的预测,模型输入数据为当前时刻的AQI数据、l1小时的历史AQI数据以及当前时刻的气温、气压、风力、湿度和天气类型数据。For the prediction of each moment in the next 12 hours, the model input data is the current time AQI data, l 1 hour historical AQI data, and current temperature, air pressure, wind, humidity and weather type data.
  4. 根据权利要求3所述的方法,其特征是,步骤S4中,利用二维线性插值方法,建立对指定坐标处进行空气质量预测的空间预测模型,包括:The method according to claim 3, wherein in step S4, a spatial prediction model for air quality prediction at a specified coordinate is established by using a two-dimensional linear interpolation method, comprising:
    S41,获取历史数据库中所有已知空气质量指数的地点在相同时刻的实时监测数据,与相 应地点的经纬度数据,组成空间预测模型的训练数据集S2S41: acquiring real-time monitoring data of all known air quality index locations in the historical database at the same time, and latitude and longitude data of corresponding locations, forming a training data set S 2 of the spatial prediction model;
    S42,定义待预测地点的坐标为(x,y),对该地点进行空气指数预测的空间预测模型表示为:S42, the coordinate of the location to be predicted is (x, y), and the spatial prediction model for air index prediction of the location is expressed as:
    Y2=griddata(x,y,S2)Y 2 =griddata(x,y,S 2 )
    其中模型的输入量即S2,模型输出量为待预测地点的空气质量指数,griddata()代表二维插值函数。The input of the model is S 2 , the output of the model is the air quality index of the location to be predicted, and the griddata () represents the two-dimensional interpolation function.
  5. 根据权利要求4所述的方法,其特征是,步骤S5中,交通数据包括各个待预测地点和空气质量监测基站周边设定半径区域内的畅通路段、缓慢路段和拥堵路段的长度数据。The method according to claim 4, wherein in step S5, the traffic data comprises length data of the smooth path segment, the slow road segment and the congestion road segment in the set radius region of each of the to-be-predicted locations and the air quality monitoring base station.
    地理兴趣点数据包括各个待预测地点和空气质量监测基站周边设定半径区域内的地理对象实体的分布数据;所述地理对象实体类型包括学校、银行、餐厅和加油站。The geographic point of interest data includes distribution data of geographic object entities within a set radius area around the base station and the air quality monitoring base station; the geographic object entity types include schools, banks, restaurants, and gas stations.
    步骤S5利用多元线性回归方法建立动态预测模型,包括步骤:Step S5 uses a multiple linear regression method to establish a dynamic prediction model, including the steps:
    S51,获取历史数据库中多个时刻分别对应的各基站外周给定半径内的交通数据和地理兴趣点数据,交通数据包括畅通路段、缓慢路段和拥堵路段长度的占比数据,定义为T1,T2,T3,地理兴趣点数据包括基站外周给定半径内各地理兴趣点的分布数量,定义为T4,T5,…,Tq,以及相应时刻相应基站的空气质量指数监测数据,建立初始训练集S3S51: Obtain traffic data and geographic interest point data within a given radius of each base station corresponding to each time point in the historical database, and the traffic data includes the proportion data of the length of the smooth path segment, the slow road segment, and the congestion segment length, and is defined as T 1 . T 2 , T 3 , geographic interest point data includes the number of geographical interest points distributed within a given radius of the base station, defined as T 4 , T 5 , . . . , T q , and air quality index monitoring data of the corresponding base station at the corresponding time, Establish an initial training set S 3 ;
    S52,建立动态预测模型,表示为:S52, establishing a dynamic prediction model, expressed as:
    Y3=α01T12T23T34T4+…++αqTq Y 301 T 12 T 23 T 34 T 4 +...++α q T q
    其中αi为回归系数,模型输入量为待预测地点在指定时刻给定半径内的交通数据和地理兴趣点数据,模型输出量Y3即待预测点的空气质量指数。Where α i is the regression coefficient, the model input quantity is the traffic data and geographic interest point data within a given radius of the to-be-predicted location at the specified time, and the model output quantity Y 3 is the air quality index of the point to be predicted.
  6. 根据权利要求5所述的方法,其特征是,步骤S6采用回归树算法建立室内外预测模型,包括步骤:The method according to claim 5, wherein the step S6 uses a regression tree algorithm to establish an indoor and outdoor prediction model, comprising the steps of:
    S61,获取历史数据库中多个指定时刻各基站监测的空气质量指数数据,以及相应时刻相应地点用户共享的室内空气质量指数数据和室内空气质量指数相关数据,室内空气质量指数相关数据包括建筑环境类型、楼层、距主干道距离、是否开启中央空调、是否开窗通风及是 否开启空气净化器;基于获取到的数据建立室内外预测模型的初始训练集S4S61. Acquire air quality index data monitored by each base station at a plurality of specified moments in the historical database, and indoor air quality index data and indoor air quality index related data shared by users at corresponding moments in the corresponding time, and the indoor air quality index related data includes the type of the building environment. , floor, distance from the main road, whether to open the central air conditioner, whether to open the window ventilation and whether to open the air purifier; based on the acquired data to establish an initial training set S 4 of the indoor and outdoor prediction model;
    S62,建立室内外预测模型,表示为:S62, establishing an indoor and outdoor prediction model, expressed as:
    Y4=RT(M,S4)Y 4 =RT(M,S 4 )
    模型输入量为待预测地点在待预测时刻所获取的用户共享的室内空气质量指数M,和室内空气质量指数相关数据,模型输出量Y4为待预测地点在待预测时刻的空气质量指数数据。The model input quantity is the indoor air quality index M shared by the user acquired at the time to be predicted, and the indoor air quality index related data, and the model output quantity Y 4 is the air quality index data of the to-be-predicted place at the time to be predicted.
  7. 根据权利要求6所述的方法,其特征是,步骤S7将已建立的时间预测模型、空间预测模型、动态预测模型和室内外预测模型进行协同训练,包括:The method according to claim 6, wherein step S7 performs cooperative training on the established time prediction model, spatial prediction model, dynamic prediction model and indoor and outdoor prediction model, including:
    S71,记时间预测模型、空间预测模型、动态预测模型和室内外预测模型分别为预测器F1,F2,F3,F4,各预测器的训练集分别记为L1,L2,L3,L4,将训练集初始化为:S71, the time prediction model, the spatial prediction model, the dynamic prediction model, and the indoor and outdoor prediction models are predictors F 1 , F 2 , F 3 , and F 4 , respectively, and the training sets of each predictor are respectively recorded as L 1 , L 2 , L 3 , L 4 , initialize the training set to:
    L1=S1,L2=S2,L3=S3,L4=S4L 1 =S 1 , L 2 =S 2 , L 3 =S 3 , L 4 =S 4 ;
    初始化各预测器预测结果的权重向量为[w1,w2,w3,w4],四个权重因子的和等于1。The weight vector for initializing each predictor prediction result is [w 1 , w 2 , w 3 , w 4 ], and the sum of the four weighting factors is equal to 1.
    S72,基于训练集L1,L2,L3,L4分别训练得到F1,F2,F3,F4S72, based on the training set L 1, L 2, L 3 , L 4 , respectively, is trained F 1, F 2, F 3 , F 4;
    S73,获取待预测地点在待预测时刻对应各预测器的模型输入量数据,利用所获取的数据,对于各待预测地点,利用训练得到的四个预测器分别计算得到待预测时刻的预测值,记为:S73. Acquire a model input quantity data corresponding to each predictor at a time to be predicted at a time to be predicted, and use the acquired data to calculate a predicted value of the to-be-predicted time by using four predictors obtained by the training for each to-be-predicted place. Recorded as:
    Y1=F1(x,y)Y 1 =F 1 (x,y)
    Y2=F2(x,y)Y 2 =F 2 (x,y)
    Y3=F3(x,y)Y 3 =F 3 (x,y)
    Y4=F4(x,y)Y 4 =F 4 (x,y)
    S74,对于各待预测地点,其在待预测时刻的AQI融合值为:S74. For each location to be predicted, the AQI fusion value at the time to be predicted is:
    Figure PCTCN2017085715-appb-100008
    Figure PCTCN2017085715-appb-100008
    S75,定义预测结果的偏差阈值Rth,计算四个预测器预测结果的偏差之和:S75, defining a deviation threshold Rth of the prediction result, and calculating a sum of deviations of the prediction results of the four predictors:
    Figure PCTCN2017085715-appb-100009
    Figure PCTCN2017085715-appb-100009
    S76,对于各待预测地点,分别将计算得到的Rx,y与偏差阈值Rth进行比较,若满足:S76: For each location to be predicted, compare the calculated R x,y with the deviation threshold R th respectively, if:
    Figure PCTCN2017085715-appb-100010
    Figure PCTCN2017085715-appb-100010
    则退出循环,以Y0作为各待预测地点在待预测时刻的空气质量指数预测值;否则转至步 骤S77;Then exit the loop, using Y 0 as the predicted air quality index of each to-be-predicted location at the time to be predicted; otherwise, go to step S77;
    S77,从所有待预测地点中,以相应的Rx,y从小到大为顺序,选取n个待预测地点,记为:S77, from all the to-be-predicted locations, in the order of the corresponding R x, y from small to large, select n locations to be predicted, and record:
    S={(x1,y1),(x2,y2),…,(xn,yn)};S={(x 1 , y 1 ), (x 2 , y 2 ), ..., (x n , y n )};
    S78,更新各预测器的训练集为:L1={L1,S},L2={L2,S},L3={L3,S},L4为当前的S4;转至步骤S72,并重复步骤S72至步骤S78继续进行训练,直至进行步骤S76时满足
    Figure PCTCN2017085715-appb-100011
    则以满足时对应的Y0作为各待预测地点在待预测时刻的空气质量指数预测值。
    S78, updating the training set of each predictor is: L 1 ={L 1 ,S}, L 2 ={L 2 ,S}, L 3 ={L 3 ,S}, L 4 is the current S 4 ; Go to step S72, and repeat steps S72 to S78 to continue training until the step S76 is satisfied.
    Figure PCTCN2017085715-appb-100011
    Then, Y 0 corresponding to the time of satisfaction is used as the predicted value of the air quality index of each to-be-predicted location at the time to be predicted.
  8. 根据权利要求7所述的方法,其特征是,若步骤S73中无法通过预测器F4得到相应的AQI预测值,则在步骤S74中采用以下公式计算AQI融合值:The method according to claim 7, wherein if the corresponding AQI prediction value cannot be obtained by the predictor F 4 in step S73, the AQI fusion value is calculated in the following formula using the following formula:
    Figure PCTCN2017085715-appb-100012
    Figure PCTCN2017085715-appb-100012
  9. 一种城市小尺度空气质量指数预测系统,其特征是,包括:A city small-scale air quality index prediction system, which is characterized by:
    区域划分模块,将城市区域以网格形式进形区域划分,网格交点对应待预测空气质量指数的地点;The area dividing module divides the urban area into a grid-shaped advancing area, and the grid intersection points correspond to the location of the air quality index to be predicted;
    历史监测数据获取模块,获取各空气质量监测基站的历史监测数据,建立历史数据库;The historical monitoring data acquisition module acquires historical monitoring data of each air quality monitoring base station and establishes a historical database;
    时间预测模型建立模块,基于历史数据库中各基站多个时间序列的监测数据,建立分别对应当前时刻预测以及未来一段时间中各时刻预测的时间预测模型;a time prediction model establishing module, based on monitoring data of multiple time series of each base station in the historical database, establishing a time prediction model corresponding to the current time prediction and each time prediction in a future time period;
    空间预测模型建立模块,基于历史数据库中各个空气质量监测基站在同一时刻的监测数据,利用二维线性插值方法,建立对指定坐标处进行空气质量预测的空间预测模型;The spatial prediction model building module monitors the monitoring data of the base station at the same time based on the respective air quality in the historical database, and uses the two-dimensional linear interpolation method to establish a spatial prediction model for air quality prediction at the specified coordinates;
    动态预测模型建立模块,获取各个待预测地点和空气质量监测基站的交通数据和地理兴趣点数据,及其对应时刻各待预测地点和空气质量监测基站的空气质量指数数据;基于所获取的数据,建立表征交通数据和地理兴趣点数据与空气质量指数之间关系的动态预测模型;The dynamic prediction model establishing module acquires traffic data and geographic interest point data of each to-be-predicted location and air quality monitoring base station, and air quality index data of each base station to be predicted and the air quality monitoring base station at the corresponding time; based on the acquired data, Establish a dynamic prediction model that characterizes the relationship between traffic data and geographic point of interest data and air quality index;
    室内外预测模型建立模块,获取用户共享的室内空气质量指数、用户居住环境数据,以及相应地点的空气质量指数数据,建立表征室内空气质量指数与室外空气质量指数之间关系的室内外预测模型;The indoor and outdoor prediction model building module acquires the indoor air quality index shared by the user, the user living environment data, and the air quality index data of the corresponding place, and establishes an indoor and outdoor prediction model that characterizes the relationship between the indoor air quality index and the outdoor air quality index;
    协同训练模块,对于待预测空气质量指数的任一实时时刻的任一待预测地点,将已建立的时间预测模型、空间预测模型、动态预测模型和室内外预测模型进行协同训练,以将所有模型的预测结果相融合,进而得到各待预测地点在相应当前时刻和未来一段时间中各时刻的空气质量指数预测值。 The collaborative training module cooperatively trains the established time prediction model, spatial prediction model, dynamic prediction model and indoor and outdoor prediction model for any predicted location of any real-time moment of the air quality index to be predicted, so as to The prediction results are combined to obtain the predicted values of the air quality index of each of the to-be-predicted locations at respective current moments and at various moments in the future.
PCT/CN2017/085715 2017-05-24 2017-05-24 Small-scale air quality index prediction method and system for city WO2018214060A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
PCT/CN2017/085715 WO2018214060A1 (en) 2017-05-24 2017-05-24 Small-scale air quality index prediction method and system for city
CN201780005024.8A CN108701274B (en) 2017-05-24 2017-05-24 Urban small-scale air quality index prediction method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2017/085715 WO2018214060A1 (en) 2017-05-24 2017-05-24 Small-scale air quality index prediction method and system for city

Publications (1)

Publication Number Publication Date
WO2018214060A1 true WO2018214060A1 (en) 2018-11-29

Family

ID=63844053

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/085715 WO2018214060A1 (en) 2017-05-24 2017-05-24 Small-scale air quality index prediction method and system for city

Country Status (2)

Country Link
CN (1) CN108701274B (en)
WO (1) WO2018214060A1 (en)

Cited By (47)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109657858A (en) * 2018-12-17 2019-04-19 杭州电子科技大学 Roadside Air Pollution Forecast method based on uneven amendment semi-supervised learning
CN110210681A (en) * 2019-06-11 2019-09-06 西安电子科技大学 A kind of prediction technique of the monitoring station PM2.5 value based on distance
CN110363350A (en) * 2019-07-15 2019-10-22 西华大学 A kind of regional air pollutant analysis method based on complex network
CN110399537A (en) * 2019-07-22 2019-11-01 苏州量盾信息科技有限公司 A kind of alert spatio-temporal prediction method based on artificial intelligence technology
CN110533239A (en) * 2019-08-23 2019-12-03 中南大学 A kind of smart city air quality high-precision measuring method
CN110531815A (en) * 2019-09-25 2019-12-03 中国农业科学院农业信息研究所 A kind of greenhouse intelligent pre-conditioned device and method merging indoor and outdoor surroundings parameter
CN110621026A (en) * 2019-02-18 2019-12-27 北京航空航天大学 Base station flow multi-time prediction method
CN110825754A (en) * 2019-10-23 2020-02-21 北京蛙鸣华清环保科技有限公司 Air quality spatial interpolation method, system, medium and device based on attributes
CN110929793A (en) * 2019-11-27 2020-03-27 谢国宇 Time-space domain model modeling method and system for ecological environment monitoring
CN111077048A (en) * 2019-11-27 2020-04-28 华南师范大学 Opportunistic group intelligent air quality monitoring and evaluating method based on mobile equipment
CN111125937A (en) * 2020-01-13 2020-05-08 暨南大学 Near-ground atmospheric fine particulate matter concentration estimation method based on space-time weighted regression model
CN111401605A (en) * 2020-02-17 2020-07-10 北京石油化工学院 Interpretable prediction method for atmospheric pollution
CN111461163A (en) * 2020-02-25 2020-07-28 河南大学 Urban interior PM2.5 concentration simulation and population exposure evaluation method and device
CN111832222A (en) * 2020-06-28 2020-10-27 成都佳华物链云科技有限公司 Pollutant concentration prediction model training method, prediction method and device
CN112100913A (en) * 2020-09-08 2020-12-18 中国电力科学研究院有限公司 Data-driven clear electricity price boundary simulation method and system and storage medium
CN112308336A (en) * 2020-11-18 2021-02-02 浙江大学 High-speed railway high wind speed limit dynamic disposal method based on multi-step time sequence prediction
CN112418560A (en) * 2020-12-10 2021-02-26 长春理工大学 PM2.5 concentration prediction method and system
CN112417753A (en) * 2020-11-04 2021-02-26 中国科学技术大学 Urban public transport resource joint scheduling method
CN112561191A (en) * 2020-12-22 2021-03-26 北京百度网讯科技有限公司 Prediction model training method, prediction method, device, apparatus, program, and medium
CN112561199A (en) * 2020-12-23 2021-03-26 北京百度网讯科技有限公司 Weather parameter prediction model training method, weather parameter prediction method and device
CN112580859A (en) * 2020-06-01 2021-03-30 北京理工大学 Haze prediction method based on global attention mechanism
CN112801366A (en) * 2021-01-27 2021-05-14 上海微亿智造科技有限公司 Industrial quality data index intelligent prediction method, system and medium
CN113033901A (en) * 2021-03-30 2021-06-25 上海眼控科技股份有限公司 Meteorological element prediction method, device, equipment and storage medium
CN113052353A (en) * 2019-12-27 2021-06-29 中移雄安信息通信科技有限公司 Air quality prediction and prediction model training method and device and storage medium
CN113077081A (en) * 2021-03-26 2021-07-06 航天科工智能运筹与信息安全研究院(武汉)有限公司 Traffic pollution emission prediction method
KR20210086786A (en) * 2019-12-30 2021-07-09 전북대학교산학협력단 System and method for predicting fine dust and odor
CN113155190A (en) * 2021-04-16 2021-07-23 浙江农林大学 Foundation pit construction area environment monitoring device and method
CN113222236A (en) * 2021-04-30 2021-08-06 中国科学技术大学先进技术研究院 Data distribution self-adaptive cross-regional exhaust emission prediction method and system
CN113554105A (en) * 2021-07-28 2021-10-26 桂林电子科技大学 Missing data completion method for Internet of things based on space-time fusion
CN113919234A (en) * 2021-10-29 2022-01-11 合肥综合性国家科学中心人工智能研究院(安徽省人工智能实验室) Mobile source emission prediction method, system and equipment based on time sequence characteristic migration
CN113919231A (en) * 2021-10-25 2022-01-11 北京航天创智科技有限公司 PM2.5 concentration space-time change prediction method and system based on space-time diagram neural network
CN113992525A (en) * 2021-10-12 2022-01-28 支付宝(杭州)信息技术有限公司 Method and device for adjusting number of applied containers
CN114219345A (en) * 2021-12-24 2022-03-22 武汉工程大学 Secondary air quality prediction optimization method based on data mining
CN114255392A (en) * 2021-12-21 2022-03-29 中国科学技术大学 Nitrogen dioxide concentration prediction system based on satellite hyperspectral remote sensing and artificial intelligence
CN114330850A (en) * 2021-12-21 2022-04-12 南京大学 Abnormal relative tendency generation method and system for climate prediction
CN114462684A (en) * 2022-01-12 2022-05-10 东南大学 Wind speed multipoint synchronous prediction method for coupling numerical weather forecast and measured data
CN115237896A (en) * 2022-07-12 2022-10-25 四川大学 Data preprocessing method and system for forecasting air quality based on deep learning
CN115993488A (en) * 2023-03-24 2023-04-21 天津安力信通讯科技有限公司 Intelligent monitoring method and system for electromagnetic environment
CN116204805A (en) * 2023-04-24 2023-06-02 青岛鑫屋精密机械有限公司 Micro-pressure oxygen cabin and data management system
CN117074627A (en) * 2023-10-16 2023-11-17 三科智能(山东)集团有限公司 Medical laboratory air quality monitoring system based on artificial intelligence
CN117129638A (en) * 2023-10-26 2023-11-28 江西怡杉环保股份有限公司 Regional air environment quality monitoring method and system
CN117250133A (en) * 2023-11-16 2023-12-19 国建大数据科技(辽宁)有限公司 Smart city large-scale data acquisition method and system
CN117268460A (en) * 2023-08-16 2023-12-22 广东省泰维思信息科技有限公司 Indoor and outdoor linkage monitoring method and system based on Internet of things
CN117332901A (en) * 2023-10-17 2024-01-02 南方电网数字电网研究院有限公司 New energy small time scale power prediction method adopting layered time aggregation strategy
CN117370772A (en) * 2023-12-08 2024-01-09 北京英视睿达科技股份有限公司 PM2.5 diffusion analysis method and system based on urban street topography classification
CN117805502A (en) * 2024-02-29 2024-04-02 深圳市瑞达检测技术有限公司 Urban electromagnetic radiation monitoring method and system based on big data
WO2024077876A1 (en) * 2022-10-12 2024-04-18 华院计算技术(上海)股份有限公司 Adaptation-based local dynamic coke quality prediction method

Families Citing this family (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020021343A1 (en) * 2018-07-25 2020-01-30 山东诺方电子科技有限公司 Method for evaluating credibility of data from environmental monitoring station
CN111178653B (en) * 2018-11-13 2022-12-02 百度在线网络技术(北京)有限公司 Method and device for determining a contaminated area
CN109657842A (en) * 2018-11-27 2019-04-19 平安科技(深圳)有限公司 The prediction technique and device of air pollutant concentration, electronic equipment
CN110162587A (en) * 2019-05-27 2019-08-23 北京气象在线科技有限公司 Meteorological benchmark index generation method towards outdoor physical exercises
CN110346518B (en) * 2019-07-25 2021-06-15 中南大学 Traffic emission pollution visualization early warning method and system thereof
CN110361505B (en) * 2019-07-25 2021-06-22 中南大学 Method of train passenger health early warning system in outside air pollution environment
CN110346517B (en) * 2019-07-25 2021-06-08 中南大学 Smart city industrial atmosphere pollution visual early warning method and system
CN110555551B (en) * 2019-08-23 2022-12-13 中南大学 Air quality big data management method and system for smart city
CN110784891B (en) * 2019-10-21 2022-08-26 中国联合网络通信集团有限公司 Data processing method and device
CN111081016B (en) * 2019-12-18 2021-07-06 北京航空航天大学 Urban traffic abnormity identification method based on complex network theory
CN111163430B (en) * 2019-12-30 2023-11-21 上海云瀚科技股份有限公司 Water quantity prediction method based on mobile phone base station user positioning data
CN111340288B (en) * 2020-02-25 2024-04-05 武汉墨锦创意科技有限公司 Urban air quality time sequence prediction method considering time-space correlation
CN111339392B (en) * 2020-03-27 2023-02-03 中国科学院大气物理研究所 Sky blue index determination method and system based on meteorological elements
CN111581602A (en) * 2020-05-07 2020-08-25 南京信息工程大学 Air quality index self-adaptive prediction voice system
CN111766347B (en) * 2020-07-24 2023-08-29 苍龙集团有限公司 Indoor air quality real-time monitoring method and device
CN112084286B (en) * 2020-09-14 2021-06-29 智慧足迹数据科技有限公司 Spatial data processing method and device, computer equipment and storage medium
CN112541302B (en) * 2020-12-23 2024-02-06 北京百度网讯科技有限公司 Air quality prediction model training method, air quality prediction method and device
CN112765229B (en) * 2020-12-25 2022-08-16 哈尔滨工程大学 Air quality inference method based on multilayer attention mechanism
CN112597144B (en) * 2020-12-29 2022-11-08 农业农村部环境保护科研监测所 Automatic cleaning method for production place environment monitoring data
CN113254417B (en) * 2021-06-29 2022-02-22 南京满星数据科技有限公司 Meteorological grid data service method and system based on big data technology
CN114674988B (en) * 2022-05-25 2022-09-02 维睿空气系统产品(深圳)有限公司 Air on-line monitoring system based on wireless network
CN115936242B (en) * 2022-12-26 2023-11-17 中科三清科技有限公司 Method and device for acquiring traceability relation data of air quality and traffic condition

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103514366A (en) * 2013-09-13 2014-01-15 中南大学 Urban air quality concentration monitoring missing data recovering method
CN104200104A (en) * 2014-09-04 2014-12-10 浙江鸿程计算机系统有限公司 Fine granularity air pollutant concentration area estimation method based on spatial characteristics
CN104200103A (en) * 2014-09-04 2014-12-10 浙江鸿程计算机系统有限公司 Urban air quality grade predicting method based on multi-field characteristics
CN105493109A (en) * 2013-06-05 2016-04-13 微软技术许可有限责任公司 Air quality inference using multiple data sources

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103615790B (en) * 2013-12-20 2016-03-30 山东钢铁股份有限公司 A kind of method and system utilizing natural conditions to regulate skyscraper air quality
CN104008278B (en) * 2014-05-14 2017-02-15 昆明理工大学 PM2.5 concentration prediction method based on feature vectors and least square support vector machine
CN105243444A (en) * 2015-10-09 2016-01-13 杭州尚青科技有限公司 City monitoring station air quality prediction method based on online multi-core regression

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105493109A (en) * 2013-06-05 2016-04-13 微软技术许可有限责任公司 Air quality inference using multiple data sources
CN103514366A (en) * 2013-09-13 2014-01-15 中南大学 Urban air quality concentration monitoring missing data recovering method
CN104200104A (en) * 2014-09-04 2014-12-10 浙江鸿程计算机系统有限公司 Fine granularity air pollutant concentration area estimation method based on spatial characteristics
CN104200103A (en) * 2014-09-04 2014-12-10 浙江鸿程计算机系统有限公司 Urban air quality grade predicting method based on multi-field characteristics

Cited By (71)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109657858A (en) * 2018-12-17 2019-04-19 杭州电子科技大学 Roadside Air Pollution Forecast method based on uneven amendment semi-supervised learning
CN109657858B (en) * 2018-12-17 2023-06-23 杭州电子科技大学 Road edge air pollution prediction method based on unbalance correction semi-supervised learning
CN110621026A (en) * 2019-02-18 2019-12-27 北京航空航天大学 Base station flow multi-time prediction method
CN110621026B (en) * 2019-02-18 2023-09-05 北京航空航天大学 Multi-moment prediction method for base station flow
CN110210681A (en) * 2019-06-11 2019-09-06 西安电子科技大学 A kind of prediction technique of the monitoring station PM2.5 value based on distance
CN110363350A (en) * 2019-07-15 2019-10-22 西华大学 A kind of regional air pollutant analysis method based on complex network
CN110363350B (en) * 2019-07-15 2023-10-10 西华大学 Regional air pollutant analysis method based on complex network
CN110399537A (en) * 2019-07-22 2019-11-01 苏州量盾信息科技有限公司 A kind of alert spatio-temporal prediction method based on artificial intelligence technology
CN110533239A (en) * 2019-08-23 2019-12-03 中南大学 A kind of smart city air quality high-precision measuring method
CN110531815A (en) * 2019-09-25 2019-12-03 中国农业科学院农业信息研究所 A kind of greenhouse intelligent pre-conditioned device and method merging indoor and outdoor surroundings parameter
CN110825754B (en) * 2019-10-23 2022-06-17 北京蛙鸣华清环保科技有限公司 Air quality spatial interpolation method, system, medium and device based on attributes
CN110825754A (en) * 2019-10-23 2020-02-21 北京蛙鸣华清环保科技有限公司 Air quality spatial interpolation method, system, medium and device based on attributes
CN111077048A (en) * 2019-11-27 2020-04-28 华南师范大学 Opportunistic group intelligent air quality monitoring and evaluating method based on mobile equipment
CN110929793A (en) * 2019-11-27 2020-03-27 谢国宇 Time-space domain model modeling method and system for ecological environment monitoring
CN113052353B (en) * 2019-12-27 2022-10-18 中移雄安信息通信科技有限公司 Air quality prediction and prediction model training method and device and storage medium
CN113052353A (en) * 2019-12-27 2021-06-29 中移雄安信息通信科技有限公司 Air quality prediction and prediction model training method and device and storage medium
KR20210086786A (en) * 2019-12-30 2021-07-09 전북대학교산학협력단 System and method for predicting fine dust and odor
KR102439038B1 (en) 2019-12-30 2022-09-02 전북대학교산학협력단 System and method for predicting fine dust and odor
CN111125937A (en) * 2020-01-13 2020-05-08 暨南大学 Near-ground atmospheric fine particulate matter concentration estimation method based on space-time weighted regression model
CN111125937B (en) * 2020-01-13 2023-05-02 暨南大学 Near-ground atmosphere fine particulate matter concentration estimation method based on space-time weighted regression model
CN111401605A (en) * 2020-02-17 2020-07-10 北京石油化工学院 Interpretable prediction method for atmospheric pollution
CN111401605B (en) * 2020-02-17 2023-05-02 北京石油化工学院 Interpreted prediction method for atmospheric pollution
CN111461163B (en) * 2020-02-25 2023-03-24 河南大学 Urban interior PM2.5 concentration simulation and population exposure evaluation method and device
CN111461163A (en) * 2020-02-25 2020-07-28 河南大学 Urban interior PM2.5 concentration simulation and population exposure evaluation method and device
CN112580859A (en) * 2020-06-01 2021-03-30 北京理工大学 Haze prediction method based on global attention mechanism
CN111832222B (en) * 2020-06-28 2023-07-25 成都佳华物链云科技有限公司 Pollutant concentration prediction model training method, pollutant concentration prediction method and pollutant concentration prediction device
CN111832222A (en) * 2020-06-28 2020-10-27 成都佳华物链云科技有限公司 Pollutant concentration prediction model training method, prediction method and device
CN112100913A (en) * 2020-09-08 2020-12-18 中国电力科学研究院有限公司 Data-driven clear electricity price boundary simulation method and system and storage medium
CN112417753B (en) * 2020-11-04 2024-03-29 中国科学技术大学 Urban public transport resource-based joint scheduling method
CN112417753A (en) * 2020-11-04 2021-02-26 中国科学技术大学 Urban public transport resource joint scheduling method
CN112308336B (en) * 2020-11-18 2023-12-19 浙江大学 High-speed railway strong wind speed limiting dynamic treatment method based on multi-step time sequence prediction
CN112308336A (en) * 2020-11-18 2021-02-02 浙江大学 High-speed railway high wind speed limit dynamic disposal method based on multi-step time sequence prediction
CN112418560A (en) * 2020-12-10 2021-02-26 长春理工大学 PM2.5 concentration prediction method and system
CN112418560B (en) * 2020-12-10 2024-05-14 长春理工大学 PM2.5 concentration prediction method and system
CN112561191B (en) * 2020-12-22 2024-02-27 北京百度网讯科技有限公司 Prediction model training method, prediction device, prediction apparatus, prediction program, and program
CN112561191A (en) * 2020-12-22 2021-03-26 北京百度网讯科技有限公司 Prediction model training method, prediction method, device, apparatus, program, and medium
CN112561199A (en) * 2020-12-23 2021-03-26 北京百度网讯科技有限公司 Weather parameter prediction model training method, weather parameter prediction method and device
CN112801366A (en) * 2021-01-27 2021-05-14 上海微亿智造科技有限公司 Industrial quality data index intelligent prediction method, system and medium
CN113077081A (en) * 2021-03-26 2021-07-06 航天科工智能运筹与信息安全研究院(武汉)有限公司 Traffic pollution emission prediction method
CN113033901A (en) * 2021-03-30 2021-06-25 上海眼控科技股份有限公司 Meteorological element prediction method, device, equipment and storage medium
CN113155190A (en) * 2021-04-16 2021-07-23 浙江农林大学 Foundation pit construction area environment monitoring device and method
CN113222236A (en) * 2021-04-30 2021-08-06 中国科学技术大学先进技术研究院 Data distribution self-adaptive cross-regional exhaust emission prediction method and system
CN113554105A (en) * 2021-07-28 2021-10-26 桂林电子科技大学 Missing data completion method for Internet of things based on space-time fusion
CN113554105B (en) * 2021-07-28 2023-04-18 桂林电子科技大学 Missing data completion method for Internet of things based on space-time fusion
CN113992525A (en) * 2021-10-12 2022-01-28 支付宝(杭州)信息技术有限公司 Method and device for adjusting number of applied containers
CN113919231A (en) * 2021-10-25 2022-01-11 北京航天创智科技有限公司 PM2.5 concentration space-time change prediction method and system based on space-time diagram neural network
CN113919234A (en) * 2021-10-29 2022-01-11 合肥综合性国家科学中心人工智能研究院(安徽省人工智能实验室) Mobile source emission prediction method, system and equipment based on time sequence characteristic migration
CN114330850A (en) * 2021-12-21 2022-04-12 南京大学 Abnormal relative tendency generation method and system for climate prediction
CN114330850B (en) * 2021-12-21 2023-11-17 南京大学 Abnormal relative trend generation method and system for climate prediction
CN114255392A (en) * 2021-12-21 2022-03-29 中国科学技术大学 Nitrogen dioxide concentration prediction system based on satellite hyperspectral remote sensing and artificial intelligence
CN114219345A (en) * 2021-12-24 2022-03-22 武汉工程大学 Secondary air quality prediction optimization method based on data mining
CN114462684A (en) * 2022-01-12 2022-05-10 东南大学 Wind speed multipoint synchronous prediction method for coupling numerical weather forecast and measured data
CN114462684B (en) * 2022-01-12 2024-06-07 东南大学 Wind speed multipoint synchronous prediction method coupling numerical weather forecast and measured data
CN115237896A (en) * 2022-07-12 2022-10-25 四川大学 Data preprocessing method and system for forecasting air quality based on deep learning
CN115237896B (en) * 2022-07-12 2023-07-11 四川大学 Data preprocessing method and system based on deep learning forecast air quality
WO2024077876A1 (en) * 2022-10-12 2024-04-18 华院计算技术(上海)股份有限公司 Adaptation-based local dynamic coke quality prediction method
CN115993488A (en) * 2023-03-24 2023-04-21 天津安力信通讯科技有限公司 Intelligent monitoring method and system for electromagnetic environment
CN116204805A (en) * 2023-04-24 2023-06-02 青岛鑫屋精密机械有限公司 Micro-pressure oxygen cabin and data management system
CN117268460A (en) * 2023-08-16 2023-12-22 广东省泰维思信息科技有限公司 Indoor and outdoor linkage monitoring method and system based on Internet of things
CN117268460B (en) * 2023-08-16 2024-04-09 广东省泰维思信息科技有限公司 Indoor and outdoor linkage monitoring method and system based on Internet of Things
CN117074627B (en) * 2023-10-16 2024-01-09 三科智能(山东)集团有限公司 Medical laboratory air quality monitoring system based on artificial intelligence
CN117074627A (en) * 2023-10-16 2023-11-17 三科智能(山东)集团有限公司 Medical laboratory air quality monitoring system based on artificial intelligence
CN117332901A (en) * 2023-10-17 2024-01-02 南方电网数字电网研究院有限公司 New energy small time scale power prediction method adopting layered time aggregation strategy
CN117129638B (en) * 2023-10-26 2024-01-12 江西怡杉环保股份有限公司 Regional air environment quality monitoring method and system
CN117129638A (en) * 2023-10-26 2023-11-28 江西怡杉环保股份有限公司 Regional air environment quality monitoring method and system
CN117250133B (en) * 2023-11-16 2024-02-20 国建大数据科技(辽宁)有限公司 Smart city large-scale data acquisition method and system
CN117250133A (en) * 2023-11-16 2023-12-19 国建大数据科技(辽宁)有限公司 Smart city large-scale data acquisition method and system
CN117370772A (en) * 2023-12-08 2024-01-09 北京英视睿达科技股份有限公司 PM2.5 diffusion analysis method and system based on urban street topography classification
CN117370772B (en) * 2023-12-08 2024-04-16 北京英视睿达科技股份有限公司 PM2.5 diffusion analysis method and system based on urban street topography classification
CN117805502A (en) * 2024-02-29 2024-04-02 深圳市瑞达检测技术有限公司 Urban electromagnetic radiation monitoring method and system based on big data
CN117805502B (en) * 2024-02-29 2024-06-11 深圳市瑞达检测技术有限公司 Urban electromagnetic radiation monitoring method and system based on big data

Also Published As

Publication number Publication date
CN108701274B (en) 2021-10-08
CN108701274A (en) 2018-10-23

Similar Documents

Publication Publication Date Title
WO2018214060A1 (en) Small-scale air quality index prediction method and system for city
Watson et al. Machine learning models accurately predict ozone exposure during wildfire events
Nouri et al. Predicting urban land use changes using a CA–Markov model
CN110782093B (en) PM fusing SSAE deep feature learning and LSTM2.5Hourly concentration prediction method and system
CN110598953A (en) Space-time correlation air quality prediction method
CN105243435B (en) A kind of soil moisture content prediction technique based on deep learning cellular Automation Model
Chen Water resources research in Northwest China
CN110346517B (en) Smart city industrial atmosphere pollution visual early warning method and system
CN104751242A (en) Method and device for predicting air quality index
CN105760970A (en) Method for predicting AQI
CN104850734A (en) Air quality index prediction method based on spatial and temporal distribution characteristics
Liu et al. Spatio-temporal prediction and factor identification of urban air quality using support vector machine
Zhang et al. Bayesian analysis of climate change effects on observed and projected airborne levels of birch pollen
CN110738354B (en) Method and device for predicting particulate matter concentration, storage medium and electronic equipment
CN101893726A (en) Aeolian sand disaster simulating device and method
Jiang et al. A Municipal PM2. 5 Forecasting Method Based on Random Forest and WRF Model.
CN109657988B (en) Tobacco leaf quality partitioning method based on HASM and Euclidean distance algorithm
Hu et al. SVR based dense air pollution estimation model using static and wireless sensor network
KR20190027567A (en) Method for predicting chlorophyll-a concentration in stream water based on data mining and spatial analysis
CN107247809A (en) A kind of new method of artificial forest different age forest space mapping
Bahari et al. Prediction of PM2. 5 concentrations using temperature inversion effects based on an artificial neural network
Fadavi et al. Evaluation of AERMOD for distribution modeling of particulate matters (Case study: Ardestan Cement Factory)
Ongoma et al. An investigation of the transport and dispersion of atmospheric pollutants over Nairobi City
Nechausov The information model of the system for local atmospheric air pollution monitoring
Ren et al. Shift of potential natural vegetation against global climate change under historical, current and future scenarios

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17911161

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 17911161

Country of ref document: EP

Kind code of ref document: A1