CN114139719A - Multi-source artificial heat space-time quantization method based on machine learning - Google Patents
Multi-source artificial heat space-time quantization method based on machine learning Download PDFInfo
- Publication number
- CN114139719A CN114139719A CN202111354918.2A CN202111354918A CN114139719A CN 114139719 A CN114139719 A CN 114139719A CN 202111354918 A CN202111354918 A CN 202111354918A CN 114139719 A CN114139719 A CN 114139719A
- Authority
- CN
- China
- Prior art keywords
- heat
- ahf
- county
- data
- monthly
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/29—Geographical information databases
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/04—Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02A—TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
- Y02A90/00—Technologies having an indirect contribution to adaptation to climate change
- Y02A90/10—Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Abstract
Man-made heat has a significant impact on city climate and air quality, but currently there is no accurate and efficient estimation method for multi-source man-made heat. The invention improves the flow of artificial heat modeling and provides a multisource artificial heat space-time quantization method based on machine learning. The method comprises the following steps: step 1) calculating county-level annual average Artificial Heat Flux (AHF) based on energy consumption and socioeconomic data; step 2) carrying out time-dimension scale reduction treatment by using artificial heat with alternative data as different sources to obtain county-level monthly AHF; step 3) calculating a monthly county-level average value of the artificial heat related multi-source data as an explanatory variable, and forming a training sample with the corresponding AHF; step 4) training models based on two machine learning algorithms of a gradient lifting regression tree and a Cubist, carrying out error analysis, and selecting an optimal algorithm for different heat sources for modeling; and 5) inputting the specific raster data into the optimal model to calculate the multi-source artificial heat flux of the specific area at specific time.
Description
Technical Field
The invention relates to a multisource artificial heat space-time quantification method based on machine learning.
Background
Anthropogenic heat rejection has a significant impact on city climate and air quality, and is also an important data input for climate modeling. Accurate artificial heat data can be used as a regional or global scale climate simulation ground surface boundary condition, influence of human activities on urban environment is reasonably evaluated, and the method is an important basis for solving the problems of climate warming, heat island effect, air pollution and the like. Artificial Heat Flux (AHF) is artificial heat per unit time and per unit area and is the main target of artificial heat estimation. The traditional AHF estimation method comprises an energy balance equation method, building energy efficiency modeling and an energy consumption inventory method. Among them, the energy consumption inventory method is a relatively general and reliable method for estimating AHF. Further, to simplify the complexity of the AHF calculation and reduce repetitive work, some variables having strong correlation with heat emission, such as night lights and air pollutants, are used in combination with machine learning algorithms to build more practical empirical models.
While a number of methods currently proposed have been able to meet general research and application requirements, simple models such as simple linear regression and single variables have not been able to respond well to challenges in the face of complex spatio-temporal variations of multi-source artificial heat. In addition, in previous studies, AHF samples were usually estimated from provinces or cities (Chen et al, 2012; Chen et al, 2020; Sailor et al, 2015), but the administrative area of a province or city was generally large, and the proportion of built-up areas was low, which resulted in high artificial heat emission and low AHF, which caused some trouble in analyzing the estimation results. For the estimation model, although the explanatory variables are also processed into the same spatial scale, the large spatial scale mean processing reduces the otherwise scarce medium and high value AHF samples; meanwhile, the estimation model based on the annual average AHF is prone to enabling samples to become similar, and much time information is lost. In summary, current studies do not adequately account for the diversity and variability of training samples. On the other hand, machine learning-based AHF modeling is often affected by different algorithms, different artificial heat sources may have different optimized modeling algorithms due to differences in their spatio-temporal characteristics, and this influencing factor has not yet been considered and practiced. Aiming at the problems of the existing artificial calorification method, the invention constructs a sample with more space-time diversity, and simultaneously uses two more complex machine learning algorithms to construct an optimal multisource AHF estimation model.
A Gradient boosting regression tree (GBDT) is a very classical integrated learning algorithm, which integrates weak learners to generate strong learners, and is widely used in data analysis and prediction in multiple fields. The Cubist algorithm was developed from a model tree (Quinlan, 1992; Quinlan, 1993; Quinlan,1996) and is a rule-based algorithm in which the leaf nodes are a multiple linear regression model rather than single values. Similar to GBDT, Cubist can also carry out ensemble learning and has better performance in the fields of traffic flow prediction, air surface temperature estimation, ground surface coverage estimation, leaf area index estimation and artificial heat estimation.
Disclosure of Invention
Aiming at the technical defect problem of accurate and efficient estimation of multisource artificial heat, the multisource artificial heat space-time quantization method based on machine learning provided by the invention is mainly realized based on the following steps:
step 1) calculating a county grade annual average AHF by adopting a top-down energy consumption inventory method based on energy consumption data and socioeconomic data;
step 2) carrying out time-dimension scale reduction treatment by using artificial heat with alternative data as different sources to obtain county-level monthly AHF;
step 3) preprocessing a multi-source data set related to artificial heat, calculating a monthly county-level average value as an explanatory variable, and forming a training sample with a corresponding AHF;
step 4) training models based on two machine learning algorithms of a gradient lifting regression tree and a Cubist, carrying out error analysis, selecting an optimal algorithm for modeling different heat sources, and simultaneously using a simple linear regression model based on night lamplight as a reference for precision lifting;
and 5) inputting the specific grid data into the optimal model to calculate the artificial heat flux of the specific area at the specific time, and outputting a grid result.
Drawings
FIG. 1 is a technical flow diagram;
FIG. 2 shows the error results of the models;
FIG. 3 is a spatio-temporal profile of a model output multi-source AHF;
Detailed Description
The invention 'a multi-source artificial thermal spatiotemporal quantization method based on machine learning' is further explained below with reference to the accompanying drawings.
Estimation of monthly AHF at county level
Artificial heat encompasses four sources of industry, construction, traffic, and human metabolism. And sequentially estimating annual average AHF at provincial level, city level and county level based on an energy inventory method. When the urban AHF is reduced to the county level, the industrial heat is calculated according to the proportion of the quantity of industrial POI in the county to the whole city, the traffic heat and the building heat are calculated according to the proportion of the county population, and the metabolic heat is estimated directly through the county population. The monthly AHF is calculated from the temporal variation of the replacement data, which is a common time down scaling rule in the top-down energy inventory method. In the absence of heating in the study area, monthly building heat and industrial heat are estimated based on monthly power consumption, monthly traffic heat is estimated based on monthly freight volume, and metabolic heat is fixed, participating in model training as a whole with building heat. The specific calculation is as follows:
wherein the content of the first and second substances,AHF of mth month of heat source representing type S, i represents prefecture city, j represents prefecture;emission fraction (%) for the mth month of the corresponding heat source;annual heat emission (J) for industrial traffic and buildings, respectively. A. thejIs the area of county (m)2);TyIs the time of year(s). The process can be realized by using Python and R language programming, and can also be directly calculated by using Excel.
(II) construction of training samples
The method comprises the steps of establishing a density grid at a search radius of 1000 meters for roads, railways and industrial POI, and calculating a distance grid at the same time, wherein a point density tool and a Euclidean distance tool in ArcGIS can be used for calculation respectively. Here, roads and railways are special variables of traffic heat, industrial POI and railways are special variables of industrial heat, and building area ratio is a special variable of building metabolic heat.
The common variable participates in the estimation of three heat sources at the same time, and the processing is as follows: remote sensing data (surface temperature, NDVI and the like), meteorological data (air temperature, humidity, wind speed) and topographic data (DEM and gradient) are subjected to data screening, monthly synthesis, cutting, re-projection and re-sampling through a Google Earth Engine (GEE); and finally, carrying out partition statistics on the grids in ArcGIS, outputting an interpretation variable table, adding two classification variables of the region and the month to which the two classification variables belong, and forming a training sample together with the every-month county-level AHF.
(III) training and evaluation of models
Training samples are led into R, 80% of samples are randomly selected for training and testing, and the rest samples are used for model verification. GBDT and Cubist both contain more hyper-parameters and can be adjusted, a caret packet, a gbm packet and a Cubist packet are called in R, GBDT and Cubist models are respectively trained according to different AHF sources, 10-fold cross validation repeated for 10 times is used for parameter adjustment, optimal model parameters are selected according to the principle of minimizing training/testing errors, multi-source artificial heat space-time quantification models (GBDT and Cubist) are constructed, and finally an algorithm (GBDT or Cubist) used for modeling each heat source is determined according to the validation errors; a Simple Linear Regression (SLR) model based on night light is used as a reference for testing the accuracy improvement of the two complex algorithms. All the above processes can be completed in R.
(III) outputting of the results
Calling a raster bag in the R, converting the interpretation variables in the grid form into a data frame format, and constructing a simple classification regression tree model to fill in the missing variables; and calling the trained model prediction result, constructing a grid input prediction result, and finally outputting the multisource artificial heat space-time distribution in the grid form.
Claims (6)
1. A multisource artificial heat space-time quantification method based on machine learning is mainly realized by the following technical steps:
step 1) calculating an annual average Artificial Heat Flux (AHF) of a county level by adopting a top-down energy consumption inventory method based on energy consumption data and socioeconomic data;
step 2) carrying out time-dimension scale reduction treatment by using artificial heat with alternative data as different sources to obtain county-level monthly AHF;
step 3) preprocessing a multi-source data set related to artificial heat, calculating a monthly county-level average value as an explanatory variable, and forming a training sample with a corresponding AHF;
step 4) training models based on two machine learning algorithms of a gradient lifting regression tree and a Cubist, carrying out error analysis, selecting an optimal algorithm for modeling different heat sources, and simultaneously using a simple linear regression model based on night lamplight as a reference for precision lifting;
and 5) inputting the specific grid data into the optimal model to calculate the artificial heat flux of the specific area at the specific time, and outputting a grid result.
2. The method of claim 1, wherein step 1): artificial heat encompasses four sources of industry, construction, traffic, and human metabolism. And sequentially estimating annual average AHF at provincial level, city level and county level based on an energy inventory method. When the urban AHF is reduced to the county level, the industrial heat is calculated according to the proportion of the quantity of industrial POI in the county to the whole city, the traffic heat and the building heat are calculated according to the proportion of the county population, and the metabolic heat is estimated directly through the county population.
3. The method of claim 1, wherein step 2): the monthly AHF is calculated from the temporal variation of the replacement data, which is a common time down scaling rule in the top-down energy inventory method. In the absence of heating in the study area, monthly building heat and industrial heat are estimated based on monthly power consumption, monthly traffic heat is estimated based on monthly freight volume, and metabolic heat is fixed, participating in model training as a whole with building heat. The specific calculation is as follows:
wherein the content of the first and second substances,AHF of mth month of heat source representing type S, i represents prefecture city, j represents prefecture;emission fraction (%) for the mth month of the corresponding heat source;annual heat emission (J) for industrial traffic and buildings, respectively. A. thejIs the area of county (m)2);TyIs the time of year(s). The process can be realized by using Python and R language programming, and can also be directly calculated by using Excel.
4. The method of claim 1, wherein step 3): the method comprises the steps of establishing a density grid at a search radius of 1000 meters for roads, railways and industrial POI, and calculating a distance grid at the same time, wherein a point density tool and a Euclidean distance tool in ArcGIS can be used for calculation respectively. Here, roads and railways are special variables of traffic heat, industrial POI and railways are special variables of industrial heat, and building area ratio is a special variable of building metabolic heat. The common variable participates in the estimation of three heat sources at the same time, and the processing is as follows: remote sensing data (surface temperature, NDVI and the like), meteorological data (air temperature, humidity, wind speed) and topographic data (DEM and gradient) are subjected to data screening, monthly synthesis, cutting, re-projection and re-sampling through a Google Earth Engine (GEE); and finally, carrying out partition statistics on the grids in ArcGIS, outputting an interpretation variable table, adding two classification variables of the region and the month to which the two classification variables belong, and forming a training sample together with the every-month county-level AHF.
5. The method of claim 1, wherein step 4): training samples are led into R, 80% of samples are randomly selected for training and testing, and the rest samples are used for model verification. GBDT and Cubist both contain more hyper-parameters and can be adjusted, a caret packet, a gbm packet and a Cubist packet are called in R, GBDT and Cubist models are respectively trained according to different AHF sources, 10-fold cross validation repeated for 10 times is used for parameter adjustment, optimal model parameters are selected according to the principle of minimizing training/testing errors, multi-source artificial heat space-time quantification models (GBDT and Cubist) are constructed, and finally an algorithm (GBDT or Cubist) used for modeling each heat source is determined according to the validation errors; a Simple Linear Regression (SLR) model based on night light is used as a reference for testing the accuracy improvement of the two complex algorithms. All the above processes can be completed in R.
6. The method of claim 1, wherein step 5): calling a raster bag in the R, converting the interpretation variables in the grid form into a data frame format, and constructing a simple classification regression tree model to fill in the missing variables; and calling the trained model prediction result, constructing a grid input prediction result, and finally outputting the multisource artificial heat space-time distribution in the grid form.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111354918.2A CN114139719A (en) | 2021-11-16 | 2021-11-16 | Multi-source artificial heat space-time quantization method based on machine learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111354918.2A CN114139719A (en) | 2021-11-16 | 2021-11-16 | Multi-source artificial heat space-time quantization method based on machine learning |
Publications (1)
Publication Number | Publication Date |
---|---|
CN114139719A true CN114139719A (en) | 2022-03-04 |
Family
ID=80393380
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111354918.2A Pending CN114139719A (en) | 2021-11-16 | 2021-11-16 | Multi-source artificial heat space-time quantization method based on machine learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114139719A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115204691A (en) * | 2022-07-13 | 2022-10-18 | 中国科学院地理科学与资源研究所 | Urban artificial heat emission estimation method based on machine learning and remote sensing technology |
CN117313451A (en) * | 2023-09-01 | 2023-12-29 | 长安大学 | Crop canopy structure parameter inversion method based on E-INFORM model |
-
2021
- 2021-11-16 CN CN202111354918.2A patent/CN114139719A/en active Pending
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115204691A (en) * | 2022-07-13 | 2022-10-18 | 中国科学院地理科学与资源研究所 | Urban artificial heat emission estimation method based on machine learning and remote sensing technology |
CN115204691B (en) * | 2022-07-13 | 2023-02-03 | 中国科学院地理科学与资源研究所 | Urban artificial heat emission estimation method based on machine learning and remote sensing technology |
CN117313451A (en) * | 2023-09-01 | 2023-12-29 | 长安大学 | Crop canopy structure parameter inversion method based on E-INFORM model |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN113919448B (en) | Method for analyzing influence factors of carbon dioxide concentration prediction at any time-space position | |
CN110570651B (en) | Road network traffic situation prediction method and system based on deep learning | |
CN108701274B (en) | Urban small-scale air quality index prediction method and system | |
Mishra et al. | Prediction of land use changes based on land change modeler (LCM) using remote sensing: A case study of Muzaffarpur (Bihar), India | |
CN110782093B (en) | PM fusing SSAE deep feature learning and LSTM2.5Hourly concentration prediction method and system | |
CN114139719A (en) | Multi-source artificial heat space-time quantization method based on machine learning | |
CN106355334A (en) | Farmland construction area determining method | |
CN109948547A (en) | Urban green space landscape evaluation method, device, storage medium and terminal device | |
CN110334732A (en) | A kind of Urban Air Pollution Methods and device based on machine learning | |
Nadoushan et al. | Modeling land use/cover changes by the combination of Markov chain and cellular automata Markov (CA-Markov) models | |
CN114861277B (en) | Long-time-sequence territorial space function and structure simulation method | |
CN114881356A (en) | Urban traffic carbon emission prediction method based on particle swarm optimization BP neural network optimization | |
CN113806419B (en) | Urban area function recognition model and recognition method based on space-time big data | |
CN106127333A (en) | Movie attendance Forecasting Methodology and system | |
CN109685249A (en) | Air PM2.5 concentration prediction method based on AutoEncoder and BiLSTM fused neural network | |
CN110889092A (en) | Short-time large-scale activity peripheral track station passenger flow volume prediction method based on track transaction data | |
CN110826244A (en) | Conjugate gradient cellular automata method for simulating influence of rail transit on urban growth | |
CN115759488A (en) | Carbon emission monitoring and early warning analysis system and method based on edge calculation | |
CN116681176A (en) | Traffic flow prediction method based on clustering and heterogeneous graph neural network | |
Rimba et al. | Identifying land use and land cover (LULC) change from 2000 to 2025 driven by tourism growth: A study case in Bali | |
Tehrani et al. | Predicting solar radiation in the urban area: A data-driven analysis for sustainable city planning using artificial neural networking | |
Stevanovic et al. | Evaluating robustness of signal timings for varying traffic flows | |
Akhter et al. | Climate modeling of Jhelum River basin-a comparative study | |
CN115983522B (en) | Rural habitat quality assessment and prediction method | |
Kopyrin | Simulation modelling of the municipal sanatorium-tourist branch |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |