CN113298439A - Population distribution-based environmental risk assessment method and device and computer equipment - Google Patents

Population distribution-based environmental risk assessment method and device and computer equipment Download PDF

Info

Publication number
CN113298439A
CN113298439A CN202110694081.XA CN202110694081A CN113298439A CN 113298439 A CN113298439 A CN 113298439A CN 202110694081 A CN202110694081 A CN 202110694081A CN 113298439 A CN113298439 A CN 113298439A
Authority
CN
China
Prior art keywords
data
population
population distribution
pollutant concentration
distribution
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110694081.XA
Other languages
Chinese (zh)
Inventor
李佳雯
郑越
黄俊斌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Property and Casualty Insurance Company of China Ltd
Original Assignee
Ping An Property and Casualty Insurance Company of China Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Property and Casualty Insurance Company of China Ltd filed Critical Ping An Property and Casualty Insurance Company of China Ltd
Priority to CN202110694081.XA priority Critical patent/CN113298439A/en
Publication of CN113298439A publication Critical patent/CN113298439A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0635Risk analysis of enterprise or organisation activities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/24323Tree-organised classifiers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0639Performance analysis of employees; Performance analysis of enterprise or organisation operations
    • G06Q10/06393Score-carding, benchmarking or key performance indicator [KPI] analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/26Government or public services

Landscapes

  • Business, Economics & Management (AREA)
  • Engineering & Computer Science (AREA)
  • Human Resources & Organizations (AREA)
  • Theoretical Computer Science (AREA)
  • Strategic Management (AREA)
  • Economics (AREA)
  • Tourism & Hospitality (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Educational Administration (AREA)
  • Development Economics (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Marketing (AREA)
  • General Business, Economics & Management (AREA)
  • Game Theory and Decision Science (AREA)
  • Data Mining & Analysis (AREA)
  • Quality & Reliability (AREA)
  • Operations Research (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Primary Health Care (AREA)
  • Financial Or Insurance-Related Operations Such As Payment And Settlement (AREA)

Abstract

The application discloses an environmental risk assessment method and device based on population distribution and computer equipment, relates to the field of information processing, and can solve the technical problem that the existing environmental risk assessment is inaccurate and unreasonable. The method comprises the following steps: obtaining regional sample data in a preset region, and training a population distribution prediction model by using the regional sample data; extracting first feature data in a target area, inputting the first feature data into a trained population distribution prediction model, and obtaining a prediction result corresponding to second feature data, wherein the first feature data at least comprise night light data, normalized vegetation indexes, data elevation model data, gradients, interest points and insurance data, and the second feature data comprise population distribution data; and determining pollutant concentration data in the target area, and performing data superposition analysis on the population distribution data and the pollutant concentration data to obtain an environmental risk assessment result.

Description

Population distribution-based environmental risk assessment method and device and computer equipment
Technical Field
The present application relates to the field of information processing, and in particular, to a method and an apparatus for environmental risk assessment based on population distribution, and a computer device.
Background
The population growth and the huge population number cause excessive consumption of land resources, energy resources, mineral resources and the like, and also cause the problems of serious environmental pollution, ecological damage and the like. With the continuous outbreak of haze across the country, the public pair includes PM2.5The harm cognition of the pollutants in the air is rapidly improved, and the haze treatment requirements of the whole country can reach the unprecedented height.
At present, the population pollution exposure risk is generally evaluated by adopting a pollutant concentration index of an air quality monitoring station. However, because the spatial distribution of the contaminants and the spatial distribution of the population are not consistent, errors may occur in directly using the concentration of the contaminants instead of the actual exposure level of the person. For example, in some areas, the pollutant concentration is high, but only a small amount of people live, the pollution exposure of the people caused by the pollutant concentration may be small, and in some areas, the pollutant concentration may not be abnormally prominent, but the number of people exposed to the pollutant is large, so that the number of the people suffering from the pollutant is large, and the health burden is huge. Therefore, there is a need to scientifically consider the sensitivity of population distribution to air pollution, comprehensively consider the influence of air pollution distribution and population distribution, and explore a human-oriented urban air pollution exposure risk evaluation method.
Disclosure of Invention
In view of this, the present application provides an environmental risk assessment method, an environmental risk assessment device and a computer device based on population distribution, which can solve the technical problem that the current environmental risk assessment is inaccurate and unreasonable.
According to one aspect of the application, a population distribution-based environmental risk assessment method is provided, the method comprising:
obtaining regional sample data in a preset region, and training a population distribution prediction model by using the regional sample data;
extracting first feature data in a target area, inputting the first feature data into a trained population distribution prediction model, and obtaining a prediction result corresponding to second feature data, wherein the first feature data at least comprise night light data, normalized vegetation indexes, data elevation model data, gradients, interest points and insurance data, and the second feature data comprise population distribution data;
and determining pollutant concentration data in the target area, and performing data superposition analysis on the population distribution data and the pollutant concentration data to obtain an environmental risk assessment result.
According to another aspect of the present application, there is provided an environmental risk assessment apparatus based on population distribution, the apparatus comprising:
the training module is used for acquiring regional sample data in a preset region and training a population distribution prediction model by using the regional sample data;
the system comprises an acquisition module, a prediction module and a prediction module, wherein the acquisition module is used for extracting first characteristic data in a target area, inputting the first characteristic data into a trained population distribution prediction model, and acquiring a prediction result corresponding to second characteristic data, the first characteristic data at least comprises night light data, normalized vegetation indexes, data elevation model data, slopes, interest points and insurance data, and the second characteristic data comprises population distribution data;
and the analysis module is used for determining pollutant concentration data in the target area, and performing data superposition analysis on the population distribution data and the pollutant concentration data to obtain an environmental risk assessment result.
According to yet another aspect of the present application, there is provided a storage medium having stored thereon a computer program which, when executed by a processor, implements the above-described method of demographic distribution-based environmental risk assessment.
According to yet another aspect of the present application, there is provided a computer device comprising a storage medium, a processor, and a computer program stored on the storage medium and executable on the processor, the processor implementing the above-described method for demographic-based environmental risk assessment when executing the program.
By means of the technical scheme, compared with the current environmental risk assessment mode, the population distribution based environmental risk assessment method, the population distribution based environmental risk assessment device and the computer equipment provided by the application can firstly train a population distribution prediction model by using the sample data of the area, and then refine and determine a prediction result corresponding to second characteristic data, namely population distribution data, based on the first characteristic data of each dimension in the target area by using the population distribution prediction model; and then carrying out data superposition analysis on the pollutant concentration data and population distribution data in the target area to further obtain an environmental risk assessment result. Through the mode in this application, can combine air pollution distribution data and population distribution data to carry out the spatialization aassessment of population pollution exposure. Compared with the mode of directly using the pollutant concentration to evaluate the actual exposure level in the past, the method and the device consider the main body effect of population factors in the air pollution exposure risk evaluation, and can improve the rationality and accuracy of the air pollution exposure risk evaluation by performing data superposition analysis on population distribution data and pollutant concentration data.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the application and together with the description serve to explain the application and not to limit the application to the disclosed embodiment. In the drawings:
FIG. 1 is a flow chart of a method for population distribution-based environmental risk assessment according to an embodiment of the present disclosure;
FIG. 2 is a schematic flow chart diagram illustrating another method for population distribution-based environmental risk assessment according to an embodiment of the present disclosure;
FIG. 3 is a schematic structural diagram of an environmental risk assessment device based on population distribution according to an embodiment of the present application;
fig. 4 shows a schematic structural diagram of another population distribution-based environmental risk assessment apparatus provided in an embodiment of the present application.
Detailed Description
The present application will be described in detail below with reference to the accompanying drawings in conjunction with embodiments. It should be noted that the embodiments and features of the embodiments in the present application may be combined with each other without conflict.
Aiming at the technical problem that the existing environmental risk assessment is inaccurate and unreasonable, the application provides an environmental risk assessment method based on population distribution, as shown in fig. 1, the method comprises the following steps:
101. obtaining regional sample data in a preset region, and training a population distribution prediction model by using the regional sample data.
The preset area is a geographical area with determined accurate population distribution data, and the area sample data is area characteristic information carrying model input characteristics (night light data, normalized vegetation index, data elevation model data, gradient, interest point and insurance data) and model output characteristics (population distribution data). In the application, the population distribution prediction model can be trained by using the regional sample data, so that the population distribution prediction model can accurately output the characteristic data matched with the output characteristics of the model according to the input characteristics of the model. Specifically, a population refined distribution model can be constructed by adopting random forests, night light data, normalized vegetation indexes, data elevation model data, gradients, interest points and insurance data are introduced into the model as independent variables, county census population data are used as dependent variables, the contribution of each independent variable to the model is accurately evaluated by a method of combining machine learning and a GIS, the optimal population space distribution result is finally output, and the prediction of high-precision and high-resolution population distribution data is realized.
The random forest model is a classifier which trains and predicts a sample by using a plurality of trees, a large number of variables can be input into the classifier, a high-accuracy classification or regression result is output after the classifier learns the variables quickly, meanwhile, the importance of the variables is evaluated, and the problem of overfitting is avoided. The random forest is very suitable for spatialization of population data due to the advantages, and can quickly learn the relation between the variable factors and the population data and give importance evaluation of the variable factors. Compared with the traditional regression simulation method, the population random forest model can avoid overfitting, can identify tolerance outliers and noise, and can remarkably improve the prediction precision under the condition of not remarkably improving the operation amount, so that the high-precision and high-resolution grid population space distribution research becomes possible.
102. And extracting first characteristic data in the target area, inputting the first characteristic data into the trained population distribution prediction model, and obtaining a prediction result corresponding to the second characteristic data.
The target region may be a region preset by a user, a preset city, a preset country, or the like, for example, the target region may be a city such as beijing, shanghai, shenzhen, or guangdong; the first characteristic data at least comprise night light data, normalized vegetation indexes, data elevation model data, gradients, interest points and insurance data, and the second characteristic data comprise population distribution data.
Specifically, night light data (NTL) is a light index, and can be obtained by using a new generation of NTL product based on the night light data, the data has higher spatial resolution and radiation range, can identify a weak light source, and can be used for researches in the aspects of atmosphere, land surface processes, human activities and the like; the normalized vegetation index represents the coverage degree of vegetation, is a numerical value within 0-1 obtained through the normalized treatment of band operation, the numerical value is closer to 1, the higher the vegetation coverage rate is, and the normalized vegetation index can be derived from the United nations environmental planning arrangement; the digital elevation model data can be acquired through a geographical national condition monitoring cloud platform; the gradient data are numerical values higher than the sea level, are data on the space and can be calculated by digital elevation model data; the point-of-interest data is building identification information, such as hospitals, markets, schools, supermarkets and the like, and can be acquired through an application programming interface; the insurance data is the insurance policy data of corresponding year with longitude and latitude selected from the hive database.
103. And determining pollutant concentration data in the target area, and performing data superposition analysis on the human mouth distribution data and the pollutant concentration data to obtain an environmental risk evaluation result.
In a specific application scenario, the pollutants may include carbon monoxide (CO), nitrogen oxides (NOx), hydrocarbons (H air pollutants C), sulfur oxides(s), Particulate Matters (PM), and the like, and in the embodiment, the pollutants to be studied may be specifically PM2.5Wherein PM2.5The concentration data can be acquired by an open source data through an international geoscience information network center of Columbia university. In this embodiment, after the concentration data is obtained, the population distribution data and the PM may be combined2.5The concentration data are combined to carry out spatial assessment of the pollution exposure level, and the environmental risk assessment result comprising the population exposure level, the population weighted exposure level and the population pollution exposure risk partition map is determined2.5The concentration replaces the actual exposure level of people, and the method can be applied to the research and monitoring in macroscopic continuity, covers the whole population, has wider coverage range, and can be used for the long-term analysis of the time-space change of the exposure level of the whole population.
By the method for evaluating the environmental risk based on the population distribution, a population distribution prediction model can be trained by using the sample data of the region, and then a prediction result corresponding to second characteristic data, namely population distribution data, can be determined in a refined manner based on the first characteristic data of each dimension in the target region by using the population distribution prediction model; and then carrying out data superposition analysis on the pollutant concentration data and population distribution data in the target area to further obtain an environmental risk assessment result. Through the mode in this application, can combine air pollution distribution data and population distribution data to carry out the spatialization aassessment of population pollution exposure. Compared with the mode of directly using the pollutant concentration to evaluate the actual exposure level in the past, the method and the device consider the main body effect of population factors in the air pollution exposure risk evaluation, and can improve the rationality and accuracy of the air pollution exposure risk evaluation by performing data superposition analysis on population distribution data and pollutant concentration data.
Further, as a refinement and an extension of the embodiments of the foregoing embodiments, in order to fully illustrate the implementation process in this embodiment, another population distribution-based environmental risk assessment method is provided, as shown in fig. 2, and the method includes:
201. obtaining regional sample data in a preset region, and training a population distribution prediction model by using the regional sample data.
For this embodiment, step 201 of the embodiment may specifically include: dividing a preset area into a plurality of first grid units with consistent grid attributes; extracting first characteristic data and second characteristic data in each first grid unit; determining the first characteristic data as the input characteristic of a population distribution prediction model, determining the second characteristic data as the output characteristic of the population distribution prediction model, and training the population distribution prediction model; training and evaluating the human mouth distribution prediction model based on a cross validation algorithm to obtain an evaluation result; and if the training error calculated according to the evaluation result is smaller than a preset threshold value, judging that the population distribution prediction model is trained completely. The first characteristic data at least comprise night light data, normalized vegetation indexes, data elevation model data, gradients, interest points and insurance data, and the second characteristic data at least comprise population distribution data.
Specifically, for optimal results and maximum resolution, the first grid cell size is set to 500m × 500 m. The grid attributes may specifically include resolution, grid row and column numbers, a coordinate system, and the like, and specifically, the grid attributes may be pre-processed to be uniform by using a geographic information system platform to perform steps of projection, masking, resampling, and the like on the grid data in the first grid unit. The geographic information system platform can be specifically an ARCGIS system platform, the ARCGIS is used as a world-leading geographic information system construction and application platform and is a comprehensive system, and users can collect, organize, manage, analyze, communicate and release geographic information by using the ARCGIS system platform.
Correspondingly, night light data, normalized vegetation indexes, data elevation model data, gradients, interest points and insurance data are determined as input characteristics of a population distribution prediction model, population distribution data are determined as output characteristics of the population distribution prediction model, the population distribution prediction model can be trained by using an R language, a population random forest refined distribution model is constructed by using a random forest algorithm, the operation performance and the model precision are comprehensively considered, ntree is set to be 500, mtry is set to be 2, and a prediction result is obtained through model training. Furthermore, the distribution of the partition weight of the prediction results can be adjusted to obtain the population space distribution of the grid points of 500m × 500m corresponding to the year of the research area. Specifically, the same grid layer may be positioned in a fitted radio frequency model, distribution weights of each 500m × 500m grid region are calculated, the population of the research area is spatially decomposed according to the weights, and the population is finally distributed in the grid by using a regional population density mapping, where the calculation formula is as follows:
Figure BDA0003127312440000061
in the formula, wgirdFor each grid point population distribution weight, wcountryFor the sum of the population distribution weights of each region, POPcountryRepresenting the census population, POP, of each areagirdA predicted population for each grid area.
Finally, model precision and contribution of respective variables can be judged through data such as population random forest model variable interpretations (abbreviated as% Var extended) and mean square error increments (The percentage of increment of mean square error, abbreviated as% IncMSE), a confirmation model is established by using a random forest machine learning method, a population distribution prediction model is evaluated by adopting 10-fold cross validation, The main idea is to divide data into 10 subsamples at random, one sample is taken as verification data each time, The remaining 9 samples are taken as training data, The model is acted on The verification data after being established, The current error rate is calculated, and The steps are repeated. By performing linear regression and error analysis on the estimation result and the statistical result, a regression equation R2 and an error range are obtained. Using the decision coefficient (R2) and the relative error (RMSE) as evaluation indexes, R2 represents the fitting accuracy between the predicted population and the census population, and the relative error can reflect the credibility of the prediction, and the formula is as follows:
Figure BDA0003127312440000071
Figure BDA0003127312440000072
wherein R2 is the coefficient of determination, RMSE is the relative error, n is the total number of points counted, popiCensus data for grid point i, poiThe demographic data is predicted for grid point i,
Figure BDA0003127312440000073
the average of the census data in n total grid points.
In a specific application scenario, if the R2 and RMSE values are larger than respective preset thresholds based on a cross-validation result, it can be determined that a population refinement distribution model passes training, and target weight values corresponding to 6 independent variable data of night light data, digital elevation model data, normalized vegetation indexes, gradients, interest points and insurance data are further determined, the sum of the target weight values corresponding to the 6 independent variable data is 1, and the target weight values are used for reflecting the predicted influence degree of the variable factors on population density.
202. And extracting first characteristic data in the target area, inputting the first characteristic data into the trained population distribution prediction model, and obtaining a prediction result corresponding to the second characteristic data.
For this embodiment, after the target area is obtained, the target area needs to be first divided into grids, specifically, the target area may be divided into a plurality of second grid units with consistent grid attributes, the second grid units are the same as the first grid units in processing manner, the size of the second grid unit is also set to be 500m × 500m, the grid attributes may specifically include resolution, grid row and column numbers, a coordinate system, and the like, and specifically, the grid data in the second grid unit may be preprocessed to be uniform grid attributes through steps of projection, masking, resampling, and the like by using a geographic information system platform. And then night light data, digital elevation model data, normalized vegetation indexes, slopes, interest points and insurance data in each second grid unit can be obtained in real time, the 6 independent variable data are input into a trained population distribution prediction model, population distribution data in the second grid unit are output by using the population distribution prediction model, and then the prediction result is corrected through regional density mapping to obtain a gridded population distribution result.
The zoning density mapping method is to redistribute the population number of each grid according to the proportion of the population of each grid to the total population of all grids in an administrative region, which is obtained by a random forest, and the calculation formula is as follows:
Figure BDA0003127312440000081
in the formula, PiFor the population within each grid, SjThe total number of population of the administrative district in which the grid is located, DiPopulation number estimated for the grid from a random forest model, DjAnd the total population of all grids of the administrative region where the grid is located is estimated according to the random forest model.
In a specific application scenario, when extracting the first feature data in the target region, the embodiment step 202 may specifically include: dividing the target area into a plurality of second grid units with consistent grid attributes according to a preset grid division rule; and extracting first characteristic data in each second grid unit, wherein the first characteristic data at least comprises night light data, normalized vegetation indexes, data elevation model data, gradients, interest points and insurance data. Correspondingly, when inputting the first feature data into the trained population distribution prediction model and obtaining the prediction result corresponding to the second feature data, the embodiment step 202 may specifically include: extracting a target weight value corresponding to each first characteristic data from the trained population distribution prediction model; inputting the first characteristic data into a trained population distribution prediction model, and calculating to obtain a prediction result corresponding to the second characteristic data based on a target weight value; and correcting the prediction result through the partition density mapping to obtain population distribution data.
203. And mask extracting pollutant concentration data in each second grid unit from the global concentration data by using a geographic information system platform.
For the present embodiment, the geographic information system platform can be utilized to mask out the PM in each second grid cell from the global concentration data2.5Concentration data, in turn PM2.5Distributing the concentration data into 500m × 500m grids to obtain air pollutant distribution data, wherein the air pollutant distribution data comprises each grid and corresponding PM2.5Correspondence between concentration data.
204. And calculating the population exposure intensity in each second grid unit based on the first calculation formula, population distribution data and pollutant concentration data.
Wherein the first calculation formula is characterized by:
Ei=PiCi
in the formula, EiPopulation exposure intensity of the second grid cell i, CiIs the concentration in the second grid cell i, PiAnd (3) outputting the population density in the second grid unit i based on the first characteristic data by the population distribution prediction model.
For this embodiment, after determining the population exposure intensity in each second grid cell, it may be further based on the PM in each grid cell2.5And determining the population exposure level in each grid according to the population exposure intensity and the rating threshold corresponding to each exposure level.
205. And calculating population weighted exposure intensity in the target area based on the second calculation formula and population distribution data, pollutant concentration data and total population number in the target area.
For the present embodiment, population weighting based PM may also be employed2.5Population weighted PM for pollution exposure model assessment research area2.5The level of contamination. Population weighting based PM2.5Pollution (b) byThe exposure evaluation takes population at different exposure concentrations as weight, and can better reflect PM2.5The second calculation formula is characterized by the actual impact on the population as:
Figure BDA0003127312440000091
where E is the population weighted exposure intensity of the target area (region/province/city), CiIs the mass concentration, P, of the contaminant of the second grid cell iiIs the number of population in the second grid cell i, n is the total number of grid cells in the study area, and P is the total number of population in the study area.
206. And generating a population pollution exposure risk zone map according to the population distribution data and the pollutant concentration data.
The population pollution exposure zone map stores air pollution exposure indexes corresponding to the second grid units in the target area.
For the embodiment, data superposition analysis is carried out on population distribution data and pollutant distribution data, the main body effect of population factors in air pollution exposure risk evaluation can be effectively considered, the defect that population distribution is neglected by existing pollutant concentration indexes is overcome, and the rationality of air pollution exposure risk evaluation is further improved.
Correspondingly, in a specific application scenario, the embodiment step 206 may specifically include: carrying out dimensionless normalization processing on the human mouth distribution data and the pollutant concentration data to obtain dimensionless data; performing first classification of population density and first classification of pollution degree concentration on a target area according to dimensionless data to obtain a population density grade table and a pollution degree grade table; performing second classification of population density and second classification of pollution degree concentration on the dimensionless data according to the population density grade table and the pollution degree grade table to obtain a population density grading diagram and a pollutant concentration grading diagram; and converting the population density grading map and the pollutant concentration grading map into grid data, and superposing the pollutant concentration parameters in the grid data into the population density parameters to generate a population pollution exposure risk zone map.
Wherein the PM in the pollution degree grade table2.5The degree of influence of the concentration on the space unit is classified into five categories, i.e., high, second highest, medium, second lowest, and low, and the population density is classified into five categories, i.e., high density, second highest density, medium density, second lowest density, and low density, in the population density ranking table, and the ranking indexes are quantized to integers of {1,2,3,4,5}, where 1 represents low, 2 represents second lowest, 3 represents medium, 4 represents second highest, and 5 represents high. In this embodiment, after the population distribution data and the pollutant concentration data are obtained, dimensionless normalization processing may be performed on the population distribution data and the pollutant concentration data in each second grid unit to obtain dimensionless data; comparing the dimensionless data in each second grid unit with the quantitative scores of the grade indexes to obtain a population density grade table and a pollution degree grade table; further, according to the population density grade table and the pollution degree grade table, performing second grading on the population density and second grading on the non-dimensional data in each second grid unit to obtain a population density grading chart and a pollutant concentration grading chart which can visually display grade division, wherein in the population density grading chart and the pollutant concentration grading chart, different grades can be marked by using different colors, for example, for the population density grading chart, five corresponding color marks of high density, second high density, medium density, second low density and low density can be respectively: red, orange, yellow, green, blue, for the contaminant concentration grading map, the five categories of corresponding color labels high, second high, medium, second low, low may also be set as: red, orange, yellow, green, blue; in obtaining PM2.5After the concentration grading diagram and the population density grading diagram are obtained, the PM can be further converted by using the conversion function of the geographic information system platform2.5Converting the density grading chart and the population density grading chart into grid data, and using the grid computing function of the geographic information system platform to calculate the PM2.5The concentration parameter is superposed into the human mouth density parameter, thereby obtaining the PM of each space unit2.5Population exposure data for concentration.
Preferably, when the population exposure data is mapped to five types, i.e., high risk, low risk, medium risk, low risk, and the like, to obtain the air pollution exposure zone map, pollution exposure indexes corresponding to the target zones are stored in the air pollution exposure zone map, so that the air pollution exposure of each space unit can be visually reflected.
Correspondingly, when the dimensionless normalization processing is performed on the population distribution data and the pollutant concentration data to obtain the dimensionless data, the embodiment step 206 may specifically include: calculating the average pollutant concentration and the average population density of each second grid unit in the target area; and dividing the population space distribution data by the average population density, and dividing the pollutant distribution data by the average pollutant concentration to obtain dimensionless data. Through carrying out dimensionless normalization processing on the human mouth data and the pollutant concentration data and carrying out risk classification, the abnormal clustering phenomenon of 'high and low two-stage' of exposure risks can be avoided, the pollution exposure risks in an evaluation target area can be visually displayed, and a key risk area can be identified.
By means of the environmental risk assessment method based on population distribution, a population distribution prediction model can be trained by using sample data of the region, and then a prediction result corresponding to second characteristic data, namely population distribution data, is determined in a refining mode based on first characteristic data of all dimensions in a target region by using the population distribution prediction model; and then carrying out data superposition analysis on the pollutant concentration data and population distribution data in the target area to further obtain an environmental risk assessment result. Through the mode in this application, can combine air pollution distribution data and population distribution data to carry out the spatialization aassessment of population pollution exposure. Compared with the mode of directly using the pollutant concentration to evaluate the actual exposure level in the past, the method and the device consider the main body effect of population factors in the air pollution exposure risk evaluation, and can improve the rationality and accuracy of the air pollution exposure risk evaluation by performing data superposition analysis on population distribution data and pollutant concentration data.
Further, as a specific implementation of the methods shown in fig. 1 and fig. 2, an embodiment of the present application provides an environmental risk assessment apparatus based on population distribution, as shown in fig. 3, the apparatus includes: a training module 31, an acquisition module 32, and an analysis module 33;
the training module 31 may be configured to obtain regional sample data in a preset region, and train a population distribution prediction model using the regional sample data;
the obtaining module 32 is configured to extract first feature data in the target area, input the first feature data into a trained population distribution prediction model, and obtain a prediction result corresponding to second feature data, where the first feature data at least includes night light data, a normalized vegetation index, data elevation model data, a slope, an interest point, and insurance data, and the second feature data includes population distribution data;
and the analysis module 33 may be configured to determine pollutant concentration data in the target region, and perform data superposition analysis on the human mouth distribution data and the pollutant concentration data to obtain an environmental risk assessment result.
In a specific application scenario, in order to obtain a population distribution prediction model based on region sample data training in a preset region, as shown in fig. 4, the training module 31 may specifically include: a first dividing unit 311, a first extracting unit 312, a training unit 313, an evaluating unit 314, and a judging unit 315;
the first dividing unit 311 may be configured to divide the preset region into a plurality of first grid units with consistent grid attributes;
a first extraction unit 312, configured to extract first feature data and second feature data in each first grid cell;
a training unit 313, configured to determine the first feature data as an input feature of a population distribution prediction model, determine the second feature data as an output feature of the population distribution prediction model, and train the population distribution prediction model;
the evaluation unit 314 is configured to train and evaluate the human mouth distribution prediction model based on a cross validation algorithm to obtain an evaluation result;
the determining unit 315 may be configured to determine that the population distribution prediction model is trained completely if the training error calculated according to the evaluation result is smaller than a preset threshold.
In a specific application scenario, in order to extract and obtain the first feature data in the target region, as shown in fig. 4, the obtaining module 32 may specifically include: a second dividing unit 321, a second extracting unit 322;
a second dividing unit 321, configured to divide the target region into a plurality of second grid units with consistent grid attributes according to a preset grid division rule;
the second extraction unit 322 may be configured to extract first feature data in each second grid unit, where the first feature data at least includes night light data, normalized vegetation index, data elevation model data, slope, interest point, and insurance data.
Correspondingly, in order to obtain the prediction result corresponding to the second feature data by inputting the first feature data into the trained population distribution prediction model, as shown in fig. 4, the obtaining module 32 may specifically include: a third extraction unit 323, an input unit 324, and a correction unit 325;
a third extracting unit 323, configured to extract a target weight value corresponding to each first feature data in the trained population distribution prediction model;
the input unit 324 may be configured to input the first feature data into a trained population distribution prediction model, and obtain a prediction result corresponding to the second feature data based on the target weight value;
the correcting unit 325 may be configured to correct the prediction result through the partition density mapping to obtain population distribution data.
In a specific application scenario, when the environmental risk assessment result is obtained by performing data superposition analysis on the human mouth distribution data and the pollutant concentration data, as shown in fig. 4, the analysis module 33 may specifically include: a fourth extraction unit 331, a first calculation unit 332, a second calculation unit 333, a generation unit 334;
a fourth extraction unit 331, configured to extract pollutant concentration data in each second grid unit from the global concentration data by using the geographic information system platform;
a first calculation unit 332, configured to calculate population exposure intensity in each second grid cell based on the first calculation formula, population distribution data, and pollutant concentration data;
a second calculating unit 333 operable to calculate a population weighted exposure intensity within the target area based on a second calculation formula and the population distribution data, the pollutant concentration data, the total population within the target area;
the generating unit 334 is configured to generate a population pollution exposure risk zone map according to the population distribution data and the pollutant concentration data.
Correspondingly, when generating the population pollution exposure risk zone map according to the population distribution data and the pollutant concentration data, the generating unit 334 is specifically configured to perform dimensionless normalization processing on the population distribution data and the pollutant concentration data to obtain dimensionless data; performing first classification of population density and first classification of pollution degree concentration on a target area according to dimensionless data to obtain a population density grade table and a pollution degree grade table; performing second classification of population density and second classification of pollution degree concentration on the dimensionless data according to the population density grade table and the pollution degree grade table to obtain a population density grading diagram and a pollutant concentration grading diagram; and converting the population density grading map and the pollutant concentration grading map into grid data, and superposing the pollutant concentration parameters in the grid data into the population density parameters to generate a population pollution exposure risk zone map.
In a specific application scenario, when dimensionless normalization processing is performed on the population distribution data and the pollutant concentration data to obtain dimensionless data, the generating unit 334 is specifically configured to calculate an average pollutant concentration and an average population density of each second grid unit in the target area; and dividing the population space distribution data by the average population density, and dividing the pollutant distribution data by the average pollutant concentration to obtain dimensionless data.
It should be noted that other corresponding descriptions of the functional units related to the environmental risk assessment device based on population distribution provided in this embodiment may refer to the corresponding descriptions in fig. 1 to fig. 2, and are not described herein again.
Based on the method shown in fig. 1 to 2, correspondingly, the present embodiment further provides a storage medium, on which computer readable instructions are stored, and the computer readable instructions, when executed by a processor, implement the method for assessing environmental risk based on population distribution shown in fig. 1 to 2.
Based on such understanding, the technical solution of the present application may be embodied in the form of a software product, which may be stored in a storage medium (which may be a CD-ROM, a usb disk, a removable hard disk, or the like), and include several instructions to enable a computer device (which may be a personal computer, a server, or a network device, or the like) to execute various implementation scenarios in the present application, so as to implement the above-described environmental risk assessment method based on population distribution as shown in fig. 1 to fig. 2.
Based on the method shown in fig. 1 to fig. 2 and the virtual device embodiments shown in fig. 3 and fig. 4, in order to achieve the above object, the present embodiment further provides a computer device, where the computer device includes a storage medium and a processor; a storage medium for storing a computer program; a processor for executing a computer program to implement the above-described method for population distribution based environmental risk assessment as shown in fig. 1-2.
Optionally, the computer device may further include a user interface, a network interface, a camera, Radio Frequency (RF) circuitry, a sensor, audio circuitry, a WI-FI module, and so forth. The user interface may include a Display screen (Display), an input unit such as a keypad (Keyboard), etc., and the optional user interface may also include a USB interface, a card reader interface, etc. The network interface may optionally include a standard wired interface, a wireless interface (e.g., WI-FI interface), etc.
It will be understood by those skilled in the art that the present embodiment provides a computer device structure that is not limited to the physical device, and may include more or less components, or some components in combination, or a different arrangement of components.
The storage medium may further include an operating system and a network communication module. The operating system is a program that manages the hardware and software resources of the computer device described above, supporting the operation of information handling programs and other software and/or programs. The network communication module is used for realizing communication among components in the storage medium and communication with other hardware and software in the information processing entity device.
Through the above description of the embodiments, those skilled in the art will clearly understand that the present application can be implemented by software plus a necessary general hardware platform, and can also be implemented by hardware.
By applying the technical scheme, compared with the current mode of environmental risk assessment based on population distribution, the method can firstly train a population distribution prediction model by using the sample data of the area, and then refine and determine the prediction result corresponding to the second characteristic data, namely population distribution data, based on the first characteristic data of each dimension in the target area by using the population distribution prediction model; and then carrying out data superposition analysis on the pollutant concentration data and population distribution data in the target area to further obtain an environmental risk assessment result. Through the mode in this application, can combine air pollution distribution data and population distribution data to carry out the spatialization aassessment of population pollution exposure. Compared with the mode of directly using the pollutant concentration to evaluate the actual exposure level in the past, the method and the device consider the main body effect of population factors in the air pollution exposure risk evaluation, and can improve the rationality and accuracy of the air pollution exposure risk evaluation by performing data superposition analysis on population distribution data and pollutant concentration data.
Those skilled in the art will appreciate that the figures are merely schematic representations of one preferred implementation scenario and that the blocks or flow diagrams in the figures are not necessarily required to practice the present application. Those skilled in the art will appreciate that the modules in the devices in the implementation scenario may be distributed in the devices in the implementation scenario according to the description of the implementation scenario, or may be located in one or more devices different from the present implementation scenario with corresponding changes. The modules of the implementation scenario may be combined into one module, or may be further split into a plurality of sub-modules.
The above application serial numbers are for description purposes only and do not represent the superiority or inferiority of the implementation scenarios. The above disclosure is only a few specific implementation scenarios of the present application, but the present application is not limited thereto, and any variations that can be made by those skilled in the art are intended to fall within the scope of the present application.

Claims (10)

1. An environmental risk assessment method based on population distribution, comprising:
obtaining regional sample data in a preset region, and training a population distribution prediction model by using the regional sample data;
extracting first feature data in a target area, inputting the first feature data into a trained population distribution prediction model, and obtaining a prediction result corresponding to second feature data, wherein the first feature data at least comprise night light data, normalized vegetation indexes, data elevation model data, gradients, interest points and insurance data, and the second feature data comprise population distribution data;
and determining pollutant concentration data in the target area, and performing data superposition analysis on the population distribution data and the pollutant concentration data to obtain an environmental risk assessment result.
2. The method according to claim 1, wherein the obtaining of region sample data in a preset region and training of a population distribution prediction model using the region sample data specifically comprises:
dividing the preset area into a plurality of first grid units with consistent grid attributes;
extracting first characteristic data and second characteristic data in each first grid unit;
determining the first characteristic data as the input characteristic of a human mouth distribution prediction model, determining the second characteristic data as the output characteristic of the human mouth distribution prediction model, and training the human mouth distribution prediction model;
training and evaluating the population distribution prediction model based on a cross validation algorithm to obtain an evaluation result;
and if the training error calculated according to the evaluation result is smaller than a preset threshold value, judging that the population distribution prediction model is trained completely.
3. The method according to claim 1, wherein the extracting first feature data in the target region specifically includes:
dividing the target area into a plurality of second grid units with consistent grid attributes according to a preset grid division rule;
and extracting first characteristic data in each second grid unit, wherein the first characteristic data at least comprises night light data, a normalized vegetation index, data elevation model data, a slope, an interest point and insurance data.
4. The method according to claim 3, wherein the inputting the first feature data into a trained population distribution prediction model to obtain a prediction result corresponding to second feature data specifically comprises:
extracting a target weight value corresponding to each first characteristic data from the trained population distribution prediction model;
inputting the first feature data into a trained population distribution prediction model, and calculating to obtain a prediction result corresponding to the second feature data based on the target weight value;
and correcting the prediction result through a partition density chart to obtain population distribution data.
5. The method of claim 1, wherein the determining pollutant concentration data in the target area and performing data overlay analysis on the population distribution data and the pollutant concentration data to obtain an environmental risk assessment result comprises:
mask extracting pollutant concentration data in each second grid unit from the global concentration data by using a geographic information system platform;
calculating population exposure intensities within the respective second grid cells based on a first calculation formula and the population distribution data and the pollutant concentration data;
calculating a population weighted exposure intensity in a target area based on a second calculation formula and the population distribution data, the pollutant concentration data, and a total population number in the target area;
and generating a population pollution exposure risk zone map according to the population distribution data and the pollutant concentration data.
6. The method of claim 5, wherein generating a population pollution exposure risk zone map from the population distribution data and the pollutant concentration data comprises:
performing dimensionless normalization processing on the population distribution data and the pollutant concentration data to obtain dimensionless data;
performing first classification of population density and first classification of pollution degree concentration on the target area according to the dimensionless data to obtain a population density grade table and a pollution degree grade table;
performing second classification of population density and second classification of pollution degree concentration on the dimensionless data according to the population density grade table and the pollution degree grade table to obtain a population density grading diagram and a pollutant concentration grading diagram;
and converting the population density grading map and the pollutant concentration grading map into grid data, and superposing pollutant concentration parameters in the grid data into population density parameters to generate a population pollution exposure risk zone map.
7. The method according to claim 6, wherein the performing dimensionless normalization on the population distribution data and the pollutant concentration data to obtain dimensionless data comprises:
calculating an average contaminant concentration and an average population density for each second grid cell in the target area;
and dividing the population space distribution data by the average population density, and dividing the pollutant distribution data by the average pollutant concentration to obtain dimensionless data.
8. An environmental risk assessment device based on population distribution, comprising:
the training module is used for acquiring regional sample data in a preset region and training a population distribution prediction model by using the regional sample data;
the system comprises an acquisition module, a prediction module and a prediction module, wherein the acquisition module is used for extracting first characteristic data in a target area, inputting the first characteristic data into a trained population distribution prediction model, and acquiring a prediction result corresponding to second characteristic data, the first characteristic data at least comprises night light data, normalized vegetation indexes, data elevation model data, slopes, interest points and insurance data, and the second characteristic data comprises population distribution data;
and the analysis module is used for determining pollutant concentration data in the target area, and performing data superposition analysis on the population distribution data and the pollutant concentration data to obtain an environmental risk assessment result.
9. A storage medium having stored thereon a computer program, which when executed by a processor implements the method for population distribution based environmental risk assessment according to any one of claims 1 to 7.
10. A computer device comprising a storage medium, a processor, and a computer program stored on the storage medium and executable on the processor, wherein the processor, when executing the program, implements the population distribution based environmental risk assessment method of any one of claims 1 to 7.
CN202110694081.XA 2021-06-22 2021-06-22 Population distribution-based environmental risk assessment method and device and computer equipment Pending CN113298439A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110694081.XA CN113298439A (en) 2021-06-22 2021-06-22 Population distribution-based environmental risk assessment method and device and computer equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110694081.XA CN113298439A (en) 2021-06-22 2021-06-22 Population distribution-based environmental risk assessment method and device and computer equipment

Publications (1)

Publication Number Publication Date
CN113298439A true CN113298439A (en) 2021-08-24

Family

ID=77329123

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110694081.XA Pending CN113298439A (en) 2021-06-22 2021-06-22 Population distribution-based environmental risk assessment method and device and computer equipment

Country Status (1)

Country Link
CN (1) CN113298439A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113689048A (en) * 2021-08-31 2021-11-23 广东工业大学 Method, system and computer-readable storage medium for predicting refined spatial distribution of future population
CN114328782A (en) * 2021-12-24 2022-04-12 中科三清科技有限公司 Geographic vector data processing method, electronic device and storage medium
CN116504327A (en) * 2022-09-26 2023-07-28 中国疾病预防控制中心环境与健康相关产品安全所 Near ground O 3 Crowd exposure space-time refined analysis and evaluation method and system

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110458333A (en) * 2019-07-18 2019-11-15 华南农业大学 A kind of population spatial distribution prediction technique and system based on POIs data
CN111709646A (en) * 2020-06-17 2020-09-25 九江学院 Air pollution exposure risk evaluation method and system
CN111932036A (en) * 2020-09-23 2020-11-13 中国科学院地理科学与资源研究所 Fine spatio-temporal scale dynamic population prediction method and system based on position big data
CN112381332A (en) * 2020-12-02 2021-02-19 中国科学院空天信息创新研究院 Population spatial distribution prediction method based on settlement object

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110458333A (en) * 2019-07-18 2019-11-15 华南农业大学 A kind of population spatial distribution prediction technique and system based on POIs data
CN111709646A (en) * 2020-06-17 2020-09-25 九江学院 Air pollution exposure risk evaluation method and system
CN111932036A (en) * 2020-09-23 2020-11-13 中国科学院地理科学与资源研究所 Fine spatio-temporal scale dynamic population prediction method and system based on position big data
CN112381332A (en) * 2020-12-02 2021-02-19 中国科学院空天信息创新研究院 Population spatial distribution prediction method based on settlement object

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113689048A (en) * 2021-08-31 2021-11-23 广东工业大学 Method, system and computer-readable storage medium for predicting refined spatial distribution of future population
CN114328782A (en) * 2021-12-24 2022-04-12 中科三清科技有限公司 Geographic vector data processing method, electronic device and storage medium
CN116504327A (en) * 2022-09-26 2023-07-28 中国疾病预防控制中心环境与健康相关产品安全所 Near ground O 3 Crowd exposure space-time refined analysis and evaluation method and system
CN116504327B (en) * 2022-09-26 2024-01-30 中国疾病预防控制中心环境与健康相关产品安全所 Near ground O 3 Crowd exposure space-time refined analysis and evaluation method and system

Similar Documents

Publication Publication Date Title
Araki et al. Spatiotemporal land use random forest model for estimating metropolitan NO2 exposure in Japan
Shi et al. Investigating the influence of urban land use and landscape pattern on PM2. 5 spatial variation using mobile monitoring and WUDAPT
Aburas et al. Land suitability analysis of urban growth in Seremban Malaysia, using GIS based analytical hierarchy process
CN110796284B (en) Method and device for predicting pollution level of fine particulate matters and computer equipment
Huang et al. Development of land use regression models for PM2. 5, SO2, NO2 and O3 in Nanjing, China
Tian et al. Analysis of spatial and seasonal distributions of air pollutants by incorporating urban morphological characteristics
CN113298439A (en) Population distribution-based environmental risk assessment method and device and computer equipment
Cuvelier et al. CityDelta: A model intercomparison study to explore the impact of emission reductions in European cities in 2010
Feng et al. Spatially-explicit modeling and intensity analysis of China's land use change 2000–2050
Das et al. Assessment of urban sprawl using landscape metrics and Shannon’s entropy model approach in town level of Barrackpore sub-divisional region, India
Mandal et al. Ensemble averaging based assessment of spatiotemporal variations in ambient PM2. 5 concentrations over Delhi, India, during 2010–2016
Li et al. Integrated modelling of urban spatial development under uncertain climate futures: a case study in Hungary
Liu et al. Spatio-temporal prediction and factor identification of urban air quality using support vector machine
CN112669976B (en) Crowd health assessment method and system based on ecological environment change
CN115879630A (en) Method and device for immediately characterizing and predicting carbon emission based on land utilization
CN111709646B (en) Air pollution exposure risk evaluation method and system
CN115983522B (en) Rural habitat quality assessment and prediction method
Tseng et al. Assessing relocation strategies of urban air quality monitoring stations by GA-based compromise programming
CN113011455A (en) Air quality prediction SVM model construction method
Jin et al. Enriched spatial analysis of air pollution: Application to the city of Bogotá, Colombia
Georgati et al. Spatially explicit population projections: The case of Copenhagen, Denmark
Kii et al. Development of a suitability model for estimation of global urban land cover
Hysa Classifying the forest surfaces in metropolitan areas by their wildfire ignition probability and spreading capacity in support of forest fire risk reduction
CN111461163B (en) Urban interior PM2.5 concentration simulation and population exposure evaluation method and device
Leonenko Analyzing the spatial distribution of acute coronary syndrome cases using synthesized data on arterial hypertension prevalence

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination