CN115544199A - Foreign population space integration method and system for fine carrier extraction - Google Patents

Foreign population space integration method and system for fine carrier extraction Download PDF

Info

Publication number
CN115544199A
CN115544199A CN202211498052.7A CN202211498052A CN115544199A CN 115544199 A CN115544199 A CN 115544199A CN 202211498052 A CN202211498052 A CN 202211498052A CN 115544199 A CN115544199 A CN 115544199A
Authority
CN
China
Prior art keywords
population
data
administrative
value
foreign
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202211498052.7A
Other languages
Chinese (zh)
Other versions
CN115544199B (en
Inventor
张玉
董春
康风光
亢晓琛
赵荣
栗斌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chinese Academy of Surveying and Mapping
Original Assignee
Chinese Academy of Surveying and Mapping
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chinese Academy of Surveying and Mapping filed Critical Chinese Academy of Surveying and Mapping
Priority to CN202211498052.7A priority Critical patent/CN115544199B/en
Publication of CN115544199A publication Critical patent/CN115544199A/en
Application granted granted Critical
Publication of CN115544199B publication Critical patent/CN115544199B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/29Geographical information databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/26Government or public services

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Tourism & Hospitality (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Educational Administration (AREA)
  • General Health & Medical Sciences (AREA)
  • Development Economics (AREA)
  • Data Mining & Analysis (AREA)
  • Remote Sensing (AREA)
  • Health & Medical Sciences (AREA)
  • Economics (AREA)
  • General Engineering & Computer Science (AREA)
  • Human Resources & Organizations (AREA)
  • Marketing (AREA)
  • Primary Health Care (AREA)
  • Strategic Management (AREA)
  • General Business, Economics & Management (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The application discloses a foreign population space integration method and a system for carrier fine extraction, which relate to the field of population spatialization, and the method comprises the following steps: s1, acquiring and processing foreign multi-level administrative division demographic data and multi-source geographic space data, and constructing a standardized data structure frame; s2, carrying out space matching on the foreign demographic data and administrative region demarcation line vector data; s3, refining and extracting a foreign population distribution carrier based on global 30-meter surface coverage data and fusing other multi-source data; s4, extracting space elements and population element space integration auxiliary factors based on administrative divisions and population distribution carriers; and S5, carrying out population space integration and precision verification by using a multi-factor stepwise fusion modeling method based on a partition strategy. The scheme provided by the invention solves the problems that the foreign population distribution carrier is unknown, the foreign population statistical data scale is too coarse, and the foreign fine-scale population acquisition difficulty is large.

Description

Foreign population space integration method and system for fine carrier extraction
Technical Field
The application relates to the field of population spatialization, in particular to a foreign population space integration method and system for carrier-oriented fine extraction.
Background
At present, china mainly focuses on the integration or spatialization research of population spaces with different scales in the region of China, and the foreign population space integration literature has only 1 article, which is used for describing and evaluating the foreign population distribution evolution mechanism from the viewpoint of review around the aspects of a theoretical system, a research method, theoretical explanation and the like. The research on the aspect of foreign population space integration is rarely reported, and literature data of foreign population distribution carrier extraction and population space integration technology is not available. The main problems are as follows: one is that demographics and expressions in units of administrative districts can provide support for government management and policy planning, but the scale of foreign administrative districts, especially the balanced expression of population distribution in a large area, covers local information and detail characteristics of the foreign population distribution. And secondly, census areas are generally difficult to obtain, and moreover, the census areas in different countries and regions are complicated in division condition and different in division rule, so that the geographic difference of population distribution is blurred, and the establishment of a uniform space-time reference frame of the global population distribution is influenced. Thirdly, the too thick grid unit does not establish corresponding connection with the actual residence of population, and certain defects exist in the actual situation and expression of dynamic population distribution. And fourthly, certain difficulty is brought to acquisition, acquisition and interpretation of residential buildings in large foreign area ranges, particularly rural areas due to the limitations of personnel, expenditure, cost, system and the like. Therefore, the integration and comprehensive application of global terrestrial geographic information and population information under new developments is greatly limited. In recent years, the rapid development of big data acquisition technologies such as position navigation, internet, earth observation, surveying and mapping geographic information and the like, and the large data amplification of global geographic space with high precision, multiple elements and high dimensionality provide an effective means for the integrated quantitative research of foreign high-precision population space. The global 30-meter ground covering data accurately measure the spatial distribution of global artificial ground, and can provide reliable basic data support for the refined extraction of foreign population distribution carriers and population space integration. The method is used for developing the study of the population space integration method in the large foreign area aiming at the problems of unknown foreign population distribution carriers, overlarge foreign population statistical data scales, high acquisition difficulty of foreign fine-scale population and the like so as to more accurately acquire the space position, range and population related attribute information of population distribution.
Disclosure of Invention
Object of the application
Based on this, in order to obtain the spatial position of the foreign population distribution, solve the problems that the foreign population distribution carrier is not clear, the foreign demographic data scale is too coarse, and the foreign fine-scale population acquisition difficulty is large, and provide effective reference for the foreign population spatial research and application, the application discloses the following technical scheme.
(II) technical scheme
The application discloses a method for integrating foreign population space for carrier fine extraction, which comprises the following steps:
s1, acquiring and processing foreign multi-level administrative division demographic data and multi-source geographic space data, and constructing a standardized data structure frame;
s2, carrying out space matching on the foreign demographic data and administrative region demarcation line vector data;
s3, refining and extracting a foreign population distribution carrier based on global 30-meter surface coverage data and by fusing DEM data and OSM data;
s4, extracting space elements and population element space integration auxiliary factors based on administrative divisions and population distribution carriers;
and S5, carrying out population space integration and precision verification by using a multi-factor stepwise fusion modeling method based on a partition strategy.
In a possible implementation, the method for constructing a standardized data structure framework specifically includes:
s11: downloading demographic data of the latest year and different formats of the corresponding country or region through a website;
s12: designing a standardized population data structure framework based on the table structure;
s13: inputting corresponding multi-level administrative region demographic information according to a standardized table structure;
s14: translating different national languages into Chinese according to the national published place name standard and the existing place name library;
s15: and verifying population data, filling up missing population data, ensuring that initial values of population statistical data of administrative regions at the same level are complete, and forming a standardized data table.
In a possible embodiment, the method for filling up missing demographic data specifically includes:
s151: population data corresponding to public channels are adopted for supplement;
s152: if the corresponding population data cannot be obtained through an open channel, analyzing population missing data characteristics, establishing a regression model for population missing data variables and existing variables through R software for the population missing data with the correlation coefficient larger than 0.5, and interpolating the population missing value by using the predicted value of the missing variable;
s153: the missing population value that cannot be interpolated in step S152 is interpolated by a multiple interpolation method based on repetitive simulation.
In one possible embodiment, the method for spatially matching the foreign demographic data with the administrative demarcation line vector data specifically includes:
s21: comparing administrative division codes and administrative division names in foreign demographic data and administrative division line vector data, extracting inconsistent administrative division names in the data, for the inconsistent administrative division names, downloading all levels of administrative division line data of corresponding countries or regions by inquiring relevant information of administrative division change of the countries, determining the corresponding relation between the administrative division line vector data and the demographic data, and further correcting the administrative division line vector data;
s22: after correction, the demographic information field is added to the administrative division vector data layer by a correlation method through the public field of the administrative division name, and the spatial matching correlation of the administrative division line vector data and the demographic data is completed.
In a possible implementation manner, the main link of refining and extracting the foreign population distribution carrier specifically includes:
s31: extracting, splicing and fusing the artificial surface pattern spots: downloading global 30-meter earth surface covering data, establishing a 30-meter earth surface covering country framing data connection chart, coding according to row and column numbers, establishing a corresponding relation between a country and the 30-meter earth surface covering country framing data, extracting artificial earth surface elements in the global 30-meter earth surface covering data according to a 08 code value, converting the artificial earth surface elements into a vector format to obtain artificial earth surface vector data, cutting the artificial earth surface elements according to a specific country or a plurality of country boundaries based on a data splicing mode stored by a map sheet to obtain the artificial earth surface data of the specific country or a plurality of countries, downloading land utilization data, extracting residential area map spot data according to a flcas attribute field value, fusing the artificial earth surface data and the residential area map spot data, and respectively identifying the sources of the artificial earth surface data and the residential area map spot data;
s32: extracting a water area mask: searching according to the position by combining water area elements with consistent presence in OSM data, and removing the artificial earth surface completely positioned in the water area;
s33: and (3) identifying abnormal pattern spots: screening artificial earth surface pattern spots intersected with national boundaries by a position selection method, continuously adjusting an area threshold value by combining a visual discrimination mode until all fine pattern spots at the boundaries are extracted, and then screening and removing abnormal pattern spots with small areas according to an attribute selection and erasing method;
s34: removal of non-residential areas: screening and erasing land utilization types which are not suitable for living in combination with land utilization types in the OSM data and remote sensing image data;
s35: identifying and removing the high-altitude unmanned area: downloading global DEM elevation data, extracting elevation values of the artificial earth surface pattern spots by combining the DEM elevation data, and removing the artificial earth surface with the elevation greater than a certain threshold;
s36: long and narrow artificial ground surface identification and removal: and (3) comprehensively considering the size, distribution and morphological characteristics of the artificial earth surface in a man-machine interaction mode, determining an inflection point of the boundary density of the artificial earth surface, setting a threshold value according to the inflection point, and removing the long and narrow artificial earth surface by combining the maximum value, the minimum value and the width value of the vertical and horizontal coordinates and the distribution of the centroid kernel density of the triangular net.
In a possible implementation manner, the extraction method of the integrated cofactor extraction specifically includes:
s41: acquiring the area of each population distribution carrier, downloading corrected NPP (neutral point Power) night light remote sensing data, acquiring the light brightness value of each population distribution carrier unit by using a grid partitioning statistical method, and acquiring the total area of each population distribution carrier in an administrative district and the total brightness of night light according to the administrative district to which the population distribution carrier belongs;
s42: downloading road network data, calculating the area of a corresponding administrative region and the length of a road in the administrative region range, calculating the density of the road network of the administrative region according to the ratio of the length to the area, constructing a spatial buffer region by taking a population distribution carrier unit as a center and taking the average distance from the road as a radius, counting the length of the road in the buffer region and the area of the buffer region, calculating the density of the road network of the buffer region, and assigning the density value of the road network to the population distribution carrier unit positioned at the center of the buffer region;
s43: downloading global 30 m resolution DEM data, and calculating administrative regions and the terrain factor of each population distribution carrier on the same scale as the demographic data by using a surface analysis tool;
s44: downloading POI data, calculating the area of an administrative region and the number of POIs in the administrative region range on the same scale as the demographic data, calculating the POI density of the administrative region according to the ratio of the number of the POIs to the area, taking a population distribution carrier unit as a center, taking the area of the population distribution carrier unit as a weight, constructing a weighted Voronoi diagram, calculating the number of the POIs in each Voronoi diagram and the area of the Voronoi diagram, calculating the POI density in each Voronoi diagram according to the ratio of the number to the area, and assigning the POI density value to the population distribution carrier unit located at the center of the Voronoi diagram.
In a possible embodiment, the developing method of the population space integration and the accuracy verification specifically includes:
s51: calculating the initial population number of each finely extracted artificial earth surface according to the area proportion of the finely extracted artificial earth surface, and recording the initial population number as Pai, namely the initial population number of the single artificial earth surface;
s52: taking an administrative district corresponding to initial distribution as a statistical unit, wherein the population statistical value of the administrative district is equal to the accumulated value of the initial population number of each artificial earth surface in the administrative district, if the population statistical value of the administrative district is not equal to the accumulated value of the initial population number of each artificial earth surface in the administrative district, taking the approximate value of the ratio of the two as a correction coefficient, carrying out iterative correction on the correction coefficient until the population statistical value of the administrative district is completely equal to the accumulated value of all the initial population numbers of the artificial earth surfaces in the administrative district, obtaining the correction coefficient, and multiplying the correction coefficient by each initial population number of the artificial earth surfaces to obtain the estimated value of each artificial earth surface;
s53: the administrative district unit contains a plurality of sub-administrative district units, and based on the sub-administrative district unit, the integrated relative error of the population space is calculated, namely the average value of the absolute value of the ratio of the difference between the estimated value of the demographics of the manual table in the sub-administrative district and the demographics of the sub-administrative district to the demographics of the sub-administrative district, and the formula is as follows:
Figure 995777DEST_PATH_IMAGE001
wherein the content of the first and second substances,MAEintegrating relative errors for the population space;
Figure 237534DEST_PATH_IMAGE002
is the demographic value of the ith sub-administrative district;
Figure 699215DEST_PATH_IMAGE003
a population estimate for the ith sub-administrative area;
Figure 213373DEST_PATH_IMAGE004
the number of the sub administrative districts;
s54: constructing a population space integrated relative error data set, and identifying and extracting high-error artificial surface map spots in the population space integrated relative error data set by using a box-whisker diagram combined with a corrected Z scoring method; specifically, first, the population space is integrated with a relative error value
Figure 296997DEST_PATH_IMAGE005
Performing ascending arrangement to divide the population space integration relative error data set into four equal partsAnd calculating a first quartile, a second quartile and a third quartile respectively; then, the median of the relative error of the population space integration is calculated
Figure 678081DEST_PATH_IMAGE006
And the absolute deviation value of each value from the median
Figure 969385DEST_PATH_IMAGE007
Then calculating the median of the absolute deviation value; and finally, calculating a corrected Z score value, wherein the corrected Z score calculation formula is as follows:
Figure 577696DEST_PATH_IMAGE008
taking the corrected Z score value as a threshold value, extracting a single artificial earth surface with higher relative error of population space integration larger than the threshold value or larger than a third quartile, further fusing the residential map spots extracted from the OSM data on the basis of the single artificial earth surface in the step S51 to obtain population distribution carriers, calculating the initial population number of each population distribution carrier according to the area proportion of the population distribution carriers, recording the initial population number as Paai, and continuously executing S52 and S53;
s55: based on the sub-administrative region unit, comparing the relative errors of population space integration under two methods of a population distribution carrier based on single artificial earth surface, artificial earth surface and residential region map spot fusion, calculating a corrected Z score value serving as a threshold value, selecting a map spot scale population space integration result generated by modeling through a corresponding method lower than the threshold value or smaller than a third quartile, and recording the map spot scale population space integration result as the map spot scale integration result
Figure 651963DEST_PATH_IMAGE009
Identifying and extracting all population distribution carrier image spots in the administrative district which are higher than the threshold value or higher than a third quartile;
s56: introducing lighting brightness, calculating the population number of each population distribution carrier based on the population distribution carrier area and the lighting brightness, recording as Pli, and executing S52 and S53;
s57: based on the sub-administrative area units,comparing population space integration relative errors under three methods of population distribution carrier based on single artificial earth surface, fusion of artificial earth surface and residential area map spots and introduction of lamplight brightness, calculating corrected Z score serving as threshold, selecting a corresponding method which is smaller than the threshold or smaller than a third quartile to model and generate a map spot scale population space integration result, and recording the integration result as the map spot scale population space integration result
Figure 665049DEST_PATH_IMAGE010
Identifying and extracting all population distribution carrier image spots in the administrative region which are higher than the threshold value or higher than a third quartile;
s58: introducing road network density and topographic factors, calculating the population number of each population distribution carrier based on the population distribution carrier area, the lighting brightness, the road network density and the topographic factors, respectively, recording the population number as Pri, pei and Ppi, and executing S52 and S53;
s59: and comparing population space integration relative errors under different methods based on the sub-administrative regions until the fitting index of population space integration in all the sub-administrative regions is greater than 0.5, and selecting a spot scale population space integration result generated by modeling through a corresponding method as a final result. The fit index is used to represent the agreement between the estimated and statistical values of the population space integration. The calculation formula is as follows:
Figure 799358DEST_PATH_IMAGE011
wherein the content of the first and second substances,
Figure 420265DEST_PATH_IMAGE012
actual statistical values of population data of the ith sub administrative area;
Figure 439168DEST_PATH_IMAGE013
an estimated value of population data of the ith sub-administrative region;
Figure 634657DEST_PATH_IMAGE014
population mean;
Figure 671358DEST_PATH_IMAGE004
is the estimated number of sub-administrative district units.
As a second aspect of the present application, the present application also discloses a carrier-oriented fine extraction foreign population space integration system, comprising:
the data preprocessing module is used for acquiring and processing foreign multi-level administrative division demographic data and multi-source geographic space data and constructing a standardized data structure frame;
the data matching module is used for carrying out space matching on the foreign demographic data and administrative region demarcation line vector data;
the carrier unit extraction module is used for refining and extracting foreign population distribution carriers based on global 30-meter surface coverage data and fusing DEM data and OSM data;
the auxiliary factor extraction module is used for extracting space elements and population element space integrated auxiliary factors based on administrative divisions and population distribution carriers;
the population space integration module is a multi-factor step-by-step fusion modeling method based on a partition strategy and is used for developing population space integration and precision verification.
(III) advantageous effects
1. The application discloses carrier-oriented foreign population space integration method and system for fine extraction of carriers carries out space matching on foreign demographic data and administrative region demarcation line vector data, corrects the administrative region demarcation line vector data, and improves accuracy of the administrative region demarcation line vector data.
2. The method and the system for integrating the foreign population space for carrier fine extraction can obtain effective population distribution space position and range by deep extraction of the foreign population distribution carrier, and improve the reliability of foreign population spatialization research.
3. The method and the system for integrating the foreign population space for the fine carrier extraction can screen out the most accurate population space integration result through gradual fusion modeling based on multiple factors and continuous precision verification, and improve the refinement degree of foreign population statistics.
Drawings
The embodiments described below with reference to the drawings are exemplary and intended to be used for explaining and illustrating the present application and should not be construed as limiting the scope of the present application.
Fig. 1 is a schematic flow chart of the carrier-oriented fine extraction foreign population space integration method disclosed in the present application.
Figure 2 is an exemplary graph of spatial matching associations of columbia city-level demographic data with administrative region vector data as disclosed herein.
FIG. 3 is an exemplary diagram of a water-area mask extraction disclosed herein.
Fig. 4 is a chart showing an example of a country boundary edge anomaly map disclosed in the present application.
Fig. 5 is an exemplary illustration of the removal of non-occupancy disclosed in the present application.
Fig. 6 is an exemplary diagram of high altitude unmanned area identification and removal as disclosed herein.
Fig. 7 is an artificial surface boundary density index table as disclosed herein.
Fig. 8 is a horizontally or vertically elongated surface processing parameter table as disclosed in the present application.
Fig. 9 is a table of kernel density reclassification level thresholds as disclosed herein.
Fig. 10 is an exemplary diagram of long and narrow artificial surface recognition and extraction disclosed in the present application.
FIG. 11 is a Columbia demographics integrated step-by-step modeling accuracy statistics table as disclosed herein.
Fig. 12 is a golombia population space integration results space layout diagram as disclosed in the present application.
Detailed Description
In order to make the implementation objects, technical solutions and advantages of the present application clearer, the technical solutions in the embodiments of the present application will be described in more detail below with reference to the drawings in the embodiments of the present application.
An embodiment of the carrier-oriented fine extraction foreign population space integration method disclosed in the present application is described in detail below with reference to fig. 1. As shown in FIG. 1, the method disclosed in this embodiment mainly includes the following steps S1 to S5.
S1, acquiring and processing foreign multi-level administrative division demographic data and multi-source geographic space data, and constructing a standardized data structure frame;
the method for constructing the standardized data structure framework comprises the specific steps S11-S14.
S11: downloading demographic data of the latest year and different formats through the links of the domestic and foreign statistical websites in the national statistical bureau website of China;
s12: designing a standardized population data structure frame based on a table structure, wherein fields of the standardized population data structure are not limited to administrative division identification codes, administrative division English names, administrative division Chinese names, population numbers, years and the like;
s13: inputting corresponding multi-level administrative region demographic information according to a standardized table structure;
s14: according to the place name standard published by the country and the existing place name library, foreign languages such as Arabic, spanish, portuguese and the like in different national languages are translated into Chinese;
s15: verifying population data, filling up missing population data, ensuring that initial values of the population statistical data of administrative regions of the same level are complete, and forming a standardized data table, wherein the method for filling up the missing population data comprises the specific steps of S151-S153.
S151: preferentially opening the links of the domestic and foreign statistical websites of the national statistical bureau website of China, and preferentially adopting corresponding population data on a public channel for supplement, wherein the public channel comprises a world bank, a world health organization, a national statistical bureau and the like.
S152: if the corresponding population data cannot be obtained through open channels such as foreign statistical websites and the like, analyzing population missing data characteristics, establishing a regression model for population missing data variables and existing variables through R software for the population missing data with the correlation coefficient larger than 0.5, and interpolating the population missing value by using the predicted value of the missing variable. Specifically, an any (is. Na ()) function is called to judge whether the population missing value exists in the data set, and then an lm () function is called to fill the population missing value with the missing data condition expected value.
S153: the missing population value that cannot be interpolated in step S152 is interpolated by a multiple interpolation method based on repetitive simulation. Specifically, a mic () function is called, and a complete data set is generated on the basis of original data by adopting a Monte Carlo method; calling a with () function to carry out statistical modeling on the data; calling a pool () function to integrate the statistical modeling analysis result; finally, calling lm (), coef () functions to compare the standard error of multiple interpolation estimation, and selecting the Nth interpolation result with the minimum standard error to fill in missing data.
S2, carrying out space matching on the foreign demographic data and the administrative demarcation line vector data, wherein the matching method comprises the specific steps of S21-S22.
S21: comparing administrative division codes and administrative division names in foreign demographic data and administrative division vector data, extracting inconsistent administrative division names in the data, for the inconsistent administrative division names, downloading administrative division change related data of corresponding countries or regions from an OSM (https:// www.opentreetmap.org), determining the corresponding relation between the administrative division vector data and the demographic data, and further correcting the administrative division vector data;
s22: after the correction, the demographic information field is added to the administrative division vector data layer by the association method through the common field "administrative division name", and the spatial matching association between the administrative division vector data and the demographic data is completed, as shown in fig. 2, which is a population distribution map obtained by performing the spatial matching association on columbia.
And S3, refining and extracting the foreign population distribution carrier based on the global 30-meter surface coverage data and by fusing DEM data and OSM data, wherein the main link for refining and extracting the foreign population distribution carrier comprises the specific steps of S31-S36.
S31: extracting, splicing and fusing the artificial surface pattern spots: downloading global 30-meter earth surface covering data from a national geographic information resource directory service system (https:// www.webmap.cn), establishing an earth surface covering country framing data connection chart of 30 meters, coding according to row and column numbers, establishing a corresponding relation between a country and the 30-meter earth surface covering country framing data, extracting manual table elements in the global 30-meter earth surface covering data according to a '08' code value, converting the manual table elements into a vector format to obtain manual earth surface vector data, and cutting according to a specific country or a plurality of country boundaries based on a data splicing mode stored by a map frame to obtain the manual earth surface data of the specific country or a plurality of countries; downloading land utilization data from an OSM (https:// www. Opentreetmap. Org /), extracting residential map spot data according to a 'flcas' attribute field value, fusing the artificial surface data and the residential map spot data, and respectively identifying the sources of the artificial surface data and the residential map spot data;
s32: extracting a water area mask: combining water area elements with consistent current situation in OSM data, searching according to positions, and removing the artificial terrains completely positioned in the water areas, wherein as shown in fig. 3, A in fig. 3 is the artificial terrains of the water areas which are not removed, and B in fig. 3 is the artificial terrains of the water areas which are removed;
s33: and (3) identifying abnormal pattern spots: screening artificial earth surface pattern spots intersected with the national boundary by a position selection method, continuously adjusting an area threshold value by combining a visual discrimination mode until all fine pattern spots at the boundary are extracted, and then screening and removing abnormal pattern spots with small areas according to an attribute selection and erasing method, wherein the abnormal pattern spots are abnormal pattern spots at the edge of the national boundary as shown in FIG. 4;
s34: removal of non-residential areas: combining the land utilization type in the OSM data and the remote sensing image data, screening and erasing the land utilization type which is not suitable for living, the land which is not suitable for living, such as graveyards, quarries, forests, shrubs and the like, as shown in figure 5, A in figure 5 is an artificial land surface without non-living land removed, and B in figure 5 is an artificial land surface without non-living land removed;
s35: identifying and removing the high-altitude unmanned area: and downloading global DEM elevation data, combining the DEM elevation data, extracting the elevation value of the artificial earth surface map spots, and removing the artificial earth surface with the altitude elevation greater than a certain threshold value. According to the international standard for dividing altitude, considering that no people live in high altitude areas generally, the human body functions are seriously reduced, the human body and core organs are greatly damaged, and some damage is irreversible, so that almost no people live at the altitude for a long time. According to DEM elevation data, calculating the DEM mean value in the range of each artificial earth surface pattern spot to obtain the elevation value of each artificial earth surface pattern spot, and deleting the artificial earth surface pattern spots which are not inhabited by people, namely the artificial earth surface pattern spots with the elevation values larger than 5000 meters by taking 5000 meters of the elevation value as a threshold value point, wherein A in the figure 6 is the artificial earth surface without the high-altitude unmanned area, and B in the figure 6 is the artificial earth surface without the high-altitude unmanned area.
S36: long and narrow artificial ground surface identification and removal: and (3) comprehensively considering the size, distribution and morphological characteristics of the artificial earth surface in a man-machine interaction mode, determining an inflection point of the boundary density of the artificial earth surface, setting a threshold value according to the inflection point, and removing the long and narrow artificial earth surface by combining the maximum value, the minimum value and the width value of the vertical and horizontal coordinates and the distribution of the centroid kernel density of the triangulation network.
The method for removing the narrow and long artificial ground surface comprises the specific steps S361-S363.
S361: and when the shape of the artificial ground surface is a long and narrow strip shape, calculating the polygonal boundary density of the artificial ground surface, and setting a threshold parameter to remove the long and narrow ground surface. In an ArcGIS environment, firstly, the circumference and the area of the regional artificial surface map spot are calculated, and the boundary density index is calculated according to the circumference-to-area ratio. Then, drawing a boundary density index curve graph of all artificial earth surfaces, judging curve outlier inflection points according to the curve graph, and removing long and narrow artificial earth surfaces with boundary density indexes larger than the outlier inflection points, wherein the boundary density indexes of the artificial earth surfaces are shown in fig. 7.
S362: removing the long and narrow artificial earth surface with the boundary density index larger than the outlier inflection point, and extracting and removing the long and narrow earth surface with a horizontal or vertical shape by adopting the following method. Specifically, the maximum value, the minimum value, and the width value of the abscissa and the ordinate of the remaining artificial surface-shaped elements are calculated using the long and narrow surface processing parameters as shown in fig. 8. Let len _ Y _ X = Y _ wid/X _ wid, sort len _ Y _ X fields in descending order, plot len _ Y _ X, screen according to inflection point location and use cancellation to remove vertically shaped elongated terraces below the threshold.
S363: when the two ends of the polygon are convex bags and the middle of the polygon is a strip, the polygon is generally considered to be an artificial ground surface of a road connected with a building area or other facilities. Specifically, a triangular net is constructed for the polygons of the manual surface, centroid points of the triangular net are extracted, nuclear density analysis is carried out, and then the strip-shaped polygons are segmented and removed according to a certain threshold value so as to keep the convex hull surface at two ends.
Specifically, the vertices of the artificial surface polygon are ordered clockwise or counterclockwise to construct a triangulation network. Dividing each artificial earth surface polygon pattern spot into N triangles in the ordered nodes; and (3) taking a node P on the artificial surface polygon, wherein the node P is a split point, then taking the last point O of the P, then taking the next point Q of the O, and at the moment, forming a triangular OPQ by three points of the OPQ. The steps are circulated until the number of the polygonal nodes on the artificial earth surface is 3, and the subdivision is stopped after the subdivision is completed;
and extracting the centroid point of each triangle, and performing kernel density analysis on the centroid point by taking the area of the triangle as the weight. And calculating the average value and the standard deviation of the nuclear density data set, and reclassifying the nuclear density data set into 5 grades by adopting a standard deviation classification method. The 5-level corresponding criteria are shown in fig. 9, and the average value of the kernel density data set is used as the 1 st threshold, and then the average value plus 1, 2, 3 standard deviation values are used as the 2 nd, 3 rd, and 4 th threshold points, respectively, so as to separate 5 value interval ranges. The mean value of the nuclear density data set is set as A, the standard deviation is set as s, and the calculation formula and the classification are as follows:
nuclear density data set:
Figure 257192DEST_PATH_IMAGE015
average value:
Figure 673261DEST_PATH_IMAGE016
standard deviation:
Figure 864202DEST_PATH_IMAGE017
vectorizing the kernel density reclassification data, performing intersection processing on the vectorized layer and the artificial surface image spots generated in the step S36-2, and counting the mass center points of the artificial surface triangle in the kernel density reclassification areas of different levels; and continuously refining the kernel density reclassification number and carrying out vectorization on the basis of the 4 threshold points, and if the variation of the number of the centroid points in the classification region range tends to be stable, cutting the reclassified pattern spots corresponding to the inflection points into the artificial surface pattern spots to remove the middle long and narrow artificial surface, wherein as shown in fig. 10, A in fig. 10 is the artificial surface which is not vectorized, and B in fig. 10 is the artificial surface which is vectorized.
And S4, carrying out space element and population element space integration auxiliary factor extraction based on administrative divisions and population distribution carriers, wherein the extraction method of the integration auxiliary factor extraction comprises the specific steps S41-S44.
S41: the area of each population distribution carrier is obtained, corrected NPP night light remote sensing data are downloaded from a national aeronautics and astronautics administration (NASA) website, a grid partition statistical method is utilized to obtain the light brightness value of each population distribution carrier unit, the total area of each administrative region population distribution carrier and the total night light brightness are obtained according to the administrative region to which the population distribution carrier belongs, specifically, due to the fact that a part of population distribution carriers cross a regional administrative region boundary, judgment is carried out according to the area proportion of the carriers to guarantee the integrity of the carriers, and if the area proportion of the carriers in an administrative region is larger than 50%, the carriers are considered to belong to the scope of the administrative region.
S42: downloading road network data from an OSM official website (https:// www. Openstreetmap. Org /), calculating the area of a corresponding administrative area and the length of a road in the administrative area range, calculating the road network density of the administrative area through the ratio of the length to the area, taking a population distribution carrier unit as a center, taking the average distance from the road as a radius for spatial buffering, counting the length of the road in the buffer area range and the area of the buffer area, calculating the road network density of the buffer area, and assigning the road network density value to the population distribution carrier unit positioned at the center of the buffer area.
S43: global 30 m resolution DEM data is downloaded from the national geographic information resource directory service system (https:// www.webmap.cn), and the terrain factors of administrative regions and each population distribution carrier on the same scale as the demographic data are calculated by using a surface analysis tool, wherein the terrain factors comprise other terrains such as elevation, gradient and slope direction.
S44: downloading POI data from an OSM (https:// www. Openstreetmap. Org), calculating the area of an administrative region on the same scale with demographic data and the number of POIs in the range of the administrative region, calculating the POI density of the administrative region according to the ratio of the number of the POIs to the area, taking a population distribution carrier unit as the center and the area of the population distribution carrier unit as the weight, constructing a weighted Voronoi diagram, calculating the number of the POIs in each Voronoi diagram and the area of the Voronoi diagram, calculating the POI density in each Voronoi diagram according to the ratio of the number of the POIs to the area, and assigning the POI density value to the population distribution carrier unit positioned at the center of the Voronoi diagram.
S5, carrying out population space integration and precision verification by a multi-factor step-by-step fusion modeling method based on a partition strategy, wherein the accuracy statistics of Columbia population space integration step-by-step modeling is shown in a figure 11, and the development method of population space integration and precision verification comprises the specific steps of S51-S59.
S51: calculating the initial population number of each finely extracted artificial earth surface according to the area proportion of the finely extracted artificial earth surface, and recording the initial population number as Pai, namely the initial population number of the single artificial earth surface;
specifically, the method for calculating the initial population number of the single artificial surface comprises the following steps: and adding an identification code for each artificial land surface, calculating the proportion of the surface area of each person in the total area of the artificial land surfaces of the corresponding administrative regions, distributing the total population number of the corresponding administrative regions to each artificial land surface according to the proportion, and acquiring the initial population number of the artificial land surfaces.
S52: taking the administrative region corresponding to the initial distribution as a statistical unit, wherein the demographic value of the administrative region is equal to the accumulated value of the initial population number of each artificial earth surface of the administrative region, if the demographic value of the administrative region is not equal to the accumulated value of the initial population number of each artificial earth surface of the administrative region, taking the approximate value of the ratio of the two values as a correction coefficient, and iteratively correcting the correction coefficient until the demographic value of the administrative region is completely equal to the accumulated value of all the initial population numbers of the artificial earth surfaces in the administrative region, and obtaining the correction coefficient. Multiplying the correction coefficient by the initial population number of each artificial earth surface to obtain an estimated value of the population of each artificial earth surface;
s53: the administrative district unit contains a plurality of sub administrative district units, based on sub administrative district unit, calculates the integrated relative error in population space, and the average value of the absolute value of the ratio of the estimated value of the demographics of the artificial table in the sub administrative district to the difference value of the demographics of the sub administrative district to the demographics of the sub administrative district is as follows:
Figure 58423DEST_PATH_IMAGE018
wherein the content of the first and second substances,MAEintegrating relative errors for the population space;
Figure 665640DEST_PATH_IMAGE019
is the demographic value of the ith sub-administrative district;
Figure 823083DEST_PATH_IMAGE020
a population estimate for the ith sub-administrative area;
Figure 462006DEST_PATH_IMAGE004
is the number of sub administrative districts.
S54: and constructing a population space integrated relative error data set, and identifying and extracting high-error artificial surface map spots in the population space integrated relative error data set by using a box-whisker map and a corrected Z scoring method. Specifically, first, the population space is integrated with a relative error value
Figure 981456DEST_PATH_IMAGE005
Performing ascending arrangement, dividing the data set into four equal parts, and respectively calculating a first quartile, a second quartile and a third quartile; then, the median of the relative error of the population space integration is calculated
Figure 541881DEST_PATH_IMAGE021
And the absolute deviation value of each value from the median
Figure 440698DEST_PATH_IMAGE022
Then calculating the median of the absolute deviation value; finally, the corrected Z is calculatedAnd (5) scoring. The modified Z-score is calculated as:
Figure 650573DEST_PATH_IMAGE023
taking the corrected Z score value as a threshold value, extracting a single artificial earth surface with higher relative error of population space integration larger than the threshold value or larger than a third quartile, further fusing the residential map spots extracted from the OSM data on the basis of the single artificial earth surface in the step S51 to obtain population distribution carriers, calculating the initial population number of each population distribution carrier according to the area proportion of the population distribution carriers, recording the initial population number as Paai, and continuously executing S52 and S53;
s55: based on sub administrative district units, comparing population space integration relative errors under two methods of population distribution carriers based on single artificial earth surface, artificial earth surface and residence map plaque fusion, calculating corrected Z score value serving as threshold value, selecting map plaque scale population space integration result generated by modeling of corresponding method lower than the threshold value or lower than third quartile, and recording the result as map plaque scale population space integration result
Figure 937329DEST_PATH_IMAGE024
Identifying and extracting all population distribution carrier image spots in the administrative district which are higher than the threshold value or higher than a third quartile;
s56: introducing lighting brightness, calculating the population number of each population distribution carrier based on the population distribution carrier area and the lighting brightness, recording as Pli, and executing S52 and S53;
s57: based on the sub administrative district unit, comparing population space integration relative errors based on three methods of a single artificial earth surface, a population distribution carrier fusing the artificial earth surface and the residential district map patches and introducing lamplight brightness, calculating a corrected Z score serving as a threshold value, selecting a map patch scale population space integration result generated by modeling by a corresponding method smaller than the threshold value or smaller than a third quartile, and recording the integration result as the map patch scale population space integration result
Figure 657155DEST_PATH_IMAGE025
Identify and extract values above the threshold or aboveAll population distribution carrier patches in the administrative district of the third quartile;
s58: introducing road network density and topographic factors, calculating the population number of each population distribution carrier based on the population distribution carrier area, the lighting brightness, the road network density and the topographic factors, respectively, recording the population number as Pri, pei and Ppi, and executing S52 and S53;
s59: and comparing population space integration relative errors under different methods based on the sub-administrative regions until the fitting index of population space integration in all the sub-administrative regions is greater than 0.5, and selecting a spot scale population space integration result generated by modeling through a corresponding method as a final result. The fit index is used to represent the agreement between the estimated and statistical values of the population space integration. The calculation formula is as follows:
Figure 159330DEST_PATH_IMAGE026
in the formula (I), the compound is shown in the specification,
Figure 569583DEST_PATH_IMAGE012
actual statistical values of population data of the ith sub administrative area;
Figure 89557DEST_PATH_IMAGE027
an estimate of the population data for the ith sub-administrative area;
Figure 575640DEST_PATH_IMAGE014
population mean;
Figure 472052DEST_PATH_IMAGE004
is the estimated number of sub-administrative district units.
Specifically, the result of the population space integration of the zonal modeling of columbia by the above method is shown in fig. 12.
An embodiment of the carrier-oriented fine-extraction foreign-population-space integration system disclosed in the present application is described in detail below with reference to fig. 1. The system disclosed in the present embodiment includes:
the data preprocessing module is used for acquiring and processing foreign multi-level administrative division demographic data and multi-source geographic space data and constructing a standardized data structure frame;
the data matching module is used for carrying out space matching on the foreign demographic data and administrative region demarcation line vector data;
the carrier unit extraction module is used for refining and extracting foreign population distribution carriers based on global 30-meter surface coverage data and fusing DEM data and OSM data;
the auxiliary factor extraction module is used for extracting space elements and population element space integrated auxiliary factors based on administrative divisions and population distribution carriers;
the population space integration module is a multi-factor gradual fusion modeling method based on a partition strategy and is used for developing population space integration and precision verification.
In conclusion, the method and the device have the advantages that foreign demographic data and administrative boundary line vector data are subjected to spatial matching, the administrative boundary line vector data are corrected, accuracy of the administrative boundary line vector data is improved, effective population distribution spatial positions and ranges can be obtained through deep extraction of foreign population distribution carriers, reliability of spatial study of foreign populations is improved, the most accurate population space integration result can be screened out through gradual fusion modeling based on multiple factors and continuous accuracy verification, and accuracy of foreign demographic accuracy is improved.
The division of modules, units or components herein is merely a logical division, and other divisions may be possible in an actual implementation, for example, a plurality of modules and/or units may be combined or integrated in another system. Modules, units, or components described as separate parts may or may not be physically separate. The components displayed as cells may or may not be physical cells, and may be located in a specific place or distributed in grid cells. Therefore, some or all of the units can be selected according to actual needs to implement the scheme of the embodiment.
The above description is only for the specific embodiments of the present application, but the scope of the present application is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the present application should be covered within the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (8)

1. The foreign population space integration method for carrier fine extraction is characterized by comprising the following steps:
s1, acquiring and processing foreign multi-level administrative division demographic data and multi-source geographic space data, and constructing a standardized data structure frame;
s2, carrying out space matching on the foreign demographic data and administrative region demarcation line vector data;
s3, refining and extracting a foreign population distribution carrier based on global 30-meter surface coverage data and by fusing DEM data and OSM data;
s4, extracting space elements and population element space integration auxiliary factors based on administrative divisions and population distribution carriers;
and S5, carrying out population space integration and precision verification by using a multi-factor stepwise fusion modeling method based on a partition strategy.
2. The method of claim 1, wherein the method of constructing a standardized data structure framework specifically comprises:
s11: downloading demographic data of the latest year and different formats of the corresponding country or region through a website;
s12: designing a standardized population data structure framework based on the table structure;
s13: inputting corresponding multi-level administrative region demographic information according to a standardized table structure;
s14: translating different national languages into Chinese according to the national published place name standard and the existing place name library;
s15: and checking population data, filling missing population data, ensuring that the initial values of the population statistical data of administrative regions at the same level are complete, and forming a standardized data table.
3. The method of claim 2, wherein the method of filling in missing demographic data specifically comprises:
s151: population data corresponding to public channels are adopted for supplement;
s152: if the corresponding population data cannot be obtained through an open channel, analyzing population missing data characteristics, establishing a regression model for population missing data variables and existing variables through R software for the population missing data with the correlation coefficient larger than 0.5, and interpolating the population missing value by using the predicted value of the missing variable;
s153: the missing population value that cannot be interpolated in step S152 is interpolated by a multiple interpolation method based on repetitive simulation.
4. The method of claim 1, wherein the method of spatially matching foreign demographic data with administrative demarcation line vector data specifically comprises:
s21: comparing administrative division codes and administrative division names in foreign demographic data and administrative division line vector data, extracting inconsistent administrative division names in the data, for the inconsistent administrative division names, downloading all levels of administrative division line data of corresponding countries or regions by inquiring relevant information of administrative division change of the countries, determining the corresponding relation between the administrative division line vector data and the demographic data, and further correcting the administrative division line vector data;
s22: after correction, the demographic information field is added to the administrative division vector data layer by a correlation method through the public field of the administrative division name, and the spatial matching correlation of the administrative division line vector data and the demographic data is completed.
5. The method of claim 1, wherein the refining of the foreign population distribution carriers comprises:
s31: extracting, splicing and fusing the artificial surface pattern spots: downloading global 30-meter earth surface covering data, establishing a 30-meter earth surface covering country framing data connection chart, coding according to row and column numbers, establishing a corresponding relation between the country and the 30-meter earth surface covering country framing data, extracting artificial surface elements in the global 30-meter earth surface covering data according to a '08' code value, converting the artificial surface elements into a vector format to obtain artificial earth surface vector data, cutting the artificial surface elements according to a specific country or a plurality of country boundaries based on a data splicing mode stored by a map frame to obtain the artificial earth surface data of the specific country or a plurality of countries, downloading land utilization data, extracting residential land pattern spot data according to a 'flcas' attribute field value, fusing the artificial earth surface data with the residential land pattern spot data, and respectively identifying the sources of the artificial earth surface data and the residential land pattern spot data;
s32: extracting a water area mask: searching according to the position by combining water area elements with consistent presence in OSM data, and removing the artificial earth surface completely positioned in the water area;
s33: and (3) identifying abnormal pattern spots: screening artificial surface pattern spots intersecting national boundaries by a position selection method, continuously adjusting an area threshold value by combining a visual discrimination mode until all fine-crushing pattern spots at the boundaries are extracted, and then screening and removing abnormal pattern spots with small areas according to an attribute selection and erasure method;
s34: removal of land for non-occupancy: screening and erasing land utilization types which are not suitable for living in combination with land utilization types in the OSM data and remote sensing image data;
s35: identifying and removing the high-altitude unmanned area: downloading global DEM elevation data, extracting the elevation value of the artificial earth surface map spots by combining the DEM elevation data, and removing the artificial earth surface with the altitude elevation larger than a certain threshold;
s36: long and narrow artificial ground surface identification and removal: and (3) comprehensively considering the size, distribution and morphological characteristics of the artificial earth surface in a man-machine interaction mode, determining an inflection point of the boundary density of the artificial earth surface, setting a threshold value according to the inflection point, and removing the long and narrow artificial earth surface by combining the maximum value, the minimum value and the width value of the vertical and horizontal coordinates and the distribution of the centroid kernel density of the triangular net.
6. The method according to claim 1, wherein the extraction method of the integrated cofactor extraction specifically comprises:
s41: acquiring the area of each population distribution carrier, downloading corrected NPP (neutral Point protocol) night light remote sensing data, acquiring the light brightness value of each population distribution carrier unit by using a grid partition statistical method, and acquiring the total area of each population distribution carrier in an administrative district and the total brightness of night light according to the administrative district to which the population distribution carrier belongs;
s42: downloading road network data, calculating the area of a corresponding administrative region and the length of a road in the administrative region range, calculating the density of the road network of the administrative region according to the ratio of the length to the area, constructing a spatial buffer region by taking a population distribution carrier unit as a center and taking the average distance from the road as a radius, counting the length of the road in the buffer region and the area of the buffer region, calculating the density of the road network of the buffer region, and assigning the density value of the road network to the population distribution carrier unit positioned at the center of the buffer region;
s43: downloading global 30 m resolution DEM data, and calculating administrative regions and topographic factors of each population distribution carrier on the same scale as the demographic data by using a surface analysis tool;
s44: downloading POI data, calculating the area of an administrative region and the number of POIs in the administrative region range on the same scale as the demographic data, calculating the POI density of the administrative region according to the ratio of the number of the POIs to the area, taking a population distribution carrier unit as a center, taking the area of the population distribution carrier unit as a weight, constructing a weighted Voronoi diagram, calculating the number of the POIs in each Voronoi diagram and the area of the Voronoi diagram, calculating the POI density in each Voronoi diagram according to the ratio of the number to the area, and assigning the POI density value to the population distribution carrier unit located at the center of the Voronoi diagram.
7. The method of claim 1, wherein the developing of the population space integration and accuracy verification specifically comprises:
s51: calculating the initial population number of each finely extracted artificial earth surface according to the area proportion of the finely extracted artificial earth surface, and recording the initial population number as Pai, namely the initial population number of the single artificial earth surface;
s52: taking an administrative district corresponding to initial distribution as a statistical unit, wherein the population statistical value of the administrative district is equal to the accumulated value of the initial population number of each artificial earth surface in the administrative district, if the population statistical value of the administrative district is not equal to the accumulated value of the initial population number of each artificial earth surface in the administrative district, taking the approximate value of the ratio of the two as a correction coefficient, carrying out iterative correction on the correction coefficient until the population statistical value of the administrative district is completely equal to the accumulated value of all the initial population numbers of the artificial earth surfaces in the administrative district, obtaining the correction coefficient, and multiplying the correction coefficient by each initial population number of the artificial earth surfaces to obtain the estimated value of each artificial earth surface;
s53: the administrative district unit contains a plurality of sub administrative district units, based on sub administrative district unit, calculates the integrated relative error in population space, the average value of the absolute value of the ratio of the difference between the estimated value of the demographics of the manual table in the sub administrative district and the demographics of the sub administrative district to the demographics of the sub administrative district, and the formula is as follows:
Figure 766841DEST_PATH_IMAGE001
wherein, the first and the second end of the pipe are connected with each other,MAEintegrating relative errors for the population space;
Figure 79005DEST_PATH_IMAGE002
is the demographic value of the ith sub-administrative district;
Figure 346493DEST_PATH_IMAGE003
a population estimate for the ith sub-administrative area;
Figure 402305DEST_PATH_IMAGE004
the number of the sub administrative districts;
s54: constructing a population space integrated relative error data set, and identifying and extracting high-error artificial surface map spots in the population space integrated relative error data set by using a box-whisker diagram combined with a corrected Z scoring method; specifically, first, the population space is integrated with a relative error value
Figure 860968DEST_PATH_IMAGE005
To proceed withPerforming ascending arrangement, dividing the population space integration relative error data set into four equal parts, and respectively calculating a first quartile, a second quartile and a third quartile; then, the median of the relative error of the population space integration is calculated
Figure 484979DEST_PATH_IMAGE006
And the absolute deviation value of each value from the median
Figure 206422DEST_PATH_IMAGE007
Then calculating the median of the absolute deviation value; and finally, calculating a corrected Z score value, wherein the corrected Z score calculation formula is as follows:
Figure 3608DEST_PATH_IMAGE008
taking the corrected Z score value as a threshold, extracting a single artificial ground surface which is larger than the threshold or larger than a third quartile and has high relative error of population space integration, further fusing the residential map patches extracted from the OSM data on the basis of the single artificial ground surface in the step S51 to obtain population distribution carriers, calculating the initial population number of each population distribution carrier according to the area proportion of the population distribution carriers, recording the initial population number as Paai, and continuously executing S52 and S53;
s55: based on sub administrative district units, comparing population space integration relative errors under two methods of population distribution carriers based on single artificial earth surface, artificial earth surface and residence map plaque fusion, calculating corrected Z score value serving as threshold value, selecting map plaque scale population space integration result generated by modeling of corresponding method lower than the threshold value or lower than third quartile, and recording the result as map plaque scale population space integration result
Figure 598669DEST_PATH_IMAGE009
Identifying and extracting all population distribution carrier image spots in the administrative district which are higher than the threshold value or higher than a third quartile;
s56: introducing lighting brightness, calculating the population number of each population distribution carrier based on the population distribution carrier area and the lighting brightness, recording as Pli, and executing S52 and S53;
s57: based on the sub administrative district unit, comparing population space integration relative errors based on three methods of a single artificial earth surface, a population distribution carrier fusing the artificial earth surface and the residential district map patches and introducing lamplight brightness, calculating a corrected Z score serving as a threshold value, selecting a map patch scale population space integration result generated by modeling by a corresponding method smaller than the threshold value or smaller than a third quartile, and recording the integration result as the map patch scale population space integration result
Figure 970744DEST_PATH_IMAGE010
Identifying and extracting all population distribution carrier image spots in the administrative district which are higher than the threshold value or higher than a third quartile;
s58: introducing road network density and topographic factors, calculating the population number of each population distribution carrier based on the population distribution carrier area, the lighting brightness, the road network density and the topographic factors, respectively, recording the population number as Pri, pei and Ppi, and executing S52 and S53;
s59: comparing population space integration relative errors under different methods based on the sub administrative area units until the fitting index of population space integration in all sub administrative areas is greater than 0.5, and selecting a chart spot scale population space integration result generated by modeling with a corresponding method as a final result; the fitting index is used for expressing the consistency between the estimated value and the statistical value of the population space integration, and the calculation formula is as follows:
Figure 385676DEST_PATH_IMAGE011
wherein, the first and the second end of the pipe are connected with each other,
Figure 501400DEST_PATH_IMAGE012
actual statistical values of population data of the ith sub administrative area;
Figure 953896DEST_PATH_IMAGE013
an estimated value of population data of the ith sub-administrative region;
Figure 309923DEST_PATH_IMAGE014
population mean;
Figure 25200DEST_PATH_IMAGE004
is the estimated number of sub-administrative district units.
8. Foreign population space integrated system towards meticulous extraction of carrier, its characterized in that includes:
the data preprocessing module is used for acquiring and processing foreign multi-level administrative division demographic data and multi-source geographic space data and constructing a standardized data structure frame;
the data matching module is used for carrying out space matching on the foreign demographic data and administrative region demarcation line vector data;
the carrier unit extraction module is used for refining and extracting foreign population distribution carriers based on global 30-meter surface coverage data and fusing DEM data and OSM data;
the auxiliary factor extraction module is used for extracting space elements and population element space integrated auxiliary factors based on administrative divisions and population distribution carriers;
the population space integration module is a multi-factor gradual fusion modeling method based on a partition strategy and is used for developing population space integration and precision verification.
CN202211498052.7A 2022-11-28 2022-11-28 Foreign population space integration method and system for fine carrier extraction Active CN115544199B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211498052.7A CN115544199B (en) 2022-11-28 2022-11-28 Foreign population space integration method and system for fine carrier extraction

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211498052.7A CN115544199B (en) 2022-11-28 2022-11-28 Foreign population space integration method and system for fine carrier extraction

Publications (2)

Publication Number Publication Date
CN115544199A true CN115544199A (en) 2022-12-30
CN115544199B CN115544199B (en) 2023-03-31

Family

ID=84722493

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211498052.7A Active CN115544199B (en) 2022-11-28 2022-11-28 Foreign population space integration method and system for fine carrier extraction

Country Status (1)

Country Link
CN (1) CN115544199B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117688120A (en) * 2024-02-02 2024-03-12 中国测绘科学研究院 Method and system for finely dividing public population space data set based on multi-source data

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110428126A (en) * 2019-06-18 2019-11-08 华南农业大学 A kind of urban population spatialization processing method and system based on the open data of multi-source
CN114595428A (en) * 2022-03-11 2022-06-07 辽宁工程技术大学 Multi-source data population spatialization method based on noctilucent remote sensing
CN114881466A (en) * 2022-05-04 2022-08-09 辽宁工程技术大学 Multi-source data-based population space partition fitting method

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110428126A (en) * 2019-06-18 2019-11-08 华南农业大学 A kind of urban population spatialization processing method and system based on the open data of multi-source
CN114595428A (en) * 2022-03-11 2022-06-07 辽宁工程技术大学 Multi-source data population spatialization method based on noctilucent remote sensing
CN114881466A (en) * 2022-05-04 2022-08-09 辽宁工程技术大学 Multi-source data-based population space partition fitting method

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
朱守杰等: "融合多源空间数据的城镇人口分布估算", 《地球信息科学学报》 *
杨晓荣等: "基于多源数据的福建省人口数据空间化研究", 《贵州大学学报(自然科学版)》 *
王雪梅等: "基于遥感和GIS的人口数据空间化研究进展及案例分析", 《遥感技术与应用》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117688120A (en) * 2024-02-02 2024-03-12 中国测绘科学研究院 Method and system for finely dividing public population space data set based on multi-source data
CN117688120B (en) * 2024-02-02 2024-04-19 中国测绘科学研究院 Method and system for finely dividing public population space data set based on multi-source data

Also Published As

Publication number Publication date
CN115544199B (en) 2023-03-31

Similar Documents

Publication Publication Date Title
Hammerberg et al. Implications of employing detailed urban canopy parameters for mesoscale climate modelling: a comparison between WUDAPT and GIS databases over Vienna, Austria
CN108876027B (en) GIS-based rural residential point centralized residential area site selection and optimization method
Miliaresis et al. Segmentation and object-based classification for the extraction of the building class from LIDAR DEMs
Klouček et al. How does data accuracy influence the reliability of digital viewshed models? A case study with wind turbines
Dukai et al. A multi-height LoD1 model of all buildings in the Netherlands
CN111950942A (en) Model-based water pollution risk assessment method and device and computer equipment
CN115544199B (en) Foreign population space integration method and system for fine carrier extraction
CN115238584B (en) Population distribution identification method based on multi-source big data
Peeters A GIS-based method for modeling urban-climate parameters using automated recognition of shadows cast by buildings
Dihkan et al. Evaluation of urban heat island effect in Turkey
Dukai et al. Generating, storing, updating and disseminating a countrywide 3D model
CN114692236A (en) Big data-oriented territorial space planning base map base number processing method
CN112017282A (en) Method for extracting boundary and river network of water collection area with arbitrary section facing digital elevation model
Aljumaily et al. Voxel change: Big data–based change detection for aerial urban LiDAR of unequal densities
CN111986320B (en) Smart city application-oriented DEM and oblique photography model space fitting optimization method
CN111982077B (en) Electronic map drawing method and system and electronic equipment
Jain GIS-based framework for local spatial planning in hill areas
Acharya et al. Extraction and modelling of spatio-temporal urban change in kathmandu valley
Parent et al. Rapid viewshed analyses: A case study with visibilities limited by trees and buildings
KR20050063615A (en) Method for providing surface roughness in geographic information system
CN115270904A (en) Method and system for spatialization of proper-age permanent population in compulsory education stage
Mahdizadeh Gharakhanlou et al. Developing a cellular automata model for simulating rainfall-runoff process (case study: Babol catchment)
van Huijstee et al. Towards an urban preview
Pirowski et al. Distribution of Krakow’s Population by Dasymetric Modeling Method Using Urban Atlas and Publicly Available Statistical Data
Hariyanto et al. Measurement of Sprawl Effect Based on Urban Growth Trends and Prediction in Kedungkandang District, Malang City

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant