WO2021248335A1 - Method and system for measuring urban poverty spaces based on street view images and machine learning - Google Patents

Method and system for measuring urban poverty spaces based on street view images and machine learning Download PDF

Info

Publication number
WO2021248335A1
WO2021248335A1 PCT/CN2020/095204 CN2020095204W WO2021248335A1 WO 2021248335 A1 WO2021248335 A1 WO 2021248335A1 CN 2020095204 W CN2020095204 W CN 2020095204W WO 2021248335 A1 WO2021248335 A1 WO 2021248335A1
Authority
WO
WIPO (PCT)
Prior art keywords
street view
image data
proportion
poverty
following formula
Prior art date
Application number
PCT/CN2020/095204
Other languages
French (fr)
Chinese (zh)
Inventor
袁媛
刘颖
牛通
Original Assignee
中山大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中山大学 filed Critical 中山大学
Priority to PCT/CN2020/095204 priority Critical patent/WO2021248335A1/en
Priority to CN202080001052.4A priority patent/CN111937016B/en
Publication of WO2021248335A1 publication Critical patent/WO2021248335A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/29Geographical information databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/26Government or public services
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/10Terrestrial scenes
    • G06V20/182Network patterns, e.g. roads or rivers
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A30/00Adapting or protecting infrastructure or their operation
    • Y02A30/60Planning or developing urban green infrastructure

Definitions

  • the present invention relates to the field of artificial intelligence machine learning, and more specifically, to a method and system for measuring urban internal poverty space based on street view pictures and machine learning.
  • the patent (2019102766003) discloses a method for obtaining remote sensing data of a target city through remote sensing satellites and combining POI data for poverty assessment.
  • the above method does not use the existing urban street view images for combined evaluation, and the evaluation index dimension is less.
  • the present invention proposes a method and system for measuring urban poverty space based on street view pictures and machine learning.
  • the invention effectively compensates for the shortcomings of existing research, not only promotes the refinement of urban poverty research, but also enriches the dimensions of urban poverty measurement indicators. It has practical significance for improving poor communities and promoting renewal planning. It is accurate and reliable in measuring urban poverty. , Practical methods.
  • a method for measuring urban poverty space based on street view pictures and machine learning including the following steps:
  • map information database such as Baidu map, Gaode map, Google map, etc.
  • the street view image data of the target area is divided into several pieces of street view image data;
  • the principal factor is obtained, and the principal factor is defined as the street view factor
  • the multiple deprivation index IMD and street view factor are used as input variables of the machine learning algorithm to obtain the urban poverty score;
  • the invention collects street view image data from a map information database, uses picture segmentation technology to fully excavate element information in the street view image data, and combines mathematical models and computer algorithms to construct a machine learning model for measuring urban poverty.
  • the present invention effectively compensates for the shortcomings of the existing measurement, not only promotes the refinement of urban poverty research, but also enriches the dimensions of urban poverty measurement indicators. It has practical significance for improving poor communities and promoting renewal planning. It is accurate and reliable in measuring urban poverty. , Practical methods.
  • the "construction of multiple deprivation index IMD based on census data” includes the following sub-contents:
  • each dimension of data corresponds to a proportional weight ⁇ ;
  • the multiple deprivation index IMD is expressed by the following formula:
  • the E j represents the value of the j-th latitude data.
  • the value of the income field data is E 1.
  • the proportion of the income field data is 0.303; the value of the education field data is E 2 , and the proportion of the education field data is 0.212; the value of the employment field data is E 3 , and the proportion of the employment field data is 0.182;
  • the value of the housing field data is E 4 , and the proportion of the housing field data is 0.303;
  • the multiple deprivation index IMD is expressed by the following formula:
  • IMD E 1 *0.303+E 2 *0.212+E 3 *0.182+E 4 *0.303.
  • the E 1 is expressed by the following formula:
  • E 1 the proportion of industrial workers j 11 + the proportion of low-end service industries j 12 + the proportion of divorces and widows j 13
  • the said industrial worker ratio j 11 is expressed by the following formula:
  • the proportion of industrial workers j 11 (the number of people in the mining industry + the number of people in the manufacturing industry)/total number of employees
  • the said industrial worker ratio j 11 is expressed by the following formula:
  • Proportion of low-end service industry j 121 (population of electricity, gas and water production and supply industry + population of wholesale and retail industry + population of accommodation and catering industry + population of real estate industry) / total number of employees
  • the ratio of divorce and widowhood j 13 is expressed by the following formula:
  • Divorce and widowhood ratio j 13 the number of divorced and widowed population/the sum of the unmarried population aged 15 and above and the population with a spouse.
  • the said low education level j 21 is expressed by the following formula:
  • Low level of education j 21 population without going to school, elementary school, and junior high school/total population
  • the proportion of leaving school without a diploma j 22 the population without a diploma/total population.
  • E 4 Proportion of population living per square meter j 41 + Proportion without clean energy j 42 + Proportion without running water j 43 + Proportion without kitchen j 44 + Proportion without toilet j 45 + Proportion without hot water j 46
  • the said population ratio j 41 per square meter is expressed by the following formula:
  • the said non-clean energy ratio j 42 is expressed by the following formula:
  • Proportion without clean energy j 42 number of households using coal, firewood, and other energy sources/total number of households
  • the ratio of no tap water j 43 is expressed by the following formula:
  • Proportion of no running water j 43 number of households without running water/total number of households
  • the said no kitchen ratio j 44 is expressed by the following formula:
  • Proportion without kitchen j 44 number of households without kitchen/total number of households
  • Proportion without toilet j 45 number of households without toilet/total number of households
  • the ratio of no hot water j 46 is expressed by the following formula:
  • the proportion of no hot water j 46 the number of households without hot water/the total number of households.
  • the "acquiring street view image data of the target area in the map information database” includes the following sub-steps:
  • M*L pieces of image data are obtained for the sampling points of each target area, and the combined set of image data defining the sampling points of all target areas is the street view image data set of the target area.
  • the M*L pieces of image data represent each Under the vertical viewing angle, M images of different directions are taken, and there are L vertical viewing angles.
  • the distance D 100 meters.
  • the "segmentation of the street view image data of the target area into several pieces of street view image data through the image segmentation technology” includes the following sub-steps:
  • Image segmentation is performed on the street view image data set of the target area corresponding to the sample points of the target area through the optimal image segmentation technology of the sampling points of the target area, and the result obtained is defined as several pieces of street view image data.
  • the "based on several pieces of street view image data, combined with principal component analysis to obtain the main factor, and define the main factor as the street view factor" includes the following sub-steps:
  • street view indicators are obtained.
  • the street view indicators include sky opening index P sky , green viewing rate P green , road ratio P road , building ratio P building , interface enclosure degree P enclosure , and color elements , Salient area feature SRS, visual entropy VE, where the color elements include the name and saturation of street view image data;
  • the sky opening index P sky is calculated by the following formula:
  • the NS i is the number of pixels in the sky in the i-th block of street view image data; the N i is the total number of pixels in the i-th block of street view image data;
  • the green viewing rate P green is calculated by the following formula:
  • the NG i is the number of pixels of vegetation in the i-th block of street view image data
  • the ratio of the road surface P road is calculated by the following formula:
  • the NR i is the number of pixels of the road in the i-th block of street view image data
  • the NB i is the number of pixels of the building in the i-th block of street view image data
  • the interface enclosure degree P enclosure is calculated by the following formula:
  • the salient area feature SRS is calculated by the following formula:
  • the max (R, G, B) represents the maximum value among the color components in the i-th block of street view image data
  • the min (R, G, B) represents the minimum value among the color components in the i-th block of street view image data value
  • the visual entropy VE is calculated by the following formula:
  • the P i represents the probability of the i-th block of street view image data, which is used to characterize the entropy branch value
  • the street view index is used as the input variable of the principal component analysis method, and the main factor of the output variable is obtained.
  • the "machine learning algorithm” is a random forest algorithm.
  • the random forest algorithm uses random repeated sampling and node random splitting techniques, and is based on a large number of tree-like structure ensemble learning for classification and prediction. It is a simple, stable, and highly accurate algorithm.
  • the street view index is greatly affected by the position, location, angle of view, etc.
  • the present invention uses a random forest algorithm that belongs to a non-linear model to realize the simulation prediction of the urban poverty score with complex and multi-dimensional street view data. Since the random forest algorithm can evaluate all variables, there is no need to worry about the problem of multiple collinearity between variables.
  • the invention also discloses a urban poverty space measurement system based on street view pictures and machine learning based on the above method, which includes an image acquisition module, an image segmentation module, a picture combination module, a street view index module and an urban poverty score calculation module, wherein,
  • the image acquisition module is used to acquire street view image data of the target area
  • the picture combination module is used to combine M image data in different directions with the same vertical viewing angle of the sampling point to obtain street view image data of the target area;
  • the image segmentation module is used to segment the street view image data of the target area into several pieces of street view image data
  • the street view index module is used to calculate the street view index of the target area
  • the urban poverty score calculation module uses the multiple deprivation index IMD and the street view factor as input variables of the machine learning algorithm to obtain the urban poverty score.
  • the street view indicator module includes an image element pixel ratio calculation module and a color complexity calculation module, wherein:
  • the image element pixel ratio calculation module is used to calculate the sky open index P sky , the green viewing rate P green , the road surface ratio P road , the building ratio P building , and the interface enclosure degree P enclosure ;
  • the color complexity calculation module is used to calculate the visual entropy VE.
  • the invention collects street view image data from a map information database, uses picture segmentation technology to fully excavate element information in the street view image data, and combines mathematical models and computer algorithms to construct a machine learning model for measuring urban poverty.
  • the present invention effectively compensates for the shortcomings of the existing measurement, not only promotes the refinement of urban poverty research, but also enriches the dimensions of urban poverty measurement indicators. It has practical significance for improving poor communities and promoting renewal planning. It is accurate and reliable in measuring urban poverty. , Practical methods.
  • Figure 1 is a flow chart of Example 1;
  • Figure 2 is a distribution map of urban poverty levels with multiple deprivation index IMD;
  • Figure 3 is a distribution map of sampling points for street view images;
  • Figure 4 is a schematic diagram of the process of segmentation and interpretation of street view images;
  • Figure 6 is the spatial distribution pattern of the sense of enclosure of streetscape buildings;
  • Figure 7 is the spatial distribution pattern of the sense of enclosure of streetscape vegetation;
  • Figure 8 is the spatial distribution pattern of the sense of openness of the streetscape sky;
  • Figure 9 is the street view The spatial distribution pattern of the sense of openness of the road;
  • Figure 10 is the spatial distribution pattern of the complex sense of streetscape color;
  • Figure 11 is the distribution map of the urban poverty level predicted by the street view.
  • a method for measuring urban poverty space based on street view pictures and machine learning includes the following steps:
  • map information database such as Baidu map, Gaode map, Google map, etc.
  • the street view image data of the target area is divided into several pieces of street view image data;
  • the principal factor is obtained, and the principal factor is defined as the street view factor
  • the multiple deprivation index IMD and the street view factor are used as input variables of the machine learning algorithm to obtain the urban poverty score.
  • Embodiment 1 collects street view image data from a map information database, uses image segmentation technology to fully excavate the element information in the street view image data, and combines mathematical models and computer algorithms to construct a machine learning model for measuring urban poverty.
  • the present invention effectively compensates for the shortcomings of the existing measurement, not only promotes the refinement of urban poverty research, but also enriches the dimensions of urban poverty measurement indicators. It has practical significance for improving poor communities and promoting renewal planning. It is accurate and reliable in measuring urban poverty. , Practical methods.
  • Example 1 Constructing a multiple deprivation index IMD based on census data” includes the following sub-contents:
  • each dimension of data corresponds to a proportional weight ⁇ ;
  • the multiple deprivation index IMD is expressed by the following formula:
  • E j represents the value of the j-th latitude data.
  • the four dimensions of data are income field data, education field data, employment field data, and housing field data.
  • the value of the income field data is E 1 , the proportion of income field data is 0.303; the value of education field data is E 2 , the proportion of education field data is 0.212; the value of employment field data is E 3 , the proportion of employment field data is 0.182; the value of housing field data It is E 4 , the proportion of data in the housing sector is 0.303;
  • the multiple deprivation index IMD is expressed by the following formula:
  • IMD E 1 *0.303+E 2 *0.212+E 3 *0.182+E 4 *0.303.
  • E 1 is expressed by the following formula:
  • E 1 the proportion of industrial workers j 11 + the proportion of low-end service industries j 12 + the proportion of divorces and widows j 13
  • the proportion of industrial workers j 11 (the number of people in the mining industry + the number of people in the manufacturing industry)/total number of employees
  • Proportion of low-end service industry j 121 (population of electricity, gas and water production and supply industry + population of wholesale and retail industry + population of accommodation and catering industry + population of real estate industry) / total number of employees
  • the ratio of divorce and widowhood j 13 is expressed by the following formula:
  • Divorce and widowhood ratio j 13 the number of divorced and widowed population/the sum of the unmarried population aged 15 and above and the population with a spouse.
  • E 2 is expressed by the following formula:
  • the low education level j 21 is expressed by the following formula:
  • Low level of education j 21 population without going to school, elementary school, and junior high school/total population
  • the percentage of leaving school without a diploma j 22 is expressed by the following formula:
  • the proportion of leaving school without a diploma j 22 the population without a diploma/total population.
  • E 3 is expressed by the following formula:
  • E 4 is expressed by the following formula:
  • E 4 Proportion of population living per square meter j 41 + Proportion without clean energy j 42 + Proportion without running water j 43 + Proportion without kitchen j 44 + Proportion without toilet j 45 + Proportion without hot water j 46
  • the proportion of the population living per square meter j 41 is expressed by the following formula:
  • the non-clean energy ratio j 42 is expressed by the following formula:
  • Proportion without clean energy j 42 number of households using coal, firewood, and other energy sources/total number of households
  • the ratio of no tap water j 43 is expressed by the following formula:
  • Proportion of no running water j 43 number of households without running water/total number of households
  • Proportion without kitchen j 44 number of households without kitchen/total number of households
  • the ratio of no toilets j 45 is expressed by the following formula:
  • Proportion without toilet j 45 number of households without toilet/total number of households
  • the ratio of no hot water j 46 is expressed by the following formula:
  • the proportion of no hot water j 46 the number of households without hot water/the total number of households.
  • the combined set of image data defining the sampling points of all target areas is the street view image data set of the target area.
  • the M*L image data represents each vertical direction. Under the viewing angle, M images of different directions are taken, and there are L vertical viewing angles.
  • Image segmentation is performed on the street view image data set of the target area corresponding to the sample points of the target area through the optimal image segmentation technology of the sampling points of the target area, and the result obtained is defined as several pieces of street view image data.
  • Street view indicators include sky openness index P sky , green viewing rate P green , road ratio P road , building ratio P building , interface enclosure P enclosure , color elements, and salient areas Feature SRS, visual entropy VE, where color elements include the name and saturation of street view image data;
  • the sky opening index P sky is calculated by the following formula:
  • NS i is the number of pixels in the sky in the i-th block of street view image data
  • N i is the total number of pixels in the i-th block of street view image data
  • the green viewing rate P green is calculated by the following formula:
  • NG i is the number of pixels of vegetation in the i-th block of street view image data
  • NR i is the number of pixels of the road in the i-th block of street view image data
  • the building proportion P building is calculated by the following formula:
  • NB i is the number of pixels of the building in the i-th block of street view image data
  • the salient area feature SRS is calculated by the following formula:
  • max(R,G,B) represents the maximum value of the color components in the i-th block of street view image data
  • min(R,G,B) represents the minimum value of the color components in the i-th block of street view image data
  • the visual entropy VE is calculated by the following formula:
  • P i represents the probability of the i-th block of street view image data, which is used to represent the entropy branch value
  • the street view index is used as the input variable of the principal component analysis method, and the main factor of the output variable is obtained.
  • the random forest algorithm uses random repeated sampling and node random splitting techniques, and performs classification and prediction based on a large number of tree-like structure integrated learning, which is a simple, stable, and highly accurate algorithm.
  • the street view index is greatly affected by the position, location, angle of view, etc.
  • the present invention uses a random forest algorithm that belongs to a non-linear model to realize the simulation prediction of the urban poverty score with complex and multi-dimensional street view data. Since the random forest algorithm can evaluate all variables, there is no need to worry about the problem of multiple collinearity between variables.
  • a method for measuring urban poverty space based on street view pictures and machine learning including the following steps:
  • Step 1 Calculate 11 indicators from the sixth national census data, construct a traditional indicator system for measuring urban poverty, and calculate the multiple deprivation index (IMD), as shown in Figure 2;
  • Step 2 Along the main road, sub-main road, and branch road, the street view sampling interval is determined to be a uniform distance of 100 meters, and each sampling point is from four directions of 0°, 90°, 180°, and 270°. °Horizontal viewing angle and 20° elevation angle, the acquisition time is close to the time of the sixth national census, and Baidu map street view covering 8536 sampling points and 286 communities has been obtained, a total of 61864 pictures, and their spatial distribution is shown in the figure 3 shown;
  • Step 3 Randomly sample half of the street view images of the case community, and use the TensorFlow deep learning network framework that is often used in the field of vision to support, and use the artificial intelligence model based on FCN, SegNet, and PSPNet for interpretation (as shown in Figure 4) ).
  • PA Pixel Accuracy
  • MPA Mean Pixel Accuracy
  • MIOU Magnetic Intersection Over Union
  • Step 4 Summarize the characteristics of streetscape indicators of typical poor communities, and use the method of correlation analysis to determine the streetscape elements related to the degree of urban poverty.
  • the principal component analysis method is used to reduce the dimensionality to process the multi-view and multi-element street view indicators, and rotate the factor loading matrix to extract the high-contribution and important street view factors and name them, namely Architectural enclosure, vegetation enclosure, sky openness, road openness, and color complexity are shown in Figures 6-10.
  • Step 5 Take the important street scene factors obtained in the previous step as independent variables, and use multiple deprivation index (IMD) as a reference variable to construct a random forest (Random Forest) prediction model. After the remaining 50% of the sample data is tested, this step is repeated to generate a large number of decision trees. When the model error tends to the smallest and stable state, the growth of the random forest is terminated, the urban poverty level is judged, and the highest frequency is finally output The classification result, as the final output result of the street view measuring the urban poverty degree, the average correct rate of the statistical model reached 82.48%. The specific result is shown in Figure 11.
  • the urban poverty level is assigned a value from 0 to 5. The larger the number, the poorer it is. Then, it is stratified to each level of the community in proportion, and 50% of the data is drawn as the training sample. At the same time, random repeated sampling and random sampling with replacement are used to randomly select N data subsets with the same size as the existing training data to grow N independent decision tree models. After calculating the correct rate of the model prediction results and the total model error, it is found that when the number of tree nodes is 6, the average error rate of the model prediction reaches the minimum; at the same time, the number of trees is selected from 0-100, and it can be seen that 55 trees are generated After the decision tree, the total error of the model tends to be stable.
  • the parameters of the random forest model are determined during this demonstration.
  • the generation of tree nodes is determined by increasing the variables one by one to compare the level of misjudgment rate, that is, selecting from M existing possible attributes, and segmenting the most representative random feature variable.
  • This demonstration process compares 8 indicators at 0° and 20° in pairs, puts more important street view indicators into the model, allows all decision trees to grow as much as possible, and does not modify any parameters during the model building process. This helps to reduce the correlation between the decision tree used for classification and regression, enrich the comprehensiveness of the model and improve the classification ability.
  • this step is repeated to generate a large number of decision trees.
  • the model error tends to the smallest and stable state, the growth of the random forest is terminated.
  • the urban poverty level is judged, and the type with the highest frequency is finally output as the final output value of the random forest model, as shown in Table 1.
  • the 0° viewing angle is better, while color elements and salient area features calculated based on color are better.
  • the 20° angle of view index contributes more to the correct prediction of the model.
  • the model's predictive ability is improved.
  • Embodiment 2 is an application based on embodiment 1, a system for measuring urban poverty space based on street view pictures and machine learning, including image acquisition module, image segmentation module, image combination module, street view index module, and urban poverty score calculation Modules, where
  • the image acquisition module is used to acquire street view image data of the target area
  • the picture combination module is used to combine M image data of different directions with the same vertical viewing angle of the sampling points to obtain street view image data of the target area;
  • the image segmentation module is used to segment the street view image data of the target area into several pieces of street view image data
  • the street view index module is used to calculate the street view index of the target area
  • the urban poverty score calculation module uses the multiple deprivation index IMD and the street view factor as input variables of the machine learning algorithm to obtain the urban poverty score.
  • the street view index module includes a calculation module for the proportion of image elements and a color complexity calculation module, where,
  • the image element pixel ratio calculation module is used to calculate the sky open index P sky , the green viewing rate P green , the road surface ratio P road , the building ratio P building , and the interface enclosure degree P enclosure ;
  • the color complexity calculation module is used to calculate the visual entropy VE.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Human Resources & Organizations (AREA)
  • Economics (AREA)
  • Strategic Management (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Databases & Information Systems (AREA)
  • Software Systems (AREA)
  • Educational Administration (AREA)
  • Marketing (AREA)
  • General Engineering & Computer Science (AREA)
  • Development Economics (AREA)
  • Multimedia (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Data Mining & Analysis (AREA)
  • Remote Sensing (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Operations Research (AREA)
  • Health & Medical Sciences (AREA)
  • Game Theory and Decision Science (AREA)
  • Primary Health Care (AREA)
  • Artificial Intelligence (AREA)
  • Quality & Reliability (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

A method and a system for measuring urban poverty spaces based on street view images and machine learning, comprising the following steps: on the basis of census data, constructing an index of multiple deprivation IMD; acquiring street view image data of a target area in a map information database; by means of image segmentation technology, segmenting the street view image data of the target area into several blocks of street view image data; on the basis of the several blocks of street view image data, incorporating a principal component analysis method to obtain a principal factor, and defining the principal factor as a street view factor; using the index of multiple deprivation IMD and the street view factor as input variables of a machine learning algorithm to obtain an urban poverty score; and, on the basis of the urban poverty score, evaluating the degree of urban poverty. Also disclosed is a system based on the present method for measuring urban internal poverty spaces based on street view images and machine learning. The present method and system promote the refinement of urban poverty research and enrich the dimensions of urban poverty measurement indices.

Description

一种基于街景图片及机器学习的城市内部贫困空间测度方法及系统Method and system for measuring urban internal poverty space based on street view pictures and machine learning 技术领域Technical field
本发明涉及人工智能机器学习领域,更具体地,涉及一种基于街景图片及机器学习的城市内部贫困空间测度方法及系统。The present invention relates to the field of artificial intelligence machine learning, and more specifically, to a method and system for measuring urban internal poverty space based on street view pictures and machine learning.
背景技术Background technique
20世纪60~70年代以来,以多重贫困(Multiple Deprivation)为代表的传统城市贫困测度虽已日渐成熟,但基于社会经济统计数据构建的指标体系,往往存在更新周期长、可获得性低、数据来源单一等不足。随着信息化时代的到来,西方学界开始借助遥感影像、夜间灯光、公交刷卡、在线房租、地图兴趣点等大数据识别贫困空间。而国内现有研究仅在乡村贫困中较常使用卫星影像数据,利用遥感影像、夜间灯光等单一类型分析广大区域或城乡地带,而在城市贫困测度上新型数据和技术的使用较少,需要寻找适用于城市区域的数据来拓展城市贫困的指标范围、细化测度尺度,以深入挖掘贫困空间现象。Since the 1960s and 1970s, although the traditional urban poverty measurement represented by Multiple Deprivation has matured, the indicator system based on socioeconomic statistics often has a long update cycle, low availability, and data. Single source and other shortcomings. With the advent of the information age, Western academic circles have begun to use remote sensing images, night lights, bus card swiping, online rent, map points of interest and other big data to identify poverty spaces. However, the existing domestic research only uses satellite image data more frequently in rural poverty, using remote sensing images, night lights and other single types to analyze large areas or urban and rural areas. However, new data and technologies are used less in urban poverty measurement, so it is necessary to find It is applicable to the data of urban areas to expand the range of indicators of urban poverty, refine the measurement scale, and dig deeper into the spatial phenomenon of poverty.
专利(2019102766003)公开了一种通过遥感卫星获取目标城市遥感数据,结合POI数据进行贫困评估的一种方法。上述方法没有利用现有的城市街景图像进行结合评估,评估的指标维度较少。The patent (2019102766003) discloses a method for obtaining remote sensing data of a target city through remote sensing satellites and combining POI data for poverty assessment. The above method does not use the existing urban street view images for combined evaluation, and the evaluation index dimension is less.
发明内容Summary of the invention
为克服上述现有技术与方法的不足,本发明提出了一种基于街景图片及机器学习的城市内部贫困空间测度方法及系统。本发明有效地弥补了已有研究的缺陷,不仅推进了城市贫困研究精细化,而且丰富城市贫困度量指标的维度,对改善贫困社区、推进更新规划具有实际的意义,是测度城市内部贫困准确可靠、切实可行的方法。In order to overcome the above-mentioned shortcomings of the prior art and methods, the present invention proposes a method and system for measuring urban poverty space based on street view pictures and machine learning. The invention effectively compensates for the shortcomings of existing research, not only promotes the refinement of urban poverty research, but also enriches the dimensions of urban poverty measurement indicators. It has practical significance for improving poor communities and promoting renewal planning. It is accurate and reliable in measuring urban poverty. , Practical methods.
为解决上述技术问题,本发明的技术方案如下:In order to solve the above technical problems, the technical scheme of the present invention is as follows:
一种基于街景图片及机器学习的城市内部贫困空间测度方法,包括以下步骤:A method for measuring urban poverty space based on street view pictures and machine learning, including the following steps:
根据人口普查数据构建多重剥夺指数IMD;Construct multiple deprivation index IMD based on census data;
在地图信息数据库(如百度地图、高德地图、google地图等)中获取目标区域的街景图像数据;Obtain the street view image data of the target area in the map information database (such as Baidu map, Gaode map, Google map, etc.);
通过图像分割技术,将目标区域的街景图像数据分割为若干块街景图像数据;Through image segmentation technology, the street view image data of the target area is divided into several pieces of street view image data;
基于若干块街景图像数据,结合主成分分析法,得到主因子,将主因子定义为街景因子;Based on several pieces of street view image data, combined with the principal component analysis method, the principal factor is obtained, and the principal factor is defined as the street view factor;
将多重剥夺指数IMD和街景因子作为机器学习算法的输入变量,得到城市贫困分数;The multiple deprivation index IMD and street view factor are used as input variables of the machine learning algorithm to obtain the urban poverty score;
根据城市贫困分数对城市的贫困程度进行评估。Evaluate the poverty level of the city based on the urban poverty score.
本发明从地图信息数据库收集街景图像数据,使用图片分割技术充分挖掘了街景图像数据中的要素信息,同时结合数理模型和计算机算法,构建测度城市贫困程度的机器学习模型。本发明有效地弥补了已有测度的缺陷,不仅推进了城市贫困研究精细化,而且丰富城市贫困度量指标的维度,对改善贫困社区、推进更新规划具有实际的意义,是测度城市内部贫困准确可靠、切实可行的方法。The invention collects street view image data from a map information database, uses picture segmentation technology to fully excavate element information in the street view image data, and combines mathematical models and computer algorithms to construct a machine learning model for measuring urban poverty. The present invention effectively compensates for the shortcomings of the existing measurement, not only promotes the refinement of urban poverty research, but also enriches the dimensions of urban poverty measurement indicators. It has practical significance for improving poor communities and promoting renewal planning. It is accurate and reliable in measuring urban poverty. , Practical methods.
在一种优选的方案中,所述的“根据人口普查数据构建多重剥夺指数IMD”包括以下子内容:In a preferred solution, the "construction of multiple deprivation index IMD based on census data" includes the following sub-contents:
根据人口普查数据得到P个维度数据,每个维度的数据对应一个比例权重λ;According to the census data, P dimensions of data are obtained, and each dimension of data corresponds to a proportional weight λ;
所述的多重剥夺指数IMD通过下式进行表达:The multiple deprivation index IMD is expressed by the following formula:
Figure PCTCN2020095204-appb-000001
Figure PCTCN2020095204-appb-000001
式中,所述的E j表示第j个纬度数据的数值。 In the formula, the E j represents the value of the j-th latitude data.
在一种优选的方案中,所述的P=4,所述的4个维度数据分别是收入领域数据、教育领域数据、就业领域数据和住房领域数据,所述的收入领域数据的数值是E 1,收入领域数据的比重是0.303;所述的教育领域数据的数值是E 2,教育领域数据的比重是0.212;所述的就业领域数据的数值是E 3,就业领域数据的比重是0.182;所述的住房领域数据的数值是E 4,住房领域数据的比重是0.303;所述的多重剥夺指数IMD通过下式进行表达: In a preferred solution, the P=4, the four dimensions of data are income field data, education field data, employment field data, and housing field data, and the value of the income field data is E 1. The proportion of the income field data is 0.303; the value of the education field data is E 2 , and the proportion of the education field data is 0.212; the value of the employment field data is E 3 , and the proportion of the employment field data is 0.182; The value of the housing field data is E 4 , and the proportion of the housing field data is 0.303; the multiple deprivation index IMD is expressed by the following formula:
IMD=E 1*0.303+E 2*0.212+E 3*0.182+E 4*0.303。 IMD=E 1 *0.303+E 2 *0.212+E 3 *0.182+E 4 *0.303.
在一种优选的方案中,所述的E 1通过下式进行表达: In a preferred solution, the E 1 is expressed by the following formula:
E 1=产业工人比例j 11+低端服务业比例j 12+离婚丧偶比例j 13 E 1 = the proportion of industrial workers j 11 + the proportion of low-end service industries j 12 + the proportion of divorces and widows j 13
所述的产业工人比例j 11通过下式进行表达: The said industrial worker ratio j 11 is expressed by the following formula:
产业工人比例j 11=(采矿业的人口数+制造业的人口数)/就业总人数 The proportion of industrial workers j 11 = (the number of people in the mining industry + the number of people in the manufacturing industry)/total number of employees
所述的产业工人比例j 11通过下式进行表达: The said industrial worker ratio j 11 is expressed by the following formula:
低端服务业比例j 121=(电力、煤气及水的生产和供应业的人口数+批发和零售 业的人口数+住宿和餐饮业的人口数+房地产业的人口数)/就业总人数 Proportion of low-end service industry j 121 = (population of electricity, gas and water production and supply industry + population of wholesale and retail industry + population of accommodation and catering industry + population of real estate industry) / total number of employees
所述的离婚丧偶比例j 13通过下式进行表达: The ratio of divorce and widowhood j 13 is expressed by the following formula:
离婚丧偶比例j 13=离婚及丧偶人口数/15岁及以上未婚人口与有配偶人口数之和。 Divorce and widowhood ratio j 13 = the number of divorced and widowed population/the sum of the unmarried population aged 15 and above and the population with a spouse.
在一种优选的方案中,所述的E 2通过下式进行表达: In a preferred solution, the E 2 is expressed by the following formula:
E 2=低教育水平j 21+离校没有文凭比例j 22 E 2 = low education level j 21 + the proportion of school leavers without a diploma j 22
所述的低教育水平j 21通过下式进行表达: The said low education level j 21 is expressed by the following formula:
低教育水平j 21=未上过学、小学、初中的人口数/总人口 Low level of education j 21 = population without going to school, elementary school, and junior high school/total population
所述的离校没有文凭比例j 22通过下式进行表达: The proportion of school leavers without a diploma j 22 is expressed by the following formula:
离校没有文凭比例j 22=没有文凭的人口数/总人口。 The proportion of leaving school without a diploma j 22 = the population without a diploma/total population.
在一种优选的方案中,所述的E 3通过下式进行表达: In a preferred solution, the E 3 is expressed by the following formula:
E 3=失业比例j 31=没有工作的人口数/总人口。 E 3 = Unemployment ratio j 31 = Number of unemployed population/total population.
在一种优选的方案中,所述的E 4通过下式进行表达: In a preferred solution, the E 4 is expressed by the following formula:
E 4=每平方米住的人口比例j 41+无清洁能源比例j 42+无自来水比例j 43+无厨房比例j 44+无厕所比例j 45+无热水比例j 46 E 4 = Proportion of population living per square meter j 41 + Proportion without clean energy j 42 + Proportion without running water j 43 + Proportion without kitchen j 44 + Proportion without toilet j 45 + Proportion without hot water j 46
所述的每平方米住的人口比例j 41通过下式进行表达: The said population ratio j 41 per square meter is expressed by the following formula:
每平方米住的人口比例j 41=1/人均住房建筑面积(平方米/人) The proportion of the population living per square meter j 41 = 1/per capita housing area (square meters/person)
所述的无清洁能源比例j 42通过下式进行表达: The said non-clean energy ratio j 42 is expressed by the following formula:
无清洁能源比例j 42=煤炭、柴草、其他能源使用的家庭户数/总家庭户数 Proportion without clean energy j 42 = number of households using coal, firewood, and other energy sources/total number of households
所述的无自来水比例j 43通过下式进行表达: The ratio of no tap water j 43 is expressed by the following formula:
无自来水比例j 43=无自来水的家庭户数/总家庭户数 Proportion of no running water j 43 = number of households without running water/total number of households
所述的无厨房比例j 44通过下式进行表达: The said no kitchen ratio j 44 is expressed by the following formula:
无厨房比例j 44=无厨房的家庭户数/总家庭户数 Proportion without kitchen j 44 = number of households without kitchen/total number of households
所述的无厕所比例j 45通过下式进行表达: The ratio j 45 without toilets is expressed by the following formula:
无厕所比例j 45=无厕所的家庭户数/总家庭户数 Proportion without toilet j 45 = number of households without toilet/total number of households
所述的无热水比例j 46通过下式进行表达: The ratio of no hot water j 46 is expressed by the following formula:
无热水比例j 46=无热水的家庭户数/总家庭户数。 The proportion of no hot water j 46 = the number of households without hot water/the total number of households.
在一种优选的方案中,所述的“在地图信息数据库中获取目标区域的街景图像数据”包括以下子步骤:In a preferred solution, the "acquiring street view image data of the target area in the map information database" includes the following sub-steps:
在地图信息数据库中获取目标区域的路网信息;Obtain the road network information of the target area in the map information database;
根据目标区域的路网信息,按照距离D进行间隔性采样,得到目标区域的采样点;According to the road network information of the target area, perform interval sampling according to the distance D to obtain the sampling points of the target area;
针对每一个目标区域的采样点得到M*L张图像数据,定义所有目标区域的采样点的图像数据的结合集合为目标区域的街景图像数据集合,所述的M*L张图像数据表示每一个竖直方向视角下取M张相互不同方向的图像数据,有L个竖直方向视角。M*L pieces of image data are obtained for the sampling points of each target area, and the combined set of image data defining the sampling points of all target areas is the street view image data set of the target area. The M*L pieces of image data represent each Under the vertical viewing angle, M images of different directions are taken, and there are L vertical viewing angles.
在一种优选的方案中,所述的距离D=100米。In a preferred solution, the distance D=100 meters.
在一种优选的方案中,所述的M=4,L=2;所述的采样点按照第一个竖直方向视角的前后左右四个方向和第二个竖直方向视角的前后左右四个方向取得8张图像数据。In a preferred solution, M=4, L=2; the sampling points are based on the four directions of the first vertical viewing angle and the four directions of the second vertical viewing angle. Obtain 8 pieces of image data in each direction.
在一种优选的方案中,所述的“通过图像分割技术,将目标区域的街景图像数据分割为若干块街景图像数据”包括以下子步骤:In a preferred solution, the "segmentation of the street view image data of the target area into several pieces of street view image data through the image segmentation technology" includes the following sub-steps:
对目标区域的街景图像数据集合进行抽样,得到抽样结果;Sampling the street view image data set of the target area to obtain the sampling result;
在抽样结果中,将每一个抽样点的每一个竖直方向视角的M张相互不同方向的图像数据进行拼合,得到对应抽样点在设定竖直方向视角的全域图像;In the sampling result, merge the M image data of different directions from each vertical viewing angle of each sampling point to obtain a full-area image of the corresponding sampling point at the set vertical viewing angle;
将所有抽样点的每一个竖直方向视角的全域图像的集合定义为目标区域的采样点的抽样集合;Define the set of global images of each vertical viewing angle of all sampling points as a sampling set of sampling points of the target area;
通过现有的图像分割技术,判断最适合目标区域的采样点的抽样集合的图像分割技术,所得的结果定义为目标区域的采样点的最佳图像分割技术;Through the existing image segmentation technology, determine the most suitable image segmentation technology for the sampling set of the sampling points of the target area, and the result obtained is defined as the best image segmentation technology for the sampling points of the target area;
通过目标区域的采样点的最佳图像分割技术将目标区域的采样点对应的目标区域的街景图像数据集合进行图像分割,所得的结果定义为若干块街景图像数据。Image segmentation is performed on the street view image data set of the target area corresponding to the sample points of the target area through the optimal image segmentation technology of the sampling points of the target area, and the result obtained is defined as several pieces of street view image data.
在一种优选的方案中,所述的“基于若干块街景图像数据,结合主成分分析法,得到主因子,将主因子定义为街景因子”包括以下子步骤:In a preferred solution, the "based on several pieces of street view image data, combined with principal component analysis to obtain the main factor, and define the main factor as the street view factor" includes the following sub-steps:
基于若干块街景图像数据,得到街景指标,所述的街景指标包括天空开敞指数P sky、绿视率P green、路面占比P road、建筑占比P building、界面围合度P enclosure、色彩要素、显著区域特征SRS、视觉熵VE,其中,所述的色彩要素包括街景图像数据的名度和饱和度; Based on several pieces of street view image data, street view indicators are obtained. The street view indicators include sky opening index P sky , green viewing rate P green , road ratio P road , building ratio P building , interface enclosure degree P enclosure , and color elements , Salient area feature SRS, visual entropy VE, where the color elements include the name and saturation of street view image data;
所述的天空开敞指数P sky通过下式进行计算: The sky opening index P sky is calculated by the following formula:
Figure PCTCN2020095204-appb-000002
Figure PCTCN2020095204-appb-000002
式中,所述的NS i为第i块街景图像数据中天空的像素数;所述的N i是第i块街景图像数据中的总像素数; In the formula, the NS i is the number of pixels in the sky in the i-th block of street view image data; the N i is the total number of pixels in the i-th block of street view image data;
所述的绿视率P green通过下式进行计算: The green viewing rate P green is calculated by the following formula:
Figure PCTCN2020095204-appb-000003
Figure PCTCN2020095204-appb-000003
式中,所述的NG i为第i块街景图像数据中植被的像素数; In the formula, the NG i is the number of pixels of vegetation in the i-th block of street view image data;
所述的路面占比P road通过下式进行计算: The ratio of the road surface P road is calculated by the following formula:
Figure PCTCN2020095204-appb-000004
Figure PCTCN2020095204-appb-000004
式中,所述的NR i为第i块街景图像数据中道路的像素数; In the formula, the NR i is the number of pixels of the road in the i-th block of street view image data;
所述的建筑占比P building通过下式进行计算: The stated proportion of building P building is calculated by the following formula:
Figure PCTCN2020095204-appb-000005
Figure PCTCN2020095204-appb-000005
式中,所述的NB i为第i块街景图像数据中建筑的像素数; In the formula, the NB i is the number of pixels of the building in the i-th block of street view image data;
所述的界面围合度P enclosure通过下式进行计算: The interface enclosure degree P enclosure is calculated by the following formula:
P enclosure=P green+P building P enclosure =P green +P building
所述的显著区域特征SRS通过下式进行计算:The salient area feature SRS is calculated by the following formula:
Figure PCTCN2020095204-appb-000006
Figure PCTCN2020095204-appb-000006
所述的max(R,G,B)表示第i块街景图像数据中颜色分量中的最大值;所述的min(R,G,B)表示第i块街景图像数据中颜色分量中的最小值;The max (R, G, B) represents the maximum value among the color components in the i-th block of street view image data; the min (R, G, B) represents the minimum value among the color components in the i-th block of street view image data value;
所述的视觉熵VE通过下式进行计算:The visual entropy VE is calculated by the following formula:
Figure PCTCN2020095204-appb-000007
Figure PCTCN2020095204-appb-000007
所述的P i表示第i块街景图像数据的概率,用于表征熵枝值; The P i represents the probability of the i-th block of street view image data, which is used to characterize the entropy branch value;
将街景指标作为主成分分析法的输入变量,得到输出变量主因子。The street view index is used as the input variable of the principal component analysis method, and the main factor of the output variable is obtained.
在一种优选的方案中,所述的“机器学习算法”是随机森林算法。In a preferred solution, the "machine learning algorithm" is a random forest algorithm.
本优选方案中,随机森林算法利用随机重复采样和节点随机分裂技术,并基于大量树状结构的集成学习来进行分类和预测,是一种简单稳定、准确率较高的 算法。而街景指标受方位、区位、视角等影响很大,本发明使用属于非线性模型的随机森林算法,以实现复杂多维的街景数据对城市贫困分值的模拟预测。由于随机森林算法可以评估所有变量,无需顾虑变量之间的多元共线性问题。In this preferred scheme, the random forest algorithm uses random repeated sampling and node random splitting techniques, and is based on a large number of tree-like structure ensemble learning for classification and prediction. It is a simple, stable, and highly accurate algorithm. The street view index is greatly affected by the position, location, angle of view, etc. The present invention uses a random forest algorithm that belongs to a non-linear model to realize the simulation prediction of the urban poverty score with complex and multi-dimensional street view data. Since the random forest algorithm can evaluate all variables, there is no need to worry about the problem of multiple collinearity between variables.
本发明还公开了基于上述方法的一种基于街景图片及机器学习的城市内部贫困空间测度系统,包括图像获取模块、图像分割模块、图片组合模块、街景指标模块和城市贫困分数计算模块,其中,The invention also discloses a urban poverty space measurement system based on street view pictures and machine learning based on the above method, which includes an image acquisition module, an image segmentation module, a picture combination module, a street view index module and an urban poverty score calculation module, wherein,
所述的图像获取模块用于获取目标区域的街景图像数据;;The image acquisition module is used to acquire street view image data of the target area;
所述的图片组合模块用于将采样点的相同的竖直方向视角的M张不同方向的图像数据进行拼合,得到目标区域的街景图像数据;The picture combination module is used to combine M image data in different directions with the same vertical viewing angle of the sampling point to obtain street view image data of the target area;
所述的图像分割模块用于将目标区域的街景图像数据分割为若干块街景图像数据;The image segmentation module is used to segment the street view image data of the target area into several pieces of street view image data;
所述的街景指标模块用于计算目标区域的街景指标;The street view index module is used to calculate the street view index of the target area;
所述的城市贫困分数计算模块将多重剥夺指数IMD和街景因子作为机器学习算法的输入变量,得到城市贫困分数。The urban poverty score calculation module uses the multiple deprivation index IMD and the street view factor as input variables of the machine learning algorithm to obtain the urban poverty score.
在一种优选的方案中,所述的街景指标模块包括图像要素像元占比计算模块和色彩复杂程度计算模块,其中,In a preferred solution, the street view indicator module includes an image element pixel ratio calculation module and a color complexity calculation module, wherein:
所述的图像要素像元占比计算模块用于计算天空开敞指数P sky、绿视率P green、路面占比P road、建筑占比P building、界面围合度P enclosureThe image element pixel ratio calculation module is used to calculate the sky open index P sky , the green viewing rate P green , the road surface ratio P road , the building ratio P building , and the interface enclosure degree P enclosure ;
所述的色彩复杂程度计算模块用于计算视觉熵VE。The color complexity calculation module is used to calculate the visual entropy VE.
与现有技术相比,本发明技术方案的有益效果是:Compared with the prior art, the beneficial effects of the technical solution of the present invention are:
本发明从地图信息数据库收集街景图像数据,使用图片分割技术充分挖掘了街景图像数据中的要素信息,同时结合数理模型和计算机算法,构建测度城市贫困程度的机器学习模型。本发明有效地弥补了已有测度的缺陷,不仅推进了城市贫困研究精细化,而且丰富城市贫困度量指标的维度,对改善贫困社区、推进更新规划具有实际的意义,是测度城市内部贫困准确可靠、切实可行的方法。The invention collects street view image data from a map information database, uses picture segmentation technology to fully excavate element information in the street view image data, and combines mathematical models and computer algorithms to construct a machine learning model for measuring urban poverty. The present invention effectively compensates for the shortcomings of the existing measurement, not only promotes the refinement of urban poverty research, but also enriches the dimensions of urban poverty measurement indicators. It has practical significance for improving poor communities and promoting renewal planning. It is accurate and reliable in measuring urban poverty. , Practical methods.
附图说明Description of the drawings
图1为实施例1的流程图;图2是多重剥夺指数IMD城市贫困等级分布图;图3是街景图片采样点分布图;图4是街景图像分割解译的流程示意图;图5是三种模型分割街景结果示例对比图;图6是街景建筑围合感空间分布格局图;图7是街景植被 围合感空间分布格局图;图8是街景天空开阔感空间分布格局图;图9是街景道路开阔感空间分布格局图;图10是街景色彩复杂感空间分布格局图;图11是街景预测城市贫困等级分布图。Figure 1 is a flow chart of Example 1; Figure 2 is a distribution map of urban poverty levels with multiple deprivation index IMD; Figure 3 is a distribution map of sampling points for street view images; Figure 4 is a schematic diagram of the process of segmentation and interpretation of street view images; Figure 6 is the spatial distribution pattern of the sense of enclosure of streetscape buildings; Figure 7 is the spatial distribution pattern of the sense of enclosure of streetscape vegetation; Figure 8 is the spatial distribution pattern of the sense of openness of the streetscape sky; Figure 9 is the street view The spatial distribution pattern of the sense of openness of the road; Figure 10 is the spatial distribution pattern of the complex sense of streetscape color; Figure 11 is the distribution map of the urban poverty level predicted by the street view.
具体实施方式detailed description
附图仅用于示例性说明,不能理解为对本专利的限制;为了更好说明本实施例,附图某些部件会有省略、放大或缩小,并不代表实际产品的尺寸;The accompanying drawings are only for illustrative purposes and cannot be understood as a limitation of the patent; in order to better illustrate this embodiment, some parts of the accompanying drawings may be omitted, enlarged or reduced, and do not represent the size of the actual product;
对于本领域技术人员来说,附图中某些公知结构及其说明可能省略是可以理解的。下面结合附图和实施例对本发明的技术方案做进一步的说明。For those skilled in the art, it is understandable that some well-known structures in the drawings and their descriptions may be omitted. The technical solution of the present invention will be further described below in conjunction with the drawings and embodiments.
实施例1Example 1
如图1所示,一种基于街景图片及机器学习的城市内部贫困空间测度方法,包括以下步骤:As shown in Figure 1, a method for measuring urban poverty space based on street view pictures and machine learning includes the following steps:
根据人口普查数据构建多重剥夺指数IMD;Construct multiple deprivation index IMD based on census data;
在地图信息数据库(如百度地图、高德地图、google地图等)中获取目标区域的街景图像数据;Obtain the street view image data of the target area in the map information database (such as Baidu map, Gaode map, Google map, etc.);
通过图像分割技术,将目标区域的街景图像数据分割为若干块街景图像数据;Through image segmentation technology, the street view image data of the target area is divided into several pieces of street view image data;
基于若干块街景图像数据,结合主成分分析法,得到主因子,将主因子定义为街景因子;Based on several pieces of street view image data, combined with the principal component analysis method, the principal factor is obtained, and the principal factor is defined as the street view factor;
将多重剥夺指数IMD和街景因子作为机器学习算法的输入变量,得到城市贫困分数。The multiple deprivation index IMD and the street view factor are used as input variables of the machine learning algorithm to obtain the urban poverty score.
根据城市贫困分数对城市的贫困程度进行评估。Evaluate the poverty level of the city based on the urban poverty score.
实施例1从地图信息数据库收集街景图像数据,使用图片分割技术充分挖掘了街景图像数据中的要素信息,同时结合数理模型和计算机算法,构建测度城市贫困程度的机器学习模型。本发明有效地弥补了已有测度的缺陷,不仅推进了城市贫困研究精细化,而且丰富城市贫困度量指标的维度,对改善贫困社区、推进更新规划具有实际的意义,是测度城市内部贫困准确可靠、切实可行的方法。 Embodiment 1 collects street view image data from a map information database, uses image segmentation technology to fully excavate the element information in the street view image data, and combines mathematical models and computer algorithms to construct a machine learning model for measuring urban poverty. The present invention effectively compensates for the shortcomings of the existing measurement, not only promotes the refinement of urban poverty research, but also enriches the dimensions of urban poverty measurement indicators. It has practical significance for improving poor communities and promoting renewal planning. It is accurate and reliable in measuring urban poverty. , Practical methods.
在实施例1中,还可以进行以下扩展:“根据人口普查数据构建多重剥夺指数IMD”包括以下子内容:In Example 1, the following extensions can also be made: "Constructing a multiple deprivation index IMD based on census data" includes the following sub-contents:
根据人口普查数据得到P个维度数据,每个维度的数据对应一个比例权重λ;According to the census data, P dimensions of data are obtained, and each dimension of data corresponds to a proportional weight λ;
多重剥夺指数IMD通过下式进行表达:The multiple deprivation index IMD is expressed by the following formula:
Figure PCTCN2020095204-appb-000008
Figure PCTCN2020095204-appb-000008
式中,E j表示第j个纬度数据的数值。 In the formula, E j represents the value of the j-th latitude data.
在实施例1及上述改进实施例1中,还可以进行以下扩展:P=4,4个维度数据分别是收入领域数据、教育领域数据、就业领域数据和住房领域数据,收入领域数据的数值是E 1,收入领域数据的比重是0.303;教育领域数据的数值是E 2,教育领域数据的比重是0.212;就业领域数据的数值是E 3,就业领域数据的比重是0.182;住房领域数据的数值是E 4,住房领域数据的比重是0.303;多重剥夺指数IMD通过下式进行表达: In Embodiment 1 and the above-mentioned improved embodiment 1, the following extensions can also be made: P=4, the four dimensions of data are income field data, education field data, employment field data, and housing field data. The value of the income field data is E 1 , the proportion of income field data is 0.303; the value of education field data is E 2 , the proportion of education field data is 0.212; the value of employment field data is E 3 , the proportion of employment field data is 0.182; the value of housing field data It is E 4 , the proportion of data in the housing sector is 0.303; the multiple deprivation index IMD is expressed by the following formula:
IMD=E 1*0.303+E 2*0.212+E 3*0.182+E 4*0.303。 IMD=E 1 *0.303+E 2 *0.212+E 3 *0.182+E 4 *0.303.
在实施例1及上述改进实施例1中,还可以进行以下扩展:E 1通过下式进行表达: In Embodiment 1 and the above-mentioned improved embodiment 1, the following extensions can also be made: E 1 is expressed by the following formula:
E 1=产业工人比例j 11+低端服务业比例j 12+离婚丧偶比例j 13 E 1 = the proportion of industrial workers j 11 + the proportion of low-end service industries j 12 + the proportion of divorces and widows j 13
产业工人比例j 11通过下式进行表达: The proportion of industrial workers j 11 is expressed by the following formula:
产业工人比例j 11=(采矿业的人口数+制造业的人口数)/就业总人数 The proportion of industrial workers j 11 = (the number of people in the mining industry + the number of people in the manufacturing industry)/total number of employees
产业工人比例j 11通过下式进行表达: The proportion of industrial workers j 11 is expressed by the following formula:
低端服务业比例j 121=(电力、煤气及水的生产和供应业的人口数+批发和零售业的人口数+住宿和餐饮业的人口数+房地产业的人口数)/就业总人数 Proportion of low-end service industry j 121 = (population of electricity, gas and water production and supply industry + population of wholesale and retail industry + population of accommodation and catering industry + population of real estate industry) / total number of employees
离婚丧偶比例j 13通过下式进行表达: The ratio of divorce and widowhood j 13 is expressed by the following formula:
离婚丧偶比例j 13=离婚及丧偶人口数/15岁及以上未婚人口与有配偶人口数之和。 Divorce and widowhood ratio j 13 = the number of divorced and widowed population/the sum of the unmarried population aged 15 and above and the population with a spouse.
在实施例1及上述改进实施例1中,还可以进行以下扩展:E 2通过下式进行表达: In Embodiment 1 and the above-mentioned improved embodiment 1, the following extensions can also be made: E 2 is expressed by the following formula:
E 2=低教育水平j 21+离校没有文凭比例j 22 E 2 = low education level j 21 + the proportion of school leavers without a diploma j 22
低教育水平j 21通过下式进行表达: The low education level j 21 is expressed by the following formula:
低教育水平j 21=未上过学、小学、初中的人口数/总人口 Low level of education j 21 = population without going to school, elementary school, and junior high school/total population
离校没有文凭比例j 22通过下式进行表达: The percentage of leaving school without a diploma j 22 is expressed by the following formula:
离校没有文凭比例j 22=没有文凭的人口数/总人口。 The proportion of leaving school without a diploma j 22 = the population without a diploma/total population.
在实施例1及上述改进实施例1中,还可以进行以下扩展:E 3通过下式进行表达: In Embodiment 1 and the above improved embodiment 1, the following extensions can also be made: E 3 is expressed by the following formula:
E 3=失业比例j 31=没有工作的人口数/总人口。 E 3 = Unemployment ratio j 31 = Number of unemployed population/total population.
在实施例1及上述改进实施例1中,还可以进行以下扩展:E 4通过下式进行表达: In Embodiment 1 and the above-mentioned improved embodiment 1, the following extensions can also be made: E 4 is expressed by the following formula:
E 4=每平方米住的人口比例j 41+无清洁能源比例j 42+无自来水比例j 43+无厨房比例j 44+无厕所比例j 45+无热水比例j 46 E 4 = Proportion of population living per square meter j 41 + Proportion without clean energy j 42 + Proportion without running water j 43 + Proportion without kitchen j 44 + Proportion without toilet j 45 + Proportion without hot water j 46
每平方米住的人口比例j 41通过下式进行表达: The proportion of the population living per square meter j 41 is expressed by the following formula:
每平方米住的人口比例j 41=1/人均住房建筑面积(平方米/人) The proportion of the population living per square meter j 41 = 1/per capita housing area (square meters/person)
无清洁能源比例j 42通过下式进行表达: The non-clean energy ratio j 42 is expressed by the following formula:
无清洁能源比例j 42=煤炭、柴草、其他能源使用的家庭户数/总家庭户数 Proportion without clean energy j 42 = number of households using coal, firewood, and other energy sources/total number of households
无自来水比例j 43通过下式进行表达: The ratio of no tap water j 43 is expressed by the following formula:
无自来水比例j 43=无自来水的家庭户数/总家庭户数 Proportion of no running water j 43 = number of households without running water/total number of households
无厨房比例j 44通过下式进行表达: The proportion of no kitchen j 44 is expressed by the following formula:
无厨房比例j 44=无厨房的家庭户数/总家庭户数 Proportion without kitchen j 44 = number of households without kitchen/total number of households
无厕所比例j 45通过下式进行表达: The ratio of no toilets j 45 is expressed by the following formula:
无厕所比例j 45=无厕所的家庭户数/总家庭户数 Proportion without toilet j 45 = number of households without toilet/total number of households
无热水比例j 46通过下式进行表达: The ratio of no hot water j 46 is expressed by the following formula:
无热水比例j 46=无热水的家庭户数/总家庭户数。 The proportion of no hot water j 46 = the number of households without hot water/the total number of households.
在实施例1及上述改进实施例1中,还可以进行以下扩展:“在地图信息数据库中获取目标区域的街景图像数据”包括以下子步骤:In Embodiment 1 and the above-mentioned improved embodiment 1, the following extensions can also be made: "Acquiring street view image data of the target area in the map information database" includes the following sub-steps:
在地图信息数据库中获取目标区域的路网信息;Obtain the road network information of the target area in the map information database;
根据目标区域的路网信息,按照距离D进行间隔性采样,得到目标区域的采样点;According to the road network information of the target area, perform interval sampling according to the distance D to obtain the sampling points of the target area;
针对每一个目标区域的采样点得到M*L张图像数据,定义所有目标区域的采样点的图像数据的结合集合为目标区域的街景图像数据集合,M*L张图像数据表示每一个竖直方向视角下取M张相互不同方向的图像数据,有L个竖直方向视角。Obtain M*L image data for the sampling points of each target area. The combined set of image data defining the sampling points of all target areas is the street view image data set of the target area. The M*L image data represents each vertical direction. Under the viewing angle, M images of different directions are taken, and there are L vertical viewing angles.
在实施例1及上述改进实施例1中,还可以进行以下扩展:距离D=100米。In Embodiment 1 and the above-mentioned modified embodiment 1, the following extension can also be carried out: distance D=100 meters.
在实施例1及上述改进实施例1中,还可以进行以下扩展:M=4,L=2;采样点按照第一个竖直方向视角的前后左右四个方向和第二个竖直方向视角的前后左右四个方向取得8张图像数据。In Embodiment 1 and the above-mentioned modified embodiment 1, the following extensions can also be made: M=4, L=2; the sampling points are based on the four directions of front, back, left, and right of the first vertical viewing angle and the second vertical viewing angle Obtain 8 pieces of image data in four directions: front, back, left, and right.
在实施例1及上述改进实施例1中,还可以进行以下扩展:“通过图像分割技术,将目标区域的街景图像数据分割为若干块街景图像数据”包括以下子步骤:In Embodiment 1 and the above-mentioned improved embodiment 1, the following extensions can also be made: "Splitting the street view image data of the target area into several pieces of street view image data through image segmentation technology" includes the following sub-steps:
对目标区域的街景图像数据集合进行抽样,得到抽样结果;Sampling the street view image data set of the target area to obtain the sampling result;
在抽样结果中,将每一个抽样点的每一个竖直方向视角的M张相互不同方向的图像数据进行拼合,得到对应抽样点在设定竖直方向视角的全域图像;In the sampling result, merge the M image data of different directions from each vertical viewing angle of each sampling point to obtain a full-area image of the corresponding sampling point at the set vertical viewing angle;
将所有抽样点的每一个竖直方向视角的全域图像的集合定义为目标区域的采样点的抽样集合;Define the set of global images of each vertical viewing angle of all sampling points as a sampling set of sampling points of the target area;
通过现有的图像分割技术,判断最适合目标区域的采样点的抽样集合的图像分割技术,所得的结果定义为目标区域的采样点的最佳图像分割技术;Through the existing image segmentation technology, determine the most suitable image segmentation technology for the sampling set of the sampling points of the target area, and the result obtained is defined as the best image segmentation technology for the sampling points of the target area;
通过目标区域的采样点的最佳图像分割技术将目标区域的采样点对应的目标区域的街景图像数据集合进行图像分割,所得的结果定义为若干块街景图像数据。Image segmentation is performed on the street view image data set of the target area corresponding to the sample points of the target area through the optimal image segmentation technology of the sampling points of the target area, and the result obtained is defined as several pieces of street view image data.
在实施例1及上述改进实施例1中,还可以进行以下扩展:“基于若干块街景图像数据,结合主成分分析法,得到主因子,将主因子定义为街景因子”包括以下子步骤:In Embodiment 1 and the above-mentioned improved embodiment 1, the following extensions can also be made: "Based on several pieces of street view image data, combined with the principal component analysis method to obtain the main factor, and define the main factor as the street view factor" includes the following sub-steps:
基于若干块街景图像数据,得到街景指标,街景指标包括天空开敞指数P sky、绿视率P green、路面占比P road、建筑占比P building、界面围合度P enclosure、色彩要素、显著区域特征SRS、视觉熵VE,其中,色彩要素包括街景图像数据的名度和饱和度; Based on several pieces of street view image data, street view indicators are obtained. Street view indicators include sky openness index P sky , green viewing rate P green , road ratio P road , building ratio P building , interface enclosure P enclosure , color elements, and salient areas Feature SRS, visual entropy VE, where color elements include the name and saturation of street view image data;
天空开敞指数P sky通过下式进行计算: The sky opening index P sky is calculated by the following formula:
Figure PCTCN2020095204-appb-000009
Figure PCTCN2020095204-appb-000009
式中,NS i为第i块街景图像数据中天空的像素数;N i是第i块街景图像数据中的总像素数; In the formula, NS i is the number of pixels in the sky in the i-th block of street view image data; N i is the total number of pixels in the i-th block of street view image data;
绿视率P green通过下式进行计算: The green viewing rate P green is calculated by the following formula:
Figure PCTCN2020095204-appb-000010
Figure PCTCN2020095204-appb-000010
式中,NG i为第i块街景图像数据中植被的像素数; In the formula, NG i is the number of pixels of vegetation in the i-th block of street view image data;
路面占比P road通过下式进行计算:
Figure PCTCN2020095204-appb-000011
P road, the proportion of road surface, is calculated by the following formula:
Figure PCTCN2020095204-appb-000011
式中,NR i为第i块街景图像数据中道路的像素数; In the formula, NR i is the number of pixels of the road in the i-th block of street view image data;
建筑占比P building通过下式进行计算:
Figure PCTCN2020095204-appb-000012
The building proportion P building is calculated by the following formula:
Figure PCTCN2020095204-appb-000012
式中,NB i为第i块街景图像数据中建筑的像素数; In the formula, NB i is the number of pixels of the building in the i-th block of street view image data;
界面围合度P enclosure通过下式进行计算:P enclosure=P green+P building The interface enclosure degree P enclosure is calculated by the following formula: P enclosure =P green +P building
显著区域特征SRS通过下式进行计算:
Figure PCTCN2020095204-appb-000013
The salient area feature SRS is calculated by the following formula:
Figure PCTCN2020095204-appb-000013
max(R,G,B)表示第i块街景图像数据中颜色分量中的最大值;min(R,G,B)表示第i块街景图像数据中颜色分量中的最小值;max(R,G,B) represents the maximum value of the color components in the i-th block of street view image data; min(R,G,B) represents the minimum value of the color components in the i-th block of street view image data;
视觉熵VE通过下式进行计算:
Figure PCTCN2020095204-appb-000014
The visual entropy VE is calculated by the following formula:
Figure PCTCN2020095204-appb-000014
P i表示第i块街景图像数据的概率,用于表征熵枝值; P i represents the probability of the i-th block of street view image data, which is used to represent the entropy branch value;
将街景指标作为主成分分析法的输入变量,得到输出变量主因子。The street view index is used as the input variable of the principal component analysis method, and the main factor of the output variable is obtained.
在实施例1及上述改进实施例1中,还可以进行以下扩展:“机器学习算法”是随机森林算法。In Embodiment 1 and the above-mentioned improved embodiment 1, the following extensions can also be made: "Machine learning algorithm" is a random forest algorithm.
本改进实施例1中,随机森林算法利用随机重复采样和节点随机分裂技术,并基于大量树状结构的集成学习来进行分类和预测,是一种简单稳定、准确率较高的算法。而街景指标受方位、区位、视角等影响很大,本发明使用属于非线性模型的随机森林算法,以实现复杂多维的街景数据对城市贫困分值的模拟预测。由于随机森林算法可以评估所有变量,无需顾虑变量之间的多元共线性问题。In this improved embodiment 1, the random forest algorithm uses random repeated sampling and node random splitting techniques, and performs classification and prediction based on a large number of tree-like structure integrated learning, which is a simple, stable, and highly accurate algorithm. The street view index is greatly affected by the position, location, angle of view, etc. The present invention uses a random forest algorithm that belongs to a non-linear model to realize the simulation prediction of the urban poverty score with complex and multi-dimensional street view data. Since the random forest algorithm can evaluate all variables, there is no need to worry about the problem of multiple collinearity between variables.
实施例1的演示Demonstration of Example 1
演示环境:Demonstration environment:
抽样选取位于广州市、中心四区(越秀、荔湾、海珠、天河)的社区作为研究对象,涵盖各类具备不同建成环境的贫困和非贫困社区。一方面是因为作为华南区域的政治、经济、文化中心,广州市一直是城市贫困研究的典型案例区域。另一方面,参考行政边界、分区职能、发展阶段等方面的差异,越秀、海珠、荔湾、天河区适合作为研究对象。经2010年第六次人口普查数据整理统计,中心四区共包含914个居委会/村委会(社区),总统计人口数为483.3万人,占广州市全部人数的40%,研究对象具有典型代表性。The sample selected communities located in Guangzhou City and the four central districts (Yuexiu, Liwan, Haizhu, Tianhe) as the research objects, covering various impoverished and non-poverty communities with different built environments. On the one hand, as the political, economic, and cultural center of South China, Guangzhou has always been a typical case area for urban poverty research. On the other hand, with reference to differences in administrative boundaries, divisional functions, development stages, etc., Yuexiu, Haizhu, Liwan, and Tianhe districts are suitable as research objects. According to the statistics of the sixth population census in 2010, there are 914 neighborhood committees/village committees (communities) in the four central districts, with a total population of 4.833 million, accounting for 40% of the total population in Guangzhou. The research objects are typical Representative.
演示过程:Demonstration process:
一种基于街景图片及机器学习的城市内部贫困空间测度方法,包括以下步骤:A method for measuring urban poverty space based on street view pictures and machine learning, including the following steps:
第1步:从第六次全国人口普查数据中计算得到11个指标,构建传统测度城市贫困程度的指标体系,并计算多重剥夺指数(IMD),如图2所示;Step 1: Calculate 11 indicators from the sixth national census data, construct a traditional indicator system for measuring urban poverty, and calculate the multiple deprivation index (IMD), as shown in Figure 2;
第2步:沿着主干路、次干路、支路,将街景采样间隔确定为100米的均匀距离,每个采样点分别从0°、90°、180°、270°四个方向、0°水平视角和20°仰角两个视角,采集时间接近于第六次全国人口普查的时间,获取了涵盖8536个采样点、286个社区的百度地图街景,共计61864张图片,其空间分布如图3所示;Step 2: Along the main road, sub-main road, and branch road, the street view sampling interval is determined to be a uniform distance of 100 meters, and each sampling point is from four directions of 0°, 90°, 180°, and 270°. °Horizontal viewing angle and 20° elevation angle, the acquisition time is close to the time of the sixth national census, and Baidu map street view covering 8536 sampling points and 286 communities has been obtained, a total of 61864 pictures, and their spatial distribution is shown in the figure 3 shown;
第3步:随机抽样出一半数量案例社区的街景图片,借助视觉领域中经常使用的TensorFlow深度学习网络框架进行支持,使用基于FCN、SegNet、PSPNet的人工智能模型进行解译(如图4所示)。计算像素精度PA(Pixel Accuracy)、平均像素精度MPA(Mean Pixel Accuracy)、平均并交比MIOU(Mean Intersection Over Union)三种效率评估指标,选出图像分割技术准确度最高的模型分割所有街景图片(如图5所示)。Step 3: Randomly sample half of the street view images of the case community, and use the TensorFlow deep learning network framework that is often used in the field of vision to support, and use the artificial intelligence model based on FCN, SegNet, and PSPNet for interpretation (as shown in Figure 4) ). Calculate the three efficiency evaluation indicators of pixel accuracy PA (Pixel Accuracy), average pixel accuracy MPA (Mean Pixel Accuracy), and average intersection ratio MIOU (Mean Intersection Over Union), and select the model with the highest image segmentation technology accuracy to segment all street scene pictures (As shown in Figure 5).
第4步:总结典型贫困社区的街景指标特征,并使用相关分析的方法,确定与城市贫困程度相关的街景要素。在计算重要街景要素的相应指标基础上,通过主成分分析法,降维处理多视角、多要素的街景指标,并旋转因子载荷矩阵,以提取出贡献度高、重要的街景因子进行命名,即建筑围合感、植被围合感、天空开阔感、道路开阔感、色彩复杂感如图6~10所示。Step 4: Summarize the characteristics of streetscape indicators of typical poor communities, and use the method of correlation analysis to determine the streetscape elements related to the degree of urban poverty. On the basis of calculating the corresponding indicators of important street view elements, the principal component analysis method is used to reduce the dimensionality to process the multi-view and multi-element street view indicators, and rotate the factor loading matrix to extract the high-contribution and important street view factors and name them, namely Architectural enclosure, vegetation enclosure, sky openness, road openness, and color complexity are shown in Figures 6-10.
第5步:将上一步获得的重要街景因子作为自变量,以多重剥夺指数(IMD)作为参考变量,构建随机森林(Random Forest)预测模型。经剩余50%的样本数据进行测试后,反复循环此步骤生成大量决策树,当模型误差趋于最小和稳定状态时,就终止形成随机森林的生长,进行城市贫困程度判别,最终输出频次最高的分类结果,作为街景测度城市贫困程度的最终输出结果,经统计模型平均正确率达到82.48%,具体结果如图11所示。Step 5: Take the important street scene factors obtained in the previous step as independent variables, and use multiple deprivation index (IMD) as a reference variable to construct a random forest (Random Forest) prediction model. After the remaining 50% of the sample data is tested, this step is repeated to generate a large number of decision trees. When the model error tends to the smallest and stable state, the growth of the random forest is terminated, the urban poverty level is judged, and the highest frequency is finally output The classification result, as the final output result of the street view measuring the urban poverty degree, the average correct rate of the statistical model reached 82.48%. The specific result is shown in Figure 11.
本演示过程中,对城市贫困程度从0到5赋值,数字越大代表越贫困。随后按比例分层到每一等级的社区,抽取出50%的数据作为训练样本。同时采用随机重复采样、有放回地随机抽取N个,与已有训练数据规模相等的数据子集,以生长出N棵独立的决策树模型。经计算模型预测结果正确率和总模型误差发现,当树节点个数为6时,模型预测的错误率平均达到最小;同时让树的棵数从0-100 取值,可以看出生成55棵决策树后模型总误差趋于稳定。因此本演示过程中随机森林模型的参数得以确定。其中,树节点的产生是通过逐一增加变量比较误判率的高低来决定的,即从M个已有可能属性中选取出,最具有代表性的随机特征变量进行分割。本演示过程将0°和20°的8个指标两两对比,将更重要的街景指标放入模型中,允许所有决策树尽可能地生长,并且在模型构建过程中不修改任何参数。这样有利于可以降低用于分类与回归的决策树之间的相关性,丰富模型的全面性和提高分类能力。In this demonstration, the urban poverty level is assigned a value from 0 to 5. The larger the number, the poorer it is. Then, it is stratified to each level of the community in proportion, and 50% of the data is drawn as the training sample. At the same time, random repeated sampling and random sampling with replacement are used to randomly select N data subsets with the same size as the existing training data to grow N independent decision tree models. After calculating the correct rate of the model prediction results and the total model error, it is found that when the number of tree nodes is 6, the average error rate of the model prediction reaches the minimum; at the same time, the number of trees is selected from 0-100, and it can be seen that 55 trees are generated After the decision tree, the total error of the model tends to be stable. Therefore, the parameters of the random forest model are determined during this demonstration. Among them, the generation of tree nodes is determined by increasing the variables one by one to compare the level of misjudgment rate, that is, selecting from M existing possible attributes, and segmenting the most representative random feature variable. This demonstration process compares 8 indicators at 0° and 20° in pairs, puts more important street view indicators into the model, allows all decision trees to grow as much as possible, and does not modify any parameters during the model building process. This helps to reduce the correlation between the decision tree used for classification and regression, enrich the comprehensiveness of the model and improve the classification ability.
经剩余50%的样本数据进行测试后,反复循环此步骤以生成大量决策树,当模型误差趋于最小和稳定状态时,就终止形成随机森林的生长。进行城市贫困程度判别,最终输出频率最高的类型,作为随机森林模型的最终输出值,具体如表1所示。在优化模型的过程中发现,对于基于要素来计算的街景指标如天空开敞指数、绿视率等来说,0°视角拼合优度更佳,而基于颜色计算的色彩要素、显著区域特征等来说,20°视角的指标对模型正确预测的贡献度越大。同时由于属性种类的增加,模型预测能力随之提高,在加入第八个街景指标后,模型平均正确率达到82.48%,超过前两种模型的预测效果。而且增加不同的属性种类,模型预测准确率上升幅度不同。综合所有分析显示,0°天空开敞指数、0°绿视率、20°色彩要素、0°建筑占比、0°路面占比、20°视觉熵对城市贫困的预测影响程度比较高。After the remaining 50% of the sample data is tested, this step is repeated to generate a large number of decision trees. When the model error tends to the smallest and stable state, the growth of the random forest is terminated. The urban poverty level is judged, and the type with the highest frequency is finally output as the final output value of the random forest model, as shown in Table 1. In the process of optimizing the model, it was found that for street view indicators such as sky opening index, green viewing rate, etc., calculated based on elements, the 0° viewing angle is better, while color elements and salient area features calculated based on color are better. In other words, the 20° angle of view index contributes more to the correct prediction of the model. At the same time, due to the increase of attribute types, the model's predictive ability is improved. After the eighth street view index is added, the average correct rate of the model reaches 82.48%, which exceeds the predictive effect of the first two models. Moreover, with the addition of different attribute types, the model prediction accuracy rate rises differently. Comprehensive analysis shows that 0° sky openness index, 0° green viewing rate, 20° color element, 0° building proportion, 0° pavement proportion, and 20° visual entropy have a relatively high impact on the prediction of urban poverty.
表1 随机森林模型评价参数结果Table 1 Evaluation parameter results of random forest model
Figure PCTCN2020095204-appb-000015
Figure PCTCN2020095204-appb-000015
实施例2Example 2
实施例2是基于实施例1的一种应用,一种基于街景图片及机器学习的城市内部贫困空间测度系统,包括图像获取模块、图像分割模块、图片组合模块、街景指标模块和城市贫困分数计算模块,其中, Embodiment 2 is an application based on embodiment 1, a system for measuring urban poverty space based on street view pictures and machine learning, including image acquisition module, image segmentation module, image combination module, street view index module, and urban poverty score calculation Modules, where
图像获取模块用于获取目标区域的街景图像数据;;The image acquisition module is used to acquire street view image data of the target area;
图片组合模块用于将采样点的相同的竖直方向视角的M张不同方向的图像数据进行拼合,得到目标区域的街景图像数据;The picture combination module is used to combine M image data of different directions with the same vertical viewing angle of the sampling points to obtain street view image data of the target area;
图像分割模块用于将目标区域的街景图像数据分割为若干块街景图像数据;The image segmentation module is used to segment the street view image data of the target area into several pieces of street view image data;
街景指标模块用于计算目标区域的街景指标;The street view index module is used to calculate the street view index of the target area;
城市贫困分数计算模块将多重剥夺指数IMD和街景因子作为机器学习算法的输入变量,得到城市贫困分数。The urban poverty score calculation module uses the multiple deprivation index IMD and the street view factor as input variables of the machine learning algorithm to obtain the urban poverty score.
在实施例2中,还可以进行以下扩展:街景指标模块包括图像要素像元占比计算模块和色彩复杂程度计算模块,其中,In Embodiment 2, the following extensions can also be made: The street view index module includes a calculation module for the proportion of image elements and a color complexity calculation module, where,
图像要素像元占比计算模块用于计算天空开敞指数P sky、绿视率P green、路面占比P road、建筑占比P building、界面围合度P enclosureThe image element pixel ratio calculation module is used to calculate the sky open index P sky , the green viewing rate P green , the road surface ratio P road , the building ratio P building , and the interface enclosure degree P enclosure ;
色彩复杂程度计算模块用于计算视觉熵VE。The color complexity calculation module is used to calculate the visual entropy VE.
在上述具体实施方式的具体内容中,各技术特征可以进行任意不矛盾的组合,为使描述简洁,未对上述各个技术特征所有可能的组合都进行描述,然而,只要这些技术特征的组合不存在矛盾,都应当认为是本说明书记载的范围。In the specific content of the above-mentioned specific embodiments, the technical features can be combined in any non-contradictory manner. In order to make the description concise, all possible combinations of the above-mentioned technical features are not described. However, as long as the combination of these technical features does not exist Any contradiction should be regarded as the scope of this specification.
相同或相似的标号对应相同或相似的部件;The same or similar reference numbers correspond to the same or similar parts;
附图中描述位置关系的用语仅用于示例性说明,不能理解为对本专利的限制;例如实施例中水流量传感器的计算公式并不仅限于实施例中举例的公式,不同的种类的水流量传感器的计算公式各不相同。上述的是实施例的限定并不能理解为对本专利的限制。The terms describing the positional relationship in the drawings are only used for exemplary description and cannot be understood as a limitation of the patent; for example, the calculation formula of the water flow sensor in the embodiment is not limited to the formula exemplified in the embodiment, and different types of water flow sensors The calculation formulas are different. The foregoing is the limitation of the embodiments and cannot be understood as a limitation of the patent.
显然,本发明的上述实施例仅仅是为清楚地说明本发明所作的举例,而并非是对本发明的实施方式的限定。对于所属领域的普通技术人员来说,在上述说明的基础上还可以做出其它不同形式的变化或变动。这里无需也无法对所有的实施方式予以穷举。凡在本发明的精神和原则之内所作的任何修改、等同替换和改进等,均应包含在本发明权利要求的保护范围之内。Obviously, the above-mentioned embodiments of the present invention are merely examples to clearly illustrate the present invention, and are not intended to limit the implementation of the present invention. For those of ordinary skill in the art, other changes or changes in different forms can be made on the basis of the above description. It is unnecessary and impossible to list all the implementation methods here. Any modification, equivalent replacement and improvement made within the spirit and principle of the present invention shall be included in the protection scope of the claims of the present invention.

Claims (15)

  1. 一种基于街景图片及机器学习的城市内部贫困空间测度方法,其特征在于,包括以下步骤:A method for measuring urban poverty space based on street view pictures and machine learning, which is characterized in that it includes the following steps:
    根据人口普查数据构建多重剥夺指数IMD;Construct multiple deprivation index IMD based on census data;
    在地图信息数据库中获取目标区域的街景图像数据;Obtain street view image data of the target area in the map information database;
    通过图像分割技术,将目标区域的街景图像数据分割为若干块街景图像数据;Through image segmentation technology, the street view image data of the target area is divided into several pieces of street view image data;
    基于若干块街景图像数据,结合主成分分析法,得到主因子,将主因子定义为街景因子;Based on several pieces of street view image data, combined with the principal component analysis method, the principal factor is obtained, and the principal factor is defined as the street view factor;
    将多重剥夺指数IMD和街景因子作为机器学习算法的输入变量,得到城市贫困分数。The multiple deprivation index IMD and the street view factor are used as input variables of the machine learning algorithm to obtain the urban poverty score.
    根据城市贫困分数对城市的贫困程度进行评估。Evaluate the poverty level of the city based on the urban poverty score.
  2. 根据权利要求1所述的城市内部贫困空间测度方法,其特征在于,所述的“根据人口普查数据构建多重剥夺指数IMD”包括以下子内容:The method for measuring urban poverty space according to claim 1, wherein the "construction of multiple deprivation index IMD based on census data" includes the following sub-contents:
    根据人口普查数据得到P个维度数据,每个维度的数据对应一个比例权重λ;According to the census data, P dimensions of data are obtained, and each dimension of data corresponds to a proportional weight λ;
    所述的多重剥夺指数IMD通过下式进行表达:The multiple deprivation index IMD is expressed by the following formula:
    Figure PCTCN2020095204-appb-100001
    Figure PCTCN2020095204-appb-100001
    式中,所述的E j表示第j个纬度数据的数值。 In the formula, the E j represents the value of the j-th latitude data.
  3. 根据权利要求2所述的城市内部贫困空间测度方法,其特征在于,所述的P=4,所述的4个维度数据分别是收入领域数据、教育领域数据、就业领域数据和住房领域数据,所述的收入领域数据的数值是E 1,收入领域数据的比重是0.303;所述的教育领域数据的数值是E 2,教育领域数据的比重是0.212;所述的就业领域数据的数值是E 3,就业领域数据的比重是0.182;所述的住房领域数据的数值是E 4,住房领域数据的比重是0.303;所述的多重剥夺指数IMD通过下式进行表达: The method for measuring urban poverty space according to claim 2, wherein said P=4, and said four dimensional data are income field data, education field data, employment field data, and housing field data, respectively. The value of the income field data is E 1 , and the proportion of the income field data is 0.303; the value of the education field data is E 2 , and the proportion of the education field data is 0.212; the value of the employment field data is E 3. The proportion of the employment field data is 0.182; the value of the housing field data is E 4 , and the proportion of the housing field data is 0.303; the multiple deprivation index IMD is expressed by the following formula:
    IMD=E 1*0.303+E 2*0.212+E 3*0.182+E 4*0.303。 IMD=E 1 *0.303+E 2 *0.212+E 3 *0.182+E 4 *0.303.
  4. 根据权利要求3所述的城市内部贫困空间测度方法,其特征在于,所述的E 1通过下式进行表达: The method for measuring urban poverty space according to claim 3, wherein said E 1 is expressed by the following formula:
    E 1=产业工人比例j 11+低端服务业比例j 12+离婚丧偶比例j 13 E 1 = the proportion of industrial workers j 11 + the proportion of low-end service industries j 12 + the proportion of divorces and widows j 13
    所述的产业工人比例j 11通过下式进行表达: The said industrial worker ratio j 11 is expressed by the following formula:
    产业工人比例j 11=(采矿业的人口数+制造业的人口数)/就业总人数 The proportion of industrial workers j 11 = (the number of people in the mining industry + the number of people in the manufacturing industry)/total number of employees
    所述的产业工人比例j 11通过下式进行表达: The said industrial worker ratio j 11 is expressed by the following formula:
    低端服务业比例j 121=(电力、煤气及水的生产和供应业的人口数+批发和零售业的人口数+住宿和餐饮业的人口数+房地产业的人口数)/就业总人数 Proportion of low-end service industry j 121 = (population of electricity, gas and water production and supply industry + population of wholesale and retail industry + population of accommodation and catering industry + population of real estate industry) / total number of employees
    所述的离婚丧偶比例j 13通过下式进行表达: The ratio of divorce and widowhood j 13 is expressed by the following formula:
    离婚丧偶比例j 13=离婚及丧偶人口数/15岁及以上未婚人口与有配偶人口数之和。 Divorce and widowhood ratio j 13 = the number of divorced and widowed population/the sum of the unmarried population aged 15 and above and the population with a spouse.
  5. 根据权利要求3所述的城市内部贫困空间测度方法,其特征在于,所述的E 2通过下式进行表达: The method for measuring urban poverty space according to claim 3, wherein the E 2 is expressed by the following formula:
    E 2=低教育水平j 21+离校没有文凭比例j 22 E 2 = low education level j 21 + the proportion of school leavers without a diploma j 22
    所述的低教育水平j 21通过下式进行表达: The said low education level j 21 is expressed by the following formula:
    低教育水平j 21=未上过学、小学、初中的人口数/总人口 Low level of education j 21 = population without going to school, elementary school, and junior high school/total population
    所述的离校没有文凭比例j 22通过下式进行表达: The proportion of school leavers without a diploma j 22 is expressed by the following formula:
    离校没有文凭比例j 22=没有文凭的人口数/总人口。 The proportion of leaving school without a diploma j 22 = the population without a diploma/total population.
  6. 根据权利要求3所述的城市内部贫困空间测度方法,其特征在于,所述的E 3通过下式进行表达: The method for measuring urban poverty space according to claim 3, wherein the E 3 is expressed by the following formula:
    E 3=失业比例j 31=没有工作的人口数/总人口。 E 3 = Unemployment ratio j 31 = Number of unemployed population/total population.
  7. 根据权利要求3所述的城市内部贫困空间测度方法,其特征在于,所述的E 4通过下式进行表达: The method for measuring urban poverty space according to claim 3, wherein the E 4 is expressed by the following formula:
    E 4=每平方米住的人口比例j 41+无清洁能源比例j 42+无自来水比例j 43+无厨房比例j 44+无厕所比例j 45+无热水比例j 46 E 4 = Proportion of population living per square meter j 41 + Proportion without clean energy j 42 + Proportion without running water j 43 + Proportion without kitchen j 44 + Proportion without toilet j 45 + Proportion without hot water j 46
    所述的每平方米住的人口比例j 41通过下式进行表达: The said population ratio j 41 per square meter is expressed by the following formula:
    每平方米住的人口比例j 41=1/人均住房建筑面积(平方米/人) The proportion of the population living per square meter j 41 = 1/per capita housing area (square meters/person)
    所述的无清洁能源比例j 42通过下式进行表达: The said non-clean energy ratio j 42 is expressed by the following formula:
    无清洁能源比例j 42=煤炭、柴草、其他能源使用的家庭户数/总家庭户数 Proportion without clean energy j 42 = number of households using coal, firewood, and other energy sources/total number of households
    所述的无自来水比例j 43通过下式进行表达: The ratio of no tap water j 43 is expressed by the following formula:
    无自来水比例j 43=无自来水的家庭户数/总家庭户数 Proportion of no running water j 43 = number of households without running water/total number of households
    所述的无厨房比例j 44通过下式进行表达: The said no-kitchen ratio j 44 is expressed by the following formula:
    无厨房比例j 44=无厨房的家庭户数/总家庭户数 Proportion without kitchen j 44 = number of households without kitchen/total number of households
    所述的无厕所比例j 45通过下式进行表达: The ratio j 45 without toilets is expressed by the following formula:
    无厕所比例j 45=无厕所的家庭户数/总家庭户数 Proportion without toilet j 45 = number of households without toilet/total number of households
    所述的无热水比例j 46通过下式进行表达: The ratio of no hot water j 46 is expressed by the following formula:
    无热水比例j 46=无热水的家庭户数/总家庭户数。 The proportion of no hot water j 46 = the number of households without hot water/total number of households.
  8. 根据权利要求1至7中任一权利要求所述的城市内部贫困空间测度方法,其特征在于,所述的“在地图信息数据库中获取目标区域的街景图像数据”包括以下子步骤:The method for measuring urban internal poverty space according to any one of claims 1 to 7, wherein the "acquiring street view image data of the target area in a map information database" includes the following sub-steps:
    在地图信息数据库中获取目标区域的路网信息;Obtain the road network information of the target area in the map information database;
    根据目标区域的路网信息,按照距离D进行间隔性采样,得到目标区域的采样点;According to the road network information of the target area, perform interval sampling according to the distance D to obtain the sampling points of the target area;
    针对每一个目标区域的采样点得到M*L张图像数据,定义所有目标区域的采样点的图像数据的结合集合为目标区域的街景图像数据集合,所述的M*L张图像数据表示每一个竖直方向视角下取M张相互不同方向的图像数据,有L个竖直方向视角。M*L pieces of image data are obtained for the sampling points of each target area, and the combined set of image data defining the sampling points of all target areas is the street view image data set of the target area. The M*L pieces of image data represent each Under the vertical viewing angle, M images of different directions are taken, and there are L vertical viewing angles.
  9. 根据权利要求8所述的城市内部贫困空间测度方法,其特征在于,所述的距离D=100米。The method for measuring urban poverty space according to claim 8, wherein the distance D=100 meters.
  10. 根据权利要求8所述的城市内部贫困空间测度方法,其特征在于,所述的M=4,L=2;所述的采样点按照第一个竖直方向视角的前后左右四个方向和第二个竖直方向视角的前后左右四个方向取得8张图像数据。The method for measuring urban poverty space according to claim 8, wherein said M=4, L=2; said sampling points are based on the four directions of front, back, left, and right of the first vertical viewing angle and the first Eight images of image data are obtained in the four directions of front, back, left, and right of the two vertical viewing angles.
  11. 根据权利要求8所述的城市内部贫困空间测度方法,其特征在于,所述的“通过图像分割技术,将目标区域的街景图像数据分割为若干块街景图像数据”包括以下子步骤:The method for measuring urban poverty space according to claim 8, wherein the "segmenting the street view image data of the target area into several pieces of street view image data through image segmentation technology" comprises the following sub-steps:
    对目标区域的街景图像数据集合进行抽样,得到抽样结果;Sampling the street view image data set of the target area to obtain the sampling result;
    在抽样结果中,将每一个抽样点的每一个竖直方向视角的M张相互不同方向的图像数据进行拼合,得到对应抽样点在设定竖直方向视角的全域图像;In the sampling result, merge the M image data of different directions from each vertical viewing angle of each sampling point to obtain a full-area image of the corresponding sampling point at the set vertical viewing angle;
    将所有抽样点的每一个竖直方向视角的全域图像的集合定义为目标区域的采样点的抽样集合;Define the set of global images of each vertical viewing angle of all sampling points as a sampling set of sampling points of the target area;
    通过现有的图像分割技术,判断最适合目标区域的采样点的抽样集合的图像分割技术,所得的结果定义为目标区域的采样点的最佳图像分割技术;Through the existing image segmentation technology, determine the most suitable image segmentation technology for the sampling set of the sampling points of the target area, and the result obtained is defined as the best image segmentation technology for the sampling points of the target area;
    通过目标区域的采样点的最佳图像分割技术将目标区域的采样点对应的目 标区域的街景图像数据集合进行图像分割,所得的结果定义为若干块街景图像数据。Image segmentation is performed on the street view image data set of the target area corresponding to the sample points of the target area through the optimal image segmentation technology of the sampling points of the target area, and the result obtained is defined as several pieces of street view image data.
  12. 根据权利要求8所述的城市内部贫困空间测度方法,其特征在于,所述的“基于若干块街景图像数据,结合主成分分析法,得到主因子,将主因子定义为街景因子”包括以下子步骤:The method for measuring urban poverty space according to claim 8, characterized in that said "based on several pieces of street view image data, combined with principal component analysis to obtain a principal factor, defining the principal factor as a street view factor" includes the following sub- step:
    基于若干块街景图像数据,得到街景指标,所述的街景指标包括天空开敞指数P sky、绿视率P green、路面占比P road、建筑占比P builaing、界面围合度P enclosure、色彩要素、显著区域特征SRS、视觉熵VE,其中,所述的色彩要素包括街景图像数据的名度和饱和度; Based on several pieces of street view image data, street view indicators are obtained. The street view indicators include sky opening index P sky , green viewing rate P green , road surface ratio P road , building ratio P builaing , interface enclosure degree P enclosure , and color elements , Salient area feature SRS, visual entropy VE, where the color elements include the name and saturation of street view image data;
    所述的天空开敞指数P sky通过下式进行计算: The sky opening index P sky is calculated by the following formula:
    Figure PCTCN2020095204-appb-100002
    Figure PCTCN2020095204-appb-100002
    式中,所述的NS i为第i块街景图像数据中天空的像素数;所述的N i是第i块街景图像数据中的总像素数; In the formula, the NS i is the number of pixels in the sky in the i-th block of street view image data; the N i is the total number of pixels in the i-th block of street view image data;
    所述的绿视率P green通过下式进行计算: The green viewing rate P green is calculated by the following formula:
    Figure PCTCN2020095204-appb-100003
    Figure PCTCN2020095204-appb-100003
    式中,所述的NG i为第i块街景图像数据中植被的像素数; In the formula, the NG i is the number of pixels of vegetation in the i-th block of street view image data;
    所述的路面占比P road通过下式进行计算: The ratio of the road surface P road is calculated by the following formula:
    Figure PCTCN2020095204-appb-100004
    Figure PCTCN2020095204-appb-100004
    式中,所述的NR i为第i块街景图像数据中道路的像素数; In the formula, the NR i is the number of pixels of the road in the i-th block of street view image data;
    所述的建筑占比P building通过下式进行计算: The said building proportion P building is calculated by the following formula:
    Figure PCTCN2020095204-appb-100005
    Figure PCTCN2020095204-appb-100005
    式中,所述的NB i为第i块街景图像数据中建筑的像素数; In the formula, the NB i is the number of pixels of the building in the i-th block of street view image data;
    所述的界面围合度P enclosure通过下式进行计算: The interface enclosure degree P enclosure is calculated by the following formula:
    P enclosure=P green+P building P enclosure =P green +P building
    所述的显著区域特征SRS通过下式进行计算:The salient area feature SRS is calculated by the following formula:
    Figure PCTCN2020095204-appb-100006
    Figure PCTCN2020095204-appb-100006
    所述的max(R,G,B)表示第i块街景图像数据中颜色分量中的最大值;所述的min(R,G,B)表示第i块街景图像数据中颜色分量中的最小值;The max (R, G, B) represents the maximum value among the color components in the i-th block of street view image data; the min (R, G, B) represents the minimum value among the color components in the i-th block of street view image data value;
    所述的视觉熵VE通过下式进行计算:The visual entropy VE is calculated by the following formula:
    Figure PCTCN2020095204-appb-100007
    Figure PCTCN2020095204-appb-100007
    所述的P i表示第i块街景图像数据的概率,用于表征熵枝值; The P i represents the probability of the i-th block of street view image data, which is used to characterize the entropy branch value;
    将街景指标作为主成分分析法的输入变量,得到输出变量主因子。The street view index is used as the input variable of the principal component analysis method, and the main factor of the output variable is obtained.
  13. 根据权利要求1、2、3、4、5、6、7、9、10、11或12所述的城市内部贫困空间测度方法,其特征在于,所述的“机器学习算法”是随机森林算法。The method for measuring poverty space within a city according to claim 1, 2, 3, 4, 5, 6, 7, 9, 10, 11 or 12, wherein the "machine learning algorithm" is a random forest algorithm .
  14. 基于权利要求1~13中任一权力要求所述的城市内部贫困空间测度方法的一种基于街景图片及机器学习的城市内部贫困空间测度系统,其特征在于,包括图像获取模块、图像分割模块、图片组合模块、街景指标模块和城市贫困分数计算模块,其中,A measurement system of urban poverty space based on street view pictures and machine learning based on the method for measuring urban internal poverty space according to any one of claims 1 to 13, characterized in that it includes an image acquisition module, an image segmentation module, Picture combination module, street view index module and urban poverty score calculation module, among which,
    所述的图像获取模块用于获取目标区域的街景图像数据;;The image acquisition module is used to acquire street view image data of the target area;
    所述的图片组合模块用于将采样点的相同的竖直方向视角的M张不同方向的图像数据进行拼合,得到目标区域的街景图像数据;The picture combination module is used to combine M image data of different directions with the same vertical viewing angle of the sampling point to obtain street view image data of the target area;
    所述的图像分割模块用于将目标区域的街景图像数据分割为若干块街景图像数据;The image segmentation module is used to segment the street view image data of the target area into several pieces of street view image data;
    所述的街景指标模块用于计算目标区域的街景指标;The street view index module is used to calculate the street view index of the target area;
    所述的城市贫困分数计算模块将多重剥夺指数IMD和街景因子作为机器学习算法的输入变量,得到城市贫困分数。The urban poverty score calculation module uses the multiple deprivation index IMD and the street view factor as input variables of the machine learning algorithm to obtain the urban poverty score.
  15. 根据权利要求14所述的城市内部贫困空间测度系统,其特征在于,所述的街景指标模块包括图像要素像元占比计算模块和色彩复杂程度计算模块,其中,The urban interior poverty space measurement system according to claim 14, wherein the street view indicator module includes an image element pixel ratio calculation module and a color complexity calculation module, wherein:
    所述的图像要素像元占比计算模块用于计算天空开敞指数P sky、绿视率P green、路面占比P road、建筑占比P building、界面围合度P enclosureThe image element pixel ratio calculation module is used to calculate the sky opening index P sky , the green viewing rate P green , the road surface ratio P road , the building ratio P building , and the interface enclosure degree P enclosure ;
    所述的色彩复杂程度计算模块用于计算视觉熵VE。The color complexity calculation module is used to calculate the visual entropy VE.
PCT/CN2020/095204 2020-06-09 2020-06-09 Method and system for measuring urban poverty spaces based on street view images and machine learning WO2021248335A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
PCT/CN2020/095204 WO2021248335A1 (en) 2020-06-09 2020-06-09 Method and system for measuring urban poverty spaces based on street view images and machine learning
CN202080001052.4A CN111937016B (en) 2020-06-09 2020-06-09 City internal poverty-poor space measuring method and system based on street view picture and machine learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2020/095204 WO2021248335A1 (en) 2020-06-09 2020-06-09 Method and system for measuring urban poverty spaces based on street view images and machine learning

Publications (1)

Publication Number Publication Date
WO2021248335A1 true WO2021248335A1 (en) 2021-12-16

Family

ID=73333858

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/095204 WO2021248335A1 (en) 2020-06-09 2020-06-09 Method and system for measuring urban poverty spaces based on street view images and machine learning

Country Status (2)

Country Link
CN (1) CN111937016B (en)
WO (1) WO2021248335A1 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114358660A (en) * 2022-03-10 2022-04-15 武汉市规划研究院 Urban street quality evaluation method, system and storage medium
CN114565300A (en) * 2022-03-04 2022-05-31 中国科学院生态环境研究中心 Method and system for quantifying subjective emotion of public and electronic equipment
CN117079124A (en) * 2023-07-14 2023-11-17 北京大学 Urban and rural landscape image quantification and promotion method based on community differentiation

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114282934A (en) * 2021-03-30 2022-04-05 华南理工大学 Urban low-income crowd distribution prediction method and system based on mobile phone signaling data and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180053110A1 (en) * 2016-08-22 2018-02-22 The Catholic University Of Korea Industry-Academic Cooperation Foundation Method of predicting crime occurrence in prediction target region using big data
CN107944750A (en) * 2017-12-12 2018-04-20 中国石油大学(华东) A kind of poverty depth analysis method and system
CN109523125A (en) * 2018-10-15 2019-03-26 广州地理研究所 A kind of poor Measurement Method based on DMSP/OLS nighttime light data
CN109886103A (en) * 2019-01-14 2019-06-14 中山大学 Urban poverty measure of spread method
CN109948737A (en) * 2019-04-08 2019-06-28 河南大学 Poor spatial classification recognition methods and device based on big data and machine learning

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180053110A1 (en) * 2016-08-22 2018-02-22 The Catholic University Of Korea Industry-Academic Cooperation Foundation Method of predicting crime occurrence in prediction target region using big data
CN107944750A (en) * 2017-12-12 2018-04-20 中国石油大学(华东) A kind of poverty depth analysis method and system
CN109523125A (en) * 2018-10-15 2019-03-26 广州地理研究所 A kind of poor Measurement Method based on DMSP/OLS nighttime light data
CN109886103A (en) * 2019-01-14 2019-06-14 中山大学 Urban poverty measure of spread method
CN109948737A (en) * 2019-04-08 2019-06-28 河南大学 Poor spatial classification recognition methods and device based on big data and machine learning

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114565300A (en) * 2022-03-04 2022-05-31 中国科学院生态环境研究中心 Method and system for quantifying subjective emotion of public and electronic equipment
CN114565300B (en) * 2022-03-04 2022-12-23 中国科学院生态环境研究中心 Method and system for quantifying subjective emotion of public and electronic equipment
CN114358660A (en) * 2022-03-10 2022-04-15 武汉市规划研究院 Urban street quality evaluation method, system and storage medium
CN117079124A (en) * 2023-07-14 2023-11-17 北京大学 Urban and rural landscape image quantification and promotion method based on community differentiation
CN117079124B (en) * 2023-07-14 2024-04-30 北京大学 Urban and rural landscape image quantification and promotion method based on community differentiation

Also Published As

Publication number Publication date
CN111937016B (en) 2022-05-17
CN111937016A (en) 2020-11-13

Similar Documents

Publication Publication Date Title
WO2021248335A1 (en) Method and system for measuring urban poverty spaces based on street view images and machine learning
Li et al. Energy performance simulation for planning a low carbon neighborhood urban district: A case study in the city of Macau
CN111896680B (en) Greenhouse gas emission analysis method and system based on satellite remote sensing data
Wong et al. Modelling building energy use at urban scale: A review on their account for the urban environment
Wu et al. Mapping building carbon emissions within local climate zones in Shanghai
Alahmadi et al. Estimating the spatial distribution of the population of Riyadh, Saudi Arabia using remotely sensed built land cover and height data
Liu et al. Characterizing three dimensional (3-D) morphology of residential buildings by landscape metrics
Xin et al. Monitoring urban expansion using time series of night-time light data: a case study in Wuhan, China
Lee Carbayes version 4.6: An r package for spatial areal unit modelling with conditional autoregressive priors
Peeters A GIS-based method for modeling urban-climate parameters using automated recognition of shadows cast by buildings
CN112148820B (en) Underwater terrain data identification and service method and system based on deep learning
Chen et al. Quantitative analysis of the building-level relationship between building form and land surface temperature using airborne LiDAR and thermal infrared data
Gao et al. Improving the accuracy of extant gridded population maps using multisource map fusion
Güneralp et al. Capturing multiscalar feedbacks in urban land change: A coupled system dynamics spatial logistic approach
Wang et al. Processing methods for digital image data based on the geographic information system
Crols et al. Downdating high-resolution population density maps using sealed surface cover time series
CN115730731A (en) Automatic identification method and display platform for urban high-carbon emptying room unit
Ma et al. Projecting high resolution population distribution using Local Climate Zones and multi-source big data
Chen et al. A Spatiotemporal Interpolation Graph Convolutional Network for Estimating PM₂. ₅ Concentrations Based on Urban Functional Zones
Ren et al. Analysis of the spatial differentiation and scale effects of the three-dimensional architectural landscape in Xi’an, China
da Rocha¹ et al. Studies of volumetric relation between vegetation and buildings using LIDAR data and NDVI to propose urban parameters
Tarkhan et al. Capturing façade diversity in urban settings using an automated window to wall ratio extraction and detection workflow
Yin et al. Disaggregation of an urban population with M_IDW interpolation and building information
Li et al. A novel outdoor thermal comfort simulation model for heritage environments (OTC-SM-HE): Verify the effectiveness in Gulangyu, China
Qin et al. The spatio-temporal evolution and transformation mode of human settlement quality from the perspective of “production-living-ecological" spaces--a case study of Jilin Province

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20940217

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 26/04/2023)

122 Ep: pct application non-entry in european phase

Ref document number: 20940217

Country of ref document: EP

Kind code of ref document: A1