CN114898089B - Functional area extraction and classification method fusing high-resolution images and POI data - Google Patents

Functional area extraction and classification method fusing high-resolution images and POI data Download PDF

Info

Publication number
CN114898089B
CN114898089B CN202210543624.2A CN202210543624A CN114898089B CN 114898089 B CN114898089 B CN 114898089B CN 202210543624 A CN202210543624 A CN 202210543624A CN 114898089 B CN114898089 B CN 114898089B
Authority
CN
China
Prior art keywords
functional
scale
area
feature
semantic
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210543624.2A
Other languages
Chinese (zh)
Other versions
CN114898089A (en
Inventor
杜守航
杜世宏
崔希民
张修远
刘波
李炜
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Peking University
China University of Mining and Technology Beijing CUMTB
Original Assignee
Peking University
China University of Mining and Technology Beijing CUMTB
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Peking University, China University of Mining and Technology Beijing CUMTB filed Critical Peking University
Priority to CN202210543624.2A priority Critical patent/CN114898089B/en
Publication of CN114898089A publication Critical patent/CN114898089A/en
Application granted granted Critical
Publication of CN114898089B publication Critical patent/CN114898089B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/25Determination of region of interest [ROI] or a volume of interest [VOI]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/762Arrangements for image or video recognition or understanding using pattern recognition or machine learning using clustering, e.g. of similar faces in social networks
    • G06V10/7625Hierarchical techniques, i.e. dividing or merging patterns to obtain a tree-like representation; Dendograms
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • G06V10/806Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/10Terrestrial scenes
    • G06V20/176Urban or other man-made structures
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A30/00Adapting or protecting infrastructure or their operation
    • Y02A30/60Planning or developing urban green infrastructure

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Multimedia (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Medical Informatics (AREA)
  • Databases & Information Systems (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • General Engineering & Computer Science (AREA)
  • Molecular Biology (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a method for extracting and classifying a functional area fusing high-resolution images and POI data, which comprises the following steps: A. collecting high-resolution image data, constructing a multi-scale deformable convolution network model, and extracting multi-scale functional semantic feature image blocks; B. performing functional unit segmentation processing on the functional semantic feature image data by adopting a multi-scale segmentation algorithm to obtain a plurality of functional area units; C. calculating the unit attribute of each functional area unit; D. and classifying each functional area unit by using a random forest classifier. The method comprises the steps of constructing a multi-scale deformable convolution network model for extracting functional semantic features, fusing remote sensing image multi-scale depth features and POI data kernel density analysis features based on functional area units, and realizing classification of functional areas through a random forest classifier; the method can improve the precision and the fineness of the urban functional area extraction, and can be quickly and efficiently applied to the large-scale urban functional area extraction task so as to meet the actual application requirements.

Description

Functional area extraction and classification method fusing high-resolution images and POI data
Technical Field
The invention relates to the field of high-resolution image processing and deep learning semantic segmentation, in particular to a method for extracting and classifying functional regions by fusing high-resolution images and POI (point of interest) data.
Background
In the urbanization process, similar socioeconomic activities are generally gathered in the same urban space, so that different urban functional partitions are generated, the diversified requirements (such as commercial districts, residential districts, industrial districts and the like) of people on the socioeconomic activities are met, and the reasonable planning of the urban functional partitions is very important for the urbanization process. The urban functional area data is important data for analyzing urban spatial patterns, disclosing urban construction processes and promoting urban sustainable development.
The current situation of the actual urban functional area cannot be accurately reflected by the existing special data and urban planning data, so that the research of realizing the automatic extraction of the urban functional area by using a remote sensing means is necessary. Existing research can be divided into three categories, depending on the data source used: high-resolution remote sensing images, multi-source geographic data and multi-source geographic data are fused. The urban functional area analysis based on the high-resolution remote sensing image data mainly judges the attribute of the functional area according to visual features such as spectrum and texture, spatial relationship features or depth features, and the like, however, the image data only contains the natural physical attributes of ground objects and lacks the depiction of social and economic attribute information, so that the category information of the functional area cannot be accurately analyzed. The land utilization mode is closely related to human society and economic activities, and the appearance of crowd-sourced geographic data such as POI (Point of Interest), OSM (OpenStreetMap) and the like enables the social attribute characteristics of urban areas to be used for identifying urban functions. The natural physical attributes and the socioeconomic attribute characteristics of the ground features can be comprehensively utilized by fusing high-resolution images and the public-source geographic data, but the existing research mostly uses space grid division or road block division as space units, the units cannot accurately express the morphological structure of the functional area, and in addition, in the aspect of functional identification, the difference of the data cannot be effectively fused, so that the robust characteristics cannot be extracted to accurately classify the functional attributes. Therefore, the method for extracting the urban functional area by adopting multi-source data fusion is not effectively solved.
Disclosure of Invention
Aiming at the defects in the prior art, the invention aims to provide a method for extracting and classifying a functional area fusing high-resolution images and POI data, wherein a multi-scale deformable convolution network model is constructed, the multi-scale deformable convolution network model extracts features through a ResNet50 basic network, an improved spatial pyramid pooling module is used for capturing multi-scale features, and the deformable convolution module is used for increasing the sensitivity degree to the shape and the scale of the functional area, so that the robust functional semantic features can be extracted, and meanwhile, a method for generating a fine functional unit for multi-scale segmentation of the functional semantic features is provided.
The purpose of the invention is realized by the following technical scheme:
a method for extracting and classifying functional areas fusing high-resolution images and POI data comprises the following steps:
A. collecting high-resolution image data of a research area, constructing a multi-scale deformable convolution network model, wherein the multi-scale deformable convolution network model comprises a ResNet50 basic network, a space pyramid pooling module, a deformable convolution module and a decoding module, and inputting the high-resolution image data into the multi-scale deformable convolution network model;
a1, dividing high-resolution image data into a plurality of image blocks, extracting low-level features and high-level features of each image block through a ResNet50 basic network of a multi-scale deformable convolution network model, and inputting the high-level features into a spatial pyramid pooling module;
a2, the space pyramid pooling module comprises 1 × 1 convolution, a plurality of cavity convolutions and global pooling, all the cavity convolutions are subjected to feature extraction in a series-connection and parallel-connection mode, all the cavity convolutions have different sampling rates, and the space pyramid pooling module is used for channel splicing of all feature maps to obtain multi-scale semantic information and performing 1 × 1 convolution to obtain a feature map A1;
a3, inputting the characteristic diagram A1 into a deformable convolution module, outputting nine offsets required by deformable convolution through A3 x 3 convolution by the deformable convolution module, and then applying the nine offsets to a convolution kernel to output a characteristic diagram A2;
a4, overlapping the feature map A2 and the low-layer features of the ResNet50 basic network, then obtaining a semantic segmentation result map with the same size as the original high-resolution image through 3 x 3 convolution and quadruple upsampling, and outputting a multi-scale functional semantic feature image block;
B. splicing all multi-scale functional semantic feature image blocks in a research area to obtain functional semantic feature image data of the research area, and performing functional unit segmentation processing on the functional semantic feature image data of the research area by adopting a multi-scale segmentation algorithm to obtain a plurality of functional area units;
C. calculating the unit attributes of each functional area unit, wherein the unit attributes of the functional area units comprise POI data kernel density feature attributes and functional semantic features, and the method comprises the following steps:
c1, converting each type of POI data in each functional area unit into nuclear density image data by using nuclear density analysis, then calculating the mean value and the standard deviation of each type of nuclear density image data in each functional area unit, and taking the mean value and the standard deviation as the nuclear density characteristic attribute of the POI data of the functional area unit;
c2, acquiring functional semantic features in each functional area unit based on the multi-scale functional semantic features;
D. and constructing unit attributes based on the functional area units, training a random forest classifier, and classifying each functional area unit by using the random forest classifier.
In order to better realize the invention, the multi-scale deformable convolution network model training method comprises the following technical scheme:
collecting sample data, constructing a training data set, and performing model training on the multi-scale deformable convolution network model by using a cross entropy loss function by using the training data set, wherein the cross entropy loss function expression is as follows:
Figure BDA0003648908270000031
where S is the number of samples, K is the number of classes, y a,c The probability of predicting as class c for sample a; if the one-hot coding type of the sample a is the type c, then
Figure BDA0003648908270000032
Equal to 1, if the one-hot encoding class of sample a is not class c, then
Figure BDA0003648908270000033
Equal to 0.
In step B, the functional unit segmentation processing process of the functional semantic feature image data in the research area adopts the functional semantic depth feature heterogeneity increment as the first consideration of the segmentation degree;
b1, functional semantic depth feature heterogeneity increment h deep : the standard deviation of the depth features of two adjacent objects is respectively expressed as sigma 1,i 、σ 2,i The areas of two adjacent objects are respectively represented as n 1 And n 2 And the standard deviation and the area of the depth feature of the merged object are respectively expressed as sigma merg,i And n m (ii) a The depth feature heterogeneity increases by h deep Expressed as:
Figure BDA0003648908270000041
where i is the ith dimension of the depth feature, w i Is the weight of the ith dimension of the depth feature.
In step B, in the functional unit segmentation processing process of the functional semantic feature image data of the research area, a shape heterogeneity feature increment is used as a second consideration of the segmentation degree, wherein the shape heterogeneity comprises two shape indexes of smoothness and compactness;
b2, increase in shape heterogeneity h shape The expression is as follows:
h shape =w smooth ×h smooth +(1-w smooth )×h com in the formula w smooth Is a smoothness weight; wherein h is smooth 、h com Respectively obtained by the following formulas:
Figure BDA0003648908270000042
Figure BDA0003648908270000043
l 11 、l 12 respectively representing the perimeters of two adjacent objects; b is a mixture of 11 、b 12 The perimeters of the smallest bounded rectangles representing two adjacent objects, respectively; n is 11 、n 12 Respectively representing the areas of two adjacent objects; l. the merg Representing the perimeter of the merged object, b merg Perimeter, n, of the smallest bounded rectangle representing the merged object merg Representing the area of the merged object;
and B3, adopting a total heterogeneity increment as a segmentation degree in the process of carrying out functional unit segmentation processing on the functional semantic feature image data of the research area, wherein the total heterogeneity increment f is expressed by two heterogeneity increments of a functional semantic depth feature heterogeneity increment and a shape heterogeneity increment:
f=w deep ×h deep +(1-w deep )×h shape wherein w is deep Weights in the total heterogeneity augmentation for functional semantic depth features.
In step C1, performing kernel density analysis on POI data in the functional area unit, where a kernel density D formula at a spatial position (x, y) is:
Figure BDA0003648908270000051
wherein i represents a POI point; dist i Is the distance of point i from location (x, y) and R is the search radius.
According to the preferable technical scheme, in the step D, function area sample data is constructed, the function area sample data comprises function area units with different function type attributes, the function type attributes correspond to the unit attributes of the function area units, the function area sample data is divided into a training set and a testing set, a random forest classifier is trained by the training set, classification accuracy verification is carried out on the testing set by the trained random forest classifier, therefore, the random forest classifier with high accuracy is obtained, and extraction and classification of the function area are carried out by the trained random forest classifier.
Compared with the prior art, the invention has the following advantages and beneficial effects:
(1) The invention provides a functional area extraction and classification method for urban functional semantic extraction, which is constructed with a multi-scale deformable convolution network model, wherein the multi-scale deformable convolution network model extracts features through a ResNet50 basic network, an improved space pyramid pooling module is used for capturing the multi-scale features, and the sensitivity to the shape and the scale of a functional area is increased by using the deformable convolution module, so that the robust functional semantic features can be extracted, and a fine functional unit generation method for multi-scale segmentation of the functional semantic features is provided.
(2) The method is based on the functional area unit fusion remote sensing image multi-scale depth features and POI data kernel density analysis features, and classification of the functional area is achieved through a trained random forest classifier; the method can improve the precision and the fineness of the urban functional area extraction, and can be quickly and efficiently applied to the large-scale urban functional area extraction task so as to meet the actual application requirements.
(3) The multi-scale deformable convolution network model can extract feature information of functional areas with different scales, and the extracted functional semantic features can adapt to the functional areas with highly heterogeneous shapes and scales and have robustness; meanwhile, the multi-scale segmentation method of the functional semantic feature image automatically aggregates the geometric shape of the functional area from the high-resolution image, so that the segmentation is more in line with the actual shape of the functional area, and the subsequent classification of the urban functional area is more facilitated.
(4) The invention can couple multi-source spatial data on the spatial unit, overcomes the defect that a single data source classifies the functional area, and considers the natural physical characteristics and the social and economic attributes of the functional area, thereby improving the extraction precision of the functional area.
Drawings
FIG. 1 is a schematic flow chart of a method for extracting and classifying functional regions according to the present invention;
FIG. 2 is a schematic structural diagram of a multi-scale deformable convolution network model in an embodiment;
FIG. 3 is a schematic structural diagram of a ResNet50 basic network in an embodiment;
FIG. 4 is a schematic diagram of a spatial pyramid pooling module in an embodiment;
FIG. 5 is a schematic structural diagram of a deformable convolution module in an embodiment.
Detailed Description
The present invention is further illustrated in detail below with reference to examples:
examples
As shown in fig. 1 to 5, a method for extracting and classifying a functional area fusing a high-resolution image and POI data includes the following steps:
A. collecting high-resolution image data of a research area, constructing a multi-scale deformable convolution network model (the structure of the multi-scale deformable convolution network model in the embodiment is shown in fig. 2), wherein the multi-scale deformable convolution network model comprises a ResNet50 basic network (the structure of the ResNet50 basic network in the embodiment is shown in fig. 3), a spatial pyramid pooling module, a deformable convolution module and a decoding module (corresponding to all technical contents behind the deformable convolution module shown in fig. 2), and inputting the high-resolution image data into the multi-scale deformable convolution network model;
a1, dividing high-resolution image data into a plurality of image blocks, extracting low-level features and high-level features from each image block through a ResNet50 basic network of a multi-scale deformable convolution network model, and inputting the high-level features into a spatial pyramid pooling module; as shown in fig. 2 and fig. 3, the ResNet50 basic network extraction sequentially extracts three low-layer features and one high-layer feature, the three low-layer features are respectively outputs of block1, block2 and block3 (three of block1, block2 and block3 are respectively a feature extraction module, the three low-layer features are respectively feature outputs of three feature extraction modules of block1, block2 and block 3), the one high-layer feature is an output of block4 (block 4 is a feature extraction module, and the high-layer feature is a feature output of block 4), and the block4 is input into the spatial pyramid pooling module.
A2, the spatial pyramid pooling module comprises 1 × 1 convolution, a plurality of cavity convolutions and global pooling, all the cavity convolutions are subjected to feature extraction in a series-connection and parallel-connection mode, all the cavity convolutions have different sampling rates, and in the embodiment, the cavity convolutions (or the hole convolutions) with different size expansion rates are adopted to capture the receptive field information of different scales, so that the feature information of different scales is captured, and the multi-scale characteristic of the functional area is adapted. The spatial pyramid pooling module is an improved spatial pyramid pooling module (an improved ASPP module for short), the problem of information loss caused by cavity convolution can be solved by adopting the structural form of FIG. 4, and the cavity convolutions with different sampling rates are innovatively connected in series and in parallel, so that the hole convolution loopholes can be covered, and the information loss can be prevented; according to the technical principle of fig. 4, the cavity convolution in the spatial pyramid pooling module is subjected to two-dimensional decomposition into convolution of 3 × 1 and convolution of 1 × 3, so that the module calculation amount is reduced, and the network operation speed is increased. And the spatial pyramid pooling module performs channel splicing on all the feature maps to obtain multi-scale semantic information and performs 1 × 1 convolution (the number of channels can be reduced) to obtain a feature map A1.
And A3, inputting the feature map A1 into a deformable convolution module to increase the sensitivity to the shape and scale difference of the ground feature (the structure of the deformable convolution module in the embodiment is shown in FIG. 5), outputting nine offsets required by deformable convolution through A3 × 3 convolution by the deformable convolution module, and then applying the nine offsets to a convolution kernel to output a feature map A2 so as to achieve the effect of deformable convolution.
And A4, overlapping the feature graph A2 with the low-layer features of the ResNet50 basic network, performing 3 x 3 convolution and quadruple upsampling to obtain a semantic segmentation result graph with the same size as the original high-resolution image, and outputting a multi-scale functional semantic feature image block. In this embodiment, an output feature map A2 is up-sampled four times, and then is overlapped with a block1 low-layer feature extracted through 1 × 1 convolution and maximum pooling in a ResNet50 basic network and a block3 low-layer feature extracted through 1 × 1 convolution and 2 times up-sampling (the low-layer feature has fine edge feature information, the size of the feature map extracted through high-layer convolution is reduced, the edge information is lost, and the accurate edge information can be recovered by overlapping the low-layer feature and the edge feature), a multi-scale functional semantic feature image block can be obtained and output after the overlapping processing, and then a semantic segmentation result map with the same size as the original high-resolution image is obtained through 3 × 3 convolution and four times up-sampling. In this embodiment, the feature map obtained by superimposing the deformable convolution module and the low-level features may be output as the multi-scale functional semantic feature.
The preferred multi-scale deformable convolution network model training method comprises the following technical scheme:
collecting sample data, constructing a training data set, and performing model training on the multi-scale deformable convolution network model by using a cross entropy loss function by using the training data set, wherein the cross entropy loss function expression is as follows:
Figure BDA0003648908270000081
where S is the number of samples, K is the number of classes, y a,c The probability of predicting as class c for sample a; if the class of one-hot encoding (one-hot encoding is a method of converting a classification variable into several binary columns) of the sample a is class c, then
Figure BDA0003648908270000082
Equal to 1, if the one-hot encoding class of sample a is not class c, then
Figure BDA0003648908270000083
Equal to 0. Preferably, an Adam method is finally adopted for optimization solution so as to obtain higher computational efficiency and lower memory requirement.
B. All the multi-scale functional semantic feature image blocks in the research area are spliced to obtain functional semantic feature image data of the research area, and a multi-scale segmentation algorithm is adopted to perform functional unit segmentation processing on the functional semantic feature image data of the research area to obtain a plurality of functional area units. In the multi-scale segmentation, the present embodiment uses two heterogeneity increments, a functional semantic depth feature heterogeneity increment and a shape heterogeneity feature increment, as consideration of the degree of segmentation.
B1, functional semantic depth feature heterogeneity increment h deep (the functional semantic depth feature heterogeneity increment is adopted as a first consideration of the segmentation degree in the functional unit segmentation processing process of the functional semantic feature image data in the research area): the standard deviation of the depth features of two adjacent objects is respectively expressed as sigma 1,i 、σ 2,i The areas of two adjacent objects (the number of pixels can be used instead) are respectively represented as n 1 And n 2 The standard deviation and the surface integral of the depth feature (i.e. the abbreviation of the functional semantic depth feature) of the merged object are respectively expressed as sigma merg,i And n m (ii) a The depth feature heterogeneity increases by h deep Expressed as:
Figure BDA0003648908270000091
where i is the ith dimension, w, of the depth feature i The weight of the ith dimension of the depth feature may be set to be equal for improving efficiency. The multi-scale segmentation in this embodiment refers to a process of generating a meaningful image polygonal object (i.e., a functional area unit) with minimum heterogeneity and maximum homogeneity at any scale on the premise of minimum loss of image information, and is a means of image abstraction (compression), that is, information of a high-resolution image element is retained on a low-resolution object. The multi-scale division of the image starts from any pixel, an object is formed by adopting a region combination method from bottom to top, a small object can be combined into a large object through a plurality of steps, and the adjustment of the size of each object must ensure that the heterogeneity increment of the combined object is smaller than a given threshold value. Therefore, the multi-scale segmentation of the embodiment can be understood as a local optimization process, and the homogeneity criterion of the object is determined by the function language of the objectDetermining a semantic depth feature factor and a shape factor (the functional semantic depth feature factor is relatively more important), wherein the semantic depth feature factor and the shape factor respectively represent the weight of the functional semantic depth feature and the shape during image segmentation, the sum of the two weights is 1, the shape factor consists of smoothness (smoothness) and compactness (compactness), the sum of the two weights is 1, and the parameters jointly determine a segmentation result.
B2, shape heterogeneity increment h shape (the functional unit segmentation processing process of the functional semantic feature image data in the research area adopts the feature increment of shape heterogeneity as a second consideration of the segmentation degree, and the shape heterogeneity comprises two shape indexes of smoothness and compactness) as follows:
h shape =w smooth ×h smooth +(1-w smooth )×h com in the formula w smooth Is a smoothness weight; wherein h is smooth 、h com Respectively obtained by the following formulas:
Figure BDA0003648908270000092
Figure BDA0003648908270000101
l 11 、l 12 respectively representing the perimeters of two adjacent objects; b 11 、b 12 Respectively representing the perimeters of the smallest bounded rectangles of two adjacent objects; n is 11 、n 12 Respectively representing the areas of two adjacent objects; l merg Representing the perimeter of the merged object, b merg Perimeter, n, of the smallest bounded rectangle representing the merged object merg Representing the area of the merged object;
and B3, in the process of carrying out functional unit segmentation processing on the functional semantic feature image data of the research area, adopting the total heterogeneity increment as a segmentation degree (or a segmentation scale), wherein the total heterogeneity increment f is expressed by two heterogeneity increments of a functional semantic depth feature heterogeneity increment and a shape heterogeneity increment:
f=w deep ×h deep +(1-w deep )×h shape wherein w is deep Weights in the total heterogeneity increment for functional semantic depth features.
In this embodiment, an iterative computation network model may be constructed by using a multi-scale segmentation algorithm, and the iterative computation network model implements merging of image objects through an iterative process, in which it is determined whether each object can be merged with its surrounding objects (if one object is already merged, no determination is needed). For an object, if the heterogeneity increase between the object and the adjacent object is minimum and the combination criterion (namely the set segmentation scale) is met, combining the object and the adjacent object with the minimum heterogeneity increase to form a new object; otherwise, no operation is performed on the object, and then another object is selected for further processing. After all objects have been processed, this iteration is complete and the next iteration will be performed. In a new iteration, objects that were not merged in the previous iteration and all new objects will be processed until no objects are merged.
C. Calculating the unit attributes of each functional area unit, wherein the unit attributes of the functional area units comprise POI data kernel density feature attributes and functional semantic features, and the method comprises the following steps:
c1, respectively converting each type of POI data in each functional area unit into nuclear density image data by using nuclear density analysis (generally, the POI data cannot directly provide effective features to distinguish functional categories, aiming at the technical problem, the invention uses the nuclear density analysis method to mine the space and semantic information of the POI data, respectively carries out the nuclear density analysis on the POI data of different categories, converts the POI data into nuclear density images, then calculates the mean value and the standard deviation of each type of the nuclear density image data in each functional area unit and takes the mean value and the standard deviation as the nuclear density feature attributes of the POI data of the functional area unit.
According to a preferred embodiment of this embodiment, in step C1, a kernel density analysis is performed on the POI data in the functional region unit, and the kernel density D at the spatial position (x, y) is expressed as:
Figure BDA0003648908270000111
wherein i represents a POI point; dist i Is the distance of point i from location (x, y) and R is the search radius. In the embodiment, after the nuclear density of each POI is calculated, a nuclear density image is output; each type of POI data will generate a kernel density image, and then calculate the mean and standard deviation of the kernel density image data of each functional area unit as the characteristic attribute of the functional unit (i.e. POI data kernel density characteristic attribute).
And C2, acquiring the functional semantic features in each functional area unit based on the multi-scale functional semantic features (obtained by the multi-scale deformable convolution network model according to the method).
D. And constructing unit attributes based on the functional area units, training a random forest classifier, and classifying each functional area unit by using the random forest classifier.
According to a preferred embodiment of this embodiment, the training method of the random forest classifier in step D is as follows: and constructing functional area sample data, wherein the functional area sample data comprises functional area units with different functional category attributes, the functional category attributes correspond to the unit attributes of the functional area units, the functional area sample data is divided into a training set and a testing set, a random forest classifier is trained by the training set, classification accuracy verification is carried out on the testing set by the trained random forest classifier, and therefore the random forest classifier with high accuracy is obtained, and extraction and classification of the functional area are carried out by the trained random forest classifier. The invention can utilize the trained random forest classifier to classify the unclassified functional regions, obtain the extraction result of the functional regions and utilize the test data to evaluate the precision.
The invention provides a functional area extracting and classifying method (comprising a multi-scale deformable convolution network model) for urban functional semantic extraction, wherein the multi-scale deformable convolution network model extracts features through a ResNet50 basic network, captures the multi-scale features by utilizing an improved spatial pyramid pooling module, and increases the sensitivity degree to the shape and scale of the functional area by utilizing the deformable convolution module, so that the robust functional semantic features can be extracted; in addition, aiming at the problem that the functional blocks cannot be accurately expressed by the block segmentation and space grid functional units adopted by the existing method, the invention provides a fine functional unit generation method for multi-scale segmentation of functional semantic features. And finally, based on the functional area unit, fusing the remote sensing image multi-scale depth features and POI data kernel density analysis features, and realizing the classification of the functional area through a trained random forest classifier. The method can improve the precision and the fineness of the urban functional area extraction, and can be quickly and efficiently applied to the large-range urban functional area extraction task to meet the actual application requirements.
The above description is intended to be illustrative of the preferred embodiment of the present invention and should not be taken as limiting the invention, but rather, the intention is to cover all modifications, equivalents, and alternatives falling within the spirit and scope of the invention.

Claims (6)

1. A method for extracting and classifying functional areas fusing high-resolution images and POI data is characterized by comprising the following steps: the method comprises the following steps:
A. collecting high-resolution image data of a research area, constructing a multi-scale deformable convolution network model, wherein the multi-scale deformable convolution network model comprises a ResNet50 basic network, a spatial pyramid pooling module, a deformable convolution module and a decoding module, and inputting the high-resolution image data into the multi-scale deformable convolution network model;
a1, dividing high-resolution image data into a plurality of image blocks, extracting low-level features and high-level features of each image block through a ResNet50 basic network of a multi-scale deformable convolution network model, and inputting the high-level features into a spatial pyramid pooling module;
a2, the space pyramid pooling module comprises 1 × 1 convolution, a plurality of cavity convolutions and global pooling, all the cavity convolutions are subjected to feature extraction in a series-connection and parallel-connection mode, all the cavity convolutions have different sampling rates, and the space pyramid pooling module is used for channel splicing of all feature maps to obtain multi-scale semantic information and performing 1 × 1 convolution to obtain a feature map A1;
a3, inputting the characteristic diagram A1 into a deformable convolution module, outputting nine offsets required by deformable convolution through A3 x 3 convolution by the deformable convolution module, and then applying the nine offsets to a convolution kernel to output a characteristic diagram A2;
a4, overlapping the feature map A2 with the low-level features of the ResNet50 basic network, performing 3 x 3 convolution and quadruple upsampling to obtain a semantic segmentation result map with the same size as the original high-resolution image, and outputting a multi-scale functional semantic feature image block;
B. splicing all multi-scale functional semantic feature image blocks in a research area to obtain functional semantic feature image data of the research area, and performing functional unit segmentation processing on the functional semantic feature image data of the research area by adopting a multi-scale segmentation algorithm to obtain a plurality of functional area units;
C. calculating the unit attributes of each functional area unit, wherein the unit attributes of the functional area units comprise POI data kernel density feature attributes and functional semantic features, and the method comprises the following steps:
c1, converting each type of POI data in each functional area unit into nuclear density image data by utilizing nuclear density analysis, then calculating the mean value and the standard deviation of each type of nuclear density image data in each functional area unit, and taking the mean value and the standard deviation as the nuclear density characteristic attribute of the POI data of the functional area unit;
c2, acquiring functional semantic features in each functional area unit based on the multi-scale functional semantic features;
D. and constructing unit attributes based on the functional area units, training a random forest classifier, and classifying each functional area unit by using the random forest classifier.
2. The method for extracting and classifying the functional area fusing the high-resolution image and the POI data according to claim 1, wherein:
collecting sample data, constructing a training data set, and performing model training on the multi-scale deformable convolution network model by using a cross entropy loss function by using the training data set, wherein the cross entropy loss function expression is as follows:
Figure FDA0003648908260000021
where S is the number of samples, K is the number of classes, y a,c The probability of predicting as class c for sample a; if the one-hot encoding type of the sample a is type c, then
Figure FDA0003648908260000022
Equal to 1, if the one-hot encoding class of sample a is not class c, then
Figure FDA0003648908260000023
Equal to 0.
3. The method for extracting and classifying functional regions fusing high-resolution images and POI data according to claim 1, wherein the method comprises the following steps: in the step B, the functional semantic depth feature heterogeneity increment is adopted as a first consideration of the segmentation degree in the process of carrying out functional unit segmentation processing on the functional semantic feature image data of the research area;
b1, functional semantic depth feature heterogeneity increment h deep : the standard deviation of the depth features of two adjacent objects is respectively expressed as sigma 1,i 、σ 2,i The areas of two adjacent objects are respectively represented as n 1 And n 2 And the standard deviation and the area of the depth feature of the merged object are respectively expressed as sigma merg,i And n m (ii) a The depth feature heterogeneity increases by h deep Expressed as:
Figure FDA0003648908260000024
where i is the ith dimension of the depth feature, w i Is the weight of the ith dimension of the depth feature.
4. The method for extracting and classifying the functional area fusing the high-resolution image and the POI data according to claim 3, wherein: in the step B, in the process of carrying out functional unit segmentation processing on the functional semantic feature image data of the research area, a shape heterogeneity feature increment is used as a second considered quantity of the segmentation degree, and the shape heterogeneity comprises two shape indexes of smoothness and compactness;
b2, shape heterogeneity increment h shape The expression is as follows:
h shape =w smooth ×h smooth +(1-w smooth )×h com wherein wsmooth is a smoothness weight; wherein h is smooth 、h com Respectively obtained by the following formulas:
Figure FDA0003648908260000031
Figure FDA0003648908260000032
l 11 、l 12 respectively representing the perimeters of two adjacent objects; b 11 、b 12 The perimeters of the smallest bounded rectangles representing two adjacent objects, respectively; n is 11 、n 12 Respectively representing the areas of two adjacent objects; l merg Representing the perimeter of the merged object, b merg Perimeter of the smallest bounded rectangle representing the merged object, n merg Representing the area of the merged object;
and B3, in the process of carrying out functional unit segmentation processing on the functional semantic feature image data in the research area, adopting a total heterogeneity increment as a segmentation degree, wherein the total heterogeneity increment f is represented by two heterogeneity increments, namely a functional semantic depth feature heterogeneity increment and a shape heterogeneity increment:
f=w deep ×h deep +(1-w deep )×h shape wherein w is deep Weights in the total heterogeneity increment for functional semantic depth features.
5. The method for extracting and classifying functional regions fusing high-resolution images and POI data according to claim 1, wherein the method comprises the following steps: in step C1, performing kernel density analysis on the POI data in the functional area unit, where the kernel density D at the spatial position (x, y) is represented by the formula:
Figure FDA0003648908260000041
wherein i represents a POI point; dist i Is the distance of point i to location (x, y) and R is the search radius.
6. The method for extracting and classifying the functional area fusing the high-resolution image and the POI data according to claim 1, wherein: and D, constructing function area sample data, wherein the function area sample data comprises function area units with different function type attributes, the function type attributes correspond to the unit attributes of the function area units, dividing the function area sample data into a training set and a testing set, training a random forest classifier by using the training set, and verifying the classification precision of the testing set by using the trained random forest classifier, thereby obtaining a high-precision random forest classifier, and extracting and classifying the function area by using the trained random forest classifier.
CN202210543624.2A 2022-05-18 2022-05-18 Functional area extraction and classification method fusing high-resolution images and POI data Active CN114898089B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210543624.2A CN114898089B (en) 2022-05-18 2022-05-18 Functional area extraction and classification method fusing high-resolution images and POI data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210543624.2A CN114898089B (en) 2022-05-18 2022-05-18 Functional area extraction and classification method fusing high-resolution images and POI data

Publications (2)

Publication Number Publication Date
CN114898089A CN114898089A (en) 2022-08-12
CN114898089B true CN114898089B (en) 2022-10-25

Family

ID=82722944

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210543624.2A Active CN114898089B (en) 2022-05-18 2022-05-18 Functional area extraction and classification method fusing high-resolution images and POI data

Country Status (1)

Country Link
CN (1) CN114898089B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115546649B (en) * 2022-10-24 2023-04-18 中国矿业大学(北京) Single-view remote sensing image height estimation and semantic segmentation multi-task prediction method
CN117036939A (en) * 2023-08-07 2023-11-10 宁波大学 Urban functional area identification method based on multi-source data collaboration of graph rolling network

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109508585A (en) * 2017-09-15 2019-03-22 中国科学院城市环境研究所 A method of urban function region is extracted based on POI and high-resolution remote sensing image
CN110705457A (en) * 2019-09-29 2020-01-17 核工业北京地质研究院 Remote sensing image building change detection method
CN111476170A (en) * 2020-04-09 2020-07-31 首都师范大学 Remote sensing image semantic segmentation method combining deep learning and random forest
CN113205520A (en) * 2021-04-22 2021-08-03 华中科技大学 Method and system for semantic segmentation of image
CN113610165A (en) * 2021-08-10 2021-11-05 河南大学 Urban land utilization classification determination method and system based on multi-source high-dimensional features
CN114445615A (en) * 2021-12-21 2022-05-06 国网甘肃省电力公司经济技术研究院 Rotary insulator target detection method based on scale invariant feature pyramid structure

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109508585A (en) * 2017-09-15 2019-03-22 中国科学院城市环境研究所 A method of urban function region is extracted based on POI and high-resolution remote sensing image
CN110705457A (en) * 2019-09-29 2020-01-17 核工业北京地质研究院 Remote sensing image building change detection method
CN111476170A (en) * 2020-04-09 2020-07-31 首都师范大学 Remote sensing image semantic segmentation method combining deep learning and random forest
CN113205520A (en) * 2021-04-22 2021-08-03 华中科技大学 Method and system for semantic segmentation of image
CN113610165A (en) * 2021-08-10 2021-11-05 河南大学 Urban land utilization classification determination method and system based on multi-source high-dimensional features
CN114445615A (en) * 2021-12-21 2022-05-06 国网甘肃省电力公司经济技术研究院 Rotary insulator target detection method based on scale invariant feature pyramid structure

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于遥感影像的城市功能区识别;吴施瑶等;《信息与电脑》;20210525;第33卷(第10期);第169-172页 *

Also Published As

Publication number Publication date
CN114898089A (en) 2022-08-12

Similar Documents

Publication Publication Date Title
CN110136170B (en) Remote sensing image building change detection method based on convolutional neural network
CN110929607B (en) Remote sensing identification method and system for urban building construction progress
CN114898089B (en) Functional area extraction and classification method fusing high-resolution images and POI data
CN110263717B (en) Method for determining land utilization category of street view image
CN111639587B (en) Hyperspectral image classification method based on multi-scale spectrum space convolution neural network
CN109635726B (en) Landslide identification method based on combination of symmetric deep network and multi-scale pooling
CN112232328A (en) Remote sensing image building area extraction method and device based on convolutional neural network
Li et al. Fusing taxi trajectories and RS images to build road map via DCNN
Xie et al. OpenStreetMap data quality assessment via deep learning and remote sensing imagery
CN113807278A (en) Deep learning-based land use classification and change prediction method
CN114283326A (en) Underwater target re-identification method combining local perception and high-order feature reconstruction
CN117372876A (en) Road damage evaluation method and system for multitasking remote sensing image
CN114511787A (en) Neural network-based remote sensing image ground feature information generation method and system
Li et al. Learning to holistically detect bridges from large-size vhr remote sensing imagery
CN115147726B (en) City form map generation method and device, electronic equipment and readable storage medium
CN109934103A (en) Method based on obvious object in dark channel prior and region covariance detection image
CN112989919B (en) Method and system for extracting target object from image
CN112528803B (en) Road feature extraction method, device, equipment and storage medium
CN111639672B (en) Deep learning city function classification method based on majority voting
Pang et al. PTRSegNet: A Patch-to-Region Bottom-Up Pyramid Framework for the Semantic Segmentation of Large-Format Remote Sensing Images
Zhong et al. Local Climate Zone Mapping by Coupling Multi-Level Features with Prior Knowledge Based on Remote Sensing Images
Liu et al. Identification of Damaged Building Regions from High-Resolution Images Using Superpixel-Based Gradient and Autocorrelation Analysis
Wu et al. DF4LCZ: A SAM-Empowered Data Fusion Framework for Scene-Level Local Climate Zone Classification
CN117115566B (en) Urban functional area identification method and system by utilizing full-season remote sensing images
Aman et al. Comparative analysis of different methodologies for local climate zone classification

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant