Background
Buildings, as one of the main elements of the geographic space, occupy an important position in the fields of urban planning and construction, change detection, population density estimation and the like, and are a very important target in remote sensing image interpretation. With the rapid development of high-spatial resolution remote sensing (short for high-resolution remote sensing) satellite and aerial image acquisition technologies, the high-resolution remote sensing image has huge information quantity, contains information such as ground feature detail features, geometric structures, spatial features, shapes, texture features and the like, and can relatively accurately describe typical ground features, so that accurate identification of buildings by using the high-resolution remote sensing image becomes possible. At present, the method for automatically extracting buildings from high-resolution remote sensing images includes: based on segmentation techniques ([ documents 1-2]) and on corner, line and shadow features ([ documents 3-5 ]). The extraction method based on the segmentation technology fully utilizes the characteristics of the object such as spectrum, texture, shape and the like, and adopts the object-oriented segmentation technology to extract the building, but the obtained building edge and corner are inaccurate due to the influence of factors such as noise, illumination, reflectivity and the like. Extracting a building by combining corner detection and image segmentation based on a corner feature method; extracting line segments on the image by using a line detection algorithm based on a line feature method, and then judging a line relation to automatically extract a building ([ documents 6-9 ]); the shadow feature-based method first decomposes an image into uniform small regions, and then fuses the small regions having similar spectral features so that the shapes thereof are approximately rectangular according to the positions of shadows, thereby extracting a building ([ documents 10,11 ]). However, this method based on low-level features is only suitable for the case of independent house, simple shape and less shielding between ground objects. Due to the complexity of remote sensing images, buildings are affected by noise, occlusion, shadow and low contrast, and the method for extracting the buildings in a full-automatic manner cannot always obtain reliable results, especially in some application fields with higher requirements on boundary accuracy.
In order to solve the problem, one currently feasible way is to combine computer automatic extraction and manual interaction, i.e. to realize interactive surface feature extraction. In the process of extracting the remote sensing image building, on one hand, initial information provided by an operator is fully utilized, and simultaneously, the advantage of processing an image by a computer is exerted, and the efficiency of extracting the building is improved by combining the initial information and the image. Some scholars have studied interactive extraction buildings, such as: a semi-automatic extraction method of a right-angle flat-topped house combining geometric constraint and image segmentation (document 12); the method comprises the steps of (1) obtaining the optimal matching between the edge of a building and an object space geometric model by an algorithm processing through a building initial position appointed by a user based on the object space geometric constraint least square matching building semi-automatic extraction method ([ document 13 ]); extracting a linear ground object between two points by least square template matching using two initial points specified manually ([ document 14 ]); by combining Snake and dynamic planning to extract the building, the user can extract the accurate outline of the building (document 15) by only designating a plurality of seed points at the corner to indicate the approximate position of the building. The above methods are all to give the initial position of the house manually, mainly rely on the edge information to extract the building, but the interaction is complex, and the user needs to give the accurate position of the building, so the application is limited.
Based on the analysis and considering the characteristic that most buildings on the high-resolution remote sensing image are rectangular, the invention combines the automatic extraction of the computer and the manual interaction, and extracts the buildings by simply drawing a line on the buildings by a user.
Reference documents:
[1]Blaschke,T.and J.Strobl,What's wrong with pixels?Some recentdevelopments interfacing remote sensing and GIS.GIS-Zeitschrift fürGeoinformationssysteme,2001.14(6):p.12-17.
[2]Kropatsch,W.G.and S.Ben Yacoub.A revision of pyramidsegmentation.in Pattern Recognition,1996.,Proceedings of the 13thInternational Conference on.1996.
[3]Huertas A,Nevatia R.Detecting buildings in aerial images☆[J].Computer Vision Graphics&Image Processing,1988,41(2):131-152.
[4]Song,Y.,&Shan,J.(2010).Building extraction from high resolutioncolor imagery based on edge flow driven active contour and jseg.1.Blaschke,T.and J.Strobl,What's wrong with pixels?Some recent developments interfacingremote sensing and GIS.GIS-Zeitschrift für Geoinformationssysteme,2001.14(6):p.12-17.
[5]Li J,Wang K,Zhang Z,et al.A method of building informationextraction based on mathematical morphology and multiscale[C]//InternationalConference on Intelligent Earth Observing and Applications.2015:98082S.
[6]Wang J,Yang X,Qin X,et al.An Efficient Approach for AutomaticRectangular Building Extraction From Very High Resolution Optical SatelliteImagery[J].IEEE Geoscience&Remote Sensing Letters,2015,12(3):487-491.
[7]A.CroitoruDoytsher Y.Right-Angle Rooftop Polygon Extraction inRegularised Urban Areas:Cutting the Corners[J].Photogrammetric Record,2004,19(108):311–341.
[8]Ngo T T,Collet C,Mazet V.Automatic rectangular building detectionfrom VHR aerial imagery using shadow and image segmentation[C]//IEEEInternational Conference on Image Processing.IEEE,2015:1483-1487.
[9] ponkan sea, Li Guang, Zhao Jie, etc. the contour extraction [ J ] of buildings in satellite pictures based on a straight line detection algorithm, computer application 2008,28(b06): 190-.
[10] The contour information extracting method for high-resolution remote sensing image includes J remote sensing of national soil resources 2015,27(3):52-58.
[11]Singh G,Jouppi M,Zhang Z,et al.Shadow based building extractionfrom single satellite image[J].Proceedings of SPIE-The International Societyfor Optical Engineering,2015,9401:94010F-94010F-15.
[12] Zhangyu, Zhangzu, geometric constraint and image segmentation combined rapid semi-automatic house extraction [ J ] Wuhan university newspaper information science edition, 2000,25(3): 238-.
[13] Zhangzu, Zhang Jianqing, Hu Xiang Yun, based on the least square matching of the space geometric constraint of the object space method of semi-automatic extraction of buildings [ J ] Wuhan university journal (information science edition), 2001,26(4):290 supplement 295.
[14] Semi-automatic extraction of linear ground object on aerial image [ J ] Chinese graphic bulletin, 2002,7(2): 137-doped 140.
[15]Fazan A J,Poz A P D.Rectilinear building roof contour extractionbased on snakes and dynamic programming[J].International Journal of AppliedEarth Observation&Geoinformation,2013,25(3):1–10.
Detailed Description
In order to facilitate the understanding and implementation of the present invention for those of ordinary skill in the art, the present invention is further described in detail with reference to the accompanying drawings and examples, it is to be understood that the embodiments described herein are merely illustrative and explanatory of the present invention and are not restrictive thereof.
The invention provides an interactive semi-automatic high-resolution remote sensing image building extraction method which completes the high-resolution remote sensing image building extraction through several steps of user interaction, main direction detection, superpixel segmentation, building plaque acquisition and regularization.
Step 1: the approximate extent of the building is determined through user interaction.
The invention requires a user to specify a seed line, and has several requirements on the seed line: 1) the trend of the seed line is the diagonal line of the building; 2) the length of the seed line is about two thirds of the diagonal line of the building, and 3) the two end points of the seed line are symmetrical about the center of the building. Based on the three points, the approximate position of the target building can be determined, and the image of the area is taken as the image to be segmented, as shown in fig. 1.
The curve on the building is a seed line interacted by a user, Wseed and Hseed are respectively the width and the height of a seed line wrapping rectangle S1, the approximate position S2 of the building is a square with S1 as the center, Wbuilding is the side length of the building, and the calculation mode is as follows:
Wbuilding=2*max(Wseed,Wseed) (1)
step 2: detecting straight lines in a building range by using a straight line detection algorithm, obtaining a main direction of the building by counting a line segment direction histogram, and rotating the building to a horizontal direction;
the adjacent sides of the rectangular building are mutually perpendicular, so that the building has two mutually perpendicular main directions, the main direction of the building can be determined by using the line segment direction histogram, and then the building is rotated to the horizontal direction, so that the building is in a horizontal, flat and vertical state, and the building is convenient to segment. The EDLines is a linear detection algorithm which is automatic, fast and does not need any parameter adjustment, and can extract a straight line on an image more accurately compared with a Hough transform and Lsd linear extraction algorithm. Therefore, the present invention extracts the line segment L on the whole image by using the EDLines (L1, L2, … …, ln), and the seed line is generally located at the center of the building, and the probability that the extracted line segment is located on the building is higher as the extracted line segment is closer to the center of the building, so that the line segment closer to the seed line is weighted more when the direction voting is performed. Calculating the distances from the middle points of all line segments in the building range to the center of the building, and assigning weights to the line segments by using a distance graph, wherein the value mode of the direction histogram of the line segments is as follows:
where θ ∈ [0,180) denotes the direction of the line segment, and also represents the horizontal axis of the histogram, nθThe histogram direction value is the intensity at theta, li represents a line segment, len (li) represents the length of the line segment li, D represents the normalized distance graph of the seed line, and D (x, y) represents the distance value at point (x, y). (x)li,yli) Representing the coordinates of the midpoint of the line segment li. Calculating the main direction of the building by using the formula (3), and then rotating the original image to the horizontal direction by using a bilinear interpolation method, so that the building is in a horizontal, flat and vertical state, as shown in fig. 2:
θM=arg max(nθ+nθ+90)θ∈[0,90](3)
in order to improve the effect and performance of the high-resolution remote sensing image, over-segmentation is often used as a preprocessing step, and the method utilizes a simple linear iterative clustering algorithm (SLIC) to pre-segment the original image. The SLIC algorithm is a super-pixel algorithm which is convenient to realize, time consumption is low, parameters needing to be adjusted by a user are low, and a series of uniform super-pixels with similar characteristics and without damaging image boundary information can be obtained through pre-segmentation. The super-pixels are obtained by clustering by considering the similarity of the pixel points on the colors and the proximity degree of the spatial positions. Local clustering of superpixels is done in a five-dimensional space, which consists of the [ l, a, b ] values and pixel coordinate values [ x, y ] of the CIELAB color space. In the five-dimensional space, a normalized distance metric that fuses color similarity and pixel location information is employed to perform clustering of pixels. The algorithm flow is as follows:
1) an initial cluster center is given. In an image with N pixel points, assuming that the number of pre-divided super pixels is K, the size of each super pixel is N/K pixels, and the distance between the centers of the super pixels is N/K
Selecting pixel centers as: c
K=[l
k,a
k,b
k,x
k,y
k]k∈[1,K]
2) The cluster center is moved to the gradient minimum of the neighborhood. To avoid the cluster center being exactly at the edge position of the image and at the same time reducing the likelihood of selecting a noise point, the algorithm moves the cluster center to the 3 x 3 neighborhood where the gradient is smallest.
3) For each cluster center, the similarity between each pixel point and the nearest cluster center pixel point is calculated within the 2 sx 2S neighborhood. The similarity between the pixels is calculated according to the Euclidean distance as follows:
the SLIC algorithm cluster search is shown in fig. 3;
4) calculating a new clustering center and carrying out search marking again;
5) until the difference between the new cluster center and the original cluster center is smaller than a threshold value;
fig. 4 shows an image of SLIC over-segmentation according to an embodiment of the present invention.
And step 3: obtaining a building pattern spot by utilizing a GrabCT segmentation method based on a superpixel;
the method adopts a GrabCut segmentation algorithm to obtain the building image spots. The GrabCut algorithm converts the image segmentation problem into an energy function minimization problem, and then converts the energy minimization problem into a minimum cut problem, wherein the minimum cut problem can be solved by using a maximum stream theory. The image features are described by adopting a color Gaussian Mixture Model (GMM) statistic to replace a histogram, and the feature covariant relation among sampling points is considered, so that the segmentation result is more accurate. Due to the simple interaction and high segmentation precision, the method can meet the requirement that a user obtains a satisfactory segmentation result on the image through relatively less interactive work. Therefore, the method interactively selects the target foreground pixel points and the background pixel points according to the super-pixel initial segmentation result, and adopts the maximum flow minimum segmentation algorithm to realize the segmentation of the target and the background, namely the target extraction. The energy function is expressed as:
E(α,k,θ,z)=U(α,k,θ,z)+V(α,z) (4)
k=(k1,k1,...,kN),ki∈{1,2,...,K}(i∈[1,N]) Indicating that the ith pixel belongs to the k-th GMMiA Gaussian model, K represents the number of Gaussian functions in the GMM, α represents foreground background labels, z represents image data, and θ represents parameters of the GMM, such as θ ═ { π (α, K), μ (α, K), Σ (α, K) }
Wherein pi represents the weight of each gaussian function in the GMM, and mu, sigma respectively represent the mean and covariance matrix of the gaussian functions.
The data items are defined as: a pixel is classified as a penalty for the background or object, and is calculated as follows:
the boundary term is defined as: the boundary item represents discontinuous punishment between the neighborhood pixels m and n, if the difference between the two neighborhood pixels is small, the probability that the two neighborhood pixels belong to the same target or the same background is high, and if the difference is large, the probability that the two neighborhood pixels are located at the edge part of the target and the background is high, so that the probability of being divided is high, and the energy is low when the difference between the two neighborhood pixels is large. In the RGB space, the similarity between two pixels is measured, and the euclidean distance (two norms) is defined as follows:
V(α,z)=γ∑[αn≠αm]exp-β||zm≠zn||2(7)
the GrabCont algorithm energy is minimum through iteration, GMM parameters of foreground and background modeling are better in each iteration process, image segmentation results are better, and the algorithm is divided into three steps to complete image segmentation: initialization, iterative minimization, and user interaction.
1) Initialization, a user designates a rectangular frame containing an object on an image, all pixels outside the frame are used as background pixels, all pixels inside the frame are used as pixels of a possible object, and GMM of a foreground background is estimated by using the pixels respectively.
2) And (3) iterative minimization, namely allocating Gaussian components in the GMM to each pixel, recalculating parameters of the foreground and background GMM, constructing a graph through a Gibbs energy item of the formula (1), minimizing an energy function by using a min-cut/max-flow method, and executing until convergence to obtain initial segmentation estimation.
3) And (4) editing by the user, if the user is not satisfied with the segmentation result, re-assigning the foreground and the background on the segmentation result to execute the step 2), and repeating the whole iteration process until the user is satisfied.
The method comprises the steps of taking the expansion of a seed line outer-wrapping rectangle as a rectangle provided by a user in algorithm initialization, including a building by the rectangle, constructing a GMM model by using foreground pixels as superpixels corresponding to the seed line and background pixels as superpixels outside the rectangle, and realizing building segmentation by a min-cut/max-flow method.
And 4, step 4: and regularizing the buildings according to the color characteristics of the buildings to obtain accurate building outlines.
The invention is further illustrated by the following specific examples;
1. building extraction flow description;
the extraction of the artificial interactive building based on the superpixel segmentation is completed by four steps of user interaction, main direction detection and image rotation, image segmentation based on the superpixel and image regularization, and the specific flow chart is shown in FIG. 5;
2. the implementation process is carried out;
(1) user interaction;
because the original image is larger, the invention extracts a single building, so that the single building is extracted by segmentation on the small-range image, and the approximate range of the building and the size of the image to be segmented are determined by drawing lines by a user. As shown in fig. 6, in which the curve on the building represents a seed line and the peripheral square of the building is the approximate range of the building, the present invention adopts an image obtained by enlarging the approximate range of the building by one time as the image to be segmented.
(2) Detecting the main direction and rotating the image;
performing linear detection on an image to be segmented by using an EDLines linear detection algorithm, calculating the distance from the midpoint of all line segments in the building range to the center of the building, weighting the line segments by using a distance map, and counting a line segment direction histogram so as to determine the main direction of the building, wherein the result is shown in FIG. 7;
(3) superpixel segmentation
Performing over-segmentation on the image to be segmented by using a SLIC (segmentation in super pixels) segmentation algorithm to obtain a super pixel object, wherein the result is shown in FIG. 8;
(4) GrabCT image segmentation based on superpixels
The method comprises the steps of interactively selecting a building target foreground pixel point according to a super pixel initial segmentation result, expanding an external rectangle of a seed line to be used as a rectangle required to be provided by a user in a segmentation algorithm initialization process, enabling the foreground pixel point to be a super pixel with the same label as the seed line pixel point, enabling a background pixel point to be a super pixel outside the rectangle, and achieving segmentation of a target and a background by adopting a GrabCT segmentation algorithm, namely building extraction. In fig. 9(a), S1 is a circumscribed rectangle of the seed line, S2 is a rectangle including a building, and (b) is a building pattern obtained by division.
(5) Regularizing the pattern spots;
due to the complexity of the remote sensing image, the buildings and the surrounding ground objects are similar to the buildings, so that the building pattern spots obtained by only using the image segmentation algorithm are not complete buildings on the image, and the regularization is needed. Considering the difference between the color of the building itself and the color of the surrounding ground, the invention respectively calculates the similarity between the average value of the pixels l, a, b on each side of the outer-covering rectangle of the map spot and the average value of the pixels l, a, b contained in the building (using Euclidean distance calculation), sets a certain similarity threshold (the empirical value is 0.75), and translates four sides of the outer-covering rectangle until reaching the threshold, thereby obtaining the complete building, wherein the regularization process is shown in FIG. 10, wherein the green rectangle in (a) is the bounding rectangle of the building map spot, (b) the red line is the line segment after the green line is translated, and (c) is the regularized building.
The high-resolution aerial image and the satellite image are respectively used as data sources to carry out different building extraction experiments, the original image and the extraction result are shown in fig. 11 and fig. 12, and the experiment results show that the interactive semi-automatic building extraction method provided by the invention can be used for rapidly and accurately extracting the high-resolution remote sensing image building outline.
It should be understood that parts of the specification not set forth in detail are well within the prior art.
It should be understood that the above description of the preferred embodiments is given for clarity and not for any purpose of limitation, and that various changes, substitutions and alterations can be made herein without departing from the spirit and scope of the invention as defined by the appended claims.