CN113723373B

CN113723373B - Unmanned aerial vehicle panoramic image-based illegal construction detection method

Info

Publication number: CN113723373B
Application number: CN202111285400.8A
Authority: CN
Inventors: 叶绍泽; 卢永华; 刘玉贤; 罗小飞; 王磊; 阮明浩; 闫臻; 耿米兰
Original assignee: Shenzhen Investigation and Research Institute Co ltd
Current assignee: Shenzhen Investigation and Research Institute Co ltd
Priority date: 2021-11-02
Filing date: 2021-11-02
Publication date: 2022-01-18
Anticipated expiration: 2041-11-02
Also published as: CN113723373A

Abstract

The invention provides an unmanned aerial vehicle panoramic image-based illegal construction detection method, which comprises the following steps: s10, collecting an original image of the unmanned aerial vehicle, and labeling the original area image; s20, dividing a training set and a test set; s30, splicing the original images; s40, acquiring depth information of the image; s50, obtaining a building segmentation image block; s60, acquiring a building segmentation panoramic image; s70, carrying out image registration on the two-stage building segmentation panoramic image; s80, recording registration parameters of the two stages of registration images, and performing superposition comparison on the two stages of building segmentation panoramic images; the invention has the beneficial effects that: the illegal building inspection cost can be greatly reduced, and the inspection efficiency is improved.

Description

Unmanned aerial vehicle panoramic image-based illegal construction detection method

Technical Field

The invention relates to the technical field of image recognition, in particular to a method for detecting illegal construction based on panoramic images of unmanned aerial vehicles.

Background

Along with the rapid development of cities, the land value is continuously increased, the real estate market is particularly active, and a large number of illegal individuals or groups occupy land for development or change illegal buildings such as floors and the like designed by the original buildings in order to gain illegal benefits. Although many special departments are set up for illegal construction to manage the illegal construction and obtain certain results, it is still difficult to suppress illegal building behaviors due to limited manpower and limited information acquisition ways.

The remote sensing images are adopted to shoot pictures at fixed points regularly for screening, in order to improve efficiency, predecessors propose various methods to carry out automatic matching detection, abnormal positions are screened out through coordinate matching comparison of the images in two phases, and the abnormal positions are used as judgment basis of illegal buildings. This approach has the following disadvantages: the remote sensing image is difficult to obtain and periodic, timely illegal construction inspection is difficult to carry out, and the illegal construction inspection is possibly blocked due to adverse factors such as cloud layers, so that the inspection efficiency and accuracy are greatly challenged.

Disclosure of Invention

In order to overcome the defects of the prior art, the invention provides an illegal building detection method based on an unmanned aerial vehicle panoramic image, which can greatly reduce the illegal building detection cost and improve the detection efficiency.

The technical scheme adopted by the invention for solving the technical problems is as follows: the improvement of an unmanned aerial vehicle panoramic image-based illegal construction detection method is that the method comprises the following steps:

s10, collecting an original image of the unmanned aerial vehicle, and labeling the original area image;

s20, dividing a training set and a test set according to a set proportion, wherein the training set is used for building image segmentation model training, and the test set is used for testing the robustness of the model to obtain a building segmentation model;

s30, splicing the original images, recording respective serial numbers of the original images, carrying out panoramic splicing on the original images, and recording homography matrix parameters between the images;

s40, acquiring depth information of the image;

s50, obtaining a building segmentation image block, and segmenting the building and the background of the original image by adopting an image segmentation model to obtain the building segmentation image block;

s60, obtaining a building segmentation panoramic image, splicing the first-stage building segmentation image blocks according to the homography matrix parameters to obtain a building segmentation panoramic image consistent with the original image, and then obtaining the building segmentation panoramic image from the original image at the same position in the second stage by adopting the same method;

s70, carrying out image registration on the two-stage building segmentation panoramic image, finding out key characteristic parameters of the image, correcting the image of the second stage, and mapping the image of the second stage to a position corresponding to the image of the first stage;

and S80, recording registration parameters of the two stages of registration images, and performing superposition comparison on the two stages of building segmentation panoramic images.

Further, in step S10, image annotation is performed by using a label tool, and building positions are framed by polygons, and multi-period city original images are collected and annotated.

Further, in step S20, dividing the training set and the test set according to a 9:1 ratio;

in step S30, the panorama stitching of the original image is performed according to the image stitching method extracted by the ORB feature.

Further, step S40 includes a camera calibration step, a binocular correction step, a binocular matching step, and a depth information calculation step;

wherein the camera calibration step comprises:

s401, determining the radial distortion of the camera by parameters k1, k2 and k3, and determining the tangential distortion of the camera by parameters p1 and p 2;

s402, calculating internal parameters, distortion parameters and external parameters of the camera, wherein the internal parameters comprise a focal length f, an imaging origin cx and an imaging origin cy, the distortion parameters comprise k1, k2, k3, p1 and p2, and the external parameters are world coordinates of a calibration object;

and S403, calibrating and measuring the relative position between the two cameras, wherein the relative position comprises a rotation matrix R and a translation vector t of the right camera relative to the left camera.

Further, the binocular correction step includes: and respectively carrying out distortion elimination and line alignment on the left view and the right view of the left camera and the right camera, so that the imaging origin coordinates of the left view and the right view are consistent, the optical axes of the two cameras are parallel, the left imaging plane and the right imaging plane are coplanar, and the epipolar lines are aligned in a line mode.

Further, the binocular matching step includes: and matching corresponding image points on the left view and the right view of the same scene through binocular matching to obtain a disparity map.

Further, the step of calculating the depth information includes:

s404, setting OL and OR as optical centers of the two cameras respectively, B as a center distance of the two cameras, f as a focal length of the camera, P and P' as imaging points of a point P on photoreceptors of the two cameras respectively, Z as the obtained depth information, distance as parallax, and a calculation formula as follows:

；

wherein X_RImaging the abscissa, X, for the right view_LImaging the abscissa for the left view;

according to the similar triangle principle:

；

the following can be obtained:

；

in the formula, the focal length f and the camera center distance B are obtained by calibrating a camera;

and (3) taking the height of the unmanned aerial vehicle as a reference point, and carrying out equal-proportion transformation on the depth information Z into an approximate actual scene distance:

；

wherein

The actual scene distance is H, the vertical height of the unmanned aerial vehicle and the ground is H, and the depth data of the vertical ground point is H; and (3) solving the earth surface distance of the image by using the pythagorean theorem:

；

where C is the horizontal distance.

Further, in step S70, image registration is performed on the two-phase segmented panorama by using an image registration method of the ORB feature operator.

Further, in step S80, the superimposition calculation is performed on the two-stage building division panorama:

S_new = S₂ - S₁；

wherein S is₁Partitioning areas for buildings of the first phase, S₂And reserving the area with the failed matching for the building partition area of the second period, and recording the area as a change part.

The invention has the beneficial effects that: the utility model provides a technical scheme who breaks away and build detection, very big reduction the cost that breaks away and build the detection, increase substantially and break away and build location efficiency, adopt unmanned aerial vehicle to carry on binocular panoramic camera, can carry out data acquisition. Unmanned aerial vehicle data acquisition, fine cloud cover and the plant influence of having avoided, the effectual data quality that has improved.

Drawings

Fig. 1 is a schematic flow chart of an unmanned aerial vehicle panoramic image-based illegal construction detection method of the present invention.

FIG. 2 is a schematic structural diagram of a U-Net model in the unmanned aerial vehicle panoramic image-based default detection method of the present invention.

FIG. 3 is a schematic diagram of a maximum pooling process.

Fig. 4 shows parameters of preserving registration after selecting Sobel operator to perform edge detection on the images of the two phases.

Fig. 5 to 8 are diagrams illustrating an embodiment of panoramic original image registration according to the present invention.

Fig. 9-12 are diagrams of an embodiment of edge detection registration in the present invention.

Detailed Description

The invention is further illustrated with reference to the following figures and examples.

The conception, the specific structure, and the technical effects produced by the present invention will be clearly and completely described below in conjunction with the embodiments and the accompanying drawings to fully understand the objects, the features, and the effects of the present invention. It is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all embodiments, and those skilled in the art can obtain other embodiments without inventive effort based on the embodiments of the present invention, and all embodiments are within the protection scope of the present invention. In addition, all the connection/connection relations referred to in the patent do not mean that the components are directly connected, but mean that a better connection structure can be formed by adding or reducing connection auxiliary components according to specific implementation conditions. All technical characteristics in the invention can be interactively combined on the premise of not conflicting with each other.

Referring to fig. 1, the invention discloses an unmanned aerial vehicle panoramic image-based illegal construction detection method, in the method, a main data source is a binocular panoramic camera carried on an unmanned aerial vehicle, and panoramic image data and necessary parameters acquired in the shooting process of the unmanned aerial vehicle are provided from the main data source. The method for acquiring the first picture in the due north direction comprises the steps of taking the shooting height of an unmanned aerial vehicle as a distance reference. The splicing of the panoramic image mainly adopts ORB image feature matching, and a homography matrix is estimated based on a feature vector matched by RANSAC (random consistent sampling) to finish the image splicing.

The building detection is obtained by adopting a U-Net algorithm, wherein the U-Net is an image segmentation method based on deep learning and is divided into a coding region and a decoding region, the coding region is used for extracting image features, and the decoding region is used for mapping abstract features into segmentation images.

The U-Net mainly comprises two parts including a characteristic representation part and an up-sampling recovery part, and the whole network presents a 'U' shape. The feature representation part is used for extracting features of different levels, different levels of features can be extracted through a plurality of convolution and pooling operations, the convolution operations have the characteristics of local connection and weight sharing, a convolution kernel moves in an input image according to a certain step length, a feature layer is obtained through calculation, parameters are greatly reduced, and the feature representation part is not influenced by changes of image rotation, translation and the like. At the ith level, the jth convolution kernel is at the (x, y) position of the input with depth N, and the convolution and activation operations can be expressed as the formula:

；

where φ is the activation function, and P and Q are the height and width of the convolution kernel. The feature map generated after convolution and activation gives the reaction of the convolution kernel at each spatial position. Intuitively, a neural network would allow the convolution kernel to learn the ability to activate when it sees some type of visual feature, which may be a boundary in some orientation, or a spot of some color on a first layer, or even a honeycomb or wheel-like pattern on a higher layer of the network. Each convolution kernel produces a different two-dimensional feature map.

The different feature maps generated by each convolution kernel are stacked in the depth direction to generate output data. The convolution operation can be divided into two operation modes of 'VALID' and 'SAME', wherein the 'VALID' indicates that zero values are not filled into the edge in the convolution process, the feature map size is reduced by each convolution, and the 'SAME' indicates that the feature map size of the image or the feature map edge is filled in the convolution process, and the size of the output feature map is unchanged from the input.

A pooling layer is periodically inserted between successive convolutional layers. The method has the effect of gradually reducing the space size of the data volume, so that the number of parameters in the network can be reduced, the consumption of computing resources is reduced, and overfitting can be effectively controlled.

The pooling mode is various, such as: max pooling, average pooling, norm pooling, log pooling, and the like. The common pooling mode is maximum pooling, and the output result is a maximum value within the pooling range. The range of maximum pooling is usually the region, and the step size is also 2, so that there is no overlapping region in the pooling result. Average pooling is also used, which takes as output the average of the values within the pooling range. FIG. 3 is a schematic diagram of a maximum pooling process.

And the up-sampling part performs the same-scale fusion on the channel corresponding to the feature table part and the up-sampling part every time sampling is performed, clipping is required to be performed if the scales are not consistent before the fusion, the fusion means feature splicing, as can be seen from fig. 1, the input scale is not consistent with the output scale, the scales can be changed after the image information is transmitted through a network, and the output result is not in a size completely corresponding to the original image.

In fig. 2, the improvement made for the U-Net model is referred to as a Zone-Unet model in the invention, and the Zone-Unet model includes arrows 1 to 6, which represent different meanings, where the arrow 1 means a related conv _5x5, which means a 5x5 hole convolution; arrow 3 indicates that 3 × 3 convolution operation is performed, step size is 1, padding method is adopted as "Valid", and no boundary padding is performed, so that when the size of the feature map is odd, the size of the feature map is reduced by 2 after each convolution operation; arrow 5 indicates that the feature map is maximally pooled, and if the same pool is also "Valid" and no boundary fill is performed, then some information will be followed if the size is odd. The input size needs to be preferred. Arrow 6 means up _ conv _2x2, and means 2x2 deconvolution, which means the process of upsampling, i.e. deconvolution operation, the layer multiple increases as the layer number increases, arrow 2 means copy and crop, which means the operation of copying and cutting, it can be found that the left part is larger than the right part in the same layer, so when using shallow features, the cutting operation needs to be performed, and the splicing is performed after the left and right parts are consistent. The classification part adopts 1x1 convolution to perform classification operation, and two parts, namely a foreground and a background, can be finally output, wherein the foreground refers to an attention object, and the background refers to an unrelated scene. Arrow 4 means max _ pool _2x2, 2x2 max pooling; arrow 5 means avg _ pool _2x2, referring to 2x2 average pooling.

In this embodiment, the Zone-UNet network model makes the following modifications according to the actual scenario:

(1) performing convolution on the coding region characteristics by adopting 5x5 hole convolution, wherein the step number is 1;

(2) after the coding region is convoluted for twice, downsampling once, adopting maximum pooling, supplementing two side regions when the coding region is in odd number of length and width after pooling is met, and then pooling to ensure that the length and width are even number;

(3) the last layer of the encoder adopts average pooling to synthesize information;

(4) deconvolution is carried out on the decoding area layer by layer upwards, and convolution is carried out on the same layer by adopting 5-by-5 hole convolution;

(5) the output layer of the decoding area adopts 1x1 convolution to change the characteristic number of the output layer, so that the content of the output layer is presented according to the segmentation content;

(6) cutting and copying the characteristics of the coding region, and fusing the same layer of characteristics of the decoding region with the characteristics of the coding region to obtain a check message;

(7) performing back propagation error according to the labeling information by adopting an Adam back propagation algorithm to enable the output to be as close as possible to the label, thereby completing the training stage;

(8) the final output comprises two parts, namely target built-up area segmentation and background, and image sequence and segmentation area are recorded;

(9) the model adopts the parameter correction linear unit PReLU, and has more biological characteristics and flexible change compared with the original model.

Wherein

The method is a mode of updating according to Adam algorithm and error back propagation and the adopted driving quantity.

；

Where m is the momentum term and l is the learning rate, update

Without the addition of the weighted decay term,avoid

Is set to 0. The model effectively improves the accuracy of image segmentation, reduces the calculation requirement of the original model, improves the training speed, comprehensively uses the integral characteristic information, effectively enlarges the receptive field by the cavity convolution, captures multi-scale context information, can effectively distinguish the contents of a built-up area from the background and reduces the misjudgment condition of fuzzy contents.

U-Net adopts pixel-wise softmax in the loss function part, each pixel corresponds to the output of softmax, if the image is w h, the softmax corresponds to the size of w h. Wherein, x is taken as a certain pixel point, c (x) represents a label value corresponding to the x point, and pk (x) represents how many softmax the x point is in the output category k.

；

Where w (x) can be seen in the following formula, d1 and d2 are the distances to the nearest and second nearest objects, respectively, of the pixel point, and this weight value is the importance level for adjusting the region in the image.

；

The invention provides a default construction detection method based on an unmanned aerial vehicle panoramic image, which comprises the following steps:

in the step S10, image annotation is carried out by adopting a Labelme annotation tool, the position of a building is framed and selected by adopting a polygon, and the original images of the multi-period city are collected and annotated;

s20, dividing a training set and a test set according to a set proportion, wherein the training set is used for building image segmentation model training, and the test set is used for testing the robustness of the model to obtain a building segmentation model; in the embodiment, a training set and a test set are divided according to a ratio of 9: 1;

s30, splicing the original images, recording respective serial numbers of the original images, carrying out panoramic splicing on the original images, and recording homography matrix parameters between the images; in the embodiment, the panoramic stitching of the original image is performed according to an image stitching method extracted by the ORB characteristics;

s40, acquiring depth information of the image; in the embodiment, the method comprises a camera calibration step, a binocular correction step, a binocular matching step and a depth information calculation step;

wherein the camera calibration step comprises:

s401, determining the radial distortion of the camera by parameters k1, k2 and k3, wherein k1, k2 and k3 are radial distortion coefficients; the tangential distortion of the camera is determined by parameters p1 and p2, and p1 and p2 are tangential distortion coefficients;

the radial distortion is caused by that light rays are more bent at a position far away from the center of the lens than at a position close to the center, and the radial distortion mainly comprises barrel distortion and pincushion distortion; the tangential distortion is caused by the fact that the lens is not perfectly parallel to the image plane, a phenomenon which occurs when the imager is attached to the camera.

s403, calibrating and measuring the relative position between the two cameras, wherein the relative position comprises a rotation matrix R and a translation vector t of the right camera relative to the left camera;

the binocular correction step includes: respectively carrying out distortion elimination and line alignment on left and right views of a left camera and a right camera, so that the imaging origin coordinates of the left and right views are consistent, the optical axes of the two cameras are parallel, the left and right imaging planes are coplanar, and epipolar lines are aligned in a line manner;

the binocular matching step includes: and matching corresponding image points on the left view and the right view of the same scene through binocular matching to obtain a disparity map.

The step of calculating depth information comprises:

；

according to the similar triangle principle:

；

the following can be obtained:

；

；

wherein

For the actual scene distance to be the distance of the scene,h is the vertical height between the unmanned aerial vehicle and the ground, and H is the depth data of a vertical ground point; and (3) solving the earth surface distance of the image by using the pythagorean theorem:

；

where C is the horizontal distance.

s70, carrying out image registration on the two-stage building segmentation panoramic image, finding out key characteristic parameters of the image, correcting the image of the second stage, and mapping the image of the second stage to a position corresponding to the image of the first stage; in the embodiment, an image registration method of an ORB feature operator is adopted to perform image registration on the two-stage segmentation panoramic image;

in addition, because the unmanned aerial vehicle is difficult to ensure the shooting angle, if the registration effect is not good, and the like, the edge detection is firstly carried out on the panoramic original image, the correspondence of the main characteristics of the scene is ensured to the maximum extent, as shown in fig. 4, in the invention, after comparison, the Sobel operator is selected to carry out the edge detection on the images in the two stages, and then the registration parameters are reserved. Fig. 5 to 8 are diagrams illustrating an embodiment of panoramic original image registration according to the present invention. Fig. 9-12 are diagrams of an embodiment of edge detection registration in the present invention.

S80, recording registration parameters of the two stages of registration images, and performing superposition comparison on the two stages of building segmentation panoramic images; in step S80, the superimposition calculation is performed on the two-stage building division panorama:

S_new= S₂ - S₁；

wherein S is₁Is the first stageOf a building divided area, S₂And reserving the area with the failed matching for the building partition area of the second period, and recording the area as a change part.

And combining the steps, manually checking whether the abnormal part is illegal, and recording the position of the area to judge whether the abnormal part is illegal.

The invention mainly adopts a binocular principle to obtain the depth information, can estimate the abnormal position of the difference by depending on the building difference and the depth information of the multi-period spliced panoramic image, and can greatly improve the efficiency of the illegal construction survey and reduce the labor cost by combining with the general survey of personnel. Therefore, the invention provides a technical scheme for illegal construction detection, which greatly reduces the cost of illegal construction detection, greatly improves the illegal construction positioning efficiency, and can carry out data acquisition by adopting a small unmanned aerial vehicle to carry a binocular panoramic camera. Unmanned aerial vehicle data acquisition, fine cloud cover and the plant influence of having avoided, the effectual data quality that has improved. In addition, the image segmentation method based on the U-Net greatly improves the segmentation success rate of the building and effectively separates the building from the image background.

While the preferred embodiments of the present invention have been illustrated and described, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention as defined by the appended claims.

Claims

1. An unmanned aerial vehicle panoramic image-based illegal construction detection method is characterized by comprising the following steps:

s40, acquiring depth information of the image;

step S40, including a camera calibration step, a binocular correction step, a binocular matching step, and a depth information calculation step;

wherein the camera calibration step comprises:

the binocular matching step includes: matching corresponding image points on left and right views of the same scene is realized through binocular matching to obtain a disparity map;

the step of calculating depth information comprises:

；

according to the similar triangle principle:

；

the following can be obtained:

；

；

wherein

；

wherein C is the horizontal distance;

2. The method for detecting the default construction based on the panoramic image of the unmanned aerial vehicle as claimed in claim 1, wherein in step S10, Labelme labeling tools are used for image labeling, polygonal framing is used for selecting building positions, and multi-stage city original images are collected for labeling.

3. The unmanned aerial vehicle panoramic image-based illegal construction detection method according to claim 1, characterized in that in step S20, a training set and a test set are divided according to a 9:1 ratio;

4. The method for detecting the default construction based on the panoramic image of the unmanned aerial vehicle as claimed in claim 1, wherein in step S70, an image registration method of ORB feature operator is used to perform image registration on the two-phase segmented panoramic image.

5. The method for detecting default construction based on panoramic imagery of unmanned aerial vehicle of claim 1, wherein in step S80, the two stages of building segmentation panoramic images are calculated by superposition:

S_new = S₂ - S₁；