CN110852164A - YOLOv 3-based method and system for automatically detecting illegal building - Google Patents
YOLOv 3-based method and system for automatically detecting illegal building Download PDFInfo
- Publication number
- CN110852164A CN110852164A CN201910956709.1A CN201910956709A CN110852164A CN 110852164 A CN110852164 A CN 110852164A CN 201910956709 A CN201910956709 A CN 201910956709A CN 110852164 A CN110852164 A CN 110852164A
- Authority
- CN
- China
- Prior art keywords
- image
- grid
- kinect
- error
- value
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 42
- 238000001514 detection method Methods 0.000 claims abstract description 38
- 238000007781 pre-processing Methods 0.000 claims abstract description 8
- 238000013528 artificial neural network Methods 0.000 claims abstract description 6
- 238000003745 diagnosis Methods 0.000 claims abstract description 6
- 238000002372 labelling Methods 0.000 claims abstract description 6
- 238000012549 training Methods 0.000 claims description 32
- 230000008569 process Effects 0.000 claims description 17
- 238000003064 k means clustering Methods 0.000 claims description 9
- 230000003321 amplification Effects 0.000 claims description 8
- 238000003199 nucleic acid amplification method Methods 0.000 claims description 8
- 230000001965 increasing effect Effects 0.000 claims description 5
- 230000005484 gravity Effects 0.000 claims description 4
- 238000012937 correction Methods 0.000 claims description 3
- 230000000694 effects Effects 0.000 claims description 3
- 238000012360 testing method Methods 0.000 claims description 3
- 230000009467 reduction Effects 0.000 claims description 2
- 230000000007 visual effect Effects 0.000 claims description 2
- 238000007621 cluster analysis Methods 0.000 claims 1
- 238000010276 construction Methods 0.000 claims 1
- 238000005516 engineering process Methods 0.000 abstract description 4
- 238000012545 processing Methods 0.000 abstract description 4
- 238000012544 monitoring process Methods 0.000 description 4
- 238000004458 analytical method Methods 0.000 description 3
- 238000010586 diagram Methods 0.000 description 3
- 238000007726 management method Methods 0.000 description 2
- 238000013508 migration Methods 0.000 description 2
- 230000005012 migration Effects 0.000 description 2
- 230000004075 alteration Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000002708 enhancing effect Effects 0.000 description 1
- 238000007689 inspection Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012805 post-processing Methods 0.000 description 1
- 238000009418 renovation Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/10—Terrestrial scenes
- G06V20/176—Urban or other man-made structures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
- G06F18/232—Non-hierarchical techniques
- G06F18/2321—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
- G06F18/23213—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/25—Determination of region of interest [ROI] or a volume of interest [VOI]
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Bioinformatics & Computational Biology (AREA)
- General Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Multimedia (AREA)
- Probability & Statistics with Applications (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a method and a system for automatically detecting illegal buildings based on YOLOv 3. The image acquisition equipment of the building area shoots four types of Kinect images by using Kinect equipment, and the detection system is used for image preprocessing, labeling and judgment by using a YOLOv3 neural network. And finally, sending a result obtained by processing the detection system to a system terminal, and fusing the recognition result of the Kinect images of four types as an auxiliary diagnosis result to be delivered to a user for final diagnosis so as to realize the recognition of the building area (the illegal building). The invention completes an automatic detection violation building system based on an intelligent image processing technology and a neural network, can reduce the workload of manual identification to a certain extent, and has economic and social significance.
Description
Technical Field
The invention relates to the technical field of illegal building image identification, in particular to a method and a system for automatically detecting illegal buildings based on YOLOv 3.
Background
The building change detection is one of important contents of geographic national condition monitoring, and has important significance for illegal building identification, city dynamic monitoring, geographic information updating and the like. Taking urban illegal building detection as an example, along with the continuous development of the economic society of China, the urbanization process is continuously accelerated, urban buildings are continuously increased, the number and the scale of illegal buildings are also continuously increased, the phenomenon not only destroys urban planning and urban landscape, but also influences urban image and resident life, is a hotspot problem concerned by common people, is a difficult problem of urban management, and is one of negative factors influencing social harmony. At present, the 'law enforcement cost is low and the law enforcement cost is high' is one of the main reasons for repeated prohibition of illegal buildings, besides the lack of related legal links, the detection aspect of the illegal buildings is weak, and due to the lack of automatic monitoring means for the illegal buildings, the mode of utilizing manual inspection has a plurality of disadvantages, namely, the period of the discovery process is long, and the large-scale monitoring cost is high. In recent years, illegal building detection is attempted by using satellite image data in cities such as Beijing, but the automatic analysis technology of image information is still not mature enough, and the specific gravity of manual identification and verification participation in the process is large. Billions of manpower and material resources are invested by land law enforcement, city management, nationwide each year for this task. The method with high automation degree, robustness and reliability is urgently needed in the market to detect urban illegal buildings, so that the renovation process of the urban illegal buildings is promoted.
Disclosure of Invention
The invention aims to provide a method and a system for automatically detecting illegal buildings based on YOLOv3, which aim to solve the problems in the background technology.
In order to achieve the purpose, the invention provides the following technical scheme:
an automatic illegal building detection method based on YOLOv3 comprises the following steps:
step 1: firstly, obtaining a picture of a building area through image acquisition equipment, then firstly carrying out image definition pretreatment on the picture through image scanning equipment, then making a training set, selecting x groups of sample groups from a data set, wherein each sample group comprises y samples, each sample consists of an RGB image picture and a depth image picture, and obtaining 2 xxxy sample pictures in total;
step 2: copying each sample picture, and respectively adjusting the resolution to 300 × 225, 400 × 300, 500 × 375 and 600 × 450 according to the proportion to obtain sample pictures with four times of amplification quantity;
and step 3: pre-training the sample pictures with the four-fold amplification quantity through Darknet-53, transferring the network parameters obtained after the pre-training to a basic network, and initializing to obtain a transferred Darknet-53 model;
and 4, step 4: adopting a K-means clustering algorithm to cluster the building area frames manually marked in the training set, setting different K values, and counting the values of the corresponding error Square Sum (SSE);
step 6: drawing a relation graph of the SSE value and the k value; finding an optimal k value by an elbow method according to a relation graph of the SSE value and the k value to obtain corresponding k clustering centers, and writing the k clustering centers into a configuration file as initial candidate frame parameters of YOLOv 3;
and 7: training the training set obtained in the step 1 by using improved YOLOv3 to obtain a parameter model for completing training; and realizing the identification of the violation buildings by fusing the identification results of the four types of Kinect images.
Preferably, the training set in step 1 is prepared as follows:
1.1: the image acquisition of image acquisition equipment uses the Kinect equipment to shoot four types of Kinect images, and the image acquisition is respectively: cutting each image into a picture with a fixed position, a consistent angle and a known visual field; the resolution of the picture is 640 multiplied by 480;
1.2: copying each picture obtained by shooting, and respectively adjusting the resolution to 300 multiplied by 225, 400 multiplied by 300, 500 multiplied by 375 and 600 multiplied by 450 according to the proportion to obtain a Kinect image data set with four times of amplification;
1.3: manually marking a building area frame aiming at each picture in the four times of amplified Kinect image data set to generate a label file;
1.4: and combining the Kinect image data set and the label file to form a training set.
Preferably, after the trained parametric model is obtained in step 6, the method further includes: calling a Kinect camera to simultaneously output four types of Kinect images, and identifying by adopting a parameter model to obtain identification results of the four types of Kinect images; the four types of Kinect images refer to: IR images, Registration of RGB images, RGB images and Depth images.
Preferably, the value of the sum of squared errors SSE in step 3 is obtained as follows: YOLOv3 divides the image into S × S grids in the training process, and obtains B detection frames and confidence conf (object) thereof for each grid prediction according to formula (1), formula (2) and formula (3);
Conf(Object)=Pr(Object)×IOU (1),
wherein:
pr (object) indicates whether an object falls into the grid corresponding to the candidate box, if so, 0, as shown in formula (2);
IOU represents the ratio of the intersection area and union area of the prediction frame and the real frame; box (pred) represents a prediction box; box (Truth) represents a real box; area (·) denotes an area;
the confidence conf (object) represents the confidence level of the detected object;
each test box contains 5 parameters: x, y, w, h and conf (object); wherein, (x, y) represents the offset of the center of the detection box from the network position, and (w, h) represents the width and height of the detection box;
predicting C Class probabilities Pr (Class) per gridi|Object),Pr(ClassiI Object) represents the probability that the Object falls into grid i; the final output S × S × [ B × (4+1+ C)]Tensor of dimension (tensor); the loss function loss of YOLOv3 is characterized by equation (4):
wherein,in order to be a coordinate error,in order to be an error of the IOU,is a classification error, and has:
wherein:
λcoordis composed ofWeight parameter of λcoord=5;λnoobjIs composed ofCorrection parameter lambda ofnoobj=0.5;
The value of the x parameter representing the real box to which grid i corresponds,error of x parameter representing grid i;
the value of the y parameter representing the real box to which grid i corresponds,error of y parameter representing grid i;
the value of the w parameter representing the real box to which grid i corresponds,error of w parameter representing grid i;
the value of the h parameter representing the real box to which grid i corresponds,error of h parameter representing grid i;
Cia confidence Conf (object) predictor representing grid i;the confidence conf (object) true value representing the mesh i,representing the confidence error of grid i;
pi(c) a prediction probability Pr (Class) representing the object falling into grid ii|Object);Representing the true probability of the target falling into grid i,
whether a target falls into the grid i or not is shown, if the target falls into the grid i, the target is 1, otherwise, the target is 0;
the judgment result shows that whether an object falls into the grid i in the jth prediction frame is judged, if so, the result is 1, otherwise, the result is 0.
Preferably, in the step 4, a group of initial candidate frames with fixed size and aspect ratio is introduced into the YOLOv3 in the target detection process, and a K-Means clustering algorithm is adopted to perform clustering analysis on the manually marked target frames in the training set obtained in the step 1, so as to find out the optimal K value representing the number of the initial candidate frames and the width-height dimension of K clustering centers as the candidate frame parameters in the network configuration file;
and (3) determining the k value according to the error sum of squares SSE and the elbow method according to the formula (8):
wherein Cl isiIs the ith cluster, p is CliSample point of (1), miIs CliIs the center of gravity of CliThe mean value of all samples in the process, SSE is the clustering error of all samples, which represents the good or bad clustering effect, and the core idea of the elbow method is as follows: along with the increase of the k value, the sample division is more fine, the SSE is gradually reduced, when k reaches the optimal clustering number, the return of clustering degree is rapidly reduced by continuously increasing the k value, which is represented by the sudden reduction of the SSE descending amplitude, the relation graph of the SSE and the k shows the shape of an elbow, and the k value corresponding to the elbow is the optimal clustering number required by the user.
Preferably, in the K-means clustering in step 5, the euclidean distance is used to represent the error between the sample point and the sample mean value, the sample point is a prediction frame, the sample mean value is a real frame, the IOU is used to reflect the error between the prediction frame and the real frame, and the larger the IOU is, the smaller the error is; the clustering error of the obtained samples is calculated by using equation (9):
wherein, the IOUpIOU for sample point p, 1-IOUpThe error at sample point p is represented, resulting in the SSE and k values.
Preferably, in step 6, the recognition result is sent to the system terminal, and the recognition result is delivered to the user for final recognition as an auxiliary diagnosis result.
The invention also provides an automatic illegal building detection system based on YOLOv3, which comprises:
the image acquisition equipment is used for shooting four types of Kinect images and uploading the images to the detection system;
the image scanning equipment is used for carrying out image definition pretreatment on the image shot by the image acquisition equipment and then sending the image into the detection system;
the detection system is used for acquiring the Kinect image, preprocessing the image, labeling and judging whether the Kinect image is a violation building or not by utilizing a YOLOv3 neural network;
and the system terminal receives the judgment processed by the detection system and displays the judgment result as a user auxiliary judgment result.
Preferably, the image acquisition equipment adopts Kinect equipment, and the four types of Kinect images comprise an IR image, a Registration of RGB image, an RGB image and a Depth image respectively; the resolution of the picture is 640 multiplied by 480, the image preprocessing is to copy each picture obtained by shooting, and the resolutions are respectively adjusted to 300 multiplied by 225, 400 multiplied by 300, 500 multiplied by 375 and 600 multiplied by 450 according to the proportion, so as to obtain a Kinect image data set with four times of amplification; and manually marking a building area frame for each picture in the four times of the amplified Kinect image data set by a labeling pointer to generate a label file.
Preferably, the image scanning device is based on a PC, into which Matlab710 based on the Retinex image enhancement algorithm is loaded.
Compared with the prior art, the invention has the beneficial effects that:
the invention can provide effective auxiliary diagnosis information, and the invention completes an automatic detection violation building system based on intelligent image processing technology and neural network, can reduce the workload of manual identification to a certain extent, and has economic and social significance.
Drawings
FIG. 1 is a schematic diagram of the system of the present invention;
fig. 2 is a schematic diagram of the network structure of YOLOV3 in the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Referring to fig. 1-2, the present invention provides a technical solution:
referring to fig. 1, an automatic illegal building detection method based on YOLOv3 includes an image acquisition device, an image scanning device, a detection system and a system terminal. The image acquisition of the building area is to use a Kinect device to shoot four types of Kinect images, which are respectively as follows: one each of the IR image, Registration of RGB image, RGB image and Depth image; the resolution of the picture is 640 × 480. The detection system comprises image preprocessing, labeling and discrimination by using a YOLOv3 neural network.
According to the invention, the unmanned aerial vehicle can be used as a camera carrying the Kinect as photographing equipment, low-altitude photographing can be carried out through the unmanned aerial vehicle, and the manual photographing difficulty and workload are reduced. The unmanned aerial vehicle shoots buildings (including illegal buildings) in a city or other regions according to a preset route, an obtained picture image is input into image scanning equipment for preprocessing, the preprocessing is carried out under a Windows XP or above operating system by adopting a computer (PC), Matlab710 based on a Retinex image enhancement algorithm is installed in the Windows XP, and the picture or the photo shot by the unmanned aerial vehicle can be subjected to sharpening processing based on the Retinex algorithm. Because unmanned aerial vehicle is at the in-process of taking a picture and collecting evidence, there is the influence of weather or other factors, for example there is the haze condition or receives the wind fast disturbance and make the photo of taking a picture greatly reduced in the whole contrast and the luminance of image, image color distortion. By adopting the Retinex post-processing of the restored image, the image is integrally enhanced, the detail information of the image is enhanced, the edge is clear, the image color information is further enhanced, the image color is better recovered, and the purpose of enhancing the highlight area is achieved. The Matlab710 based on the Retinex image enhancement algorithm is the prior art, and specifically, reference may be made to the image enhancement algorithm based on the Retinex principle-article number: 1009-3044(2018)11-0185-02, a method for improving the definition of a foggy image-article number: 100325060(2011)0120083204, etc.
FIG. 2 shows a schematic diagram of the structure of YOLOv3, including making training sets, generating migration Darknet-53 models, improving candidate box parameters, and violation building identification. The method comprises the following steps:
step 1, making a training set according to the following process
1.1, using a Kinect device to shoot four types of Kinect images, namely: one each of the IR image, the registration of the RGB image, the RGB image and the Depth image; the resolution of the picture obtained by shooting was 640 × 480.
And 1.2, copying each picture obtained by shooting, and respectively adjusting the resolution to 300 × 225, 400 × 300, 500 × 375 and 600 × 450 in proportion to obtain a four-time-multiplied Kinect image data set.
And 1.3, manually marking a building area frame aiming at each picture in the four times of amplified Kinect image data set, and generating a label file.
And 1.4, combining the Kinect image data set and the label file to form a training set.
Step 2, generating a migration Darknet-53 model according to the following process
2.1, selecting x groups of sample groups in the data set, wherein each sample group comprises y samples, each sample consists of an RGBImage picture and a depth image picture, and obtaining 2 x y sample pictures.
2.2, each sample picture is copied and scaled to resolutions of 300 × 225, 400 × 300, 500 × 375 and 600 × 450, respectively, to obtain four times the number of sample pictures.
2.3, pre-training the sample pictures amplified by four times through Darknet-53, transferring the network parameters obtained after the pre-training to a basic network, and initializing to obtain a transferred Darknet-53 model.
Step 3, setting initial candidate frame parameters of YOLOv3 according to the following process
3.1, clustering the manually marked building area frames in the training set by adopting a K-means clustering algorithm, setting different K values, and counting the values of the corresponding sum of Squared errors SSE (sum of the Squared errors).
3.2, finding the optimal k value by using an elbow method to obtain corresponding k clustering centers, and writing the k clustering centers into a configuration file as initial candidate frame parameters of YOLOv 3.
Step 4, identifying the illegal buildings according to the following process
4.1, training the training set obtained in the step 1 by using improved YOLOv3 to obtain a trained parameter model;
4.2, calling a Kinect camera to simultaneously output four types of Kinect images, and identifying by adopting the parameter model obtained in the step 4.1 to obtain identification results of the four types of Kinect images; the four types of Kinect images refer to: IR images, Registration of RGB images, RGB images and Depth images.
And 4.3, identifying the building area (illegal building) by fusing the identification results of the four types of Kinect images.
In a specific implementation, step 3.1 is to obtain the values of the sum of squared errors SSE as follows:
YOLOv3 divides the image into S × S grids in the training process, and obtains B detection frames and confidence conf (object) thereof for each grid prediction according to formula (1), formula (2) and formula (3);
Conf(Object)=Pr(Object)×IOU (1),
wherein:
pr (object) indicates whether an object falls into the grid corresponding to the candidate box, if so, 0, as shown in formula (2);
IOU represents the ratio of the intersection area and union area of the prediction frame and the real frame; box (pred) represents a prediction box; box (Truth) represents a real box; area (·) denotes an area;
the confidence conf (object) represents the confidence level of the detected object;
each test box contains 5 parameters: x, y, w, h and conf (object); wherein, (x, y) represents the offset of the center of the detection box from the network position, and (w, h) represents the width and height of the detection box;
predicting C Class probabilities Pr (Class) per gridi|Object),Pr(ClassiI Object) represents the probability that the Object falls into grid i; the final output S × S × [ B × (4+1+ C)]Tensor of dimension (tensor); the loss function loss of YOLOv3 is characterized by equation (4):
wherein,in order to be a coordinate error,in order to be an error of the IOU,is a classification error, and has:
wherein:
λcoordis composed ofWeight parameter of λcoord=5;λnoobjIs composed ofCorrection parameter lambda ofnoobj=0.5;
The value of the x parameter representing the real box to which grid i corresponds,error of x parameter representing grid i;
the value of the y parameter representing the real box to which grid i corresponds,error of y parameter representing grid i;
the value of the w parameter representing the real box to which grid i corresponds,error of w parameter representing grid i;
the value of the h parameter representing the real box to which grid i corresponds,error of h parameter representing grid i;
Cia confidence Conf (object) predictor representing grid i;the confidence conf (object) true value representing the mesh i,representing the confidence error of grid i;
pi(c) a prediction probability Pr (Class) representing the object falling into grid ii|Object);Representing the true probability of the target falling into grid i,
whether a target falls into the grid i or not is shown, if the target falls into the grid i, the target is 1, otherwise, the target is 0;
whether an object falls into the grid i in the jth prediction frame or not is judged, if yes, the number is 1, and otherwise, the number is 0;
introducing a group of initial candidate frames with fixed size and aspect ratio into the YOLOv3 in the target detection process, carrying out clustering analysis on the manually marked target frames in the training set obtained in the step 1 by adopting a K-Means clustering algorithm, and finding out the optimal K value representing the number of the initial candidate frames and the width-height dimension of K clustering centers as candidate frame parameters in a network configuration file;
and (3) determining the k value according to the error sum of squares SSE and the elbow method according to the formula (8):
wherein Cl isiIs the ith cluster, p is CliSample point of (1), miIs CliIs the center of gravity of CliThe mean value of all samples in the process, SSE is the clustering error of all samples, which represents the good or bad clustering effect, and the core idea of the elbow method is as follows: along with the increase of the k value, the sample division is more fine, the SSE gradually becomes smaller, when k reaches the optimal clustering number, the returning of the clustering degree is rapidly reduced by continuously increasing the k value, which is represented by the sudden decrease of the descending amplitude of the SSE, the relation graph of the SSE and the k shows the shape of an elbow, and the k value corresponding to the elbow is the optimal clustering number required by the user;
in the K-means clustering, Euclidean distance is adopted to represent the error between a sample point and a sample mean value, the sample point is a prediction frame, the sample mean value is a real frame, the error between the prediction frame and the real frame is reflected by adopting an IOU (input output) which is larger, and the error is smaller; the clustering error of the obtained samples is calculated by using equation (9):
wherein, the IOUpIOU for sample point p, 1-IOUpThe error at sample point p is represented, resulting in the SSE and k values.
And finally, sending the identification result to a system terminal, and delivering the identification result to a user (user) as an auxiliary diagnosis result to identify the violation building.
Although embodiments of the present invention have been shown and described, it will be appreciated by those skilled in the art that changes, modifications, substitutions and alterations can be made in these embodiments without departing from the principles and spirit of the invention, the scope of which is defined in the appended claims and their equivalents.
Claims (10)
1. An automatic illegal building detection method based on YOLOv3 is characterized by comprising the following steps:
step 1: firstly, obtaining a picture of a building area through image acquisition equipment, then firstly carrying out image definition pretreatment on the picture through image scanning equipment, then making a training set, selecting x groups of sample groups from a data set, wherein each sample group comprises y samples, each sample consists of an RGB image picture and a depth image picture, and obtaining 2 xxxy sample pictures in total;
step 2: copying each sample picture, and respectively adjusting the resolution to 300 × 225, 400 × 300, 500 × 375 and 600 × 450 according to the proportion to obtain sample pictures with four times of amplification quantity;
and step 3: pre-training the sample pictures with the four-fold amplification quantity through Darknet-53, transferring the network parameters obtained after the pre-training to a basic network, and initializing to obtain a transferred Darknet-53 model;
and 4, step 4: adopting a K-means clustering algorithm to cluster the building area frames manually marked in the training set, setting different K values, and counting the values of the corresponding error Square Sum (SSE);
step 6: drawing a relation graph of the SSE value and the k value; finding an optimal k value by an elbow method according to a relation graph of the SSE value and the k value to obtain corresponding k clustering centers, and writing the k clustering centers into a configuration file as initial candidate frame parameters of YOLOv 3;
and 7: training the training set obtained in the step 1 by using improved YOLOv3 to obtain a parameter model for completing training; and realizing the identification of the violation buildings by fusing the identification results of the four types of Kinect images.
2. The method for automatically detecting illegal building based on YOLOv3 as claimed in claim 1, wherein the training set in step 1 is made as follows:
1.1: the image acquisition of image acquisition equipment uses the Kinect equipment to shoot four types of Kinect images, and the image acquisition is respectively: cutting each image into a picture with a fixed position, a consistent angle and a known visual field; the resolution of the picture is 640 multiplied by 480;
1.2: copying each picture obtained by shooting, and respectively adjusting the resolution to 300 multiplied by 225, 400 multiplied by 300, 500 multiplied by 375 and 600 multiplied by 450 according to the proportion to obtain a Kinect image data set with four times of amplification;
1.3: manually marking a building area frame aiming at each picture in the four times of amplified Kinect image data set to generate a label file;
1.4: and combining the Kinect image data set and the label file to form a training set.
3. The method for automatically detecting illegal building based on Yolov3 as claimed in claim 1, wherein the step 6 further comprises after obtaining the trained parameter model: calling a Kinect camera to simultaneously output four types of Kinect images, and identifying by adopting a parameter model to obtain identification results of the four types of Kinect images; the four types of Kinect images refer to: IR images, Registration of RGB images, RGB images and Depth images.
4. The method for automatically detecting illegal building based on YOLOv3 as claimed in claim 1, wherein the value of sum of squared errors SSE in step 3 is obtained by the following method: YOLOv3 divides the image into S × S grids in the training process, and obtains B detection frames and confidence conf (object) thereof for each grid prediction according to formula (1), formula (2) and formula (3);
Conf(Object)=Pr(Object)×IOU (1),
wherein:
pr (object) indicates whether an object falls into the grid corresponding to the candidate box, if so, 0, as shown in formula (2);
IOU represents the ratio of the intersection area and union area of the prediction frame and the real frame; box (pred) represents a prediction box; box (Truth) represents a real box; area (·) denotes an area;
the confidence conf (object) represents the confidence level of the detected object;
each test box contains 5 parameters: x, y, w, h and conf (object); wherein, (x, y) represents the offset of the center of the detection box from the network position, and (w, h) represents the width and height of the detection box;
predicting C Class probabilities Pr (Class) per gridi|Object),Pr(ClassiI Object) represents the probability that the Object falls into grid i; the final output S × S × [ B × (4+1+ C)]Tensor of dimension (tensor); the loss function loss of YOLOv3 is characterized by equation (4):
wherein,in order to be a coordinate error,in order to be an error of the IOU,is a classification error, and has:
wherein:
λcoordis composed ofWeight parameter of λcoord=5;λnoobjIs composed ofCorrection parameter lambda ofnoobj=0.5;
The value of the x parameter representing the real box to which grid i corresponds,error of x parameter representing grid i;
the value of the y parameter representing the real box to which grid i corresponds,error of y parameter representing grid i;
the value of the w parameter representing the real box to which grid i corresponds,error of w parameter representing grid i;
the value of the h parameter representing the real box to which grid i corresponds,error of h parameter representing grid i;
Cia confidence Conf (object) predictor representing grid i;the confidence conf (object) true value representing the mesh i,representing the confidence error of grid i;
pi(c) a prediction probability Pr (Class) representing the object falling into grid ii|Object);Representing the true probability of the target falling into grid i,
whether a target falls into the grid i or not is shown, if the target falls into the grid i, the target is 1, otherwise, the target is 0;
5. The method for automatically detecting the illegal building based on the YOLOv3 as claimed in claim 1, wherein in step 4, the YOLOv3 introduces a group of initial candidate frames with fixed size and aspect ratio in the target detection process, and performs cluster analysis on the manually marked target frames in the training set obtained in step 1 by using a K-Means clustering algorithm to find the optimal K value representing the number of the initial candidate frames, and the width-height dimension of K cluster centers is used as the candidate frame parameters in the network configuration file;
and (3) determining the k value according to the error sum of squares SSE and the elbow method according to the formula (8):
wherein Cl isiIs the ith cluster, p is CliSample point of (1), miIs CliIs the center of gravity of CliThe mean value of all samples in the process, SSE is the clustering error of all samples, which represents the good or bad clustering effect, and the core idea of the elbow method is as follows: along with the increase of the k value, the sample division is more fine, the SSE is gradually reduced, when k reaches the optimal clustering number, the return of clustering degree is rapidly reduced by continuously increasing the k value, which is represented by the sudden reduction of the SSE descending amplitude, the relation graph of the SSE and the k shows the shape of an elbow, and the k value corresponding to the elbow is the optimal clustering number required by the user.
6. The YOLOv 3-based automatic construction violation detection method according to claim 1, wherein in step 5, in K-means clustering, the euclidean distance is used to represent the error between the sample point and the sample mean, the sample point is the prediction box, the sample mean is the real box, the IOU is used to reflect the error between the prediction box and the real box, and the larger the IOU is, the smaller the error is; the clustering error of the obtained samples is calculated by using equation (9):
wherein, the IOUpIOU for sample point p, 1-IOUpThe error at sample point p is represented, resulting in the SSE and k values.
7. The method for automatically detecting illegal building according to claim 1, wherein the identification result is sent to the system terminal in step 6 and is delivered to the user for final identification as the auxiliary diagnosis result.
8. The YOLOv 3-based automatic illegal building detection system is characterized by comprising the following components in percentage by weight:
the image acquisition equipment is used for shooting four types of Kinect images and uploading the images to the detection system;
the image scanning equipment is used for carrying out image definition pretreatment on the image shot by the image acquisition equipment and then sending the image into the detection system;
the detection system is used for acquiring the Kinect image, preprocessing the image, labeling and judging whether the Kinect image is a violation building or not by utilizing a YOLOv3 neural network;
and the system terminal receives the judgment processed by the detection system and displays the judgment result as a user auxiliary judgment result.
9. The YOLOv 3-based automatic detection violation building system as recited in claim 8, wherein the image capturing device is a Kinect device, and four types of Kinect images comprise an IR image, a Registration of RGB image, an RGB image and a Depth image; the resolution of the picture is 640 multiplied by 480, the image preprocessing is to copy each picture obtained by shooting, and the resolutions are respectively adjusted to 300 multiplied by 225, 400 multiplied by 300, 500 multiplied by 375 and 600 multiplied by 450 according to the proportion, so as to obtain a Kinect image data set with four times of amplification; and manually marking a building area frame for each picture in the four times of the amplified Kinect image data set by a labeling pointer to generate a label file.
10. The YOLOv 3-based automatic violation detection building system according to claim 8, wherein the image scanning device is based on a PC, and Matlab710 based on Retinex image enhancement algorithm is installed in the PC.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910956709.1A CN110852164A (en) | 2019-10-10 | 2019-10-10 | YOLOv 3-based method and system for automatically detecting illegal building |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910956709.1A CN110852164A (en) | 2019-10-10 | 2019-10-10 | YOLOv 3-based method and system for automatically detecting illegal building |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110852164A true CN110852164A (en) | 2020-02-28 |
Family
ID=69596513
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910956709.1A Pending CN110852164A (en) | 2019-10-10 | 2019-10-10 | YOLOv 3-based method and system for automatically detecting illegal building |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110852164A (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111507296A (en) * | 2020-04-23 | 2020-08-07 | 嘉兴河图遥感技术有限公司 | Intelligent illegal building extraction method based on unmanned aerial vehicle remote sensing and deep learning |
CN111709310A (en) * | 2020-05-26 | 2020-09-25 | 重庆大学 | Gesture tracking and recognition method based on deep learning |
CN112215190A (en) * | 2020-10-21 | 2021-01-12 | 南京智慧航空研究院有限公司 | Illegal building detection method based on YOLOV4 model |
CN112215189A (en) * | 2020-10-21 | 2021-01-12 | 南京智慧航空研究院有限公司 | Accurate detecting system for illegal building |
CN113011405A (en) * | 2021-05-25 | 2021-06-22 | 南京柠瑛智能科技有限公司 | Method for solving multi-frame overlapping error of ground object target identification of unmanned aerial vehicle |
CN113420716A (en) * | 2021-07-16 | 2021-09-21 | 南威软件股份有限公司 | Improved Yolov3 algorithm-based violation behavior recognition and early warning method |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109325454A (en) * | 2018-09-28 | 2019-02-12 | 合肥工业大学 | A kind of static gesture real-time identification method based on YOLOv3 |
-
2019
- 2019-10-10 CN CN201910956709.1A patent/CN110852164A/en active Pending
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109325454A (en) * | 2018-09-28 | 2019-02-12 | 合肥工业大学 | A kind of static gesture real-time identification method based on YOLOv3 |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111507296A (en) * | 2020-04-23 | 2020-08-07 | 嘉兴河图遥感技术有限公司 | Intelligent illegal building extraction method based on unmanned aerial vehicle remote sensing and deep learning |
CN111709310A (en) * | 2020-05-26 | 2020-09-25 | 重庆大学 | Gesture tracking and recognition method based on deep learning |
CN111709310B (en) * | 2020-05-26 | 2024-02-02 | 重庆大学 | Gesture tracking and recognition method based on deep learning |
CN112215190A (en) * | 2020-10-21 | 2021-01-12 | 南京智慧航空研究院有限公司 | Illegal building detection method based on YOLOV4 model |
CN112215189A (en) * | 2020-10-21 | 2021-01-12 | 南京智慧航空研究院有限公司 | Accurate detecting system for illegal building |
CN113011405A (en) * | 2021-05-25 | 2021-06-22 | 南京柠瑛智能科技有限公司 | Method for solving multi-frame overlapping error of ground object target identification of unmanned aerial vehicle |
CN113420716A (en) * | 2021-07-16 | 2021-09-21 | 南威软件股份有限公司 | Improved Yolov3 algorithm-based violation behavior recognition and early warning method |
CN113420716B (en) * | 2021-07-16 | 2023-07-28 | 南威软件股份有限公司 | Illegal behavior identification and early warning method based on improved Yolov3 algorithm |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110852164A (en) | YOLOv 3-based method and system for automatically detecting illegal building | |
CN111222574B (en) | Ship and civil ship target detection and classification method based on multi-model decision-level fusion | |
CN113822247B (en) | Method and system for identifying illegal building based on aerial image | |
CN104978567B (en) | Vehicle checking method based on scene classification | |
CN108804992B (en) | Crowd counting method based on deep learning | |
CN113449632B (en) | Vision and radar perception algorithm optimization method and system based on fusion perception and automobile | |
CN112801227B (en) | Typhoon identification model generation method, device, equipment and storage medium | |
CN115512247A (en) | Regional building damage grade assessment method based on image multi-parameter extraction | |
CN115272876A (en) | Remote sensing image ship target detection method based on deep learning | |
CN113313107A (en) | Intelligent detection and identification method for multiple types of diseases on cable surface of cable-stayed bridge | |
CN115880260A (en) | Method, device and equipment for detecting base station construction and computer readable storage medium | |
CN115409814A (en) | Photovoltaic module hot spot detection method and system based on fusion image | |
CN110826364B (en) | Library position identification method and device | |
Li et al. | Hybrid cloud detection algorithm based on intelligent scene recognition | |
CN110765900B (en) | Automatic detection illegal building method and system based on DSSD | |
CN111881833B (en) | Vehicle detection method, device, equipment and storage medium | |
CN117437470A (en) | Fire hazard level assessment method and system based on artificial intelligence | |
CN114463628A (en) | Deep learning remote sensing image ship target identification method based on threshold value constraint | |
CN114494850A (en) | Village unmanned courtyard intelligent identification method and system | |
CN113963230A (en) | Parking space detection method based on deep learning | |
CN113793069A (en) | Urban waterlogging intelligent identification method of deep residual error network | |
CN113239962A (en) | Traffic participant identification method based on single fixed camera | |
CN115359346B (en) | Small micro-space identification method and device based on street view picture and electronic equipment | |
CN112767469B (en) | Highly intelligent acquisition method for urban mass buildings | |
CN115880644B (en) | Method and system for identifying coal quantity based on artificial intelligence |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20200228 |