CN112215189A - Accurate detecting system for illegal building - Google Patents

Accurate detecting system for illegal building Download PDF

Info

Publication number
CN112215189A
CN112215189A CN202011133537.7A CN202011133537A CN112215189A CN 112215189 A CN112215189 A CN 112215189A CN 202011133537 A CN202011133537 A CN 202011133537A CN 112215189 A CN112215189 A CN 112215189A
Authority
CN
China
Prior art keywords
picture
module
training
frame
video
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011133537.7A
Other languages
Chinese (zh)
Inventor
王也
周龙
汤淼
葛家明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing Smart Aviation Research Institute Co ltd
Original Assignee
Nanjing Smart Aviation Research Institute Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing Smart Aviation Research Institute Co ltd filed Critical Nanjing Smart Aviation Research Institute Co ltd
Priority to CN202011133537.7A priority Critical patent/CN112215189A/en
Publication of CN112215189A publication Critical patent/CN112215189A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/10Terrestrial scenes
    • G06V20/176Urban or other man-made structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/08Construction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/26Government or public services
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/41Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Tourism & Hospitality (AREA)
  • Computational Linguistics (AREA)
  • Biomedical Technology (AREA)
  • Human Resources & Organizations (AREA)
  • Computing Systems (AREA)
  • General Business, Economics & Management (AREA)
  • Multimedia (AREA)
  • Molecular Biology (AREA)
  • Strategic Management (AREA)
  • Primary Health Care (AREA)
  • Biophysics (AREA)
  • Economics (AREA)
  • Mathematical Physics (AREA)
  • Marketing (AREA)
  • Educational Administration (AREA)
  • Development Economics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Image Analysis (AREA)

Abstract

The invention belongs to the technical field of image recognition target detection, and particularly relates to an accurate detection system for a violation building, which comprises the following components: the picture acquisition module is used for acquiring a picture of the illegal building according to the video of the preselected area; the marking module marks the picture and acquires a prior frame according to the marked picture; the dividing module is used for dividing a data set consisting of pictures into a training set and a test set; the preprocessing module is used for preprocessing the training set; a training module for training the pairs according to the pre-processed training setYOLOV4, training the model; an adjustment module for training according to the test setYOLOV4, adjusting the model; and a detection module for adjusting the video inputYOLOVThe 4 model is used for detecting the illegal buildings in the video, so that the illegal buildings can be quickly and accurately identified.

Description

Accurate detecting system for illegal building
Technical Field
The invention belongs to the technical field of image recognition target detection, and particularly relates to an accurate detection system for illegal buildings.
Background
With the continuous development of urbanization, the value of house products is also high along with the water rising ship, illegal buildings with regulations are newly added on the occupied land or the situation that the original building design is changed and the illegal buildings with the covers are covered is in a large number, the illegal buildings are difficult to discover due to the concealment of the illegal buildings on the roofs of the buildings, the illegal buildings are always difficult to discover in illegal dismantling work, but the illegal buildings cannot be discovered and dismantled quickly due to the limitation of manpower resources and information ways.
At present, with excellent performance of deep learning in the field of target detection and popularization of unmanned aerial vehicles, a very powerful solution is provided for finding and removing buildings violating regulations on building roofs. However, most of the existing algorithms are optimized based on the existing data sets, such as ImageNet, COCO, VOC, and the like, and in practical application, no standardized method has been formed for identifying specific target objects in specific scenes, such as identifying illegal buildings in low-altitude unmanned-machine shooting videos. Because the flying height of the unmanned aerial vehicle is higher, the pixel size of each image is larger, which brings some difficulties for the identification of the violation buildings of the unmanned aerial vehicle, and the samples of the violation buildings shot by the unmanned aerial vehicle are difficult to obtain.
In the illegal building identification, two methods are mainly used, one is a method based on image comparison, and the other is a method based on image identification, and the method is specifically as follows:
the method based on image recognition mainly depends on excellent performance of a deep learning model, a model is trained through a marked data set, and new aerial photography data are input, so that information of the illegal building can be obtained.
After the difference is identified by the image contrast-based method, human judgment may be needed, and some current image identification methods are not suitable for video identification due to low efficiency. Therefore, an efficient and feasible scheme for detecting the target object of the video shot by the low-altitude unmanned aerial vehicle is urgently needed.
Therefore, a new illegal building accurate detection system needs to be designed based on the technical problems.
Disclosure of Invention
The invention aims to provide an accurate detection system for illegal buildings.
In order to solve the technical problem, the invention provides a system for detecting illegal buildings, which comprises:
the picture acquisition module is used for acquiring a picture of the illegal building according to the video of the preselected area;
the marking module marks the picture and acquires a prior frame according to the marked picture;
the dividing module is used for dividing a data set consisting of pictures into a training set and a test set;
the preprocessing module is used for preprocessing the training set;
the training module is used for training the Yolov4 model according to the preprocessed training set;
the adjusting module is used for adjusting the trained Yolov4 model according to the test set; and
and the detection module inputs the video into the adjusted YOLOV4 model to detect the illegal buildings in the video.
Further, the picture acquisition module is adapted to acquire a picture of the violation building according to the video of the preselected area, i.e. the picture acquisition module is adapted to acquire the picture of the violation building according to the video of the preselected area
The method comprises the steps of shooting a video of a preselected area, selecting a video segment with the illegal buildings in the video to obtain a picture with the illegal buildings, and adjusting the resolution of the picture to be a preset resolution.
Further, the labeling module is adapted to label the picture and obtain the prior frame according to the labeled picture, that is, the prior frame is obtained
Marking the position of the illegal building in the picture after resolution adjustment, acquiring a target frame, and acquiring the length, width and position of the target frame in the marking data;
taking the marked object type, the length and width of the target frame and the position as labels of the picture, and normalizing the length and width of the target frame:
Figure BDA0002735931040000031
Figure BDA0002735931040000032
wherein, wrIs the width of the target box after normalization; h isrIs the height of the target box after normalization; w is the width of the target frame; h is the target frame height; w is the width of the picture; h is the height of the picture;
initializing the category number and the clustering centers of the prior frames, and calculating the distance IOU between each target frame and all the clustering centers:
Figure BDA0002735931040000033
in=min(h1,h2)min(w1,w2);
un=h1w1+h2w2
wherein in is the intersection of the two target frames; un is the union of two target frames; h is1Is the height of one target box; w is a1Is the width of one target frame; h is2Is the height of another target box; w is a2Is the width of another target box;
selecting the nearest clustering center as the category of the target frame;
and taking the mean value of each cluster as the clustering center of the next iteration until the center position of each category is within the error of the two adjacent iterations, and taking the final clustering center as a prior frame.
Further, the partitioning module is adapted to partition the data set consisting of pictures into a training set and a test set, i.e. into
And dividing a data set consisting of the pictures with the adjusted resolution ratio into a training set and a test set according to a preset proportion.
Further, the preprocessing module is adapted to preprocess the training set, i.e.
Data enhancement of training sets, i.e.
Selecting a preset number of pictures in the training set each time, turning over, zooming and changing the color gamut of the pictures, and placing the pictures according to a preset position to combine the pictures and the frames;
and coding the class number of the picture into one-hot coding, and processing the label.
Further, the training module is adapted to train the YOLOV4 model according to a pre-processed training set, i.e. the training module is adapted to train the YOLOV4 model according to a pre-processed training set
Obtaining parameters of a Yolov4 model pre-trained on a COCO data set, and initializing a Yolov4 network according to the parameters;
inputting data and prior frames in a training set, carrying out forward propagation of a YOLOV4 network, and calculating a loss value between a prediction result and a real label according to a loss function of a YOLOV4 model, namely
Obtaining a regression optimization loss value CIOU of a YOLOV4 model:
Figure BDA0002735931040000041
Figure BDA0002735931040000042
Figure BDA0002735931040000043
where ρ is2(b,bgt) The Euclidean distance between the central points of the prediction frame and the real frame is taken as the distance; c is the diagonal distance of the minimum closure area which simultaneously contains the prediction frame and the real frame; w is apIs the width of the prediction box; h ispIs the high of the prediction box; w is agtIs the width of the real frame; h isgtIs the height of the real frame;
loss value LossCIOUComprises the following steps:
LossCIOU=1-CIOU;
and adjusting the weight and the bias of the YOLOV4 network according to the loss value, completing one iteration of the YOLOV4 network, and circulating until an early stop condition or the maximum iteration number is reached so as to complete the training of the YOLOV4 model.
Further, the adaptation module is adapted to adapt the trained YOLOV4 model according to the test set, i.e. the adaptation module is adapted to adapt the trained YOLOV4 model according to the test set
And inputting the data in the test set into the trained Yolov4 model to obtain a test result, and adjusting the trained Yolov4 model according to the test result.
Further, the detection module is adapted to input the video into the adapted YOLOV4 model to detect buildings violating regulations within the video, i.e. the detection module is adapted to detect buildings violating regulations within the video
And inputting the video into the adjusted YOLOV4 model, performing predictive analysis on the picture of each frame, detecting the illegal buildings in the picture, and marking the picture.
The method has the advantages that the picture of the illegal building is obtained through the picture obtaining module according to the video of the preselected area; the marking module marks the picture and acquires a prior frame according to the marked picture; the dividing module is used for dividing a data set consisting of pictures into a training set and a test set; the preprocessing module is used for preprocessing the training set; the training module is used for training the Yolov4 model according to the preprocessed training set; the adjusting module is used for adjusting the trained Yolov4 model according to the test set; and the detection module inputs the video into the adjusted YOLOV4 model to detect the illegal buildings in the video, so that the illegal buildings are quickly and accurately identified.
Additional features and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention. The objectives and other advantages of the invention will be realized and attained by the structure particularly pointed out in the written description and drawings.
In order to make the aforementioned and other objects, features and advantages of the present invention comprehensible, preferred embodiments accompanied with figures are described in detail below.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and other drawings can be obtained by those skilled in the art without creative efforts.
FIG. 1 is a flow chart of an accurate detection system for illegal buildings according to the present invention;
fig. 2 is a schematic diagram of the results of data enhancement in accordance with the present invention.
Detailed Description
To make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings, and it is apparent that the described embodiments are some, but not all embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Fig. 1 is a flow chart of an accurate detection system for buildings against traffic violations in accordance with the present invention.
As shown in fig. 1, the present embodiment provides a violation building detection system, including: the picture acquisition module is used for acquiring a picture of the illegal building according to the video of the preselected area; the marking module marks the picture and acquires a prior frame according to the marked picture; the dividing module is used for dividing a data set consisting of pictures into a training set and a test set; the preprocessing module is used for preprocessing the training set; the training module is used for training the Yolov4 model according to the preprocessed training set; the adjusting module is used for adjusting the trained Yolov4 model according to the test set; and the detection module inputs the video into the adjusted YOLOV4 model to detect the illegal buildings in the video, so that the illegal buildings are quickly and accurately identified.
In this embodiment, the picture acquiring module is adapted to acquire a picture of the buildings against regulations according to the video of the preselected area, that is, an unmanned aerial vehicle is used to shoot the selected area (preselected area), the video of the preselected area is shot, a video clip with buildings against regulations in the video is selected to be processed, so as to acquire the picture with buildings against regulations, and the resolution of the picture is adjusted to a preset resolution; considering that the pixels of the high-definition image (picture) shot by the unmanned aerial vehicle are too large, subsequent processing is affected, and the resolution of the picture is reduced to a preset resolution (for example, 1024 × 1024).
In this embodiment, the labeling module is adapted to label the picture and obtain the prior frame according to the labeled picture, i.e. using Labelimg
Marking the position of the illegal building in the picture after resolution adjustment, acquiring a target frame, and acquiring the length, width and position of the target frame in the marking data;
taking the marked object type, the length and width of the target frame and the position as labels of the picture, and normalizing the length and width of the target frame: (considering the scenes of different scales, the sizes of the target frames are different, so the target frames are normalized according to the height and width of the picture, and the normalization is performed according to the following formula):
Figure BDA0002735931040000071
Figure BDA0002735931040000072
wherein, wrIs the width of the target box after normalization; h isrIs the height of the target box after normalization; w is the width of the target frame; h is the target frame height; w is the width of the picture; h is the height of the picture;
initializing the category number and the clustering centers of the prior frames, and calculating the distance IOU between each target frame and all the clustering centers:
Figure BDA0002735931040000073
in=min(h1,h2)min(w1,w2);
un=h1w1+h2w2
wherein in is the intersection of the two target frames; un is the union of two target frames; h is1Is the height of one target box; w is a1Is the width of one target frame; h is2Is the height of another target box; w is a2Is the width of another target box;
selecting the nearest clustering center as the category of the target frame;
taking the mean value of each cluster as a clustering center of the next iteration;
and repeatedly calculating the distance IOU between each target frame and all the clustering centers and taking the mean value of each cluster as the clustering center of the next iteration until the error of the center position of each category in the two adjacent iterations is within epsilon, wherein the last clustering center is a prior frame, and 9 prior frames are obtained through K-Means clustering.
In this embodiment, the dividing module is adapted to divide the data set formed by the pictures into a training set and a test set, that is, divide the data set formed by the pictures with the adjusted resolution into the training set and the test set according to a preset ratio, and divide the data set into the training set and the test set according to the object category in the label in the original data according to 8: a ratio of 2 is used to divide the test set and the training set.
Fig. 2 is a schematic diagram of the results of data enhancement in accordance with the present invention.
In this embodiment, the preprocessing module is adapted to perform preprocessing on the training set, that is, perform Mosaic data enhancement on the training set, that is, select a preset number of pictures in the training set each time, perform flipping, scaling, and color gamut change processing on the pictures, and place each picture according to a preset position to perform combination of the pictures and combination of the frames, for example, first read four pictures each time, then respectively flip, scale, and color gamut change the four pictures, and place the pictures according to four direction positions, and finally perform combination of the pictures and combination of the frames (as shown in fig. 2); and coding the class number of the picture into a one-hot code, and smoothing the Label by a Label-smooth method, thereby improving the generalization capability of the Yolov4 model.
In this embodiment, the training module is adapted to train the YOLOV4 model according to a pre-processed training set, i.e. the training module is adapted to train the YOLOV4 model according to a pre-processed training set
Obtaining parameters of a YOLOV4 model pre-trained on a COCO data set (e.g., downloaded from a related website), and initializing a YOLOV4 network according to the parameters;
inputting data and prior frames in a training set, carrying out forward propagation of a YOLOV4 network, and calculating a loss value between a prediction result and a real label according to a loss function of a YOLOV4 model, namely
Obtaining a regression optimization loss value CIOU of a YOLOV4 model:
Figure BDA0002735931040000081
Figure BDA0002735931040000082
Figure BDA0002735931040000083
where ρ is2(b,bgt) Euclidean distance between the predicted frame (the predicted output of the YOLOV4 model) and the center point of the true frame (the true target frame); c is the diagonal distance of the minimum closure area which simultaneously contains the prediction frame and the real frame; w is apIs the width of the prediction box; h ispIs the high of the prediction box; w is agtIs the width of the real frame; h isgtIs the height of the real frame;
loss value LossCIOUComprises the following steps:
LossCIOU=1-CIOU;
and adjusting the weight and the bias of the YOLOV4 network by using a back propagation algorithm according to the loss value, completing one iteration of the YOLOV4 network, and circulating until an early stop condition or the maximum iteration number is reached so as to complete the training of the YOLOV4 model.
In this embodiment, the adjusting module is adapted to adjust the trained YOLOV4 model according to the test set, that is, to input the data in the test set into the trained YOLOV4 model (the trained YOLOV4 model is a mixed model) to obtain the inspection result, and to adjust the trained YOLOV4 model according to the inspection result.
In this embodiment, the detection module is adapted to input the video into the adjusted YOLOV4 model to detect the buildings against regulations in the video, that is, to input the video into the adjusted YOLOV4 model, perform prediction analysis on the picture of each frame, detect the buildings against regulations in the picture, mark the pictures, and then form the video display.
In this embodiment, a set of targeted training data is acquired by unmanned aerial vehicle aerial photography, the accuracy of model (YOLOV4 model) recognition can be improved by a data processing technology, and the current latest target recognition model YOLOV4 has good performance in speed and accuracy.
In conclusion, the picture of the illegal building is obtained through the picture obtaining module according to the video of the preselected area; the marking module marks the picture and acquires a prior frame according to the marked picture; the dividing module is used for dividing a data set consisting of pictures into a training set and a test set; the preprocessing module is used for preprocessing the training set; the training module is used for training the Yolov4 model according to the preprocessed training set; the adjusting module is used for adjusting the trained Yolov4 model according to the test set; and the detection module inputs the video into the adjusted YOLOV4 model to detect the illegal buildings in the video, so that the illegal buildings are quickly and accurately identified.
In the embodiments provided in the present application, it should be understood that the disclosed system may be implemented in other ways. The apparatus embodiments described above are merely illustrative and, for example, the flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
In addition, the functional modules in the embodiments of the present invention may be integrated together to form an independent part, or each module may exist separately, or two or more modules may be integrated to form an independent part.
The functions, if implemented in the form of software functional modules and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes several instructions for causing a computer device (which may be a personal computer, a server, or a network device) to perform all or part of the steps of the functions described in the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
In light of the foregoing description of the preferred embodiment of the present invention, many modifications and variations will be apparent to those skilled in the art without departing from the spirit and scope of the invention. The technical scope of the present invention is not limited to the content of the specification, and must be determined according to the scope of the claims.

Claims (8)

1. A violation building detection system, comprising:
the picture acquisition module is used for acquiring a picture of the illegal building according to the video of the preselected area;
the marking module marks the picture and acquires a prior frame according to the marked picture;
the dividing module is used for dividing a data set consisting of pictures into a training set and a test set;
the preprocessing module is used for preprocessing the training set;
the training module is used for training the Yolov4 model according to the preprocessed training set;
the adjusting module is used for adjusting the trained Yolov4 model according to the test set; and
and the detection module inputs the video into the adjusted YOLOV4 model to detect the illegal buildings in the video.
2. The violation building detection system of claim 1 wherein,
the picture acquisition module is suitable for acquiring pictures of illegal buildings according to videos of preselected areas, namely
The method comprises the steps of shooting a video of a preselected area, selecting a video segment with the illegal buildings in the video to obtain a picture with the illegal buildings, and adjusting the resolution of the picture to be a preset resolution.
3. The violation building detection system of claim 2 wherein,
the labeling module is suitable for labeling the picture and acquiring a prior frame according to the labeled picture, namely
Marking the position of the illegal building in the picture after resolution adjustment, acquiring a target frame, and acquiring the length, width and position of the target frame in the marking data;
taking the marked object type, the length and width of the target frame and the position as labels of the picture, and normalizing the length and width of the target frame:
Figure FDA0002735931030000011
Figure FDA0002735931030000012
wherein, wrIs the width of the target box after normalization; h isrIs the height of the target box after normalization; w is the width of the target frame; h is the target frame height; w is the width of the picture; h is the height of the picture;
initializing the category number and the clustering centers of the prior frames, and calculating the distance IOU between each target frame and all the clustering centers:
Figure FDA0002735931030000021
in=min(h1,h2)min(w1,w2);
un=h1w1+h2w2
wherein in is the intersection of the two target frames; un is the union of two target frames; h is1Is the height of one target box; w is a1Is the width of one target frame; h is2Is the height of another target box; w is a2Is the width of another target box;
selecting the nearest clustering center as the category of the target frame;
and taking the mean value of each cluster as the clustering center of the next iteration until the center position of each category is within the error of the two adjacent iterations, and taking the final clustering center as a prior frame.
4. The violation building detection system of claim 3 wherein,
the partitioning module is adapted to partition a data set consisting of pictures into a training set and a test set, i.e. into
And dividing a data set consisting of the pictures with the adjusted resolution ratio into a training set and a test set according to a preset proportion.
5. The violation building detection system of claim 4 wherein,
the preprocessing module is adapted to preprocess the training set, i.e. to
Data enhancement of training sets, i.e.
Selecting a preset number of pictures in the training set each time, turning over, zooming and changing the color gamut of the pictures, and placing the pictures according to a preset position to combine the pictures and the frames;
and coding the class number of the picture into one-hot coding, and processing the label.
6. The violation building detection system of claim 5 wherein,
the training module is adapted to train the YOLOV4 model according to a pre-processed training set, i.e. the training module is adapted to train the YOLOV4 model according to a pre-processed training set
Obtaining parameters of a Yolov4 model pre-trained on a COCO data set, and initializing a Yolov4 network according to the parameters;
inputting data and prior frames in a training set, carrying out forward propagation of a YOLOV4 network, and calculating a loss value between a prediction result and a real label according to a loss function of a YOLOV4 model, namely
Obtaining a regression optimization loss value CIOU of a YOLOV4 model:
Figure FDA0002735931030000031
Figure FDA0002735931030000032
Figure FDA0002735931030000033
where ρ is2(b,bgt) The Euclidean distance between the central points of the prediction frame and the real frame is taken as the distance; c is the diagonal distance of the minimum closure area which simultaneously contains the prediction frame and the real frame; w is apIs the width of the prediction box; h ispIs the high of the prediction box; w is agtIs the width of the real frame; h isgtIs the height of the real frame;
loss value LossCIOUComprises the following steps:
LossCIOU=1-CIOU;
and adjusting the weight and the bias of the YOLOV4 network according to the loss value, completing one iteration of the YOLOV4 network, and circulating until an early stop condition or the maximum iteration number is reached so as to complete the training of the YOLOV4 model.
7. The violation building detection system of claim 6 wherein,
the adaptation module is adapted to adapt the trained YOLOV4 model according to the test set, i.e.
And inputting the data in the test set into the trained Yolov4 model to obtain a test result, and adjusting the trained Yolov4 model according to the test result.
8. The violation building detection system of claim 7 wherein,
the detection module is adapted to input the video into the adapted YOLOV4 model to detect the offending building within the video, i.e. the
And inputting the video into the adjusted YOLOV4 model, performing predictive analysis on the picture of each frame, detecting the illegal buildings in the picture, and marking the picture.
CN202011133537.7A 2020-10-21 2020-10-21 Accurate detecting system for illegal building Pending CN112215189A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011133537.7A CN112215189A (en) 2020-10-21 2020-10-21 Accurate detecting system for illegal building

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011133537.7A CN112215189A (en) 2020-10-21 2020-10-21 Accurate detecting system for illegal building

Publications (1)

Publication Number Publication Date
CN112215189A true CN112215189A (en) 2021-01-12

Family

ID=74056344

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011133537.7A Pending CN112215189A (en) 2020-10-21 2020-10-21 Accurate detecting system for illegal building

Country Status (1)

Country Link
CN (1) CN112215189A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115049935A (en) * 2022-08-12 2022-09-13 松立控股集团股份有限公司 Urban illegal building division detection method

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110070074A (en) * 2019-05-07 2019-07-30 安徽工业大学 A method of building pedestrian detection model
CN110782708A (en) * 2019-11-01 2020-02-11 南京智慧航空研究院有限公司 Unmanned aerial vehicle flight network modeling method based on low-altitude airspace limiting conditions
CN110852164A (en) * 2019-10-10 2020-02-28 安徽磐众信息科技有限公司 YOLOv 3-based method and system for automatically detecting illegal building
CN111507296A (en) * 2020-04-23 2020-08-07 嘉兴河图遥感技术有限公司 Intelligent illegal building extraction method based on unmanned aerial vehicle remote sensing and deep learning
CN111783665A (en) * 2020-06-30 2020-10-16 创新奇智(西安)科技有限公司 Action recognition method and device, storage medium and electronic equipment

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110070074A (en) * 2019-05-07 2019-07-30 安徽工业大学 A method of building pedestrian detection model
CN110852164A (en) * 2019-10-10 2020-02-28 安徽磐众信息科技有限公司 YOLOv 3-based method and system for automatically detecting illegal building
CN110782708A (en) * 2019-11-01 2020-02-11 南京智慧航空研究院有限公司 Unmanned aerial vehicle flight network modeling method based on low-altitude airspace limiting conditions
CN111507296A (en) * 2020-04-23 2020-08-07 嘉兴河图遥感技术有限公司 Intelligent illegal building extraction method based on unmanned aerial vehicle remote sensing and deep learning
CN111783665A (en) * 2020-06-30 2020-10-16 创新奇智(西安)科技有限公司 Action recognition method and device, storage medium and electronic equipment

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
徐子睿: "基于YOLOv4的车辆检测与流量统计研究", 《现代信息科技》, vol. 04, no. 15, pages 98 - 100 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115049935A (en) * 2022-08-12 2022-09-13 松立控股集团股份有限公司 Urban illegal building division detection method

Similar Documents

Publication Publication Date Title
CN107729801B (en) Vehicle color recognition system based on multitask deep convolution neural network
CN112215190A (en) Illegal building detection method based on YOLOV4 model
Chen et al. Vehicle detection in high-resolution aerial images via sparse representation and superpixels
CN111814623A (en) Vehicle lane departure visual detection method based on deep neural network
CN112016605B (en) Target detection method based on corner alignment and boundary matching of bounding box
CN112287941B (en) License plate recognition method based on automatic character region perception
US20230358533A1 (en) Instance segmentation imaging system
CN113052170B (en) Small target license plate recognition method under unconstrained scene
Fan et al. Improving robustness of license plates automatic recognition in natural scenes
CN111160205A (en) Embedded multi-class target end-to-end unified detection method for traffic scene
CN114782770A (en) License plate detection and recognition method and system based on deep learning
CN113361467A (en) License plate recognition method based on field adaptation
CN110889388A (en) Violation identification method, device, equipment and storage medium
Bu et al. A UAV photography–based detection method for defective road marking
US20220335572A1 (en) Semantically accurate super-resolution generative adversarial networks
CN114882204A (en) Automatic ship name recognition method
CN112215189A (en) Accurate detecting system for illegal building
CN112348011B (en) Vehicle damage assessment method and device and storage medium
CN114550016B (en) Unmanned aerial vehicle positioning method and system based on context information perception
CN115731477A (en) Image recognition method, illicit detection method, terminal device, and storage medium
CN113065559B (en) Image comparison method and device, electronic equipment and storage medium
CN115690770A (en) License plate recognition method based on space attention characteristics in non-limited scene
Chang et al. Semi-supervised learning for YOLOv4 object detection in license plate recognition system
CN114937248A (en) Vehicle tracking method and device for cross-camera, electronic equipment and storage medium
Zhou et al. LEDet: localization estimation detector with data augmentation for ship detection based on unmanned surface vehicle

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination