CN111178206B - Building embedded part detection method and system based on improved YOLO - Google Patents

Building embedded part detection method and system based on improved YOLO Download PDF

Info

Publication number
CN111178206B
CN111178206B CN201911328091.0A CN201911328091A CN111178206B CN 111178206 B CN111178206 B CN 111178206B CN 201911328091 A CN201911328091 A CN 201911328091A CN 111178206 B CN111178206 B CN 111178206B
Authority
CN
China
Prior art keywords
building
picture
yolo
embedded part
network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201911328091.0A
Other languages
Chinese (zh)
Other versions
CN111178206A (en
Inventor
姜向远
邢金昊
于敦政
陈菲雨
贾磊
马思乐
陈纪旸
栾义忠
杜延丽
岳文斌
马晓静
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shandong University
Original Assignee
Shandong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shandong University filed Critical Shandong University
Priority to CN201911328091.0A priority Critical patent/CN111178206B/en
Publication of CN111178206A publication Critical patent/CN111178206A/en
Application granted granted Critical
Publication of CN111178206B publication Critical patent/CN111178206B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/10Terrestrial scenes
    • G06V20/13Satellite images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/02Affine transformations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/60Rotation of whole images or parts thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/46Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
    • G06V10/462Salient features, e.g. scale invariant feature transforms [SIFT]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Computing Systems (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Biomedical Technology (AREA)
  • General Engineering & Computer Science (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Astronomy & Astrophysics (AREA)
  • Remote Sensing (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

The present disclosure provides a method and a system for detecting a building embedded part based on improved YOLO, which acquire a picture of the building embedded part, calibrate the building embedded part in the picture to form a data set, and divide the data set into a training set and a testing set; utilizing a MobileNet network to replace a Darknet53 network in a YOLO detection algorithm as a feature extraction network, constructing an improved YOLO detection model, and training the improved YOLO detection model by utilizing a training set until the testing requirement of a testing set is met, so as to obtain a final detection model; and acquiring an aerial picture of a building site to be built, turning over the picture, carrying out affine transformation of different scales and Gaussian blur processing, and identifying the input picture by utilizing a final detection model to obtain an embedded part detection result.

Description

Building embedded part detection method and system based on improved YOLO
Technical Field
The disclosure belongs to the technical field of detection of building engineering embedded parts, and relates to a building embedded part detection method and system based on improved YOLO.
Background
The statements in this section merely provide background information related to the present disclosure and may not necessarily constitute prior art.
Along with the rapid development of economy, the development of the building industry as a basic industry is rapid, the prosperity of the building market brings challenges to a plurality of building enterprises, and higher requirements are also provided for the building construction quality and efficiency of the enterprises. The embedded part is a technology which is widely applied in modern constructional engineering, and comprises structural members such as steel plates, bolts, junction boxes and the like, and embedded pipes such as wire connection pipes, drain pipes and the like. The construction quality of the embedded part directly influences the construction progress and the structural safety of the building engineering, so that the construction progress and the structural safety of the building engineering are firmly controlled. The position inspection of various embedded parts at present is carried out on site by workers before cement filling, however, for engineering projects with larger construction area, the number of the embedded parts is large, the positions are scattered, such as wiring of the embedded parts of a junction box, a pipe and the like are complex, and the embedded parts are not easy to inspect, so that the manual inspection is time-consuming and labor-consuming, the efficiency is low, and in the high-rise building construction engineering, the workers climb up to the height to have danger. The unmanned aerial vehicle is used for replacing manual detection of embedded parts of high-rise and large-area constructional engineering, and is a new idea.
In recent years, unmanned aerial vehicles are rapidly popularized in the civil and commercial fields due to the advantages of small size, light weight, capability of carrying various task loads and the like, and are also popularized in the construction field, such as the aspects of foundation construction measurement, construction site management and the like. The unmanned aerial vehicle is utilized to go above an engineering place, the image shot by the cradle head camera can be returned to the computer in real time through the image transmission module for image processing, the position of the embedded part can be rapidly positioned in a higher visual field, comparison with a building design drawing is facilitated, manpower is saved, the inspection speed of the embedded part is accelerated, and engineering efficiency and quality are improved. The rapid detection of embedded parts by matching with the image information returned by the unmanned aerial vehicle requires a high-efficiency and accurate target detection algorithm, a traditional target detection algorithm such as a target detection algorithm based on the feature extraction of a direction gradient histogram (Histogram of Oriented Gradient, HOG), a target detection algorithm based on a support vector machine and the like, and the algorithms do not select a proper sliding window aiming at the target, so that the calculation time complexity is high, the window redundancy is low, and the detection efficiency is low.
According to the knowledge of the inventor, the current target detection algorithm based on deep learning is more and more paid attention to, and the algorithm trains convolutional neural networks (Convolutional Neural Networks, CNNs) by utilizing data and labels in a data set, and can be divided into two types: the first type is a region-based target detection algorithm, such as RCNN (Regions with CNN), faster R-CNN (Faster Regions with CNN), and the like, wherein candidate regions are extracted in advance according to the target object position, so that the detection accuracy is high, but the detection speed is low; the second type is a target detection algorithm based on end-to-end learning, such as YOLO (You Only Look Once), SSD (Single Shot MultiBox Detector), etc., which omits a candidate region generation step, and the feature extraction, the target classification and the target regression are implemented in the same convolutional neural network, so that the target detection speed is greatly improved. However, the neural network of the target detection algorithm based on deep learning has a huge and complex structure, so that the detection speed is low.
Disclosure of Invention
In order to solve the problems, the disclosure provides a building embedded part detection method and system based on improved YOLO.
According to some embodiments, the present disclosure employs the following technical solutions:
a building embedded part detection method based on improved YOLO comprises the following steps:
acquiring a picture of the building embedded part, calibrating the building embedded part in the picture to form a data set, and dividing the data set into a training set and a testing set;
utilizing a MobileNet network to replace a Darknet53 network in a YOLO detection algorithm as a feature extraction network, constructing an improved YOLO detection model, and training the improved YOLO detection model by utilizing a training set until the testing requirement of a testing set is met, so as to obtain a final detection model;
and acquiring an aerial picture of a building site to be built, turning over the picture, carrying out affine transformation of different scales and Gaussian blur processing, and identifying the input picture by utilizing a final detection model to obtain an embedded part detection result.
As a further limitation, when testing, object detection is regarded as a regression problem, the input picture is divided into an S x S grid, and if the center of the detected object exists in the center of a certain cell, the cell is responsible for predicting the object; each cell will produce B bounding boxes, each containing an offset of the center position of the object from the cell position and the width and height of the bounding box and confidence of the object.
As a further limitation, the convolutional neural network is used to extract the characteristics of the target object and predict, and each cell is given a probability value of C categories, which represents the probability that the target in the bounding box for which the cell is responsible for prediction belongs to each category. The conditional probability of an object being present in a cell is Pr (class/object), the probability of an identified object being of a certain class is Pr (class),
Figure BDA0002328900350000041
the intersection ratio of the predicted border and the real area of the object is expressed as follows:
Figure BDA0002328900350000042
as a further limitation, the mean square error is used as a loss function to optimize the model parameters, i.e. the mean square error of the output vector of the network structure and the corresponding vector of the real image.
By way of further limitation, the MobileNet network utilizes a depth level separable convolution instead of the standard convolution, decomposed into a depth convolution and a 1 x1 convolution.
As a further limitation, aerial pictures of a building site to be under construction are acquired by a visible light camera carried by the unmanned aerial vehicle.
An improved YOLO-based building embedment detection system, comprising:
the data acquisition module is configured to acquire pictures of the building embedded parts, calibrate the building embedded parts in the pictures to form a data set, and divide the data set into a training set and a testing set;
the model construction and training module is configured to utilize a mobile Net network to replace a Darknet53 network in a YOLO detection algorithm as a feature extraction network, construct an improved YOLO detection model, and train the improved YOLO detection model by utilizing a training set until the testing requirements of a testing set are met, so as to obtain a final detection model;
the detection and identification module is configured to acquire an aerial picture of a building site to be under construction, overturn the picture, affine transformation of different scales and Gaussian blur processing are carried out on the picture, the picture is taken as an input picture, and a final detection model is utilized to identify the input picture, so that an embedded part detection result is obtained.
A computer readable storage medium having stored therein a plurality of instructions adapted to be loaded by a processor of a terminal device and to perform the steps of the improved YOLO-based building embedment detection method.
A terminal device comprising a processor and a computer readable storage medium, the processor configured to implement instructions; the computer readable storage medium is for storing a plurality of instructions adapted to be loaded by a processor and to perform the steps of the improved YOLO-based building embedment detection method.
Compared with the prior art, the beneficial effects of the present disclosure are:
the method and the device can effectively reduce the size of the network model, reduce network parameters, improve detection performance, effectively detect and identify articles such as the embedded wire box, the embedded pipe and the like, and have high monitoring precision.
Drawings
The accompanying drawings, which are included to provide a further understanding of the disclosure, illustrate and explain the exemplary embodiments of the disclosure and together with the description serve to explain the disclosure, and do not constitute an undue limitation on the disclosure.
FIG. 1 is a schematic diagram of a depth separable convolution structure;
FIG. 2 is a schematic diagram of a modified YOLO neural network structure;
FIG. 3 is a schematic diagram of the SE-Block basic structure;
FIG. 4 is a basic structure of SEMoblieNet;
FIGS. 5 (a) - (b) are data set charts;
FIGS. 6 (a) - (d) are comparative illustrations of the detection effect of four methods.
The specific embodiment is as follows:
the disclosure is further described below with reference to the drawings and examples.
It should be noted that the following detailed description is illustrative and is intended to provide further explanation of the present disclosure. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this disclosure belongs.
It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of exemplary embodiments in accordance with the present disclosure. As used herein, the singular is also intended to include the plural unless the context clearly indicates otherwise, and furthermore, it is to be understood that the terms "comprises" and/or "comprising" when used in this specification are taken to specify the presence of stated features, steps, operations, devices, components, and/or combinations thereof.
The building site under construction is photographed by using a visible light camera mounted on an unmanned aerial vehicle (for example, the type 100 of the longitude and latitude of the large Xinjiang river), and a data set sample is shown in fig. 5 (a) - (b). In order to ensure the diversity of data, different flight attitudes of the unmanned aerial vehicle such as hovering, lifting, stable flight and the like are considered in the shooting process. By adopting a manual calibration method, a junction box and a drain pipe which are representative in the embedded part are selected as detection targets for calibration, and a PASCAL VOC format is adopted for calibration pictures. Because the background of the construction site is complex, various reinforcing steel bars, construction elements and other embedded parts are covered in the shooting process, so that samples with the shielded area exceeding 50% are not marked for ensuring the effectiveness of data. In addition, in order to make training data more effective and improve network generalization capability, the following expansion operation is performed on the data set pictures: respectively turning the picture left and right and up and down by using a turning matrix; affine transformations of different dimensions; and (5) Gaussian blur processing. The amplified dataset was divided into a training set and a test set according to a 4:1 ratio for a total of 3590 pictures.
The YOLO v3 adopts a new Darknet-53 feature extraction network, the Darknet refers to the idea of ResNet structure, and the YOLO v3 has 53 layers of convolutional neural networks, and the deepening of the network layers makes the feature extraction capability of the YOLO v3 on a target stronger. The Darknet-53 also removes a pooling layer and a full connection layer in the network, each base layer is added with a BN (Batch Normalization) layer and a leak RELU layer, and a residual error module is added in the network, so that the gradient disappearance or gradient explosion problem of the deep network can be solved. However, due to the complexity of the structure of the dark-53 network, the network has a large number of weight parameters, so that the algorithm has high requirements on the image processing equipment, and the detection speed of the picture is also affected. Tiny-YOLO is a simplified version of YOLO v3, the characteristic extraction network is simpler, a residual network is removed, only 7 convolution layers and 6 pooling layers are provided, the number of parameters is reduced, the requirements on equipment are lower, the detection speed is effectively improved, and the detection precision is lost. The feature extraction network is replaced by a lighter neural network MobileNet, and the image detection speed is increased on the basis of improving the complexity of the network by using a more efficient convolution calculation mode.
In performing object detection, YOLO regards object detection as a regression problem, divides an input picture into an sx S grid, and if the center of a detected object exists at the center of a certain cell, the cell is responsible for predicting the object. Each cell generates B bounding boxes, each bounding box containing 5 parameters (x, y, w, h, confidence), where (x, y, w, h) is the offset of the center position of the object relative to the cell position and the width and height of the bounding box, confidence is the Confidence of the object, reflecting whether there is a target object in the bounding box and the accuracy of the object's position, calculated as follows:
Figure BDA0002328900350000081
wherein P is r (object) indicates whether the bounding box contains the object, if so, 1 is taken, and if not, 0 is taken;
Figure BDA0002328900350000082
representing the intersection ratio of the predicted border and the real area of the object.
And extracting the characteristics of the target object by using a convolutional neural network and predicting, wherein each cell is required to give C category probability values, and the probability that the target in the boundary frame of the cell is responsible for prediction belongs to each category is represented. The conditional probability of an object being present in a cell is that the probability of the identified object being of a certain class is:
Figure BDA0002328900350000083
YOLO uses the mean square sum error as a loss function to optimize model parameters, i.e., the mean square sum error of the output vector of the network structure and the corresponding vector of the real image.
The formula is as follows:
Figure BDA0002328900350000084
wherein cordrerror represents the error between the predicted data and the calibration data; iouError represents the cross-ratio error; classError represents classification error. The specific formula is as follows:
Figure BDA0002328900350000085
wherein the parameter lambda coord The weight for representing the loss error enhances the importance of the bounding box in the loss calculation; lambda (lambda) noobj The weight of the classification loss function can be expressed, so that the influence of the non-target region on the calculation of the confidence coefficient of the target region can be weakened;
Figure BDA0002328900350000091
indicating that the target object falls into the jth bounding box of the ith cell, taking 1 if the target object falls into the bounding box, and taking 0 if the target object does not fall into the bounding box;
Figure BDA0002328900350000092
representing the response prediction value.
The mobilet network is a lightweight and efficient convolutional neural network proposed by Google, and replaces standard convolution with depth-level separable convolution (Depth Separable Convolution), and can be decomposed into two operations of depth convolution (Depthwise Convolution) and 1×1 convolution (Pointwise Convolution) as shown in fig. 1, which not only has higher efficiency in theory, but also can be directly completed by using highly optimized matrix multiplication by a large number of 1×1 convolution operations, and about 95% of multiply-add operations in mobilet come from 1×1 convolution, so that the operation efficiency can be greatly improved.
The improved YOLO network architecture is shown in fig. 2, where 416X 416 size picture inputs are used, 3X 3 standard convolutions are followed by depth separable convolutions, X1 is output after the 5 th depth separable convolutions, then the depth separable convolutions are continued, X2, X3 are output at the 11 th and 13 th depth separable convolutions, respectively, and X1, X2, X3 are used as inputs to the YOLO v3 network.
The deep convolution operation of MobileNet greatly reduces the scale of network parameters and improves the calculation speed of the network, but the defects are obvious: the deep convolution emphasizes the characteristic parameters of the local receptive field, improves the expression capability of the network to a certain extent, but ignores the correlation of the characteristic information of each channel, and then connects the characteristics on each channel by using a 1×1 convolution operation, but cannot completely compensate the loss in precision. In view of the correlation between features describing objects and the division of features into main features and non-main features, this embodiment adds a squeze-and-specification module in the MobileNet network.
The Squeeze-and-expression network (SE-Block) is a network module for the importance degree of object features, which is proposed by Hu Jie team, and the core idea of the module is that the learning of feature weights increases effective feature weights, reduces ineffective or small-effect feature weights, thereby enhancing the feature extraction capability of the network and bringing significant performance improvement to the neural network structure with less additional calculation cost. The basic structure of the module is shown in figure 3.
The SE-Block function first inputs arbitrary data by a standard convolution operator
Figure BDA0002328900350000102
Is converted into
Figure BDA0002328900350000103
Is described. The subsequent operation is completed by three steps of compression (Squeeze), excitation (specification) and weight distribution (Scale), the compression operation is that global space information is compressed to obtain a single-channel descriptor, statistical information of each channel is generated through global average pooling (Global Average Pooling), and the statistics z c By compressing the Feature map (Feature map) u in the spatial dimension H W c The method is realized by the following calculation formula:
Figure BDA0002328900350000101
the excitation operation is mainly completed through two full-connected layers (full-connected layers), the first full-connected layer realizes compression of feature mapping, the calculated amount is reduced, and a ReLU layer is added into the two full-connected layers, so that the nonlinear relation between channels can be learned. The second full-connection layer restores the feature mapping to the original channel number, and the final feature mapping importance description factor s is obtained through the normalization of the Sigmoid function, and the calculation formula is as follows:
s=F Excitation (z,W)+σ(g(z,W))=σ(W 2 δ(W 1 z))
where δ represents the activation function ReLU. The weight distribution is to weight the feature mapping importance descriptive factor s obtained by the excitation operation into the feature channel by channel through multiplication, the weight distribution of the original feature is completed, and the calculation formula is as follows:
Figure BDA0002328900350000111
in the embodiment, the SE-Block module is embedded into the MobileNet feature extraction network, and the SEMobileNet-YOLO detection network is provided in combination with the YOLO detection network. The SEMobileNet-YOLO network structure is similar to the MobileNet-YOLO network structure, and an SE-Block module is added after each pair of deep convolution sum and 1×1 convolution, importance weight distribution is carried out on the input feature map, then the next layer is input, and the SEMobileNet basic structure is shown in FIG. 4.
The implementation of this embodiment is performed in the environment of Ubuntu16.04LTS system, CPU i9-9900K, memory 16GB, video card Nvidia Titan XP, and development framework Keas under TensorFlow. The training parameters were set as follows: the initial weight is set to 0.001; the weight decay factor is 0.0005; adopting a momentum gradient descent algorithm with momentum of 0.9; the Batch processing parameter (Batch Size) is set to 16. Initial learning rate of 10 -3 After training 300 full datasets, the network is initialized with trained parameters to adjust the learning rate to 10 -4 The 300 full datasets were retrained. The learning rate adjustment strategy is to adjust the learning rate according to the proportion of 10% when the loss value of the test set is not reduced through continuous 3 full dataset (Epoch) training; when the loss value of the test set is not reduced after training of 10 continuous full data sets, training is stopped in advance.
After the network training is completed, the pictures of the test set are input into the network for detection. The improved SEMobileNet-YOLO detection method is compared with MobileNet-YOLO, YOLO v3 and tiny-YOLO, and tiny-YOLO is a simplified version of YOLO v3, so that the detection speed is greatly improved on the basis of sacrificing the detection accuracy. And when the intersection ratio (IoU) of the target boundary box predicted by the model and the manually marked boundary box is more than or equal to 0.5, and the target identification is accurate, the detection is considered to be successful, otherwise, the target is considered to be missed. The final detection effect is shown in fig. 6 (a) - (d), the gray dotted frame is the calibration frame drawn when the calibration is performed, and the white solid frame is the calibration frame detected by the algorithm.
It will be apparent to those skilled in the art that embodiments of the present disclosure may be provided as a method, system, or computer program product. Accordingly, the present disclosure may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present disclosure may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.
The present disclosure is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the disclosure. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The foregoing description of the preferred embodiments of the present disclosure is provided only and not intended to limit the disclosure so that various modifications and changes may be made to the present disclosure by those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present disclosure should be included in the protection scope of the present disclosure.
While the specific embodiments of the present disclosure have been described above with reference to the drawings, it should be understood that the present disclosure is not limited to the embodiments, and that various modifications and changes can be made by one skilled in the art without inventive effort on the basis of the technical solutions of the present disclosure while remaining within the scope of the present disclosure.

Claims (9)

1. A building embedded part detection method based on improved YOLO is characterized by comprising the following steps: the method comprises the following steps:
acquiring a picture of the building embedded part, calibrating the building embedded part in the picture to form a data set, and dividing the data set into a training set and a testing set;
utilizing a MobileNet network to replace a Darknet53 network in a YOLO detection algorithm as a feature extraction network, constructing an improved YOLO detection model, and training the improved YOLO detection model by utilizing a training set until the testing requirement of a testing set is met, so as to obtain a final detection model;
and acquiring an aerial picture of a building site to be built, turning over the picture, carrying out affine transformation of different scales and Gaussian blur processing, and identifying the input picture by utilizing a final detection model to obtain an embedded part detection result.
2. The method for detecting the building embedded part based on the improved YOLO as claimed in claim 1, wherein the method comprises the following steps: when testing is carried out, target detection is regarded as regression problem, an input picture is divided into S multiplied by S grids, and if the center of a detected target exists in the center of a certain cell, the cell is responsible for predicting the target; each cell will produce B bounding boxes, each containing an offset of the center position of the object from the cell position and the width and height of the bounding box and confidence of the object.
3. The method for detecting the building embedded part based on the improved YOLO as claimed in claim 1, wherein the method comprises the following steps: extracting characteristics of a target object by using a MobileNet network, predicting, giving a probability value of C categories to each cell, representing the probability that the target in a boundary frame of the cell responsible for prediction belongs to each category, wherein the conditional probability of the existence of the object in the cell is Pr (class/object), pr (object) represents whether the boundary frame contains the target or not, the probability of the identified object being in a certain category is Pr (class),
Figure FDA0004093583360000021
the intersection ratio of the predicted border and the real area of the object is expressed as follows:
Figure FDA0004093583360000022
4. the method for detecting the building embedded part based on the improved YOLO as claimed in claim 1, wherein the method comprises the following steps: the model parameters, i.e. the mean square error of the output vector of the network structure and the corresponding vector of the real image, are optimized using the mean square error as a loss function.
5. The method for detecting the building embedded part based on the improved YOLO as claimed in claim 1, wherein the method comprises the following steps: the MobileNet network utilizes a depth level separable convolution instead of the standard convolution, which is decomposed into a depth convolution and a 1 x1 convolution.
6. The method for detecting the building embedded part based on the improved YOLO as claimed in claim 1, wherein the method comprises the following steps: and acquiring an aerial picture of the building site to be under construction by carrying a visible light camera through the unmanned aerial vehicle.
7. Building built-in fitting detecting system based on improve YOLO, characterized by: comprising the following steps:
the data acquisition module is configured to acquire pictures of the building embedded parts, calibrate the building embedded parts in the pictures to form a data set, and divide the data set into a training set and a testing set;
the model construction and training module is configured to utilize a mobile Net network to replace a Darknet53 network in a YOLO detection algorithm as a feature extraction network, construct an improved YOLO detection model, and train the improved YOLO detection model by utilizing a training set until the testing requirements of a testing set are met, so as to obtain a final detection model;
the detection and identification module is configured to acquire an aerial picture of a building site to be under construction, overturn the picture, affine transformation of different scales and Gaussian blur processing are carried out on the picture, the picture is taken as an input picture, and a final detection model is utilized to identify the input picture, so that an embedded part detection result is obtained.
8. A computer-readable storage medium, characterized by: in which instructions are stored which are adapted to be loaded by a processor of a terminal device and to carry out the steps of a building embedment detection method based on improved YOLO according to any one of claims 1-6.
9. A terminal device, characterized by: comprising a processor and a computer-readable storage medium, the processor configured to implement instructions; a computer readable storage medium for storing a plurality of instructions adapted to be loaded by a processor and to perform the steps of a method of improved YOLO-based building embedment detection of any one of claims 1-6.
CN201911328091.0A 2019-12-20 2019-12-20 Building embedded part detection method and system based on improved YOLO Active CN111178206B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911328091.0A CN111178206B (en) 2019-12-20 2019-12-20 Building embedded part detection method and system based on improved YOLO

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911328091.0A CN111178206B (en) 2019-12-20 2019-12-20 Building embedded part detection method and system based on improved YOLO

Publications (2)

Publication Number Publication Date
CN111178206A CN111178206A (en) 2020-05-19
CN111178206B true CN111178206B (en) 2023-05-16

Family

ID=70647376

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911328091.0A Active CN111178206B (en) 2019-12-20 2019-12-20 Building embedded part detection method and system based on improved YOLO

Country Status (1)

Country Link
CN (1) CN111178206B (en)

Families Citing this family (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111191696B (en) * 2019-12-20 2023-04-07 山东大学 Deep learning algorithm-based steel bar layering method and system
CN111709346A (en) * 2020-06-10 2020-09-25 嘉应学院 Historical building identification and detection method based on deep learning and high-resolution images
CN111986187A (en) * 2020-08-26 2020-11-24 华中科技大学 Aerospace electronic welding spot defect detection method based on improved Tiny-YOLOv3 network
CN112149761B (en) * 2020-11-24 2021-06-22 江苏电力信息技术有限公司 Electric power intelligent construction site violation detection method based on YOLOv4 improved algorithm
CN112487915B (en) * 2020-11-25 2024-04-23 江苏科技大学 Pedestrian detection method based on Embedded YOLO algorithm
CN112699762A (en) * 2020-12-24 2021-04-23 广东工业大学 Food material identification method suitable for embedded equipment
CN113327240A (en) * 2021-06-11 2021-08-31 国网上海市电力公司 Visual guidance-based wire lapping method and system and storage medium
CN113408394A (en) * 2021-06-11 2021-09-17 通号智慧城市研究设计院有限公司 Safety helmet wearing detection method and system based on deep learning model
CN113468992B (en) * 2021-06-21 2022-11-04 四川轻化工大学 Construction site safety helmet wearing detection method based on lightweight convolutional neural network
CN114299366A (en) * 2022-03-10 2022-04-08 青岛海尔工业智能研究院有限公司 Image detection method and device, electronic equipment and storage medium
CN115439436B (en) * 2022-08-31 2023-07-28 成都建工第七建筑工程有限公司 Multi-type quality defect mobile sensing system for building structure

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109493609A (en) * 2018-12-11 2019-03-19 杭州炬视科技有限公司 A kind of portable device and method for not giving precedence to the candid photograph of pedestrian's automatic identification
CN110084817A (en) * 2019-03-21 2019-08-02 西安电子科技大学 Digital elevation model production method based on deep learning
CN110443208A (en) * 2019-08-08 2019-11-12 南京工业大学 A kind of vehicle target detection method, system and equipment based on YOLOv2

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190378013A1 (en) * 2018-06-06 2019-12-12 Kneron Inc. Self-tuning model compression methodology for reconfiguring deep neural network and electronic device

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109493609A (en) * 2018-12-11 2019-03-19 杭州炬视科技有限公司 A kind of portable device and method for not giving precedence to the candid photograph of pedestrian's automatic identification
CN110084817A (en) * 2019-03-21 2019-08-02 西安电子科技大学 Digital elevation model production method based on deep learning
CN110443208A (en) * 2019-08-08 2019-11-12 南京工业大学 A kind of vehicle target detection method, system and equipment based on YOLOv2

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
Ernin Niswatul Ukhwah等."Asphalt Pavement Pothole Detection using Deep learning method based on YOLO Neural Network".《2019 International Seminar on Intelligent Technology and Its Applications (ISITIA)》.2019,全文. *

Also Published As

Publication number Publication date
CN111178206A (en) 2020-05-19

Similar Documents

Publication Publication Date Title
CN111178206B (en) Building embedded part detection method and system based on improved YOLO
CN113705478B (en) Mangrove single wood target detection method based on improved YOLOv5
CN110378222B (en) Method and device for detecting vibration damper target and identifying defect of power transmission line
CN111402227B (en) Bridge crack detection method
CN112183414A (en) Weak supervision remote sensing target detection method based on mixed hole convolution
Li et al. Automatic bridge crack identification from concrete surface using ResNeXt with postprocessing
CN112581443A (en) Light-weight identification method for surface damage of wind driven generator blade
CN111079604A (en) Method for quickly detecting tiny target facing large-scale remote sensing image
CN113408423A (en) Aquatic product target real-time detection method suitable for TX2 embedded platform
CN114596266B (en) Concrete crack detection method based on ConcreteCrackSegNet model
CN110992307A (en) Insulator positioning and identifying method and device based on YOLO
CN115564950A (en) Small sample rocket projectile bonding defect detection method based on deep learning
CN116206185A (en) Lightweight small target detection method based on improved YOLOv7
CN111985325A (en) Aerial small target rapid identification method in extra-high voltage environment evaluation
CN114565842A (en) Unmanned aerial vehicle real-time target detection method and system based on Nvidia Jetson embedded hardware
CN114078209A (en) Lightweight target detection method for improving small target detection precision
CN110334775B (en) Unmanned aerial vehicle line fault identification method and device based on width learning
CN115115924A (en) Concrete image crack type rapid intelligent identification method based on IR7-EC network
CN116152254A (en) Industrial leakage target gas detection model training method, detection method and electronic equipment
CN116385911A (en) Lightweight target detection method for unmanned aerial vehicle inspection insulator
CN109934151B (en) Face detection method based on movidius computing chip and Yolo face
CN113269717A (en) Building detection method and device based on remote sensing image
CN117541534A (en) Power transmission line inspection method based on unmanned plane and CNN-BiLSTM model
CN116363532A (en) Unmanned aerial vehicle image traffic target detection method based on attention mechanism and re-parameterization
CN116206169A (en) Intelligent gangue target detection method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant