CN111178206A - Building embedded part detection method and system based on improved YOLO - Google Patents

Building embedded part detection method and system based on improved YOLO Download PDF

Info

Publication number
CN111178206A
CN111178206A CN201911328091.0A CN201911328091A CN111178206A CN 111178206 A CN111178206 A CN 111178206A CN 201911328091 A CN201911328091 A CN 201911328091A CN 111178206 A CN111178206 A CN 111178206A
Authority
CN
China
Prior art keywords
building
detection
yolo
picture
network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201911328091.0A
Other languages
Chinese (zh)
Other versions
CN111178206B (en
Inventor
姜向远
邢金昊
于敦政
陈菲雨
贾磊
马思乐
陈纪旸
栾义忠
杜延丽
岳文斌
马晓静
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shandong University
Original Assignee
Shandong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shandong University filed Critical Shandong University
Priority to CN201911328091.0A priority Critical patent/CN111178206B/en
Publication of CN111178206A publication Critical patent/CN111178206A/en
Application granted granted Critical
Publication of CN111178206B publication Critical patent/CN111178206B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/10Terrestrial scenes
    • G06V20/13Satellite images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/02Affine transformations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/60Rotation of whole images or parts thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/46Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
    • G06V10/462Salient features, e.g. scale invariant feature transforms [SIFT]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Multimedia (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Molecular Biology (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Health & Medical Sciences (AREA)
  • Astronomy & Astrophysics (AREA)
  • Remote Sensing (AREA)
  • Image Processing (AREA)
  • Image Analysis (AREA)

Abstract

The invention provides a building embedded part detection method and system based on improved YOLO, wherein pictures of the building embedded part are obtained, the building embedded part in the pictures is calibrated to form a data set, and the data set is divided into a training set and a testing set; replacing a Darknet53 network in a YOLO detection algorithm with a MobileNet network as a feature extraction network, constructing an improved YOLO detection model, and training the improved YOLO detection model by using a training set until the test requirements of the test set are met to obtain a final detection model; obtaining an aerial picture of a building site to be built, turning the picture, performing affine transformation of different scales and Gaussian blur processing on the picture to serve as an input picture, and identifying the input picture by using a final detection model to obtain an embedded part detection result.

Description

Building embedded part detection method and system based on improved YOLO
Technical Field
The disclosure belongs to the technical field of detection of embedded parts in constructional engineering, and relates to a method and a system for detecting embedded parts in buildings based on improved YOLO.
Background
The statements in this section merely provide background information related to the present disclosure and may not necessarily constitute prior art.
With the rapid development of economy, the construction industry as the basic industry develops rapidly, the prosperity of the construction market brings opportunities and challenges to a plurality of construction enterprises, and higher requirements are provided for the construction quality and efficiency of the enterprises. The embedded part is a very widely applied technology in modern building engineering, and comprises structural parts such as steel plates, bolts and junction boxes, and embedded pipes such as wiring pipes and drain pipes. The construction quality of the embedded part directly influences the construction progress and the structure safety of the building engineering, so that the control is firmly carried out. In the prior art, workers perform site inspection before cement is poured, however, for engineering projects with large construction areas, the quantity and positions of embedded parts are large, and the embedded parts such as junction boxes, pipes and the like are complex in wiring and difficult to inspect, so that manual inspection is time-consuming and labor-consuming, the efficiency is low, and in high-rise building construction projects, the workers climb up to be dangerous. The unmanned aerial vehicle is a new idea for replacing manpower to detect embedded parts of high-rise and large-area building engineering.
In recent years, unmanned aerial vehicles have been rapidly popularized in the civil and commercial fields due to the advantages of small size, light weight, capability of carrying various task loads and the like, and also have been popularized in the building construction field, such as being used for basic construction measurement, construction site management and the like. Utilize unmanned aerial vehicle to pass back the image real-time passback computer that cloud platform camera was shot at engineering place sky through image transmission module and carry out image processing, can be in the quick location built-in fitting position of higher field of vision, conveniently contrast with the architectural design drawing, use manpower sparingly for the built-in fitting inspection speed improves engineering efficiency and quality. The rapid detection of the embedded part by matching with the image information returned by the unmanned aerial vehicle needs an efficient and accurate target detection algorithm, and the traditional target detection algorithms such as a target detection algorithm based on Histogram of Oriented Gradient (HOG) feature extraction, a target detection algorithm based on a support vector machine and the like do not select a proper sliding window aiming at the target, so that the calculation time complexity is high, the window redundancy is high, and the detection efficiency is low.
According to the knowledge of the inventor, the current target detection algorithm based on deep learning is more and more emphasized by people, and such algorithms mostly use data and labels in data sets to train Convolutional Neural Networks (CNNs), and can be divided into two types: the first type is a region-based target detection algorithm such as RCNN (registration switch CNN), Faster R-CNN (Faster Regions with CNN) and the like, wherein the algorithm extracts candidate Regions aiming at the position of a target object in advance, has higher detection precision, but has slower detection speed; the second type is a target detection algorithm based on end-to-end learning, such as YOLO (young Only Look one), SSD (Single Shot multi box Detector), etc., which omits the candidate region generation step, and implements the feature extraction, the target classification and the target regression in the same convolutional neural network, so that the target detection speed is greatly increased. However, the neural network structure of the target detection algorithm based on deep learning is large and complex, so that the detection speed is slow.
Disclosure of Invention
The invention provides a building embedded part detection method and system based on improved YOLO (YOLO) for solving the problems.
According to some embodiments, the following technical scheme is adopted in the disclosure:
a building embedded part detection method based on improved YOLO comprises the following steps:
acquiring pictures of the building embedded parts, calibrating the building embedded parts in the pictures to form a data set, and dividing the data set into a training set and a testing set;
replacing a Darknet53 network in a YOLO detection algorithm with a MobileNet network as a feature extraction network, constructing an improved YOLO detection model, and training the improved YOLO detection model by using a training set until the test requirements of the test set are met to obtain a final detection model;
obtaining an aerial picture of a building site to be built, turning the picture, performing affine transformation of different scales and Gaussian blur processing on the picture to serve as an input picture, and identifying the input picture by using a final detection model to obtain an embedded part detection result.
As a further limitation, when performing a test, regarding target detection as a regression problem, dividing an input picture into grids of S × S, and if the center of a detected target exists in the center of a certain cell, the cell is responsible for predicting the target; each cell will generate B bounding boxes, each containing the offset of the center position of the object from the cell position, as well as the width and height of the bounding box and the confidence of the target.
By way of further limitation, the convolutional neural network is used for extracting the characteristics of the target object and predicting, and each cell is given C class probability values which represent the probability that the target in the bounding box of which the cell is responsible for predicting belongs to each class. The conditional probability of the existence of an object in a cell is Pr (class/object), and the probability of the recognized object being of a certain class is Pr (class)ass),
Figure BDA0002328900350000041
The intersection ratio of the predicted frame and the real area of the object is shown as follows:
Figure BDA0002328900350000042
as a further limitation, the model parameters, i.e. the mean square sum error of the output vector of the network structure and the corresponding vector of the real image, are optimized using the mean square sum error as a loss function.
As a further limitation, the MobileNet network utilizes depth-level separable convolutions instead of standard convolutions, decomposed into depth convolution and 1 × 1 convolution.
As a further limitation, the unmanned aerial vehicle carries a visible light camera to obtain aerial pictures of the building site to be built.
A building embedment detection system based on improved YOLO, comprising:
the data acquisition module is configured to acquire pictures of the building embedded parts, calibrate the building embedded parts in the pictures to form a data set, and divide the data set into a training set and a test set;
the model building and training module is configured to utilize a MobileNet network to replace a Darknet53 network in a YOLO detection algorithm as a feature extraction network, build an improved YOLO detection model, and train the improved YOLO detection model by utilizing a training set until the test requirements of the test set are met to obtain a final detection model;
and the detection and identification module is configured to acquire an aerial picture of a building site to be built, turn over the picture, perform affine transformation with different scales and Gaussian blur processing on the picture to serve as an input picture, and identify the input picture by using a final detection model to obtain an embedded part detection result.
A computer readable storage medium having stored therein a plurality of instructions adapted to be loaded by a processor of a terminal device and to perform the steps of the improved YOLO based building embedment detection method.
A terminal device comprising a processor and a computer readable storage medium, the processor being configured to implement instructions; the computer readable storage medium is configured to store instructions adapted to be loaded by the processor and to perform the steps of the improved YOLO based building embedment detection method.
Compared with the prior art, the beneficial effect of this disclosure is:
the size of the network model can be effectively reduced, network parameters are reduced, the detection performance is improved, objects such as the embedded line box and the embedded pipe are effectively detected and identified, and the monitoring precision is high.
Drawings
The accompanying drawings, which are included to provide a further understanding of the disclosure, illustrate embodiments of the disclosure and together with the description serve to explain the disclosure and are not to limit the disclosure.
FIG. 1 is a schematic diagram of a depth separable convolution structure;
FIG. 2 is a schematic diagram of an improved YOLO neural network structure;
FIG. 3 is a schematic diagram of the basic structure of SE-Block;
FIG. 4 is a SEMoblieNet basic structure;
FIGS. 5(a) - (b) are sample graphs of datasets;
FIGS. 6(a) - (d) are schematic diagrams comparing the detection effects of the four methods.
The specific implementation mode is as follows:
the present disclosure is further described with reference to the following drawings and examples.
It should be noted that the following detailed description is exemplary and is intended to provide further explanation of the disclosure. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this disclosure belongs.
It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of example embodiments according to the present disclosure. As used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, and it should be understood that when the terms "comprises" and/or "comprising" are used in this specification, they specify the presence of stated features, steps, operations, devices, components, and/or combinations thereof, unless the context clearly indicates otherwise.
An unmanned aerial vehicle (such as Xinjiang longitude and latitude matrix 100) is used for carrying a visible light camera to shoot a construction site under construction, and data set sample diagrams are shown in fig. 5(a) - (b). In order to guarantee the diversity of data, the different flight attitudes of unmanned aerial vehicle such as hover, go up and down, smooth flight etc. have been considered at the shooting in-process. And adopting a manual calibration method, selecting a relatively representative junction box and a drain pipe in the embedded part as a detection target for calibration, wherein the calibration picture adopts a PASCAL VOC format. Because of the building site background is complicated, the condition that various reinforcing bars and building original paper etc. cover the built-in fitting appears in the in-process of shooing, consequently does not mark to being sheltered from the sample that the area exceeds 50% in order to guarantee the validity of data. In addition, considering that the unmanned aerial vehicle can have postures such as inclined flight in the flight process, in order to enable training data to be more effective and improve the network generalization capability, the following expansion operation is carried out on a data set picture: the picture is respectively turned left and right and up and down by using a turning matrix; affine transformation of different scales; and (5) Gaussian blur processing. The data set after amplification consisted of 3590 pictures, which were divided into training and test sets at a 4:1 ratio.
The YOLO v3 adopts a new Darknet-53 feature extraction network, the Darknet draws the thought of a ResNet structure for reference, 53 layers of convolutional neural networks are provided, and the deepening of the network layer number enables the YOLO v3 to have stronger feature extraction capability on a target. The Darknet-53 also removes a pooling layer and a full connection layer in the network, a BN (Batchnormalization) layer and a Leaky RELU layer are added into each basic layer, and a residual error module is added into the network, so that the problem of gradient disappearance or gradient explosion appearing in a deep network is solved. However, due to the complexity of the Darknet-53 network structure, the network has a large number of weight parameters, so that the algorithm has higher requirements on image processing equipment, and the detection speed of the picture is also influenced. The Tiny-YOLO is a simplified version of YOLO v3, the characteristic extraction network is simple, a residual error network is removed, only 7 convolution layers and 6 pooling layers are needed, the number of parameters is reduced, the requirement on equipment is low, the detection speed is effectively improved, and the detection precision is lost. According to the method, the characteristic extraction network is replaced by a lighter neural network MobileNet, and the speed of image detection is increased on the basis of improving the complexity of the network by using a more efficient convolution calculation mode.
When performing target detection, YOLO regards target detection as a regression problem, divides an input picture into grids of S × S, and if the center of a detection target exists in the center of a certain cell, the cell is responsible for predicting the target. Each cell will generate B bounding boxes (bounding boxes), each bounding box contains 5 parameters (x, y, w, h, Confidence), where (x, y, w, h) is the deviation of the center position of the object from the cell position and the width and height of the bounding box, and Confidence is the Confidence of the target, which reflects whether there is a target object in the bounding box and the accuracy of the object position, and the calculation method is as follows:
Figure BDA0002328900350000081
wherein, Pr(object) indicates whether the bounding box contains an object, if yes, 1 is taken, and if no, 0 is taken;
Figure BDA0002328900350000082
and representing the intersection ratio of the predicted frame and the real area of the object.
And extracting the characteristics of the target object by using a convolutional neural network and predicting, wherein each cell is provided with C class probability values which represent the probability that the target in the bounding box of which the cell is responsible for predicting belongs to each class. The conditional probability of the presence of an object in a cell is that if the probability of the identified object being of a certain class is:
Figure BDA0002328900350000083
YOLO uses the mean square sum error as a loss function to optimize the model parameters, i.e. the mean square sum error of the output vector of the network structure and the corresponding vector of the real image.
The formula is as follows:
Figure BDA0002328900350000084
wherein coordError represents the error between the predicted data and the calibration data; iouError represents the cross-over ratio error; classleror denotes the classification error. The specific formula is as follows:
Figure BDA0002328900350000085
wherein the parameter lambdacoordThe weight of the loss error is represented, and the importance of the bounding box in loss calculation is enhanced; lambda [ alpha ]noobjThe weight of the classification loss function is expressed, so that the influence of the non-target area on the confidence coefficient calculation of the target area can be weakened;
Figure BDA0002328900350000091
representing that the target object falls into the jth boundary box of the ith cell, if the target object falls into the jth boundary box, taking 1, otherwise, taking 0;
Figure BDA0002328900350000092
indicating the response prediction value.
The MobileNet network is a lightweight and efficient convolutional neural network proposed by Google, and can be decomposed into two operations of deep Convolution (Depthwise Convolution) and 1 × 1 Convolution (point Convolution) as shown in fig. 1 by using Depth-level Separable Convolution (Depthwise Convolution) instead of standard Convolution, which not only has higher efficiency theoretically, but also can be directly completed by using highly optimized matrix multiplication, and about 95% of multiply-add operations in MobileNet are from 1 × 1 Convolution, so that the operation efficiency can be greatly improved.
Improved YOLO network architecture as shown in fig. 2, a picture input of 416 × 416 size is used, a depth separable convolution stack is used after a standard convolution of 3 × 3, X1 is output after a 5 th depth separable convolution, and then the depth separable convolution stack is continued, and X1, X2 and X3 are connected as inputs into a YOLO v3 network at 11 th and 13 th depth separable convolution outputs X2 and X3, respectively.
Although the deep convolution operation of the MobileNet greatly reduces the scale of network parameters and improves the calculation speed of the network, the defects are obvious: the deep convolution emphasizes the characteristic parameters of the local receptive field, improves the expression capability of the network to a certain extent, but ignores the correlation of characteristic information of each channel, and then connects the characteristics on each channel by using 1 × 1 convolution operation, but cannot completely compensate the loss of precision. Considering the correlation between the features describing the object and the classification of the features between the main features and the non-main features, the present embodiment adds the Squeeze-and-Excitation module to the MobileNet network.
The core idea of the squaeze-and-optimization network (SE-Block) is the learning of feature weights, which increases the effective feature weights, reduces the ineffective or less effective feature weights, thereby enhancing the feature extraction capability of the network, and brings significant performance improvement to the neural network structure with less extra computational cost. The basic structure of the module is shown in figure 3.
The SE-Block function firstly inputs any input through a standard convolution operator
Figure BDA0002328900350000102
Is converted into one
Figure BDA0002328900350000103
The feature map of (2). The subsequent operation is completed by three steps of compression (Squeeze), Excitation (Excitation) and weight distribution (Scale), the compression operation is to compress the Global space information to obtain a single channel descriptor, statistical information of each channel is generated through Global Average Pooling (Global Average Pooling), and the statistical quantity z iscBy compressing features in the spatial dimension H WMapping (Feature map) ucThe calculation formula is as follows:
Figure BDA0002328900350000101
the excitation operation is mainly completed through two Fully-connected layers (Fully-connected layers), the first Fully-connected layer realizes the compression of feature mapping, the calculated amount is reduced, and a ReLU layer is added into the two Fully-connected layers, so that the nonlinear relation between channels can be learned. The second full-connection layer restores the feature mapping to the original channel number, and the final feature mapping importance description factor s is obtained through the Sigmoid function normalization, wherein the calculation formula is as follows:
s=FExcitation(z,W)+σ(g(z,W))=σ(W2δ(W1z))
where δ represents the activation function ReLU. Weight distribution is to weight the feature mapping importance description factor s obtained by the excitation operation into the feature channel by channel through multiplication, so as to complete the weight distribution of the original feature, and the calculation formula is as follows:
Figure BDA0002328900350000111
in the embodiment, an SE-Block module is embedded into a MobileNet feature extraction network, and the SEMobileNet-YOLO detection network is provided by combining with the YOLO detection network. The SEMobileNet-YOLO network structure is similar to the structure of the MobileNet-YOLO network, an SE-Block module is added after each pair of deep convolution and 1 x1 convolution, the input feature mapping is subjected to importance weight distribution, and then the input feature mapping is input into the next layer, and the SEMobileNet basic structure is shown in figure 4.
The specific implementation of the embodiment is performed in an ubuntu16.04lts system, a CPU is i9-9900K, a memory is 16GB, a graphics card is Nvidia Titan XP, and a development framework is keras under tensrfow. The training parameters are set as follows: the initial weight is set to 0.001; the weighted attenuation coefficient is 0.0005; adopting a momentum gradient descent algorithm with momentum of 0.9; the Batch processing parameter (Batch Size) is set to 16. Initial learning rate of 10-3In the state ofAfter training 300 full data sets under the condition, the trained parameters are used for initializing the network, and the learning rate is adjusted to 10-4300 full datasets were retrained. The learning rate adjustment strategy is that when the loss value of the test set is not reduced after continuous 3 full data set (Epoch) training, the learning rate is adjusted according to the proportion of 10 percent; when the loss value of the test set does not decrease after the training of 10 continuous full data sets, the training is stopped early.
And after the network training is finished, inputting the pictures of the test set into the network for detection. The improved SEMobileNet-YOLO detection method is compared with the improved SEMobileNet-YOLO, YOLO v3 and tiny-YOLO, and tiny-YOLO is a simplified version of YOLO v3, so that the detection speed is greatly improved on the basis of sacrificing the detection precision. When the intersection ratio (IoU) of the target boundary box predicted by the model and the manually marked boundary box is more than or equal to 0.5 and the target recognition is accurate, the detection is considered to be successful, otherwise, the target is considered to be missed. The resulting detection effect is shown in fig. 6(a) - (d), where the gray dashed frame is a calibration frame drawn during calibration, and the white solid frame is a calibration frame detected by the algorithm.
As will be appreciated by one skilled in the art, embodiments of the present disclosure may be provided as a method, system, or computer program product. Accordingly, the present disclosure may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present disclosure may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and so forth) having computer-usable program code embodied therein.
The present disclosure is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the disclosure. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The above description is only a preferred embodiment of the present disclosure and is not intended to limit the present disclosure, and various modifications and changes may be made to the present disclosure by those skilled in the art. Any modification, equivalent replacement, improvement and the like made within the spirit and principle of the present disclosure should be included in the protection scope of the present disclosure.
Although the present disclosure has been described with reference to specific embodiments, it should be understood that the scope of the present disclosure is not limited thereto, and those skilled in the art will appreciate that various modifications and changes can be made without departing from the spirit and scope of the present disclosure.

Claims (9)

1. A building embedded part detection method based on improved YOLO is characterized by comprising the following steps: the method comprises the following steps:
acquiring pictures of the building embedded parts, calibrating the building embedded parts in the pictures to form a data set, and dividing the data set into a training set and a testing set;
replacing a Darknet53 network in a YOLO detection algorithm with a MobileNet network as a feature extraction network, constructing an improved YOLO detection model, and training the improved YOLO detection model by using a training set until the test requirements of the test set are met to obtain a final detection model;
obtaining an aerial picture of a building site to be built, turning the picture, performing affine transformation of different scales and Gaussian blur processing on the picture to serve as an input picture, and identifying the input picture by using a final detection model to obtain an embedded part detection result.
2. The method for detecting the building embedded part based on the improved YOLO as claimed in claim 1, wherein: when testing, the target detection is regarded as a regression problem, an input picture is divided into grids of S multiplied by S, and if the center of a detected target exists in the center of a certain unit, the unit grid is responsible for predicting the target; each cell will generate B bounding boxes, each containing the offset of the center position of the object from the cell position, as well as the width and height of the bounding box and the confidence of the target.
3. The method for detecting the building embedded part based on the improved YOLO as claimed in claim 1, wherein: and extracting the characteristics of the target object by using a convolutional neural network and predicting, wherein each cell is provided with C class probability values which represent the probability that the target in the bounding box of which the cell is responsible for predicting belongs to each class. The conditional probability of the presence of an object in a cell is Pr (class/object), the probability that the identified object is of a class is Pr (class),
Figure FDA0002328900340000021
the intersection ratio of the predicted frame and the real area of the object is shown as follows:
Figure FDA0002328900340000022
4. the method for detecting the building embedded part based on the improved YOLO as claimed in claim 1, wherein: the mean square sum error is used as a loss function to optimize the model parameters, i.e. the mean square sum error of the output vector of the network structure and the corresponding vector of the real image.
5. The method for detecting the building embedded part based on the improved YOLO as claimed in claim 1, wherein: the MobileNet network uses depth-level separable convolutions instead of standard convolutions, decomposed into depth convolution and 1 × 1 convolution.
6. The method for detecting the building embedded part based on the improved YOLO as claimed in claim 1, wherein: carry on the visible light camera through unmanned aerial vehicle and acquire the aerial picture of the building site of waiting to be under construction.
7. A building embedded part detection system based on improved YOLO is characterized in that: the method comprises the following steps:
the data acquisition module is configured to acquire pictures of the building embedded parts, calibrate the building embedded parts in the pictures to form a data set, and divide the data set into a training set and a test set;
the model building and training module is configured to utilize a MobileNet network to replace a Darknet53 network in a YOLO detection algorithm as a feature extraction network, build an improved YOLO detection model, and train the improved YOLO detection model by utilizing a training set until the test requirements of the test set are met to obtain a final detection model;
and the detection and identification module is configured to acquire an aerial picture of a building site to be built, turn over the picture, perform affine transformation with different scales and Gaussian blur processing on the picture to serve as an input picture, and identify the input picture by using a final detection model to obtain an embedded part detection result.
8. A computer-readable storage medium characterized by: stored with instructions adapted to be loaded by a processor of a terminal device and to perform the steps of a method for improved YOLO-based building embedment detection as claimed in any one of claims 1 to 6.
9. A terminal device is characterized in that: the system comprises a processor and a computer readable storage medium, wherein the processor is used for realizing instructions; the computer readable storage medium storing instructions adapted to be loaded by a processor and to perform the steps of any of claims 1-6 of a method for improved YOLO-based building embedment detection.
CN201911328091.0A 2019-12-20 2019-12-20 Building embedded part detection method and system based on improved YOLO Active CN111178206B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911328091.0A CN111178206B (en) 2019-12-20 2019-12-20 Building embedded part detection method and system based on improved YOLO

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911328091.0A CN111178206B (en) 2019-12-20 2019-12-20 Building embedded part detection method and system based on improved YOLO

Publications (2)

Publication Number Publication Date
CN111178206A true CN111178206A (en) 2020-05-19
CN111178206B CN111178206B (en) 2023-05-16

Family

ID=70647376

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911328091.0A Active CN111178206B (en) 2019-12-20 2019-12-20 Building embedded part detection method and system based on improved YOLO

Country Status (1)

Country Link
CN (1) CN111178206B (en)

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111191696A (en) * 2019-12-20 2020-05-22 山东大学 Deep learning algorithm-based steel bar layering method and system
CN111709346A (en) * 2020-06-10 2020-09-25 嘉应学院 Historical building identification and detection method based on deep learning and high-resolution images
CN111986187A (en) * 2020-08-26 2020-11-24 华中科技大学 Aerospace electronic welding spot defect detection method based on improved Tiny-YOLOv3 network
CN112149761A (en) * 2020-11-24 2020-12-29 江苏电力信息技术有限公司 Electric power intelligent construction site violation detection method based on YOLOv4 improved algorithm
CN112487915A (en) * 2020-11-25 2021-03-12 江苏科技大学 Pedestrian detection method based on Embedded YOLO algorithm
CN112699762A (en) * 2020-12-24 2021-04-23 广东工业大学 Food material identification method suitable for embedded equipment
CN113327240A (en) * 2021-06-11 2021-08-31 国网上海市电力公司 Visual guidance-based wire lapping method and system and storage medium
CN113408394A (en) * 2021-06-11 2021-09-17 通号智慧城市研究设计院有限公司 Safety helmet wearing detection method and system based on deep learning model
CN113468992A (en) * 2021-06-21 2021-10-01 四川轻化工大学 Construction site safety helmet wearing detection method based on lightweight convolutional neural network
CN114299366A (en) * 2022-03-10 2022-04-08 青岛海尔工业智能研究院有限公司 Image detection method and device, electronic equipment and storage medium
CN115439436A (en) * 2022-08-31 2022-12-06 成都建工第七建筑工程有限公司 Mobile sensing system for multiple types of quality defects of building structure

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109493609A (en) * 2018-12-11 2019-03-19 杭州炬视科技有限公司 A kind of portable device and method for not giving precedence to the candid photograph of pedestrian's automatic identification
CN110084817A (en) * 2019-03-21 2019-08-02 西安电子科技大学 Digital elevation model production method based on deep learning
CN110443208A (en) * 2019-08-08 2019-11-12 南京工业大学 YOLOv 2-based vehicle target detection method, system and equipment
US20190378013A1 (en) * 2018-06-06 2019-12-12 Kneron Inc. Self-tuning model compression methodology for reconfiguring deep neural network and electronic device

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190378013A1 (en) * 2018-06-06 2019-12-12 Kneron Inc. Self-tuning model compression methodology for reconfiguring deep neural network and electronic device
CN109493609A (en) * 2018-12-11 2019-03-19 杭州炬视科技有限公司 A kind of portable device and method for not giving precedence to the candid photograph of pedestrian's automatic identification
CN110084817A (en) * 2019-03-21 2019-08-02 西安电子科技大学 Digital elevation model production method based on deep learning
CN110443208A (en) * 2019-08-08 2019-11-12 南京工业大学 YOLOv 2-based vehicle target detection method, system and equipment

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
ERNIN NISWATUL UKHWAH等: "\"Asphalt Pavement Pothole Detection using Deep learning method based on YOLO Neural Network\"" *

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111191696A (en) * 2019-12-20 2020-05-22 山东大学 Deep learning algorithm-based steel bar layering method and system
CN111709346A (en) * 2020-06-10 2020-09-25 嘉应学院 Historical building identification and detection method based on deep learning and high-resolution images
CN111709346B (en) * 2020-06-10 2024-07-12 嘉应学院 Historical building identification and detection method based on deep learning and high-resolution images
CN111986187A (en) * 2020-08-26 2020-11-24 华中科技大学 Aerospace electronic welding spot defect detection method based on improved Tiny-YOLOv3 network
CN112149761A (en) * 2020-11-24 2020-12-29 江苏电力信息技术有限公司 Electric power intelligent construction site violation detection method based on YOLOv4 improved algorithm
CN112149761B (en) * 2020-11-24 2021-06-22 江苏电力信息技术有限公司 Electric power intelligent construction site violation detection method based on YOLOv4 improved algorithm
CN112487915B (en) * 2020-11-25 2024-04-23 江苏科技大学 Pedestrian detection method based on Embedded YOLO algorithm
CN112487915A (en) * 2020-11-25 2021-03-12 江苏科技大学 Pedestrian detection method based on Embedded YOLO algorithm
CN112699762A (en) * 2020-12-24 2021-04-23 广东工业大学 Food material identification method suitable for embedded equipment
CN113408394A (en) * 2021-06-11 2021-09-17 通号智慧城市研究设计院有限公司 Safety helmet wearing detection method and system based on deep learning model
CN113327240A (en) * 2021-06-11 2021-08-31 国网上海市电力公司 Visual guidance-based wire lapping method and system and storage medium
CN113468992B (en) * 2021-06-21 2022-11-04 四川轻化工大学 Construction site safety helmet wearing detection method based on lightweight convolutional neural network
CN113468992A (en) * 2021-06-21 2021-10-01 四川轻化工大学 Construction site safety helmet wearing detection method based on lightweight convolutional neural network
CN114299366A (en) * 2022-03-10 2022-04-08 青岛海尔工业智能研究院有限公司 Image detection method and device, electronic equipment and storage medium
CN115439436A (en) * 2022-08-31 2022-12-06 成都建工第七建筑工程有限公司 Mobile sensing system for multiple types of quality defects of building structure
CN115439436B (en) * 2022-08-31 2023-07-28 成都建工第七建筑工程有限公司 Multi-type quality defect mobile sensing system for building structure

Also Published As

Publication number Publication date
CN111178206B (en) 2023-05-16

Similar Documents

Publication Publication Date Title
CN111178206B (en) Building embedded part detection method and system based on improved YOLO
Ukhwah et al. Asphalt pavement pothole detection using deep learning method based on YOLO neural network
CN113705478B (en) Mangrove single wood target detection method based on improved YOLOv5
CN110569901B (en) Channel selection-based countermeasure elimination weak supervision target detection method
CN110443969B (en) Fire detection method and device, electronic equipment and storage medium
CN110378222B (en) Method and device for detecting vibration damper target and identifying defect of power transmission line
CN112183414A (en) Weak supervision remote sensing target detection method based on mixed hole convolution
CN113408423B (en) Aquatic product target real-time detection method suitable for TX2 embedded platform
Li et al. Automatic bridge crack identification from concrete surface using ResNeXt with postprocessing
CN111461291A (en) Long-distance pipeline inspection method based on YO L Ov3 pruning network and deep learning defogging model
CN113033520B (en) Tree nematode disease wood identification method and system based on deep learning
CN115115924A (en) Concrete image crack type rapid intelligent identification method based on IR7-EC network
CN112446870B (en) Pipeline damage detection method, device, equipment and storage medium
CN108764456B (en) Airborne target identification model construction platform, airborne target identification method and equipment
Cepni et al. Vehicle detection using different deep learning algorithms from image sequence
CN110910440B (en) Power transmission line length determination method and system based on power image data
CN114022812B (en) DeepSort water surface floater multi-target tracking method based on lightweight SSD
Xu et al. Vision-based multi-level synthetical evaluation of seismic damage for RC structural components: a multi-task learning approach
CN111985325A (en) Aerial small target rapid identification method in extra-high voltage environment evaluation
CN114973116A (en) Method and system for detecting foreign matters embedded into airport runway at night by self-attention feature
CN110992307A (en) Insulator positioning and identifying method and device based on YOLO
CN114565842A (en) Unmanned aerial vehicle real-time target detection method and system based on Nvidia Jetson embedded hardware
CN115512247A (en) Regional building damage grade assessment method based on image multi-parameter extraction
CN116363532A (en) Unmanned aerial vehicle image traffic target detection method based on attention mechanism and re-parameterization
CN109934151B (en) Face detection method based on movidius computing chip and Yolo face

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant