CN116012296B - Prefabricated part detection method based on super-resolution and semi-supervised learning - Google Patents

Prefabricated part detection method based on super-resolution and semi-supervised learning Download PDF

Info

Publication number
CN116012296B
CN116012296B CN202211532025.7A CN202211532025A CN116012296B CN 116012296 B CN116012296 B CN 116012296B CN 202211532025 A CN202211532025 A CN 202211532025A CN 116012296 B CN116012296 B CN 116012296B
Authority
CN
China
Prior art keywords
network
resolution
super
pictures
training
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202211532025.7A
Other languages
Chinese (zh)
Other versions
CN116012296A (en
Inventor
万华平
张文杰
胡鹏华
葛荟斌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University ZJU
Original Assignee
Zhejiang University ZJU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University ZJU filed Critical Zhejiang University ZJU
Priority to CN202211532025.7A priority Critical patent/CN116012296B/en
Publication of CN116012296A publication Critical patent/CN116012296A/en
Application granted granted Critical
Publication of CN116012296B publication Critical patent/CN116012296B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/30Computing systems specially adapted for manufacturing

Landscapes

  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

The method for detecting the prefabricated part based on super-resolution and semi-supervised learning improves the quality of the prefabricated part picture by using a super-resolution algorithm, and reduces the high cost of data labeling work by using a semi-supervised learning algorithm. The specific implementation steps are as follows: (1) building and training a super-resolution network Real-ESRGAN; (2) collecting pictures of prefabricated components under various construction environments, inputting the pictures into a generation model of a Real-ESRGAN network, and improving the picture quality; (3) introducing a semi-supervised learning algorithm mean-teacher network to train a target detector Yolov5; (4) and detecting the real-time collected field data by using the trained Yolov5 model, and positioning and classifying the prefabricated components assembled on the building site. The invention can improve the image quality, realize excellent detection performance under the condition of limited mark data and provide technical support for the management of the assembly type construction site.

Description

Prefabricated part detection method based on super-resolution and semi-supervised learning
Technical Field
The invention relates to a method for detecting prefabricated components in an assembly type, in particular to a technology for detecting prefabricated components in a construction site based on super-resolution and semi-supervised learning algorithms, and belongs to the field of structural engineering.
Background
Along with the development of the assembled building industry in China, the demand of assembled prefabricated components is rapidly increased, a large number of prefabricated components of different types are piled up on a construction site, and real-time detection of the prefabricated components for guiding construction has very important research significance. The existing method for detecting the prefabricated part is mainly manual inspection, is time-consuming and labor-consuming, and cannot meet the engineering requirements.
In recent years, the computer vision technology realizes the automation of remote dynamic monitoring of the construction site, and improves the construction management level of the construction site. With the improvement of the computing power of computers, the target detection technology based on deep learning is rapidly developed. A large number of object detection models (such as Yolo, fast R-CNN, etc.) have been widely used for the detection of prefabricated building elements due to their high precision and non-contact characteristics.
However, deep learning based object detection model training requires large-scale, high quality and well-labeled datasets. However, the existing data sets still have the following problems: (1) The resolution of the prefabricated part picture is too low, the size of the detection target in the picture is small, the carried information is little, and the feature expression capability is weak, so that the detection performance of the deep learning model is poor; (2) The data set pictures need to be marked by professionals with corresponding knowledge reserves, so that the cost is high, and the situation of error leakage is easy to occur. Thus, a new technique is needed to increase image resolution and overcome the limitations of manual marking.
Disclosure of Invention
Aiming at the defects of the prior art, the invention provides an assembled prefabricated part detection method based on super-resolution and semi-supervised learning, so as to improve the feasibility of vision-based concrete prefabricated part detection in practical application. The concrete contents include:
the prefabricated part detection method based on super-resolution and semi-supervised learning comprises the following steps:
A. training a super-resolution network Real-ESRGAN;
A1. collecting a prefabricated part picture, and preprocessing data to obtain a training data set of the Real-ESRGAN of the super-resolution network;
A2. a Real-ESRGAN network is built, and the network consists of a generation model G and an identification model D. The generation model G generates a corresponding super-resolution map using the input low-resolution map, and the discrimination model D determines whether the picture is a super-resolution map generated by the generation model or an original high-resolution map. The quality of generated pictures is improved through continuous mutual game between G and D;
A3. training Real-ESRGAN network: firstly, fixing parameters of G, and training D to accurately distinguish real images from generated images; then, the parameters of D are fixed and G is trained to generate super-resolution pictures that can confuse D. Repeating the above processes to finally obtain a required generation model;
B. collecting pictures of assembled prefabricated components under various construction environments, inputting the pictures into a generation model of a Real-ESRGAN network, amplifying the resolution of the pictures to be twice as high as the original resolution, and improving the quality of the pictures;
C. training a target detection model Yolov5 by using a semi-supervised learning algorithm;
C1. dividing the pictures with improved quality into a training set and a testing set, and labeling 50% of the pictures to obtain labeled pictures x l Label y, unlabeled picture x u The method comprises the steps of carrying out a first treatment on the surface of the For the original dataset (x l ,x u ) Respectively adding random noise to form training data set, which is marked as (x' l ,x' u ) And (x' l ,x” u );
C2. Constructing a semi-supervised learning framework mean-teacher network, wherein the framework consists of two student networks and a teacher network with the same structure;
C3. the data (x 'after random noise is added to the target detector using the Yolov5 model as a student network and a teacher network' l ,x' u ) And (x' l ,x” u ) Respectively inputting the result into a student network and a teacher network to obtain output results. And calculating the loss value according to the consistency regularization criterion, and iteratively updating the parameters of the student network and the teacher network. Finally, verifying the performance of the trained student network on a test data set;
D. and detecting image or video data acquired at the construction site by using the trained Yolov5 model, and positioning and classifying the prefabricated components assembled at the construction site.
Further, the data preprocessing in the step A1 includes: the pictures are adjusted to the same size, in order to avoid picture distortion, the long sides of the pictures are scaled to 640 pixels, while the short sides keep the original aspect ratio of the pictures scaled to the corresponding size, denoted s, so the original high resolution picture size is denoted (640, s); scaling the high resolution picture size to (320, s/2) as input to generate a low resolution picture of the model; the high resolution map and the low resolution map together comprise a training dataset of Real-ESRGAN.
Further, the training loss function of the step A3 includes: l1 distance loss L 1 Countering loss L G And perceived loss L percep Their functions can be expressed as
L percep =||φ(z)-φ(G(z))|| 1 (3)
Wherein z, y and x represent an input low resolution picture, a generated super resolution picture and a high resolution picture, respectively; I/S the level is that,and φ (·) represents the L1 norm, the desired function, and the VGG penalty function, respectively; the final loss value is obtained by the following formula
L=ηL 1 +λL G +L percep (4)
Wherein eta and lambda represent weight coefficients, taken as 1 and 0.1, respectively.
Further, the mean-teacher network in the step C2 is composed of a student network and a teacher network, wherein the student network performs parameter optimization by a random gradient descent method, and the teacher network updates according to parameters of the student network.
Further, the loss function in the step C3 includes a loss function of the student network and a teacher network optimization function:
the loss function of the student network comprises supervisionLoss of learning L sl And semi-supervised learning loss L based on consistency regularization criterion ssl The functions of which are respectively expressed as
L sl =L OD [f s (x' l ),y l ] (5)
L ssl =L OD [f s (x' l ,x' u ),f t (x” l ,x” u )] (6)
Wherein L is OD Representing the loss function of the target detector, f s Representing a student network, f t Representing a teacher network. Total loss is recorded as L T =L sl +L ssl
Parameter θ 'of the teacher network at the t training round' t Can be given by
θ' t =αθ' t-1 +(1-α)θ t (7)
Wherein θ is t Representing parameters of the student network, α represents a smoothing parameter, which increases with increasing training rounds.
Compared with the prior art, the technology has the following advantages:
(1) The detection model obtained by the technical training has better detection performance, can overcome some common challenges such as shielding, blurring, small targets and the like in building site detection, and has higher feasibility in the vision-based concrete prefabricated part detection practice.
(2) Compared with the image amplified by the linear interpolation method, the image of the super-resolution prefabricated part obtained by the Real-ESRGAN network is clearer, more characteristic information can be provided, and the accuracy of the target detection model can be effectively improved.
(3) The model performance trained by using 50% of marking data in the proposed technology is equal to the detection performance achieved by using 100% of marking data in the supervised learning technology, and the high cost caused by manual marking can be greatly reduced.
Drawings
FIG. 1 is a flow chart of the present technique;
FIG. 2 is a diagram of a super resolution network Real-ESRGAN of the present invention;
fig. 3 (a) to 3 (d) are super-resolution pictures according to the present invention, wherein fig. 3 (a) and 3 (c) are original pictures, and fig. 3 (b) and 3 (d) are quality-improved super-resolution pictures;
FIG. 4 is a semi-supervised learning mean-teacher framework diagram of the present invention.
Detailed Description
The method for detecting the prefabricated part based on super-resolution and semi-supervised learning is further described in detail below with reference to the accompanying drawings. The implementation technology of the invention is shown in fig. 1, and specifically comprises the following steps:
A. training a super-resolution network Real-ESRGAN;
A1. collecting prefabricated part pictures, wherein 2000 pictures are collected in the example; preprocessing data: the pictures are adjusted to the same size, in order to avoid picture distortion, the long sides of the pictures are scaled to 640 pixels, while the short sides keep the original aspect ratio of the pictures scaled to the corresponding size, denoted s, so the original high resolution picture size is denoted (640, s); scaling the high resolution picture size to (320, s/2) as input to generate a low resolution picture of the model; the high resolution map and the low resolution map together comprise a training dataset of Real-ESRGAN.
A2. Building a Real-ESRGAN network, wherein the network consists of a generation model G and an identification model D, as shown in figure 2;
A3. training Real-ESRGAN network using data set: firstly, fixing parameters of G, and training D to accurately distinguish real images from generated images; then, the parameters of D are fixed and G is trained to generate super-resolution pictures that can confuse D. Repeating the above processes to finally obtain a required generation model;
B. collecting prefabricated component pictures under various construction environments, inputting the prefabricated component pictures into a generation model of a Real-ESRGAN network, and improving the picture quality, wherein an original picture and a corresponding super-resolution picture are shown in FIG. 3;
C. training a target detection model Yolov5 by using a semi-supervised learning algorithm;
C1. marking the super-resolution prefabricated part pictures with improved quality, dividing the super-resolution prefabricated part pictures into a training set and a test set, wherein the training set and the test set comprise 5000 training pictures and 900 test pictures, 2500 training pictures are marked, and 900 test pictures are marked;
C2. constructing a semi-supervised learning framework mean-teacher network, wherein the framework consists of two student networks and a teacher network with the same structure, as shown in fig. 4;
C3. taking a Yolov5 model as a target detector of a student network and a teacher network, inputting pictures of all training data sets into a semi-supervised learning framework for training, and verifying the performance of the trained student network on a test data set;
C4. the test result shows that the performance of the model trained by the proposed technology is far higher than that of the supervised learning model and the supervised learning model combined with super-resolution, especially when the proportion of marked data is low, the effect of improving the performance is obvious, which fully explains the effectiveness of the proposed technology;
D. the trained Yolov5 model is used for processing image or video data acquired on a construction site, positioning and classifying prefabricated components assembled on a construction site, and detection results show that the technology has good feasibility in detecting the concrete prefabricated components based on vision.
The description of the embodiments of the present invention is merely an enumeration of possible implementation for the inventive concept, and the scope of protection of the present invention should not be construed as limited to the specific forms set forth in the embodiments, as well as equivalent technical solutions conceived by those skilled in the art according to the inventive concept.

Claims (5)

1. The method for detecting the prefabricated part based on super-resolution and semi-supervised learning is characterized by comprising the following steps of:
A. training a super-resolution network Real-ESRGAN;
A1. collecting a prefabricated part picture, and preprocessing data to obtain a training data set of the Real-ESRGAN of the super-resolution network;
A2. building a Real-ESRGAN network, wherein the network consists of a generation model G and an identification model D; the generation model G generates a corresponding super-resolution image by using the input low-resolution image, and the identification model D judges whether the image is the super-resolution image generated by the generation model or the original high-resolution image; the quality of generated pictures is improved through continuous mutual game between G and D;
A3. training Real-ESRGAN network: firstly, fixing parameters of G, and training D to accurately distinguish real images from generated images; then, fixing the parameters of D, and training G to generate a super-resolution picture capable of confusing D; repeating the above processes to finally obtain a required generation model;
B. collecting pictures of assembled prefabricated components under various construction environments, inputting the pictures into a generation model of a Real-ESRGAN network, amplifying the resolution of the pictures to be twice as high as the original resolution, and improving the quality of the pictures;
C. training a target detection model Yolov5 by using a semi-supervised learning algorithm;
C1. dividing the pictures with improved quality into a training set and a testing set, and labeling 50% of the pictures to obtain labeled pictures x l Label y, unlabeled picture x u The method comprises the steps of carrying out a first treatment on the surface of the For the original dataset (x l ,x u ) Respectively adding random noise to form training data set, which is marked as (x' l ,x′ u ) And (x) l ,x″ u );
C2. Constructing a semi-supervised learning framework mean-teacher network, wherein the framework consists of two student networks and a teacher network with the same structure;
C3. the data (x 'after random noise is added to the target detector using the Yolov5 model as a student network and a teacher network' l ,x′ u ) And (x) l ,x″ u ) Respectively inputting the data into a student network and a teacher network to obtain output results; calculating a loss value according to a consistency regularization criterion, and iteratively updating parameters of a student network and a teacher network; finally, verifying the performance of the trained student network on a test data set;
D. and detecting image or video data acquired at the construction site by using the trained Yolov5 model, and positioning and classifying the prefabricated components assembled at the construction site.
2. The method for detecting the prefabricated part based on super-resolution and semi-supervised learning according to claim 1, wherein the method comprises the following steps of: the data preprocessing in the step A1 comprises the following steps: the pictures are adjusted to the same size, in order to avoid picture distortion, the long sides of the pictures are scaled to 640 pixels, while the short sides keep the original aspect ratio of the pictures scaled to the corresponding size, denoted s, so the original high resolution picture size is denoted (640, s); scaling the high resolution picture size to (320, s/2) as input to generate a low resolution picture of the model; the high resolution map and the low resolution map together comprise a training dataset of Real-ESRGAN.
3. The method for detecting the prefabricated part based on super-resolution and semi-supervised learning according to claim 1, wherein the method comprises the following steps of: the training loss function of the step A3 includes: l1 distance loss L 1 Countering loss L G And perceived loss L percep Their functions can be expressed as
L percep =||φ(z)-φ(G(z))|| 1 (3)
Wherein z, y and x represent an input low resolution picture, a generated super resolution picture and a high resolution picture, respectively; I/S the level is that,and φ (·) represents the L1 norm, the desired function, and the VGG penalty function, respectively; the final loss value is obtained by the following formula
L=ηL 1 +λL G +L percep (4)
Wherein eta and lambda represent weight coefficients, taken as 1 and 0.1, respectively.
4. The method for detecting the prefabricated part based on super-resolution and semi-supervised learning according to claim 1, wherein the method comprises the following steps of: the mean-teacher network in the step C2 consists of a student network and a teacher network, wherein the student network performs parameter optimization through a random gradient descent method, and the teacher network updates according to parameters of the student network.
5. The method for detecting the prefabricated part based on super-resolution and semi-supervised learning according to claim 1, wherein the method comprises the following steps of: the training loss function in the step C3 includes a student network loss function and a teacher network optimization function:
the student network loss function comprises a supervised learning loss L sl And semi-supervised learning loss L based on consistency regularization criterion ssl The functions of which are respectively expressed as
L sl =L OD [f s (x′ l ),y l ] (5)
L ssl =L OD [f s (x′ l ,x′ u ),f t (x″ l ,x″ u )] (6)
Wherein L is OD Representing the loss function of the target detector, f s Representing a student network, f t Representing a teacher network;
total loss is recorded as L T =L sl +L ssl
Parameter θ 'of the teacher network at the t training round' t Can be given by
θ′ t =αθ′ t-1 +(1-α)θ t (7)
Wherein θ is t Representing parameters of the student network, α represents a smoothing parameter, which increases with increasing training rounds.
CN202211532025.7A 2022-12-01 2022-12-01 Prefabricated part detection method based on super-resolution and semi-supervised learning Active CN116012296B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211532025.7A CN116012296B (en) 2022-12-01 2022-12-01 Prefabricated part detection method based on super-resolution and semi-supervised learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211532025.7A CN116012296B (en) 2022-12-01 2022-12-01 Prefabricated part detection method based on super-resolution and semi-supervised learning

Publications (2)

Publication Number Publication Date
CN116012296A CN116012296A (en) 2023-04-25
CN116012296B true CN116012296B (en) 2023-10-24

Family

ID=86018208

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211532025.7A Active CN116012296B (en) 2022-12-01 2022-12-01 Prefabricated part detection method based on super-resolution and semi-supervised learning

Country Status (1)

Country Link
CN (1) CN116012296B (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020056274A1 (en) * 2018-09-14 2020-03-19 The Johns Hopkins University Machine learning processing of contiguous slice image data
CN113096015A (en) * 2021-04-09 2021-07-09 西安电子科技大学 Image super-resolution reconstruction method based on progressive sensing and ultra-lightweight network
CN113240580A (en) * 2021-04-09 2021-08-10 暨南大学 Lightweight image super-resolution reconstruction method based on multi-dimensional knowledge distillation
CN113743514A (en) * 2021-09-08 2021-12-03 庆阳瑞华能源有限公司 Knowledge distillation-based target detection method and target detection terminal
CN113920013A (en) * 2021-10-14 2022-01-11 中国科学院深圳先进技术研究院 Small image multi-target detection method based on super-resolution
CN115018852A (en) * 2022-08-10 2022-09-06 四川大学 Abdominal lymph node detection method and device based on semi-supervised learning
CN115205122A (en) * 2022-09-06 2022-10-18 深圳大学 Method, system, apparatus and medium for generating hyper-resolution image maintaining structure and texture

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB201604672D0 (en) * 2016-03-18 2016-05-04 Magic Pony Technology Ltd Generative methods of super resolution
US11232541B2 (en) * 2018-10-08 2022-01-25 Rensselaer Polytechnic Institute CT super-resolution GAN constrained by the identical, residual and cycle learning ensemble (GAN-circle)
US12118692B2 (en) * 2020-12-17 2024-10-15 PicsArt, Inc. Image super-resolution
CN112884064B (en) * 2021-03-12 2022-07-29 迪比(重庆)智能科技研究院有限公司 Target detection and identification method based on neural network

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020056274A1 (en) * 2018-09-14 2020-03-19 The Johns Hopkins University Machine learning processing of contiguous slice image data
CN113096015A (en) * 2021-04-09 2021-07-09 西安电子科技大学 Image super-resolution reconstruction method based on progressive sensing and ultra-lightweight network
CN113240580A (en) * 2021-04-09 2021-08-10 暨南大学 Lightweight image super-resolution reconstruction method based on multi-dimensional knowledge distillation
CN113743514A (en) * 2021-09-08 2021-12-03 庆阳瑞华能源有限公司 Knowledge distillation-based target detection method and target detection terminal
CN113920013A (en) * 2021-10-14 2022-01-11 中国科学院深圳先进技术研究院 Small image multi-target detection method based on super-resolution
CN115018852A (en) * 2022-08-10 2022-09-06 四川大学 Abdominal lymph node detection method and device based on semi-supervised learning
CN115205122A (en) * 2022-09-06 2022-10-18 深圳大学 Method, system, apparatus and medium for generating hyper-resolution image maintaining structure and texture

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
Real-World Person Re-Identification via Super-Resolution and Semi-Supervised Methods;Limin Xia等;《IEEE Access》;35834-35845 *
Semi-supervised student-teacher learning for single image super-resolution;Lin Wang等;《Pattern Recognition》;1-11 *
双判别器生成对抗网络及其在接触网鸟巢检测与半监督学习中的应用;金炜东;杨沛;唐鹏;;中国科学:信息科学(07);150-164 *

Also Published As

Publication number Publication date
CN116012296A (en) 2023-04-25

Similar Documents

Publication Publication Date Title
CN111476781A (en) Concrete crack identification method and device based on video semantic segmentation technology
CN110490100A (en) Ground automatic identification based on deep learning names method and system
CN113610778B (en) Bridge surface crack detection method and system based on semantic segmentation
CN113177456B (en) Remote sensing target detection method based on single-stage full convolution network and multi-feature fusion
CN113378736B (en) Remote sensing image semi-supervised semantic segmentation method based on transformation consistency regularization
CN114897804A (en) Ground penetrating radar tunnel lining quality detection method based on self-supervision learning
CN111104850A (en) Remote sensing image building automatic extraction method and system based on residual error network
Ju et al. Research on OMR recognition based on convolutional neural network tensorflow platform
Li et al. A deep learning-based indoor acceptance system for assessment on flatness and verticality quality of concrete surfaces
CN116630285A (en) Photovoltaic cell type incremental defect detection method based on significance characteristic hierarchical distillation
CN113177592A (en) Image segmentation method and device, computer equipment and storage medium
Li et al. Automatic annotation algorithm of medical radiological images using convolutional neural network
CN114549780B (en) Intelligent detection method for large complex component based on point cloud data
Franken et al. Rebuilding the cadastral map of The Netherlands, the artificial intelligence solution
CN117454116A (en) Ground carbon emission monitoring method based on multi-source data interaction network
Zhang et al. Automated detection and segmentation of tunnel defects and objects using YOLOv8-CM
Qian et al. Analysis method of apparent quality of fair-faced concrete based on convolutional neural network machine learning
CN116012296B (en) Prefabricated part detection method based on super-resolution and semi-supervised learning
CN111583417B (en) Method and device for constructing indoor VR scene based on image semantics and scene geometry joint constraint, electronic equipment and medium
CN115841557B (en) Intelligent crane operation environment construction method based on digital twin technology
CN117197456A (en) HE dyeing-oriented pathological image cell nucleus simultaneous segmentation classification method
CN110826478A (en) Aerial photography illegal building identification method based on countermeasure network
CN117095199A (en) Industrial visual anomaly detection system based on simplex diffusion model
Zhang et al. CAD‐Aided 3D Reconstruction of Intelligent Manufacturing Image Based on Time Series
Cheng et al. RETRACTED ARTICLE: Capacitance pin defect detection based on deep learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant