CN117197789A

CN117197789A - Curtain wall frame identification method and system based on multi-scale boundary feature fusion

Info

Publication number: CN117197789A
Application number: CN202311222984.3A
Authority: CN
Inventors: 吴德成; 刘星辰; 程隆奇; 李锐; 杨平安; 刘声; 徐潇宇; 李建珍; 唐菁; 杨丽
Original assignee: Chongqing University of Post and Telecommunications
Current assignee: Chongqing University of Post and Telecommunications
Priority date: 2023-09-20
Filing date: 2023-09-20
Publication date: 2023-12-08

Abstract

The invention relates to a curtain wall frame identification method and system based on multi-scale boundary feature fusion, and belongs to the technical field of building curtain wall construction. The method specifically comprises the following steps: s1: collecting curtain wall frame color image data; s2: preprocessing data, and marking the curtain wall frame area pixel by using data marking software; s3: constructing a curtain wall frame recognition network model, wherein the curtain wall frame recognition network model comprises a multi-scale boundary feature fusion module and a channel guide pyramid convolution module; training an optimization model by adopting a cosine annealing algorithm; s4: calibrating a front-end camera and a laser radar of the curtain wall mounting robot, and registering laser point clouds with images; the hardware measurement system of the laser radar is combined with the trained curtain wall frame recognition network model to measure the three-dimensional space position and the three-dimensional space gesture of the target curtain wall frame. The invention can realize the efficient and accurate identification of the curtain wall frame, realize real-time monitoring at the terminal and provide quick and accurate installation parameters for curtain wall construction.

Description

Curtain wall frame identification method and system based on multi-scale boundary feature fusion

Technical Field

The invention belongs to the technical field of building curtain wall construction, and relates to a curtain wall frame identification method and system based on multi-scale boundary feature fusion.

Background

In the field of assembled building, building curtain walls are widely used as peripheral decorations of buildings. However, the curtain wall is mostly installed manually, so that the labor intensity is high and the danger is extremely high. Therefore, the curtain wall installation robot with mechanization and intellectualization is indispensable in curtain wall installation engineering, so that the construction efficiency and the construction quality can be improved, and the safety of workers can be improved.

Currently, most building curtain wall installations still rely on manual work. The glass curtain wall is several hundred jin heavy, one-time installation needs many people to cooperate, the people are responsible for carrying the curtain wall, and meanwhile, people also need to observe at multiple visual angles beside to conduct alignment guidance, so that the whole installation process is time-consuming and labor-consuming, and has certain potential safety hazards. The glass carrier on the market only replaces the manual carrying process, lacks automatic identification and positioning of targets and does not have information perception. In the face of curtain wall installation tasks, the rapid and accurate identification of the curtain wall frames is a key premise for realizing automation of a curtain wall robot, meanwhile, the requirement of algorithm light weight is also required to be considered, and the realization of hardware deployment of the robot is facilitated. Therefore, the invention can realize the accurate identification of the frame of the curtain wall, the network is convenient to be realized by adopting light-weight design and embedded equipment deployment, the engineering efficiency of the curtain wall installation process in the building engineering can be provided, and the invention has important significance for the curtain wall installation intellectualization.

Disclosure of Invention

In view of the above, the invention aims to provide a curtain wall frame identification method and a curtain wall frame identification system based on multi-scale boundary feature fusion, which can quickly and accurately identify a curtain wall frame by adopting a deep learning method, and improve the intelligence of curtain wall plate installation.

In order to achieve the above purpose, the present invention provides the following technical solutions:

scheme 1:

a curtain wall frame identification method based on multi-scale boundary feature fusion specifically comprises the following steps:

s1: collecting color image data of a curtain wall frame by using a high-definition camera;

s2: preprocessing the acquired image data, constructing a data set comprising a training set, a verification set and a test set, and marking the curtain wall frame area pixel by using data marking software;

s3: constructing a curtain wall frame identification network model based on multi-scale boundary feature fusion, wherein the network model comprises a multi-scale boundary feature fusion Module (MBA) and a channel guide pyramid convolution module (CGPC), and efficient curtain wall frame identification is realized by a lightweight network architecture, and the output of the multi-scale boundary feature fusion module is the input of the channel guide pyramid convolution module; training a curtain wall frame identification network model based on multi-scale boundary feature fusion by adopting a cosine annealing algorithm to obtain optimal parameters;

s4: calibrating a front-end camera of the curtain wall mounting robot, and registering laser point clouds with the image. The hardware measurement system of the laser radar is combined with the trained curtain wall frame recognition network model to measure the three-dimensional space position and the three-dimensional space gesture of the target curtain wall frame.

Further, in step S1, the collected curtain wall frame color image data includes: a plurality of curtain wall frames, low contrast curtain wall frames, and curtain wall frames under noise and clutter, wherein clutter interference is from remote buildings, railings, framing posts, and the like.

Further, in step S2, preprocessing is performed on the acquired image data, which specifically includes: screening out blurred image pairs, expanding data volume by means of translation transformation, turnover transformation and random cutting, and constructing a curtain wall frame image data set; dividing the data set into a training set, a verification set and a test set; then using data labeling software labelme to manually determine the curtain wall frame area; and converting the label into a binary image, wherein white is a target curtain wall frame, and black is a background.

Further, in step S3, the multi-scale boundary feature fusion Module (MBA) is configured to fuse boundary features of curtain wall frames with different scales, so as to improve segmentation performance of a network, and includes: a boundary perception module (BA), an inverse residual error module (IRB) and an average pooling layer;

the multi-scale boundary feature fusion Module (MBA) specifically comprises: the feature F0 extracted by convolution is respectively combined with the features F1, F2, F3 and F4 and is input to a boundary sensing module (BA), four results are subjected to channel splicing and are input to an inverse residual error module (IRB), and then five paths are divided, wherein the first path is multiplied by the feature F0 pixel by pixel and then added with the feature F0 to output B0; the second path uses the average pooling layer, multiplies the characteristic F1 pixel by pixel, and then adds the characteristic F1 to output B1; the third path uses the average pooling layer, multiplies the characteristic F2 pixel by pixel, and then adds the characteristic F2 to output B2; the fourth path uses the average pooling layer, multiplies the characteristic F3 pixel by pixel, and then adds the characteristic F3 to output B3; the fifth path uses the average pooling layer, multiplies the characteristic F4 pixel by pixel, and then adds the characteristic F4 to output B4; the features F0, F1, F2, F3, and F4 are respectively low-level detail information and high-level semantic information extracted from the data preprocessed in the step S2.

Further, in step S3, the boundary sensing module (BA) is configured to output a weight map, and specifically includes: combining the feature F0 extracted by convolution with the features F1, F2, F3 and F4 respectively, f0 and Fi are input into two 1 x 1 three-dimensional convolutions respectively, wherein i=1, 2,3,4; after Fi convolution operation, two paths are led out, the first path is mapped by using a Sigmoid activation function after 3X 7 three-dimensional convolution operation, and then twice up-sampling is carried out on the feature, and the result after the feature F0 convolution operation is multiplied pixel by pixel; the second path carries out twice up-sampling on the characteristics and then subtracts the results of the first path pixel by pixel; performing absolute value operation on the result obtained by subtracting pixels, performing three-dimensional convolution operation by using 3 multiplied by 7, mapping by using a Sigmoid activation function, and finally outputting a boundary perception weight map;

the inverse residual error module (IRB) is used for reducing parameter quantity and storing more characteristic information, and the overall network parameter quantity is 2.6M, and specifically comprises: channel expansion is performed first using a 1 x 1 convolution and with a Relu activation function, feature extraction is performed using a 3 x 3 Depth-wise convolution and with a Relu activation function, and finally the number of channels is compressed back to the original channel using a 1 x 1 convolution and with a Linear activation function.

Further, in step S3, the channel guide pyramid convolution module (CGPC) is configured to fuse curtain wall frame features more fully, and includes: an Inverse Residual Block (IRB), a channel attention block, convolution, channel stitching, pixel-by-pixel addition and other operations; the channel attention module adopts an efficient channel attention mechanism (ECA) for positioning to a curtain wall frame area and inhibiting background information;

the channel guide pyramid convolution module specifically comprises: the input is firstly subjected to 1×1 convolution operation and then divided into two paths, the first path is divided into four branches by using a channel attention mechanism (ECA), the four branches are subjected to pixel-by-pixel addition and convolution and hole convolution expansion rate are set, then the four branches are subjected to channel stitching and input into the 1×1 convolution and then added with the first path pixel by pixel, and then all the characteristics of the front are mapped as input by each layer, wherein a convolution and inverse residual error module is used for changing the size of a characteristic diagram and reducing the number of parameters.

Further, in step S3, training a curtain wall frame identification network model based on multi-scale boundary feature fusion by adopting a cosine annealing algorithm, which specifically includes: the initial learning rate is set to be 0.002, a random gradient descent (SGD) optimizer is used for training, and the learning rate is adjusted by combining a cosine annealing algorithm, so that the training process is more stable and efficient. Meanwhile, because the curtain wall frame image has a sample imbalance problem, the negative sample (background) is far more than the positive sample (curtain wall frame), and the Focal Loss (Focal Loss) function is used as a Loss function in the training process.

Wherein the cosine annealing principle is expressed as:

wherein alpha is _t Representing a current learning rate; i represents the ith index value;and->Respectively representing the minimum value and the maximum value of the learning rate; t (T) _t Representing a current number of training cycles; epoch represents the total number of training cycles; pi represents the circumference ratio.

The focal point loss function is:

where Loss is the Loss value of the focus Loss function, N represents the total number of pixels of the single Zhang Muqiang frame image, y _i For labels of two classes0 or 1, y _i ' is output belonging to y _i The probability of the label, the super parameter alpha is used for controlling the class unbalance, and the super parameter gamma is used for controlling the difficulty sample. .

Further, in step S4, calibrating the front end camera of the curtain wall mounting robot and the laser radar specifically includes: and (3) calibrating the camera internal reference by adopting a Zhang Zhengyou calibration method, then jointly calibrating the camera and the laser radar by using a P3P (Perspotive-3-Point) algorithm, and converting the Point cloud data from a laser radar coordinate system to a camera coordinate system. The method comprises the steps of sorting images acquired by a camera, carrying out local self-adaptive thresholding, solving the problem of uneven brightness of different areas on the images, detecting angular points by adopting a Harris angular point detection algorithm, and improving the calibration accuracy of internal parameters of the camera.

The hardware measurement system adopting the laser radar is combined with a trained curtain wall frame recognition network model, and specifically comprises the following steps: mapping the identification result with the laser point cloud, detecting the corner points of the curtain wall frame from the identification result and the laser radar by using a high-quality corner point detection algorithm, outputting coordinate information of the four corner points, establishing a corresponding geometric relationship according to an imaging principle and a collinear relationship between the positions of the curtain wall frame in an image coordinate system and a world coordinate system and the positions of the center points of the cameras, and solving projections between the two geometric relationships, so that the laser point cloud and the camera images are registered, and the curtain wall frame identification result is fused with spatial information. When the curtain wall frame identification result is output, the spatial position information of the curtain wall frame is output, and the distance between the camera and the laser radar and the center point of the curtain wall frame and the angles between the camera and the four corners of the curtain wall frame are obtained, so that the three-dimensional spatial position and the three-dimensional attitude of the target curtain wall frame are measured through pose estimation, and quick and accurate installation parameters are provided for curtain wall construction.

Scheme 2:

the curtain wall frame identification system based on multi-scale boundary feature fusion is characterized by comprising an image acquisition module, a laser ranging module, an AI embedded development board, a wireless communication module, a power supply module, a display module and a cloud server;

the image acquisition module is used for acquiring curtain wall frame images, inputting the acquired curtain wall frame images into a cloud server for network training, inputting the acquired curtain wall frame images into an AI embedded development board for real-time segmentation, and inputting the acquired curtain wall frame images into display equipment for visual display;

the laser ranging module adopts a laser radar and is used for acquiring the spatial position information of the curtain wall frame, and inputting the acquired spatial position information of the curtain wall frame into the AI embedded development board;

the AI embedded development board deploys the curtain wall frame recognition network model which is constructed and trained in the scheme 1 and is based on multi-scale boundary feature fusion to the AI embedded development board, is used for real-time curtain wall frame segmentation, and matches laser radar data to estimate the distance and the gesture of the curtain wall frame;

the wireless communication module is used for realizing communication between the AI embedded development board and the cloud server;

the power module is used for supplying power to the AI embedded development board, the image acquisition module, the laser ranging module and the display module;

the display module is communicated with the AI embedded development board, monitors the front-end curtain wall frame identification in real time, combines the identification result with the laser point cloud, and outputs the space information of the curtain wall frame;

the cloud server is used for receiving the acquired curtain wall frame images, training a curtain wall frame identification network model based on multi-scale boundary feature fusion, and deploying the optimal model on the AI embedded development board after obtaining optimal parameters.

The invention has the beneficial effects that: the invention designs a curtain wall efficient frame identification method based on multi-scale boundary feature fusion, which adopts a multi-scale boundary feature fusion module and a channel guide pyramid convolution module, fully fuses boundary features of a curtain wall frame and can accurately identify the curtain wall frame; meanwhile, the invention designs a curtain wall frame efficient identification system based on multi-scale boundary feature fusion, which can deploy a network model to an AI embedded development board to realize real-time and efficient identification of the curtain wall frame and can monitor the curtain wall frame identification effect on a display module in real time.

Additional advantages, objects, and features of the invention will be set forth in part in the description which follows and in part will become apparent to those having ordinary skill in the art upon examination of the following or may be learned from practice of the invention. The objects and other advantages of the invention may be realized and obtained by means of the instrumentalities and combinations particularly pointed out in the specification.

Drawings

For the purpose of making the objects, technical solutions and advantages of the present invention more apparent, the present invention will be described in the following preferred detail with reference to the accompanying drawings, in which:

FIG. 1 is a schematic diagram of an implementation flow of a curtain wall frame efficient identification method based on multi-scale boundary feature fusion;

FIG. 2 is a diagram of a curtain wall frame efficient identification network structure based on multi-scale boundary feature fusion;

FIG. 3 is a block diagram of a multi-scale boundary feature fusion module of the present invention;

FIG. 4 is a block diagram of a channel guide pyramid convolution module of the present invention;

FIG. 5 is a diagram of a block diagram of an inverted residual error module according to the present invention;

FIG. 6 is a frame diagram of the curtain wall frame efficient identification system based on multi-scale boundary feature fusion of the invention.

Detailed Description

Other advantages and effects of the present invention will become apparent to those skilled in the art from the following disclosure, which describes the embodiments of the present invention with reference to specific examples. The invention may be practiced or carried out in other embodiments that depart from the specific details, and the details of the present description may be modified or varied from the spirit and scope of the present invention. It should be noted that the illustrations provided in the following embodiments merely illustrate the basic idea of the present invention by way of illustration, and the following embodiments and features in the embodiments may be combined with each other without conflict.

Example 1:

referring to fig. 1 to 6, the present embodiment provides a curtain wall frame identification method based on multi-scale boundary feature fusion, as shown in fig. 1, for accurately positioning a curtain wall frame, and implementing real-time and efficient segmentation of the curtain wall frame, and the method specifically includes the following steps:

step 1: and collecting color image data of the curtain wall frame by using a high-definition camera. Curtain wall frame types include: under different weather conditions, a plurality of curtain wall frames, low contrast curtain wall frames, curtain wall frames under noise and clutter, wherein the clutter interference is from remote buildings, railings, framing posts and the like.

Step 2: and preprocessing the acquired curtain wall frame image data. The pretreatment mode comprises the following steps: the blurred image pairs are screened, and the data size is expanded through three modes of translation transformation, overturn transformation and random clipping.

Step 3: constructing a curtain wall block image data set, and dividing the curtain wall block image data set into a training set, a verification set and a test set;

step 4: and marking the curtain wall frame area pixel by using data marking software. The data labeling software manually determines the curtain wall frame area by using labelme, and then converts the label into a binary image, wherein white is a target curtain wall frame and black is a background.

Step 5: and constructing a curtain wall frame efficient identification network architecture based on multi-scale boundary feature fusion, as shown in fig. 2. The network comprises a boundary perception module (BA), a multi-scale boundary feature fusion Module (MBA), an inverse residual error module (IRB), a channel guide pyramid convolution module (CGPC) and an average pooling layer. The boundary sensing module uses three-dimensional convolution to acquire a wider range of receptive fields and extract curtain wall frame features; the multi-scale boundary feature fusion module uses a boundary sensing module, a residual pouring block and an average pooling layer to fuse the boundary features of the multi-scale curtain wall frame, so that the recognition precision is improved; the channel guide pyramid convolution module uses an efficient channel attention module (ECA) to improve the characterization capability of curtain wall frame features, inhibit background features and improve the identification effect, can reduce the parameter quantity of a network, set different convolution void ratios to extract different levels of features, can capture the detail information of the curtain wall frame by smaller void ratios, can capture global features by larger void ratios, and simultaneously, sets jump connection to increase the fusion effect of the curtain wall frame features.

A multi-scale boundary feature fusion Module (MBA) is shown in fig. 3. The curtain wall frame boundary features with different scales are required to be fused, and the method specifically comprises the following steps: the characteristic F0 extracted by convolution is respectively combined with F1, F2, F3 and F4 and is input to a boundary sensing module (BA), four results are subjected to channel splicing and are input to an inverse residual error module (IRB), and then five paths are divided, wherein the first path is multiplied by the characteristic F0 pixel by pixel and then added with the F0 to output B0; the second path uses the average pooling layer, multiplies the characteristic F1 pixel by pixel, and then adds the characteristic F1 to output B1; the third path uses the average pooling layer, multiplies the characteristic F2 pixel by pixel, and then adds the characteristic F2 to output B2; the fourth path uses the average pooling layer, multiplies the characteristic F3 pixel by pixel, and then adds the characteristic F3 to output B3; the fifth path uses the average pooling layer, multiplies the feature F4 pixel by pixel, and then adds the feature F4 to output B4. The boundary sensing module is used for outputting a weight graph, and specifically comprises the following steps: combining the features F0 extracted by convolution with F1, F2, F3 and F4 respectively, F0 and Fi are respectively input into two in a 1 x 1 convolution; after the Fi convolution operation, two paths are led out, wherein the first path uses the convolution operation of 3 multiplied by 7, and then mapped by a Sigmoid activation function and multiplied by the result of the feature F0 convolution operation pixel by pixel; the result of the second path and the first path is subtracted pixel by pixel; and performing absolute value operation on the result obtained by subtracting the pixels, performing convolution operation on the result by using 3 multiplied by 7, mapping the result by using a Sigmoid activation function, and finally outputting a boundary perception weight map.

A channel guide pyramid convolution module (CGPC) is shown in fig. 4. More thorough fusion of curtain wall framing features is required, including in particular: the method comprises the steps of performing convolution, attention mechanism, channel stitching, pixel-by-pixel addition, inverse residual error (IRB) and the like on an input, dividing the input into two paths after performing the convolution operation of 1X 1, dividing the first path into four branches by using the channel attention mechanism, performing pixel-by-pixel addition, setting convolution and hole convolution expansion rate, performing channel stitching on the four branches, inputting the four branches into the convolution of 1X 1, performing pixel-by-pixel addition on the four branches with the first path, and then mapping all the characteristics in the front into each layer, wherein the size of a characteristic map is changed by using convolution and inverse residual error, and the quantity of parameters is reduced.

As shown in fig. 5, an Inverse Residual Block (IRB). For reducing the parameter quantity and saving more characteristic information, the overall network parameter quantity is 2.6M, and the method specifically comprises the following steps: channel expansion is performed first using a 1 x 1 convolution and with a Relu activation function, feature extraction is performed using a 3 x 3 Depth-wise convolution and with a Relu activation function, and finally the number of channels is compressed back to the original channel using a 1 x 1 convolution and with a Linear activation function.

Step 6: the focus loss function is chosen as the loss function in the training process, which is expressed as:

where Loss is the Loss value of the focus Loss function, N represents the total number of pixels of the single Zhang Muqiang frame image, y _i For the classification label 0 or 1, y _i ' is output belonging to y _i The probability of the label, the super parameter alpha is used for controlling the class unbalance, and the super parameter gamma is used for controlling the difficulty sample. .

Step 7: training a curtain wall frame efficient identification network model fused by multi-scale boundary features by adopting a cosine annealing algorithm, and specifically comprises the following steps: the initial learning rate is set to be 0.002, the SGD optimizer is used for training, the learning rate is adjusted by combining a cosine annealing algorithm, so that the training process is more stable and efficient, and the cosine annealing principle is expressed as follows:

wherein alpha is _t Representing a current learning rate; i represents the ith index value;and->Respectively representing the minimum value and the maximum value of the learning rate; t (T) _t Representing a current number of training cycles; epoch represents the total number of training cycles, set to a fixed value of 200; pi represents the circumference ratio.

Step 8: and (3) calibrating the camera internal reference by adopting a Zhang Zhengyou calibration method, then jointly calibrating the camera and the laser radar by using a P3P (Perspotive-3-Point) algorithm, and converting the Point cloud data from a laser radar coordinate system to a camera coordinate system. The method comprises the steps of sorting images acquired by a camera, carrying out local self-adaptive thresholding, solving the problem of uneven brightness of different areas on the images, detecting angular points by adopting a Harris angular point detection algorithm, and improving the calibration accuracy of internal parameters of the camera. The resulting camera intrinsic parameters can be expressed as:

where f represents the focal length of the camera, (u) ₀ ,v ₀ ) Representing the origin of the imaging, thereby enabling the conversion of the pixel coordinate system from the world coordinate system:

wherein R represents a rotation matrix of the camera, T represents a translation vector of the camera, (u, v) is represented as a pixel coordinate system, (X _c ,Y _c ,Z _c ) Representing the camera coordinate system, (X) _W ,Y _W ,Z _W ) Represented as a world coordinate system.

Step 9: and the CloudCompare software is used for selecting the laser point cloud area of the curtain wall frame, so that the point cloud of the appointed curtain wall frame area is obtained, and the corner points of the curtain wall frame are conveniently identified. Extracting curtain wall frame corner points in curtain wall frame laser point cloud data by using a Harris 3D corner point detection algorithm, outputting XYZ space coordinate information of four corner points, extracting curtain wall frame corner points in a curtain wall frame identification result by using a Harris 2D corner point detection algorithm, outputting coordinate information of the four corner points, establishing a corresponding geometric relationship according to an imaging principle and a collinear relationship between positions of the curtain wall frame in an image coordinate system and a world coordinate system and positions of camera center points, and solving projections among the geometric relationship, so that the identification result is mapped with the laser point cloud, thereby registering the laser point cloud and the identification image, and fusing the spatial information with the curtain wall frame identification result. The projection can be formulated as:

wherein (X, y) is the image coordinate of the curtain wall frame recognition result, (X) _FW ,Y _FW ,Z _FW ) World coordinates of corner points of curtain wall frame, (X) _CW ,Y _CW ,Z _CW ) World coordinates, r, of camera center _ij (i=1, 2,3; j=1, 2, 3) is a rotation parameter of the image coordinates and world coordinates, and can be expressed as:

wherein,omega and theta are attitude angles in a world coordinate system. And (3) establishing a projection relation from the laser point cloud to the image, and outputting spatial position information of the curtain wall frame when the curtain wall frame identification result is output, so as to obtain the distances between the camera and the laser radar and the center point of the curtain wall frame and the angles between the camera and the four corners of the curtain wall frame, thereby measuring the three-dimensional spatial position and the gesture of the target curtain wall frame through pose estimation and providing quick and accurate installation parameters for curtain wall construction.

Example 2:

referring to fig. 6, the invention further provides a curtain wall frame identification system based on multi-scale boundary feature fusion, which is used for accurately positioning a curtain wall frame and realizing real-time and efficient segmentation of the curtain wall frame, and comprises an image acquisition module, a laser ranging module, an AI embedded development board, a wireless communication module, a power module, a display module and a cloud server;

the AI embedded development board deploys the curtain wall frame identification network model based on multi-scale boundary feature fusion constructed in the embodiment 1 to the AI embedded development board for real-time curtain wall frame segmentation and matching with laser radar data to estimate the distance and the gesture of the curtain wall frame;

The identification method of the system specifically comprises the following steps:

step 1: training a curtain wall frame efficient identification network model based on multi-scale boundary feature fusion at a cloud server, and deploying the trained curtain wall frame efficient identification network model based on multi-scale boundary feature fusion to an NIVDIA Jetson series development board for real-time curtain wall frame segmentation. Installing CUDA and cuDNN on the NIVDIA Jetson series development board by using a deep learning tool and configuring a PyTorch deep learning framework, improving the reasoning performance of the model, and running an application program by connecting to the NIVDIA Jetson series development board; the power module is used for supplying power to NIVDIA Jetson series development boards, cameras, laser radars and display screens.

Step 2: optical filters were added to the industrial camera in the NIVDIA Jetson series development board connection. Because the curtain wall frame shot by the camera is easily affected by light, the shot curtain wall frame image is unclear and is unfavorable for the curtain wall frame identification, an optical filter is placed in front of the camera to filter visible light rays with specific wavelength and incoherent direction, so that the light interference can be reduced, and the image quality can be improved.

Step 3: and an industrial camera is adopted to collect curtain wall frame images in real time.

Step 4: and collecting laser point cloud data of the curtain wall frame by adopting a laser radar.

Step 5: the wireless communication module is used for realizing communication between the NIVDIA Jetson series development board and the cloud server. And downloading the trained optimal model from the cloud server to be deployed to the development board.

Step 6: and the trained optimal model is used on NIVDIA Jetson series development boards to conduct real-time and efficient identification on curtain wall frames. Registering with curtain wall frame laser point cloud data to enable a curtain wall frame identification result to be fused with space information, outputting distances between a camera and a laser radar and the center point of the curtain wall frame and angles between the camera and four corners of the curtain wall frame, and accordingly measuring the three-dimensional space position and the three-dimensional space gesture of the target curtain wall frame through pose estimation.

Step 7: and displaying the curtain wall frame identification result on the display module in real time.

The invention can realize the efficient and accurate identification of the curtain wall frame, and realize real-time monitoring at the terminal, and realize the efficient identification of the curtain wall frame by a lightweight network architecture.

Finally, it is noted that the above embodiments are only for illustrating the technical solution of the present invention and not for limiting the same, and although the present invention has been described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications and equivalents may be made thereto without departing from the spirit and scope of the present invention, which is intended to be covered by the claims of the present invention.

Claims

1. The curtain wall frame identification method based on multi-scale boundary feature fusion is characterized by comprising the following steps of:

s3: constructing a curtain wall frame identification network model based on multi-scale boundary feature fusion, wherein the network model comprises a multi-scale boundary feature fusion module and a channel guide pyramid convolution module, and the output of the multi-scale boundary feature fusion module is the input of the channel guide pyramid convolution module; training a curtain wall frame identification network model based on multi-scale boundary feature fusion by adopting a cosine annealing algorithm to obtain optimal parameters;

s4: calibrating a front-end camera and a laser radar of the curtain wall mounting robot, and registering laser point clouds with images; the hardware measurement system of the laser radar is combined with the trained curtain wall frame recognition network model to measure the three-dimensional space position and the three-dimensional space gesture of the target curtain wall frame.

2. The curtain wall frame identification method according to claim 1, wherein in step S1, the collected curtain wall frame color image data includes: a plurality of curtain wall frames, a low contrast curtain wall frame, and a curtain wall frame under noise and clutter.

3. The curtain wall frame identification method according to claim 1, wherein in step S2, preprocessing is performed on the collected image data, specifically including: screening out blurred images, expanding data volume by means of translation transformation, turnover transformation and random cutting, and constructing a curtain wall frame image data set; dividing the data set into a training set, a verification set and a test set; manually determining a curtain wall frame area by using data labeling software; and converting the label into a binary image, wherein white is a target curtain wall frame, and black is a background.

4. The curtain wall frame identification method according to claim 1, wherein in step S3, the multi-scale boundary feature fusion module is configured to fuse boundary features of curtain wall frames with different scales, and improve segmentation performance of a network, and the method includes: the device comprises a boundary sensing module, an inverse residual error module and an average pooling layer;

the multi-scale boundary feature fusion module specifically comprises: the feature F0 extracted by convolution is respectively combined with the features F1, F2, F3 and F4 and is input to a boundary sensing module, four results are subjected to channel splicing and are input to a residual pouring module, and then five paths are divided, wherein the first path is multiplied by the feature F0 pixel by pixel and then added with the feature F0 to output B0; the second path uses the average pooling layer, multiplies the characteristic F1 pixel by pixel, and then adds the characteristic F1 to output B1; the third path uses the average pooling layer, multiplies the characteristic F2 pixel by pixel, and then adds the characteristic F2 to output B2; the fourth path uses the average pooling layer, multiplies the characteristic F3 pixel by pixel, and then adds the characteristic F3 to output B3; the fifth path uses the average pooling layer, multiplies the characteristic F4 pixel by pixel, and then adds the characteristic F4 to output B4; the features F0, F1, F2, F3, and F4 are respectively low-level detail information and high-level semantic information extracted from the data preprocessed in the step S2.

5. The curtain wall frame identification method according to claim 1 or 4, wherein in step S3, the boundary sensing module is configured to output a weight map, and specifically includes: combining the feature F0 extracted by convolution with the features F1, F2, F3 and F4 respectively, f0 and Fi are input into two 1 x 1 three-dimensional convolutions respectively, wherein i=1, 2,3,4; after Fi convolution operation, two paths are led out, the first path is mapped by using a Sigmoid activation function after 3X 7 three-dimensional convolution operation, and then twice up-sampling is carried out on the feature, and the result after the feature F0 convolution operation is multiplied pixel by pixel; the second path carries out twice up-sampling on the characteristics and then subtracts the results of the first path pixel by pixel; performing absolute value operation on the result obtained by subtracting pixels, performing three-dimensional convolution operation by using 3 multiplied by 7, mapping by using a Sigmoid activation function, and finally outputting a boundary perception weight map;

the residual pouring module is used for reducing the quantity of parameters and specifically comprises the following steps: channel expansion is performed first using a 1 x 1 convolution and with a Relu activation function, feature extraction is performed using a 3 x 3 Depth-wise convolution and with a Relu activation function, and finally the number of channels is compressed back to the original channel using a 1 x 1 convolution and with a Linear activation function.

6. The curtain wall frame identification method according to claim 1, wherein in step S3, the channel guide pyramid convolution module is configured to fuse curtain wall frame features more fully, and the method comprises: the system comprises a residual pouring module, a channel attention module, convolution, channel splicing and pixel-by-pixel addition; the channel attention module adopts a high-efficiency channel attention mechanism and is used for positioning the curtain wall frame area and inhibiting background information;

the channel guide pyramid convolution module specifically comprises: firstly, carrying out 1X 1 convolution operation on an input, dividing the input into two paths, dividing the first path into four branches by using a channel attention mechanism, carrying out pixel-by-pixel addition, setting convolution and hole convolution expansion rate, then carrying out channel splicing on the four branches, inputting the four branches into the 1X 1 convolution, carrying out pixel-by-pixel addition with the first path, and then mapping all the characteristics as input by each layer, wherein a convolution and inverse residual error module is used for changing the size of a characteristic diagram and reducing the quantity of parameters.

7. The curtain wall frame identification method according to claim 1, wherein in step S3, a cosine annealing algorithm is adopted to train a curtain wall frame identification network model based on multi-scale boundary feature fusion, and the method specifically comprises the following steps: setting an initial learning rate, training by using a random gradient descent optimizer, and adjusting the learning rate by combining a cosine annealing algorithm, so that the training process is more stable and efficient; meanwhile, because the curtain wall frame image has the problem of sample unbalance, a negative sample, namely a background, is far more than a positive sample, namely the curtain wall frame, and the focus loss function is used as a loss function in the training process;

wherein the cosine annealing principle is expressed as:

wherein alpha is _t Representing a current learning rate; i represents the ith index value;and->Respectively representing the minimum value and the maximum value of the learning rate; t (T) _t Representing a current number of training cycles; epoch represents the total number of training cycles; pi represents the circumference ratio;

the focal point loss function is:

where Loss is the Loss value of the focus Loss function, N represents the total number of pixels of the single Zhang Muqiang frame image, y _i For the classification tag 0 or 1, y' _i Is output to be y _i The probability of the label, the super parameter alpha is used for controlling the class unbalance, and the super parameter gamma is used for controlling the difficulty sample.

8. The curtain wall frame identification method according to claim 1, wherein in step S4, the calibrating the front end camera of the curtain wall mounting robot and the laser radar specifically includes: calibrating the camera internal parameters by adopting a Zhang Zhengyou calibration method, then jointly calibrating the camera and the laser radar by using a P3P algorithm, and converting the point cloud data from a laser radar coordinate system to a camera coordinate system; the method comprises the steps of sorting images acquired by a camera, carrying out local self-adaptive thresholding, solving the problem of uneven brightness of different areas on the images, detecting angular points by adopting a Harris angular point detection algorithm, and improving the calibration precision of internal parameters of the camera; after the calibration of the camera and the laser radar is completed, images are acquired in real time, and the trained curtain wall frame recognition network model is used for recognizing the curtain wall frame, so that a recognition result of the curtain wall frame is obtained.

9. The curtain wall frame identification method according to claim 1, wherein in step S4, the hardware measurement system using the laser radar is combined with a trained curtain wall frame identification network model, and specifically includes: mapping the identification result with the laser point cloud, detecting the corner points of the curtain wall frame from the identification result and the laser radar by using a high-quality corner point detection algorithm, outputting coordinate information of the four corner points, establishing a corresponding geometric relationship according to an imaging principle and a collinear relationship between the positions of the curtain wall frame in an image coordinate system and a world coordinate system and the positions of the center points of the cameras, and solving projections between the two geometric relationships, so that the laser point cloud and the camera images are registered, and the curtain wall frame identification result is fused with spatial information; when the curtain wall frame identification result is output, the spatial position information of the curtain wall frame is output, and the distance between the camera and the laser radar and the center point of the curtain wall frame and the angles between the camera and the four corners of the curtain wall frame are obtained, so that the three-dimensional spatial position and the three-dimensional attitude of the target curtain wall frame are measured through pose estimation, and quick and accurate installation parameters are provided for curtain wall construction.

10. The curtain wall frame identification system based on multi-scale boundary feature fusion is characterized by comprising an image acquisition module, a laser ranging module, an AI embedded development board, a wireless communication module, a power supply module, a display module and a cloud server;

the AI embedded development board deploys the curtain wall frame identification network model based on multi-scale boundary feature fusion in any one of claims 1-9 to the AI embedded development board for real-time curtain wall frame segmentation and matching with laser radar data to estimate the distance and the gesture of the curtain wall frame;