CN116805365A

CN116805365A - High-precision construction site falling stone detection method and device

Info

Publication number: CN116805365A
Application number: CN202310625498.XA
Authority: CN
Inventors: 祝东明; 林遵虎; 黄礼春; 王泽国; 王敏帅; 杨雁彬; 梁伟森; 孟繁志; 余荣曹; 郭正; 韦显坚
Original assignee: China Railway Construction Engineering Group No5 Construction Co ltd; China Railway Construction Engineering Group Co Ltd
Current assignee: China Railway Construction Engineering Group No5 Construction Co ltd; China Railway Construction Engineering Group Co Ltd
Priority date: 2023-05-30
Filing date: 2023-05-30
Publication date: 2023-09-26

Abstract

The invention provides a high-precision method and a device for detecting and identifying falling rocks of a construction site in a natural complex environment. The method comprises the steps of obtaining a first mountain rock photograph of a construction site; transmitting the acquired first mountain rock photograph of the construction site to an image target detection neural network model to obtain the position of a rock anchor frame, outputting a rock detection result diagram with the rock anchor frame, and setting the result diagram as a reference image; acquiring a second mountain rock photograph of the construction site, which is positioned at the same position as the first mountain rock photograph of the construction site; transmitting the acquired second mountain rock photograph of the construction site to an image target detection neural network model to obtain the position of a rock anchor frame, outputting a rock detection result diagram with the rock anchor frame, and setting the result diagram as a comparison image; and comparing the comparison image with the reference image, and sending out an alarm when the position deviation of the anchor frame of the rock exceeds a threshold value. The invention can effectively improve the site falling stone detection efficiency and reduce unnecessary casualties.

Description

High-precision construction site falling stone detection method and device

Technical Field

The invention relates to the technical field of computer vision, in particular to a high-precision method and device for detecting falling rocks of a construction site.

Background

The object detection technology is one of computer technologies, and is currently commonly used in smart cities and smart sites, and the technology can utilize an anchor box (anchor box) for different types of objects in pictures by using a computer. After the image target detection technology is improved by a YOLOv1 network structure designed by Redmon et al, the positions of different types of objects in the picture can be defined by using anchor frames. From the creation of YOLOv1 to date, the YOLOv5 neural network structure designed by Bochkovskiy, ultralytics et al, to date, the YOLOv8 target detection network designed by Ultralytics et al has been excellent in an intelligent target detection system.

Building construction work is carried out under the environments with various natural environments and complex and changeable climate conditions, and natural disasters such as rock collapse, landslide and the like are easy to occur in construction areas in mountain environments. The mountain rock falls into the construction area of the construction area and brings serious harm to the life and property safety of personnel and construction infrastructure in the construction area, the matters are abrupt, but the traditional method for manually inspecting the mountain rock falls has the problems of large workload and more omission rate. The current mountain construction detection method is less in falling rocks, and is mainly realized by two methods of contact and non-contact. Wherein the contact type is used for analyzing and detecting the signal change generated when sensing the contact of the foreign matters through tension fences and the like; the non-contact type radar, visible light equipment and infrared equipment are used for capturing signals, so that the falling rocks of the construction site are identified.

In recent years, with the development of speed and accuracy of target detection in the field of road detection and the like. The students begin to research and introduce the target detection technology into the falling rock detection field, hu Xia and the like realize railway falling rock detection by utilizing a mixed attention mechanism and a method for improving YOLOX, and obtain higher recognition precision; liu Linya et al utilize the YOLOv3 algorithm to construct a mountain railway side slope falling stone detection deep learning model, the method uses a smart phone and a visible light image acquisition device to collect various mountain railway rock samples to construct a rock sample data set, and then the algorithm model is transplanted to the phone through the miniaturization characteristic of the YOLOv3 algorithm, so that falling stone detection through the phone APP is realized, and the method is more sensitive to falling stone targets with smaller volumes and has capturing performance.

The target detection technology based on deep learning does not effectively improve the recognition efficiency of the site falling stone detection system and also effectively improves the falling stone detection performance, but the method greatly influences the recognition accuracy and recognition speed of the model in construction areas facing complex natural environments, such as vegetation shielding rocks, insufficient site light and the like.

Disclosure of Invention

Aiming at complex construction site environments, the invention provides a high-precision construction site falling stone detection method and device for effectively improving the construction site falling stone detection efficiency and reducing unnecessary casualties.

In order to achieve the above purpose, the technical scheme of the invention is as follows:

in a first aspect, the present invention provides a high-precision method for detecting falling rocks in a worksite, including:

acquiring a first mountain rock photograph of a construction site;

transmitting the acquired first mountain rock photograph of the construction site to an image target detection neural network model to obtain the position of a rock anchor frame, outputting a rock detection result diagram with the rock anchor frame, and setting the result diagram as a reference image;

acquiring a second mountain rock photograph of the construction site, which is positioned at the same position as the first mountain rock photograph of the construction site;

transmitting the acquired second mountain rock photograph of the construction site to an image target detection neural network model to obtain the position of a rock anchor frame, outputting a rock detection result diagram with the rock anchor frame, and setting the result diagram as a comparison image;

and comparing the comparison image with the reference image, and sending out an alarm when the position deviation of the anchor frame of the rock exceeds a threshold value.

Further, the image target detection neural network model is improved with a network structure of YOLO series as a reference, and comprises:

using an adaptive activation function for the YOLOv8 backbone network;

constructing a self-adaptive residual attention module by using a residual network method;

the target detection head using YOLO outputs a rock detection result map with anchor frame.

Further, the adaptive activation function is:

y＝(p ₁ -p ₂ )x·σ(β(p ₁ -p ₂ )x)+p ₂ x(1)

the self-adaptive activation function utilizes a parameter beta to self-adaptively select whether to activate neuron output, when beta tends to infinity, the function output is nonlinear, and when beta tends to 0, the function output is linear; also introduce into p ₁ And p ₂ Controlling an output upper bound and an upper bound of the adaptive activation function; in the formula, the expression sigma represents a sigmoid function, p ₁ And p ₂ Representing the learnable adaptive tuning parameters.

Further, p ₁ And p ₂ Initializing a set value to be 16, carrying out self-adaptive optimization learning by adopting an optimization mode of driving quantity update in a set updating method, wherein a calculation formula is shown in a formula (2):

in the formula, mu is the momentum of the model, epsilon is the learning rate of the model, and p _i The value of i is 1 or 2; at the same time update p _i Regularization need not be used in the value process, otherwise p _i The value of (2) will drop to 0 during training.

Further, the operation formula of the parameter beta is shown in formula (3):

h, W in the formula refers to dimension information of the input feature map, W ₁ And W is ₂ Is a convolution operation with the parameter added to the convolution operation16, σ is a sigmoid function.

Further, the first mountain rock photograph of the construction site and the second mountain rock photograph of the construction site are obtained by:

monitoring equipment is arranged around a construction area, visible light video monitoring equipment is respectively arranged at four corners of the construction area, and visible light cameras are arranged on rectangular sides of the construction area at intervals to monitor the construction operation area;

and intercepting the video in a mode of fixing frequency of video information acquired by video shooting equipment arranged in the construction area so as to obtain a first mountain rock photograph of the construction site and a second mountain rock photograph of the construction site.

Further, a photo of an input image target detection neural network model is subjected to preprocessing operation firstly to improve the visual effect of an image;

the preprocessing operation comprises image graying, data normalization and image enhancement.

Further, the first mountain rock photograph of the construction site or the second mountain rock photograph of the construction site is input into an image target detection neural network model, and convolution operation with a convolution kernel of 3*3 is performed first to obtain abstract representation information of the photograph;

after the characterization information of the image is obtained, judging whether the convolution output is activated or not through the self-adaptive activation function, outputting the characteristic information when the convolution operation is activated, and setting 0 when the convolution operation is not activated;

the output image characteristic information continuously acquires the semantic characteristics of the high-level image through convolution of a residual structure, wherein the residual network structure consists of 1x1 convolution and 3x3 convolution, the 1x1 convolution is used for extracting low-dimensional characteristics, and the 3x3 convolution is used for extracting high-dimensional characteristics

After the pixel characteristic diagram is obtained, the characteristic diagram passes through an adaptive residual attention module, the adaptive residual attention module processes the pixel characteristic diagram through two branches, the left branch of the adaptive residual attention module utilizes a Bottom-up top-down structure to downsample an image to obtain a high layer, and then the image size is restored through upsampling; the right branch acquires the integral characteristic information of the image in a convolution mode, and finally the outputs of the left branch and the right branch are integrated together in a full-connection mode to obtain a rock detection result diagram with the anchor frame.

In a second aspect, the present invention provides a high precision worksite drop detection device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, the processor implementing the steps of any one of the methods described above when executing the computer program.

In a third aspect, the present invention provides a computer-readable storage medium storing a computer program, characterized in that the computer program when executed by a processor implements the steps of the method according to any one of claims 1 to 8

Compared with the prior art, the invention has the beneficial effects that:

the invention provides a high-precision construction site falling stone detection method aiming at detection and identification of construction site falling stones in a natural complex environment. The method is realized by a target detection algorithm based on a deep learning neural network model, specifically, the method mixes a target detection neural network and an attention network, improves the attention network on the basis of a YOLOv8 target detection neural network, and realizes high-precision falling stone detection. Compared with the traditional construction method of the target detection algorithm neural network, the method uses the activation function proposed by the neural network search technology to construct, namely, a new activation function is adopted to reconstruct the YOLOv8 main network, and the activation function is replaced by the new activation function, so that the accuracy of the model is effectively improved, the false detection rate of the model is reduced, and the robustness of the extracted features of the model is improved. Secondly, the method adopts the attention network method, improves the feature extraction capability of the target detection network, and effectively improves the extraction capability of the model on the image semantic features by adding the residual attention mechanism module, thereby improving the performance of the system on falling rock detection.

Drawings

Fig. 1 is a flowchart of a high-precision method for detecting falling rocks in a construction site provided in embodiment 1 of the present invention;

FIG. 2 is a schematic diagram of the composition of an image object detection neural network model;

FIG. 3 is a flow chart of a backbone network design of an image target detection neural network model;

FIG. 4 is a flow chart of an adaptive residual attention module design;

Detailed Description

The technical scheme of the invention is further described below with reference to the accompanying drawings and examples.

Example 1:

in order to prevent the occurrence of potential safety hazard events of mountain falling rocks in a construction site and reduce unnecessary casualties, as shown in fig. 1, the embodiment provides a high-precision method for detecting the falling rocks in the construction site, which specifically comprises the following steps:

101. acquiring a first mountain rock photograph of a construction site;

102. transmitting the acquired first mountain rock photograph of the construction site to an image target detection neural network model to obtain the position of a rock anchor frame, outputting a detection result diagram with the anchor frame, and setting the result diagram as a reference image;

103. acquiring a second mountain rock photograph position of the construction site, which is the same as the first mountain rock photograph position of the construction site;

104. transmitting the acquired second mountain rock photograph of the construction site to an image target detection neural network model to obtain the position of a rock anchor frame, outputting a detection result diagram with the anchor frame, and setting the result diagram as a comparison image;

105. and comparing the comparison image with the reference image, and sending out an alarm when the position deviation of the anchor frame of the rock exceeds a threshold value.

The image target detection neural network model is based on a YOLOv8 target detection network, an original activation function is replaced by a self-adaptive activation function in a YOLOv8 feature extraction main network to reconstruct a network structure, and neurons in the YOLOv8 main network are controlled through the self-adaptive activation function, so that erroneous output of the neurons is reduced, the generalization capability of the model is effectively improved, and the feature learning capability of the model is improved; secondly, compared with a target detection algorithm of YOLOv8, the invention provides a residual attention feature module, the processing capability of the model on effective feature information of the image is effectively improved by means of residual and attention ideas, the acquisition of the model on the edge feature information of the image is enhanced by utilizing a feature channel multiplication method, and the recognition capability of the model on small rocks is enhanced, so that the problems of low efficiency and low accuracy of a target detection network on a site falling rock detection task are reduced.

Specifically, the overall structure diagram of the image target detection neural network model neural network is shown in fig. 2:

the method is characterized in that the network structure of the YOLO series is used as a benchmark to be improved, firstly, an adaptive activation function is used for a main network of the YOLO v8, the characteristic extraction capacity of a neural network model is improved, secondly, a residual attention network (namely a residual attention characteristic module) is constructed by utilizing a residual network idea, the recognition capacity of the model to small target rocks is improved, and finally, a target detection head of the YOLO is used for outputting a final target detection result diagram. The backbone network architecture flow chart is shown in fig. 3.

The method selects whether to activate convolution for self-adaption by controlling parameter variable values. Most of the conventional target detection networks adopt a ReLU function as an activation function of a convolution module, but the method is too simple and has the problem of neuron death. The SiLU function is adopted in YOLOv8, the function is formed by combining weighted average of a Simoid function and a ReLU function, because of non-monotonic increment of the SiLU function and no forced output setting of 0 when input is 0, the method can learn a large amount of weight all the time, but the method retains the value of negative output in the image field to cause the reduction of the recognition accuracy of a network model, the invention proposes to replace an activation function used by YOLOv8 by using an adaptive activation function, the activation function can adaptively select whether to activate neuron output by using a parameter beta, the function output is nonlinear when beta tends to infinity, and the function output is linear when beta tends to 0, besides, the method also introduces an input p ₁ And p ₂ And controlling the output upper bound and the upper bound of the self-adaptive activation function, and reducing the error of the neural network output NAN caused by the SiLU unbounded property of the neural network. The adaptive activation function is shown as (1)

y＝(p ₁ -p ₂ )x·σ(β(p ₁ -p ₂ )x)+p ₂ x (1)

The expression sigma represents a sigmoid function, p ₁ And p ₂ Representing the learnable adaptive tuning parameters. Wherein p is ₁ And p ₂ Initializing a set value to be 16, carrying out self-adaptive optimization learning by adopting an optimization mode of driving quantity update in a set updating method, wherein a calculation formula is shown in a formula (2):

mu in the formula is the momentum of the model, epsilon is the learning rate of the model, and p _i I is 1 or 2. At the same time update p _i Regularization need not be used in the value process, otherwise p _i The value of (2) will drop to 0 during training.

In the adaptive activation function, beta is a dynamic parameter, when beta approaches infinity, the function output is close to nonlinearity, and when beta=1, the adaptive activation function is equivalent to a SiLU function. However, the adaptive activation function and the SiLU function are not greatly different, so that the adaptive activation function takes the parameter beta as a parameter, and generates the parameter beta in a similar way to a channel multiplied attention network, so that the adaptive activation function can explicitly learn the activation degree, and the operation formula of the parameter beta is shown as a formula (3):

h, W in the formula refers to dimension information of the input feature map, W ₁ And W is ₂ Is a convolution operation of two times, a convolution scaling parameter with a parameter size of 16 is added in the convolution operation, and sigma is a sigmoid function.

In YOLOv8, the attention network is not used for improving the image feature extraction capability of the neural network model, but a spatial feature pyramid method is utilized for increasing the receptive field of the neural network, so that more information is obtained. The adaptive residual attention module used in the invention is shown in fig. 4, and the adaptive residual attention module is stacked through a convolution structure, so that the model is easier to optimize and learn, and the Bottom-up top-down structure is designed on the branch of the attention module, so that the model can extract the characteristics of the image when the model extracts the characteristics in the forward direction.

The Bottom-up top-down structure extracts high-dimensional features through a series of convolution and pooling layer modules, and increases the receptive field of the model, and effective features are extracted by using pixel information in the high-level features. And restoring the image size by utilizing an up-sampling mode, so that the original input characteristic receptive field image can be output as a characteristic image with pixel characteristic information, and the characteristic extraction capacity of the model is enhanced.

The method is further described below in connection with a specific application scenario example:

step 1: taking a construction site mountain rock photograph (namely a first construction site mountain rock photograph) taken by a camera installed according to a construction site scene as an input of an image target detection neural network model;

step 2: firstly, preprocessing operations such as image graying, data normalization, image enhancement and the like are carried out on a photo of an input model, so that the visual effect of an image is improved;

step 3: the picture is preprocessed and then input into an image target detection neural network model, convolution operation with a convolution kernel of 3*3 is performed first, and abstract representation information of the picture is obtained;

step 4: after the characteristic information of the image is obtained, judging whether the convolution output is activated or not through the self-adaptive activation function, outputting the characteristic information when the convolution operation is activated, and setting 0 when the convolution operation is not activated.

Step 5: the output image feature information continuously acquires semantic features of the high-level image through convolution of a residual structure, wherein the residual network structure is formed by 1x1 convolution and 3x3 convolution as shown in fig. 3, the 1x1 convolution is used for extracting low-dimensional features, and the 3x3 convolution is used for extracting high-dimensional features;

step 6: the image characteristic information is subjected to four convolution output selections and residual structure convolution and then is required to pass through a pyramid pooling network, so that pictures with different sizes are generated, different characteristics are generated by each picture, and finally, the pictures with all sizes are integrated to obtain a pixel characteristic diagram;

step 7: after obtaining the pixel characteristic diagram, the characteristic diagram passes through a self-adaptive residual error attention module, the pixel characteristic diagram is processed through two branches, the left branch of the self-adaptive residual error attention module utilizes a Bottom-up top-down structure to downsample an image to obtain a high layer, then the image size is restored through upsampling, the right branch of the self-adaptive residual error attention module obtains the integral characteristic information of the image through convolution, and finally the outputs of the left branch and the right branch are integrated together through a full connection mode to obtain a rock image of a construction area with an anchor frame, and the image is set as a reference image;

step 8: the visible light equipment of the ground construction area continuously monitors the rock area, the video collected by monitoring acquires images (namely, second mountain rock pictures of the construction site) in a frame extraction mode every second, the images are used for acquiring rock images with anchor frames in a mode of step 2-7, and the images are called contrast images;

and 9, comparing the comparison image with the reference image, and sending out an alarm when the position deviation of the rock anchor frame exceeds a threshold value.

In summary, compared with the prior art, the invention has the following technical advantages:

in order to effectively improve the recognition performance of intelligent falling rock detection in a construction site, the scheme provides a main network of a self-adaptive activation function and a self-adaptive residual error attention network based on a deep learning target detection neural network technology, and has the advantages of high efficiency and strong portability.

In the deep learning neural network model designed by the scheme, firstly, unlike the traditional target detection neural network (a backbone network is constructed by using a traditional activation function), the backbone network designed by the scheme uses an adaptive activation function to realize the linear/nonlinear of the model adaptive control convolution so as to control whether to output image characteristic information. The method effectively reduces the output of invalid characteristic information, ensures the effectiveness of convolution output of the model to the maximum extent, and is beneficial to the model to acquire the effective characteristic pixel information.

In the feature fusion output stage, compared with the traditional feature output mode, feature fusion and output are directly carried out in a full-connection mode, the scheme considers that part of features can be lost in the full-connection process, and designs a self-adaptive residual attention mechanism module, the module adopts a two-path branch merging mode to extract semantic features of high-level images, can further acquire features of small target rocks, and is beneficial to further improving the performance of a system.

Example 2:

the high-precision construction site falling stone detection device provided by the embodiment comprises a processor, a memory and a computer program which is stored in the memory and can run on the processor, for example, the high-precision construction site falling stone detection program. The processor, when executing the computer program, implements the steps of embodiment 1 described above, such as the steps shown in fig. 1.

The computer program may be divided into one or more modules/units, which are stored in the memory and executed by the processor to accomplish the present invention, for example. The one or more modules/units may be a series of computer program instruction segments capable of performing a specific function for describing the execution of the computer program in the high-precision worksite rock fall detection device.

The high-precision construction site falling stone detection device can be a desktop computer, a notebook computer, a palm computer, a cloud server and other computing equipment. The high-precision construction site falling stone detection device can comprise, but is not limited to, a processor and a memory.

The processor may be a central processing unit (Central Processing Unit, CPU), other general purpose processor, digital signal processor (Digital Signal Processor, DSP), application specific integrated circuit (Application Specific Integrated Circuit, ASIC), off-the-shelf programmable gate array (FieldProgrammable Gate Array, FPGA) or other programmable logic device, discrete gate or transistor logic device, discrete hardware components, or the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

The memory may be an internal memory element of the high-precision work site fall detection device, such as a hard disk or a memory of the high-precision work site fall detection device. The memory may be an external storage device of the high-precision site falling stone detection apparatus, for example, a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash Card (Flash Card) or the like provided on the high-precision site falling stone detection apparatus. Further, the memory may also include both an internal memory unit and an external memory device of the high-precision worksite falling rock detection apparatus. The memory is used for storing the computer program and other programs and data required by the high-precision construction site falling stone detection device. The memory may also be used to temporarily store data that has been output or is to be output.

Example 3:

the present embodiment provides a computer readable storage medium storing a computer program which, when executed by a processor, implements the steps of the method described in embodiment 1.

The computer readable medium can be any means that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device. More specific examples (a non-exhaustive list) of the computer-readable medium would include the following: an electrical connection (electronic device) having one or more wires, a portable computer diskette (magnetic device), a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber device, and a portable compact disc read-only memory (CDROM). Additionally, the computer readable medium may even be paper or another suitable medium upon which the program is printed, such as by optically scanning the paper or other medium, then editing, interpreting, or otherwise processing as necessary, and electronically obtaining the program, which is then stored in a computer memory.

The above embodiments are only for illustrating the technical concept and features of the present invention, and are intended to enable those skilled in the art to understand the content of the present invention and implement the same, and are not intended to limit the scope of the present invention. All equivalent changes or modifications made in accordance with the essence of the present invention are intended to be included within the scope of the present invention.

Claims

1. The method for detecting the falling rocks of the construction site with high precision is characterized by comprising the following steps of:

acquiring a first mountain rock photograph of a construction site;

2. The high-precision construction site falling stone detection method according to claim 1, wherein the image target detection neural network model is improved based on a network structure of YOLO series, comprising:

using an adaptive activation function for the YOLOv8 backbone network;

3. The high-precision worksite rockfall detection method according to claim 2, wherein the adaptive activation function is:

y＝(p ₁ -p ₂ )x·σ(β(p ₁ -p ₂ )x)+p ₂ x(1)

4. A high precision worksite falling stone detection method according to claim 3, wherein p ₁ And p ₂ Initializing a set value to be 16, carrying out self-adaptive optimization learning by adopting an optimization mode of driving quantity update in a set updating method, wherein a calculation formula is shown in a formula (2):

5. A high-precision construction site falling stone detection method as claimed in claim 3, wherein the operation formula of the parameter beta is as shown in formula (3):

6. The high-precision worksite rockfall detection method according to claim 1, wherein the first mountain rock photograph of the construction site and the second mountain rock photograph of the construction site are obtained by:

7. The high-precision construction site falling stone detection method according to claim 1, wherein the input image target is detected with a photo of a neural network model, and a preprocessing operation is firstly performed to improve the visual effect of the image;

8. The high-precision construction site falling stone detection method as claimed in claim 5, wherein,

the first mountain rock photograph of the construction site or the second mountain rock photograph of the construction site is input into an image target detection neural network model, and convolution operation with a convolution kernel of 3*3 is performed first to obtain abstract representation information of the photograph;

9. A high precision worksite drop detection device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, characterized in that the processor implements the steps of the method according to any one of claims 1 to 8 when executing the computer program.

10. A computer readable storage medium storing a computer program, characterized in that the computer program when executed by a processor implements the steps of the method according to any one of claims 1 to 8.