CN116805365A - High-precision construction site falling stone detection method and device - Google Patents
High-precision construction site falling stone detection method and device Download PDFInfo
- Publication number
- CN116805365A CN116805365A CN202310625498.XA CN202310625498A CN116805365A CN 116805365 A CN116805365 A CN 116805365A CN 202310625498 A CN202310625498 A CN 202310625498A CN 116805365 A CN116805365 A CN 116805365A
- Authority
- CN
- China
- Prior art keywords
- rock
- construction site
- image
- photograph
- convolution
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000001514 detection method Methods 0.000 title claims abstract description 92
- 238000010276 construction Methods 0.000 title claims abstract description 80
- 239000004575 stone Substances 0.000 title claims abstract description 25
- 239000011435 rock Substances 0.000 claims abstract description 92
- 238000000034 method Methods 0.000 claims abstract description 44
- 238000010586 diagram Methods 0.000 claims abstract description 31
- 238000003062 neural network model Methods 0.000 claims abstract description 24
- 230000006870 function Effects 0.000 claims description 55
- 230000004913 activation Effects 0.000 claims description 34
- 230000003044 adaptive effect Effects 0.000 claims description 28
- 238000004590 computer program Methods 0.000 claims description 15
- 238000005457 optimization Methods 0.000 claims description 6
- 230000008569 process Effects 0.000 claims description 6
- 238000012544 monitoring process Methods 0.000 claims description 5
- 210000002569 neuron Anatomy 0.000 claims description 5
- 238000007781 pre-processing Methods 0.000 claims description 5
- 238000004364 calculation method Methods 0.000 claims description 3
- 238000010606 normalization Methods 0.000 claims description 3
- 238000012549 training Methods 0.000 claims description 3
- 230000000007 visual effect Effects 0.000 claims description 3
- 238000012512 characterization method Methods 0.000 claims description 2
- 238000013528 artificial neural network Methods 0.000 description 11
- 238000005516 engineering process Methods 0.000 description 8
- 238000000605 extraction Methods 0.000 description 7
- 238000004422 calculation algorithm Methods 0.000 description 6
- 238000013135 deep learning Methods 0.000 description 4
- 238000012545 processing Methods 0.000 description 4
- 230000009286 beneficial effect Effects 0.000 description 3
- 238000013461 design Methods 0.000 description 3
- 239000000284 extract Substances 0.000 description 3
- 230000007246 mechanism Effects 0.000 description 3
- 230000004927 fusion Effects 0.000 description 2
- 238000011176 pooling Methods 0.000 description 2
- 238000013459 approach Methods 0.000 description 1
- 238000009435 building construction Methods 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000013136 deep learning model Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000016273 neuron death Effects 0.000 description 1
- 230000003121 nonmonotonic effect Effects 0.000 description 1
- 239000013307 optical fiber Substances 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/25—Determination of region of interest [ROI] or a volume of interest [VOI]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0464—Convolutional networks [CNN, ConvNet]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/048—Activation functions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/28—Quantising the image, e.g. histogram thresholding for discrimination between background and foreground patterns
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/74—Image or video pattern matching; Proximity measures in feature spaces
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/50—Context or environment of the image
- G06V20/52—Surveillance or monitoring of activities, e.g. for recognising suspicious objects
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Computing Systems (AREA)
- Software Systems (AREA)
- Biomedical Technology (AREA)
- Life Sciences & Earth Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Molecular Biology (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Medical Informatics (AREA)
- Databases & Information Systems (AREA)
- Image Analysis (AREA)
Abstract
The invention provides a high-precision method and a device for detecting and identifying falling rocks of a construction site in a natural complex environment. The method comprises the steps of obtaining a first mountain rock photograph of a construction site; transmitting the acquired first mountain rock photograph of the construction site to an image target detection neural network model to obtain the position of a rock anchor frame, outputting a rock detection result diagram with the rock anchor frame, and setting the result diagram as a reference image; acquiring a second mountain rock photograph of the construction site, which is positioned at the same position as the first mountain rock photograph of the construction site; transmitting the acquired second mountain rock photograph of the construction site to an image target detection neural network model to obtain the position of a rock anchor frame, outputting a rock detection result diagram with the rock anchor frame, and setting the result diagram as a comparison image; and comparing the comparison image with the reference image, and sending out an alarm when the position deviation of the anchor frame of the rock exceeds a threshold value. The invention can effectively improve the site falling stone detection efficiency and reduce unnecessary casualties.
Description
Technical Field
The invention relates to the technical field of computer vision, in particular to a high-precision method and device for detecting falling rocks of a construction site.
Background
The object detection technology is one of computer technologies, and is currently commonly used in smart cities and smart sites, and the technology can utilize an anchor box (anchor box) for different types of objects in pictures by using a computer. After the image target detection technology is improved by a YOLOv1 network structure designed by Redmon et al, the positions of different types of objects in the picture can be defined by using anchor frames. From the creation of YOLOv1 to date, the YOLOv5 neural network structure designed by Bochkovskiy, ultralytics et al, to date, the YOLOv8 target detection network designed by Ultralytics et al has been excellent in an intelligent target detection system.
Building construction work is carried out under the environments with various natural environments and complex and changeable climate conditions, and natural disasters such as rock collapse, landslide and the like are easy to occur in construction areas in mountain environments. The mountain rock falls into the construction area of the construction area and brings serious harm to the life and property safety of personnel and construction infrastructure in the construction area, the matters are abrupt, but the traditional method for manually inspecting the mountain rock falls has the problems of large workload and more omission rate. The current mountain construction detection method is less in falling rocks, and is mainly realized by two methods of contact and non-contact. Wherein the contact type is used for analyzing and detecting the signal change generated when sensing the contact of the foreign matters through tension fences and the like; the non-contact type radar, visible light equipment and infrared equipment are used for capturing signals, so that the falling rocks of the construction site are identified.
In recent years, with the development of speed and accuracy of target detection in the field of road detection and the like. The students begin to research and introduce the target detection technology into the falling rock detection field, hu Xia and the like realize railway falling rock detection by utilizing a mixed attention mechanism and a method for improving YOLOX, and obtain higher recognition precision; liu Linya et al utilize the YOLOv3 algorithm to construct a mountain railway side slope falling stone detection deep learning model, the method uses a smart phone and a visible light image acquisition device to collect various mountain railway rock samples to construct a rock sample data set, and then the algorithm model is transplanted to the phone through the miniaturization characteristic of the YOLOv3 algorithm, so that falling stone detection through the phone APP is realized, and the method is more sensitive to falling stone targets with smaller volumes and has capturing performance.
The target detection technology based on deep learning does not effectively improve the recognition efficiency of the site falling stone detection system and also effectively improves the falling stone detection performance, but the method greatly influences the recognition accuracy and recognition speed of the model in construction areas facing complex natural environments, such as vegetation shielding rocks, insufficient site light and the like.
Disclosure of Invention
Aiming at complex construction site environments, the invention provides a high-precision construction site falling stone detection method and device for effectively improving the construction site falling stone detection efficiency and reducing unnecessary casualties.
In order to achieve the above purpose, the technical scheme of the invention is as follows:
in a first aspect, the present invention provides a high-precision method for detecting falling rocks in a worksite, including:
acquiring a first mountain rock photograph of a construction site;
transmitting the acquired first mountain rock photograph of the construction site to an image target detection neural network model to obtain the position of a rock anchor frame, outputting a rock detection result diagram with the rock anchor frame, and setting the result diagram as a reference image;
acquiring a second mountain rock photograph of the construction site, which is positioned at the same position as the first mountain rock photograph of the construction site;
transmitting the acquired second mountain rock photograph of the construction site to an image target detection neural network model to obtain the position of a rock anchor frame, outputting a rock detection result diagram with the rock anchor frame, and setting the result diagram as a comparison image;
and comparing the comparison image with the reference image, and sending out an alarm when the position deviation of the anchor frame of the rock exceeds a threshold value.
Further, the image target detection neural network model is improved with a network structure of YOLO series as a reference, and comprises:
using an adaptive activation function for the YOLOv8 backbone network;
constructing a self-adaptive residual attention module by using a residual network method;
the target detection head using YOLO outputs a rock detection result map with anchor frame.
Further, the adaptive activation function is:
y=(p 1 -p 2 )x·σ(β(p 1 -p 2 )x)+p 2 x(1)
the self-adaptive activation function utilizes a parameter beta to self-adaptively select whether to activate neuron output, when beta tends to infinity, the function output is nonlinear, and when beta tends to 0, the function output is linear; also introduce into p 1 And p 2 Controlling an output upper bound and an upper bound of the adaptive activation function; in the formula, the expression sigma represents a sigmoid function, p 1 And p 2 Representing the learnable adaptive tuning parameters.
Further, p 1 And p 2 Initializing a set value to be 16, carrying out self-adaptive optimization learning by adopting an optimization mode of driving quantity update in a set updating method, wherein a calculation formula is shown in a formula (2):
in the formula, mu is the momentum of the model, epsilon is the learning rate of the model, and p i The value of i is 1 or 2; at the same time update p i Regularization need not be used in the value process, otherwise p i The value of (2) will drop to 0 during training.
Further, the operation formula of the parameter beta is shown in formula (3):
h, W in the formula refers to dimension information of the input feature map, W 1 And W is 2 Is a convolution operation with the parameter added to the convolution operation16, σ is a sigmoid function.
Further, the first mountain rock photograph of the construction site and the second mountain rock photograph of the construction site are obtained by:
monitoring equipment is arranged around a construction area, visible light video monitoring equipment is respectively arranged at four corners of the construction area, and visible light cameras are arranged on rectangular sides of the construction area at intervals to monitor the construction operation area;
and intercepting the video in a mode of fixing frequency of video information acquired by video shooting equipment arranged in the construction area so as to obtain a first mountain rock photograph of the construction site and a second mountain rock photograph of the construction site.
Further, a photo of an input image target detection neural network model is subjected to preprocessing operation firstly to improve the visual effect of an image;
the preprocessing operation comprises image graying, data normalization and image enhancement.
Further, the first mountain rock photograph of the construction site or the second mountain rock photograph of the construction site is input into an image target detection neural network model, and convolution operation with a convolution kernel of 3*3 is performed first to obtain abstract representation information of the photograph;
after the characterization information of the image is obtained, judging whether the convolution output is activated or not through the self-adaptive activation function, outputting the characteristic information when the convolution operation is activated, and setting 0 when the convolution operation is not activated;
the output image characteristic information continuously acquires the semantic characteristics of the high-level image through convolution of a residual structure, wherein the residual network structure consists of 1x1 convolution and 3x3 convolution, the 1x1 convolution is used for extracting low-dimensional characteristics, and the 3x3 convolution is used for extracting high-dimensional characteristics
After the pixel characteristic diagram is obtained, the characteristic diagram passes through an adaptive residual attention module, the adaptive residual attention module processes the pixel characteristic diagram through two branches, the left branch of the adaptive residual attention module utilizes a Bottom-up top-down structure to downsample an image to obtain a high layer, and then the image size is restored through upsampling; the right branch acquires the integral characteristic information of the image in a convolution mode, and finally the outputs of the left branch and the right branch are integrated together in a full-connection mode to obtain a rock detection result diagram with the anchor frame.
In a second aspect, the present invention provides a high precision worksite drop detection device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, the processor implementing the steps of any one of the methods described above when executing the computer program.
In a third aspect, the present invention provides a computer-readable storage medium storing a computer program, characterized in that the computer program when executed by a processor implements the steps of the method according to any one of claims 1 to 8
Compared with the prior art, the invention has the beneficial effects that:
the invention provides a high-precision construction site falling stone detection method aiming at detection and identification of construction site falling stones in a natural complex environment. The method is realized by a target detection algorithm based on a deep learning neural network model, specifically, the method mixes a target detection neural network and an attention network, improves the attention network on the basis of a YOLOv8 target detection neural network, and realizes high-precision falling stone detection. Compared with the traditional construction method of the target detection algorithm neural network, the method uses the activation function proposed by the neural network search technology to construct, namely, a new activation function is adopted to reconstruct the YOLOv8 main network, and the activation function is replaced by the new activation function, so that the accuracy of the model is effectively improved, the false detection rate of the model is reduced, and the robustness of the extracted features of the model is improved. Secondly, the method adopts the attention network method, improves the feature extraction capability of the target detection network, and effectively improves the extraction capability of the model on the image semantic features by adding the residual attention mechanism module, thereby improving the performance of the system on falling rock detection.
Drawings
Fig. 1 is a flowchart of a high-precision method for detecting falling rocks in a construction site provided in embodiment 1 of the present invention;
FIG. 2 is a schematic diagram of the composition of an image object detection neural network model;
FIG. 3 is a flow chart of a backbone network design of an image target detection neural network model;
FIG. 4 is a flow chart of an adaptive residual attention module design;
Detailed Description
The technical scheme of the invention is further described below with reference to the accompanying drawings and examples.
Example 1:
in order to prevent the occurrence of potential safety hazard events of mountain falling rocks in a construction site and reduce unnecessary casualties, as shown in fig. 1, the embodiment provides a high-precision method for detecting the falling rocks in the construction site, which specifically comprises the following steps:
101. acquiring a first mountain rock photograph of a construction site;
102. transmitting the acquired first mountain rock photograph of the construction site to an image target detection neural network model to obtain the position of a rock anchor frame, outputting a detection result diagram with the anchor frame, and setting the result diagram as a reference image;
103. acquiring a second mountain rock photograph position of the construction site, which is the same as the first mountain rock photograph position of the construction site;
104. transmitting the acquired second mountain rock photograph of the construction site to an image target detection neural network model to obtain the position of a rock anchor frame, outputting a detection result diagram with the anchor frame, and setting the result diagram as a comparison image;
105. and comparing the comparison image with the reference image, and sending out an alarm when the position deviation of the anchor frame of the rock exceeds a threshold value.
The image target detection neural network model is based on a YOLOv8 target detection network, an original activation function is replaced by a self-adaptive activation function in a YOLOv8 feature extraction main network to reconstruct a network structure, and neurons in the YOLOv8 main network are controlled through the self-adaptive activation function, so that erroneous output of the neurons is reduced, the generalization capability of the model is effectively improved, and the feature learning capability of the model is improved; secondly, compared with a target detection algorithm of YOLOv8, the invention provides a residual attention feature module, the processing capability of the model on effective feature information of the image is effectively improved by means of residual and attention ideas, the acquisition of the model on the edge feature information of the image is enhanced by utilizing a feature channel multiplication method, and the recognition capability of the model on small rocks is enhanced, so that the problems of low efficiency and low accuracy of a target detection network on a site falling rock detection task are reduced.
Specifically, the overall structure diagram of the image target detection neural network model neural network is shown in fig. 2:
the method is characterized in that the network structure of the YOLO series is used as a benchmark to be improved, firstly, an adaptive activation function is used for a main network of the YOLO v8, the characteristic extraction capacity of a neural network model is improved, secondly, a residual attention network (namely a residual attention characteristic module) is constructed by utilizing a residual network idea, the recognition capacity of the model to small target rocks is improved, and finally, a target detection head of the YOLO is used for outputting a final target detection result diagram. The backbone network architecture flow chart is shown in fig. 3.
The method selects whether to activate convolution for self-adaption by controlling parameter variable values. Most of the conventional target detection networks adopt a ReLU function as an activation function of a convolution module, but the method is too simple and has the problem of neuron death. The SiLU function is adopted in YOLOv8, the function is formed by combining weighted average of a Simoid function and a ReLU function, because of non-monotonic increment of the SiLU function and no forced output setting of 0 when input is 0, the method can learn a large amount of weight all the time, but the method retains the value of negative output in the image field to cause the reduction of the recognition accuracy of a network model, the invention proposes to replace an activation function used by YOLOv8 by using an adaptive activation function, the activation function can adaptively select whether to activate neuron output by using a parameter beta, the function output is nonlinear when beta tends to infinity, and the function output is linear when beta tends to 0, besides, the method also introduces an input p 1 And p 2 And controlling the output upper bound and the upper bound of the self-adaptive activation function, and reducing the error of the neural network output NAN caused by the SiLU unbounded property of the neural network. The adaptive activation function is shown as (1)
y=(p 1 -p 2 )x·σ(β(p 1 -p 2 )x)+p 2 x (1)
The expression sigma represents a sigmoid function, p 1 And p 2 Representing the learnable adaptive tuning parameters. Wherein p is 1 And p 2 Initializing a set value to be 16, carrying out self-adaptive optimization learning by adopting an optimization mode of driving quantity update in a set updating method, wherein a calculation formula is shown in a formula (2):
mu in the formula is the momentum of the model, epsilon is the learning rate of the model, and p i I is 1 or 2. At the same time update p i Regularization need not be used in the value process, otherwise p i The value of (2) will drop to 0 during training.
In the adaptive activation function, beta is a dynamic parameter, when beta approaches infinity, the function output is close to nonlinearity, and when beta=1, the adaptive activation function is equivalent to a SiLU function. However, the adaptive activation function and the SiLU function are not greatly different, so that the adaptive activation function takes the parameter beta as a parameter, and generates the parameter beta in a similar way to a channel multiplied attention network, so that the adaptive activation function can explicitly learn the activation degree, and the operation formula of the parameter beta is shown as a formula (3):
h, W in the formula refers to dimension information of the input feature map, W 1 And W is 2 Is a convolution operation of two times, a convolution scaling parameter with a parameter size of 16 is added in the convolution operation, and sigma is a sigmoid function.
In YOLOv8, the attention network is not used for improving the image feature extraction capability of the neural network model, but a spatial feature pyramid method is utilized for increasing the receptive field of the neural network, so that more information is obtained. The adaptive residual attention module used in the invention is shown in fig. 4, and the adaptive residual attention module is stacked through a convolution structure, so that the model is easier to optimize and learn, and the Bottom-up top-down structure is designed on the branch of the attention module, so that the model can extract the characteristics of the image when the model extracts the characteristics in the forward direction.
The Bottom-up top-down structure extracts high-dimensional features through a series of convolution and pooling layer modules, and increases the receptive field of the model, and effective features are extracted by using pixel information in the high-level features. And restoring the image size by utilizing an up-sampling mode, so that the original input characteristic receptive field image can be output as a characteristic image with pixel characteristic information, and the characteristic extraction capacity of the model is enhanced.
The method is further described below in connection with a specific application scenario example:
step 1: taking a construction site mountain rock photograph (namely a first construction site mountain rock photograph) taken by a camera installed according to a construction site scene as an input of an image target detection neural network model;
step 2: firstly, preprocessing operations such as image graying, data normalization, image enhancement and the like are carried out on a photo of an input model, so that the visual effect of an image is improved;
step 3: the picture is preprocessed and then input into an image target detection neural network model, convolution operation with a convolution kernel of 3*3 is performed first, and abstract representation information of the picture is obtained;
step 4: after the characteristic information of the image is obtained, judging whether the convolution output is activated or not through the self-adaptive activation function, outputting the characteristic information when the convolution operation is activated, and setting 0 when the convolution operation is not activated.
Step 5: the output image feature information continuously acquires semantic features of the high-level image through convolution of a residual structure, wherein the residual network structure is formed by 1x1 convolution and 3x3 convolution as shown in fig. 3, the 1x1 convolution is used for extracting low-dimensional features, and the 3x3 convolution is used for extracting high-dimensional features;
step 6: the image characteristic information is subjected to four convolution output selections and residual structure convolution and then is required to pass through a pyramid pooling network, so that pictures with different sizes are generated, different characteristics are generated by each picture, and finally, the pictures with all sizes are integrated to obtain a pixel characteristic diagram;
step 7: after obtaining the pixel characteristic diagram, the characteristic diagram passes through a self-adaptive residual error attention module, the pixel characteristic diagram is processed through two branches, the left branch of the self-adaptive residual error attention module utilizes a Bottom-up top-down structure to downsample an image to obtain a high layer, then the image size is restored through upsampling, the right branch of the self-adaptive residual error attention module obtains the integral characteristic information of the image through convolution, and finally the outputs of the left branch and the right branch are integrated together through a full connection mode to obtain a rock image of a construction area with an anchor frame, and the image is set as a reference image;
step 8: the visible light equipment of the ground construction area continuously monitors the rock area, the video collected by monitoring acquires images (namely, second mountain rock pictures of the construction site) in a frame extraction mode every second, the images are used for acquiring rock images with anchor frames in a mode of step 2-7, and the images are called contrast images;
and 9, comparing the comparison image with the reference image, and sending out an alarm when the position deviation of the rock anchor frame exceeds a threshold value.
In summary, compared with the prior art, the invention has the following technical advantages:
in order to effectively improve the recognition performance of intelligent falling rock detection in a construction site, the scheme provides a main network of a self-adaptive activation function and a self-adaptive residual error attention network based on a deep learning target detection neural network technology, and has the advantages of high efficiency and strong portability.
In the deep learning neural network model designed by the scheme, firstly, unlike the traditional target detection neural network (a backbone network is constructed by using a traditional activation function), the backbone network designed by the scheme uses an adaptive activation function to realize the linear/nonlinear of the model adaptive control convolution so as to control whether to output image characteristic information. The method effectively reduces the output of invalid characteristic information, ensures the effectiveness of convolution output of the model to the maximum extent, and is beneficial to the model to acquire the effective characteristic pixel information.
In the feature fusion output stage, compared with the traditional feature output mode, feature fusion and output are directly carried out in a full-connection mode, the scheme considers that part of features can be lost in the full-connection process, and designs a self-adaptive residual attention mechanism module, the module adopts a two-path branch merging mode to extract semantic features of high-level images, can further acquire features of small target rocks, and is beneficial to further improving the performance of a system.
Example 2:
the high-precision construction site falling stone detection device provided by the embodiment comprises a processor, a memory and a computer program which is stored in the memory and can run on the processor, for example, the high-precision construction site falling stone detection program. The processor, when executing the computer program, implements the steps of embodiment 1 described above, such as the steps shown in fig. 1.
The computer program may be divided into one or more modules/units, which are stored in the memory and executed by the processor to accomplish the present invention, for example. The one or more modules/units may be a series of computer program instruction segments capable of performing a specific function for describing the execution of the computer program in the high-precision worksite rock fall detection device.
The high-precision construction site falling stone detection device can be a desktop computer, a notebook computer, a palm computer, a cloud server and other computing equipment. The high-precision construction site falling stone detection device can comprise, but is not limited to, a processor and a memory.
The processor may be a central processing unit (Central Processing Unit, CPU), other general purpose processor, digital signal processor (Digital Signal Processor, DSP), application specific integrated circuit (Application Specific Integrated Circuit, ASIC), off-the-shelf programmable gate array (FieldProgrammable Gate Array, FPGA) or other programmable logic device, discrete gate or transistor logic device, discrete hardware components, or the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
The memory may be an internal memory element of the high-precision work site fall detection device, such as a hard disk or a memory of the high-precision work site fall detection device. The memory may be an external storage device of the high-precision site falling stone detection apparatus, for example, a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash Card (Flash Card) or the like provided on the high-precision site falling stone detection apparatus. Further, the memory may also include both an internal memory unit and an external memory device of the high-precision worksite falling rock detection apparatus. The memory is used for storing the computer program and other programs and data required by the high-precision construction site falling stone detection device. The memory may also be used to temporarily store data that has been output or is to be output.
Example 3:
the present embodiment provides a computer readable storage medium storing a computer program which, when executed by a processor, implements the steps of the method described in embodiment 1.
The computer readable medium can be any means that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device. More specific examples (a non-exhaustive list) of the computer-readable medium would include the following: an electrical connection (electronic device) having one or more wires, a portable computer diskette (magnetic device), a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber device, and a portable compact disc read-only memory (CDROM). Additionally, the computer readable medium may even be paper or another suitable medium upon which the program is printed, such as by optically scanning the paper or other medium, then editing, interpreting, or otherwise processing as necessary, and electronically obtaining the program, which is then stored in a computer memory.
The above embodiments are only for illustrating the technical concept and features of the present invention, and are intended to enable those skilled in the art to understand the content of the present invention and implement the same, and are not intended to limit the scope of the present invention. All equivalent changes or modifications made in accordance with the essence of the present invention are intended to be included within the scope of the present invention.
Claims (10)
1. The method for detecting the falling rocks of the construction site with high precision is characterized by comprising the following steps of:
acquiring a first mountain rock photograph of a construction site;
transmitting the acquired first mountain rock photograph of the construction site to an image target detection neural network model to obtain the position of a rock anchor frame, outputting a rock detection result diagram with the rock anchor frame, and setting the result diagram as a reference image;
acquiring a second mountain rock photograph of the construction site, which is positioned at the same position as the first mountain rock photograph of the construction site;
transmitting the acquired second mountain rock photograph of the construction site to an image target detection neural network model to obtain the position of a rock anchor frame, outputting a rock detection result diagram with the rock anchor frame, and setting the result diagram as a comparison image;
and comparing the comparison image with the reference image, and sending out an alarm when the position deviation of the anchor frame of the rock exceeds a threshold value.
2. The high-precision construction site falling stone detection method according to claim 1, wherein the image target detection neural network model is improved based on a network structure of YOLO series, comprising:
using an adaptive activation function for the YOLOv8 backbone network;
constructing a self-adaptive residual attention module by using a residual network method;
the target detection head using YOLO outputs a rock detection result map with anchor frame.
3. The high-precision worksite rockfall detection method according to claim 2, wherein the adaptive activation function is:
y=(p 1 -p 2 )x·σ(β(p 1 -p 2 )x)+p 2 x(1)
the self-adaptive activation function utilizes a parameter beta to self-adaptively select whether to activate neuron output, when beta tends to infinity, the function output is nonlinear, and when beta tends to 0, the function output is linear; also introduce into p 1 And p 2 Controlling an output upper bound and an upper bound of the adaptive activation function; in the formula, the expression sigma represents a sigmoid function, p 1 And p 2 Representing the learnable adaptive tuning parameters.
4. A high precision worksite falling stone detection method according to claim 3, wherein p 1 And p 2 Initializing a set value to be 16, carrying out self-adaptive optimization learning by adopting an optimization mode of driving quantity update in a set updating method, wherein a calculation formula is shown in a formula (2):
in the formula, mu is the momentum of the model, epsilon is the learning rate of the model, and p i The value of i is 1 or 2; at the same time update p i Regularization need not be used in the value process, otherwise p i The value of (2) will drop to 0 during training.
5. A high-precision construction site falling stone detection method as claimed in claim 3, wherein the operation formula of the parameter beta is as shown in formula (3):
h, W in the formula refers to dimension information of the input feature map, W 1 And W is 2 Is a convolution operation of two times, a convolution scaling parameter with a parameter size of 16 is added in the convolution operation, and sigma is a sigmoid function.
6. The high-precision worksite rockfall detection method according to claim 1, wherein the first mountain rock photograph of the construction site and the second mountain rock photograph of the construction site are obtained by:
monitoring equipment is arranged around a construction area, visible light video monitoring equipment is respectively arranged at four corners of the construction area, and visible light cameras are arranged on rectangular sides of the construction area at intervals to monitor the construction operation area;
and intercepting the video in a mode of fixing frequency of video information acquired by video shooting equipment arranged in the construction area so as to obtain a first mountain rock photograph of the construction site and a second mountain rock photograph of the construction site.
7. The high-precision construction site falling stone detection method according to claim 1, wherein the input image target is detected with a photo of a neural network model, and a preprocessing operation is firstly performed to improve the visual effect of the image;
the preprocessing operation comprises image graying, data normalization and image enhancement.
8. The high-precision construction site falling stone detection method as claimed in claim 5, wherein,
the first mountain rock photograph of the construction site or the second mountain rock photograph of the construction site is input into an image target detection neural network model, and convolution operation with a convolution kernel of 3*3 is performed first to obtain abstract representation information of the photograph;
after the characterization information of the image is obtained, judging whether the convolution output is activated or not through the self-adaptive activation function, outputting the characteristic information when the convolution operation is activated, and setting 0 when the convolution operation is not activated;
the output image characteristic information continuously acquires the semantic characteristics of the high-level image through convolution of a residual structure, wherein the residual network structure consists of 1x1 convolution and 3x3 convolution, the 1x1 convolution is used for extracting low-dimensional characteristics, and the 3x3 convolution is used for extracting high-dimensional characteristics
After the pixel characteristic diagram is obtained, the characteristic diagram passes through an adaptive residual attention module, the adaptive residual attention module processes the pixel characteristic diagram through two branches, the left branch of the adaptive residual attention module utilizes a Bottom-up top-down structure to downsample an image to obtain a high layer, and then the image size is restored through upsampling; the right branch acquires the integral characteristic information of the image in a convolution mode, and finally the outputs of the left branch and the right branch are integrated together in a full-connection mode to obtain a rock detection result diagram with the anchor frame.
9. A high precision worksite drop detection device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, characterized in that the processor implements the steps of the method according to any one of claims 1 to 8 when executing the computer program.
10. A computer readable storage medium storing a computer program, characterized in that the computer program when executed by a processor implements the steps of the method according to any one of claims 1 to 8.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310625498.XA CN116805365A (en) | 2023-05-30 | 2023-05-30 | High-precision construction site falling stone detection method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310625498.XA CN116805365A (en) | 2023-05-30 | 2023-05-30 | High-precision construction site falling stone detection method and device |
Publications (1)
Publication Number | Publication Date |
---|---|
CN116805365A true CN116805365A (en) | 2023-09-26 |
Family
ID=88080173
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310625498.XA Pending CN116805365A (en) | 2023-05-30 | 2023-05-30 | High-precision construction site falling stone detection method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116805365A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117087023A (en) * | 2023-10-17 | 2023-11-21 | 杭州泓芯微半导体有限公司 | Double-station linear cutting machine and control method thereof |
CN117876848A (en) * | 2024-03-13 | 2024-04-12 | 成都理工大学 | Complex environment falling stone detection method based on improved yolov5 |
-
2023
- 2023-05-30 CN CN202310625498.XA patent/CN116805365A/en active Pending
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117087023A (en) * | 2023-10-17 | 2023-11-21 | 杭州泓芯微半导体有限公司 | Double-station linear cutting machine and control method thereof |
CN117876848A (en) * | 2024-03-13 | 2024-04-12 | 成都理工大学 | Complex environment falling stone detection method based on improved yolov5 |
CN117876848B (en) * | 2024-03-13 | 2024-05-07 | 成都理工大学 | Complex environment falling stone detection method based on improvement yolov5 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Zhang et al. | CCTSDB 2021: a more comprehensive traffic sign detection benchmark | |
CN111080628B (en) | Image tampering detection method, apparatus, computer device and storage medium | |
Lu et al. | TasselNet: counting maize tassels in the wild via local counts regression network | |
CN116805365A (en) | High-precision construction site falling stone detection method and device | |
US20190311223A1 (en) | Image processing methods and apparatus, and electronic devices | |
Fu et al. | Using convolutional neural network to identify irregular segmentation objects from very high-resolution remote sensing imagery | |
CN109714526B (en) | Intelligent camera and control system | |
CN110956126A (en) | Small target detection method combined with super-resolution reconstruction | |
CN112329702B (en) | Method and device for rapid face density prediction and face detection, electronic equipment and storage medium | |
CN111259868B (en) | Reverse vehicle detection method, system and medium based on convolutional neural network | |
Ma et al. | Automatic defogging, deblurring, and real-time segmentation system for sewer pipeline defects | |
CN112419202B (en) | Automatic wild animal image recognition system based on big data and deep learning | |
CN115035295A (en) | Remote sensing image semantic segmentation method based on shared convolution kernel and boundary loss function | |
CN110766007A (en) | Certificate shielding detection method, device and equipment and readable storage medium | |
Xie et al. | Recognition of big mammal species in airborne thermal imaging based on YOLO V5 algorithm | |
Jin et al. | Vehicle license plate recognition for fog‐haze environments | |
Yildirim et al. | Ship detection in optical remote sensing images using YOLOv4 and Tiny YOLOv4 | |
Xu et al. | ALAD-YOLO: an lightweight and accurate detector for apple leaf diseases | |
Qu et al. | Improved YOLOv5-based for small traffic sign detection under complex weather | |
Vijayan et al. | A universal foreground segmentation technique using deep-neural network | |
CN112329550A (en) | Weak supervision learning-based disaster-stricken building rapid positioning evaluation method and device | |
Ghariba et al. | A novel fully convolutional network for visual saliency prediction | |
CN112132015A (en) | Detection method, device, medium and electronic equipment for illegal driving posture | |
CN116843946A (en) | Tunnel rock mass main structural surface identification method and device based on image identification | |
Du et al. | Pedestrian detection based on a hybrid Gaussian model and support vector machine |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |