CN116797814A

CN116797814A - Intelligent building site safety management system

Info

Publication number: CN116797814A
Application number: CN202211693226.5A
Authority: CN
Inventors: 车海宝; 董永瑞; 姜太平; 刘媛; 吴曦; 苏子卿; 刘戈; 费双; 张春茹; 路晨升
Original assignee: China Construction Xinjiang Construction Engineering Group Third Construction Engineering Co Ltd
Current assignee: China Construction Xinjiang Construction Engineering Group Third Construction Engineering Co Ltd
Priority date: 2022-12-28
Filing date: 2022-12-28
Publication date: 2023-09-22

Abstract

The application relates to the field of intelligent management, and particularly discloses an intelligent construction site safety management system which adopts an artificial intelligent monitoring technology based on machine vision to carry out blocking processing on a monitored image to obtain more accurate small-size object information in the image, then further carries out detection of target objects on each image block to carry out region-of-interest framing, and extracts global implicit associated characteristic information of the region-of-interest after the pixels of the image are enhanced so as to carry out detection and judgment on whether personnel invasion exists. Therefore, personnel invasion can be accurately detected in real time when large equipment operates, early warning signals are sent out when the personnel invaded in and out are detected, safety management of an intelligent building site is achieved, and further accidents are avoided while normal operation of the equipment is guaranteed.

Description

Intelligent building site safety management system

Technical Field

The application relates to the field of intelligent management, and more particularly relates to an intelligent building site safety management system.

Background

The intelligent construction site is used for accurately designing and constructing simulation of engineering projects through a three-dimensional design platform by using informatization means, building an informatization ecological circle of the construction projects for interconnection cooperation, intelligent production and scientific management around construction process management, carrying out data mining analysis on the data and engineering information acquired by the Internet of things under a virtual reality environment, providing process trend prediction and expert planning, realizing visualized intelligent management of engineering construction, and improving informatization level of engineering management, so that green construction and ecological construction are gradually realized.

In the operation of the equipment in the intelligent construction site, particularly large-scale equipment, accidents occur due to accidental intrusion of personnel, so that the normal operation and the operation of the equipment are affected, and the personnel can be seriously injured.

Thus, an intelligent worksite safety management scheme is desired.

Disclosure of Invention

The present application has been made to solve the above-mentioned technical problems. The embodiment of the application provides an intelligent construction site safety management system, which adopts an artificial intelligent monitoring technology based on machine vision to carry out blocking processing on a monitored image to obtain more accurate small-size object information in the image, then further carries out detection of target objects on each image block to frame an interested region, and extracts global implicit associated characteristic information of the interested region after the pixels of the image are enhanced so as to detect and judge whether personnel invasion exists. Therefore, personnel invasion can be accurately detected in real time when large equipment operates, early warning signals are sent out when the personnel invaded in and out are detected, safety management of an intelligent building site is achieved, and further accidents are avoided while normal operation of the equipment is guaranteed.

According to one aspect of the present application, there is provided an intelligent worksite security management system comprising:

the monitoring unit is used for acquiring monitoring images acquired by cameras of large-scale equipment deployed on the intelligent construction site;

the blocking unit is used for carrying out image blocking processing on the monitoring image to obtain a plurality of image blocks;

the image block target detection unit is used for respectively passing the plurality of image blocks through a target detection network to obtain at least one region of interest;

a region-of-interest pixel enhancement unit for passing the region of interest through an image enhancer based on an antagonism generation network to obtain an enhanced region of interest;

the feature extraction unit is used for enabling the enhanced region of interest to pass through a convolutional neural network model serving as a feature extractor to obtain a region of interest feature map;

the feature enhancement unit is used for enabling the region-of-interest feature map to pass through a non-local neural network model to obtain an enhanced region-of-interest feature map; and

and the safety management result generation unit is used for enabling the enhanced region of interest feature map to pass through a classifier to obtain a classification result, wherein the classification result is used for indicating whether personnel invasion exists when the large-scale equipment operates.

In the above intelligent worksite safety management system, the blocking unit is further configured to: and carrying out uniform image blocking processing on the monitoring image to obtain a plurality of image blocks, wherein each image block in the plurality of image blocks has the same size.

In the above intelligent worksite safety management system, the image block target detection unit is further configured to: the target detection network is an anchor window-based target detection network, and the anchor window-based target detection network is Fast R-CNN, fast R-CNN or RetinaNet.

In the intelligent worksite safety management system, the antagonism-based generation network comprises a discriminator and a generator, wherein the region-of-interest pixel enhancement unit is further used for inputting the region of interest into the generator of the antagonism-based generation network image enhancer to deconvolute the region of interest by the generator to obtain the enhanced region of interest.

In the above intelligent worksite safety management system, the feature extraction unit is further configured to: each layer of the convolutional neural network model using the feature extractor performs, in forward transfer of the layer, input data: carrying out convolution processing on input data to obtain a convolution characteristic diagram; pooling the convolution feature images based on the local feature matrix to obtain pooled feature images; performing nonlinear activation on the pooled feature map to obtain an activated feature map; wherein the output of the last layer of the convolutional neural network serving as the feature extractor is the region of interest feature map, and the input of the first layer of the convolutional neural network serving as the feature extractor is the enhanced region of interest.

In the above intelligent worksite safety management system, the feature enhancement unit is further configured to: inputting the region of interest feature map into a first point convolution layer, a second point convolution layer and a third point convolution layer of the non-local neural network respectively to obtain a first feature map, a second feature map and a third feature map; calculating a weighted sum of the first feature map and the second feature map according to positions to obtain an intermediate fusion feature map; inputting the intermediate fusion feature map into a Softmax function to normalize feature values of each position in the intermediate fusion feature map so as to obtain a normalized intermediate fusion feature map; calculating a weighted sum of the normalized intermediate fusion feature map and the third feature map by position to obtain a re-fusion feature map; embedding a Gaussian similarity function into the re-fusion feature map to calculate the similarity between feature values of each position in the re-fusion feature map so as to obtain a global similarity feature map; the global similar feature map passes through a fourth point convolution layer of the non-local neural network to adjust the channel number of the global similar feature map so as to obtain a channel-adjusted global similar feature map; and calculating a position weighted sum of the channel-adjustment global similarity feature map and the region-of-interest feature map to obtain the enhanced region-of-interest feature map.

In the above intelligent worksite safety management system, the safety management result generating unit is further configured to: processing the enhanced region of interest feature map using the classifier to obtain a classification result with the following formula:

O＝softmax{(W _n ,B _n ):…:(W ₁ ,B ₁ ) Project (F), where Project (F) represents projecting the enhanced region of interest feature map as a vector, W ₁ To W _n Weight matrix for all the connection layers of each layer, B ₁ To B _n Representing the bias vector for each fully connected layer.

The intelligent site safety management system further comprises a training module for training the target detection network, the image enhancer based on the countermeasure generation network, the convolutional neural network model serving as the characteristic extractor, the non-local neural network model and the classifier; wherein, training module includes: the training data acquisition unit is used for acquiring training data, wherein the training data comprises training monitoring images and whether a true value of personnel invasion exists when the large-scale equipment operates; the training blocking unit is used for carrying out image blocking processing on the training monitoring image to obtain a plurality of training image blocks; the training image block target detection unit is used for respectively passing the plurality of training image blocks through the target detection network to obtain at least one training interested region; a training region of interest pixel enhancement unit for passing the training region of interest through the countermeasure-generating network-based image enhancer to obtain a training enhanced region of interest; the training feature extraction unit is used for enabling the training enhancement region of interest to pass through the convolutional neural network model serving as the feature extractor so as to obtain a training region of interest feature map; the training feature enhancement unit is used for enabling the training region-of-interest feature map to pass through the non-local neural network model to obtain a training enhancement region-of-interest feature map; the classification loss unit is used for enabling the training enhancement region-of-interest feature map to pass through the classifier to obtain a classification loss function value; an intrinsic learning loss unit, configured to calculate an intrinsic learning loss function value of a sequence-to-sequence response rule based on a distance between the enhanced region-of-interest feature map and feature vectors obtained after the projection of the region-of-interest feature map; and a training unit for calculating a weighted sum of the classification loss function value and the sequence-to-sequence response rule intrinsic learning loss function value as a loss function value to train the object detection network, the countermeasure-generation network-based image enhancer, the convolutional neural network model as a feature extractor, the non-local neural network model, and the classifier.

In the above intelligent worksite safety management system, the intrinsic learning loss unit is further configured to: calculating a sequence-to-sequence response rule intrinsic learning loss function value based on the distance between the enhanced region-of-interest feature map and feature vectors obtained after the projection of the region-of-interest feature map; wherein, the formula is:

wherein V is ₁ And V ₂ The feature vectors obtained after the projection of the enhanced region of interest feature map and the region of interest feature map are respectively, and W ₁ And W is ₂ Classifier to V ₁ And V ₂ Is used, the ReLU (·) represents the ReLU activation function, the Sigmoid (·) represents the Sigmoid activation function,representing matrix multiplication, d (·,) represents the Euclidean distance between the two vectors,/>Representing the inherent chemistry loss function value of the sequence-to-sequence response rule.

According to another aspect of the present application, there is provided an intelligent worksite security management method, comprising:

acquiring a monitoring image acquired by a camera of large-scale equipment deployed at an intelligent building site;

performing image blocking processing on the monitoring image to obtain a plurality of image blocks;

respectively passing the plurality of image blocks through a target detection network to obtain at least one region of interest;

Passing the region of interest through an image intensifier based on an antagonism generation network to obtain an enhanced region of interest;

the enhanced region of interest is passed through a convolutional neural network model as a feature extractor to obtain a region of interest feature map;

the region of interest feature map is passed through a non-local neural network model to obtain an enhanced region of interest feature map; and

and the enhanced region of interest feature map is passed through a classifier to obtain a classification result, wherein the classification result is used for indicating whether personnel invasion exists when the large-scale equipment operates.

According to still another aspect of the present application, there is provided an electronic apparatus including: a processor; and a memory having stored therein computer program instructions that, when executed by the processor, cause the processor to perform the smart worksite security management method as described above.

According to yet another aspect of the present application, there is provided a computer readable medium having stored thereon computer program instructions which, when executed by a processor, cause the processor to perform the smart worksite security management method as described above.

Compared with the prior art, the intelligent construction site safety management system provided by the application adopts an artificial intelligent monitoring technology based on machine vision, so that after a monitored image is subjected to blocking processing to obtain more accurate small-size object information in the image, each image block is further subjected to target object detection to frame an interested region, and after pixels of the image are enhanced, global implicit associated characteristic information of the interested region is extracted to detect and judge whether personnel invasion exists. Therefore, personnel invasion can be accurately detected in real time when large equipment operates, early warning signals are sent out when the personnel invaded in and out are detected, safety management of an intelligent building site is achieved, and further accidents are avoided while normal operation of the equipment is guaranteed.

Drawings

The above and other objects, features and advantages of the present application will become more apparent by describing embodiments of the present application in more detail with reference to the attached drawings. The accompanying drawings are included to provide a further understanding of embodiments of the application and are incorporated in and constitute a part of this specification, illustrate the application and together with the embodiments of the application, and not constitute a limitation to the application. In the drawings, like reference numerals generally refer to like parts or steps.

FIG. 1 is an application scenario diagram of an intelligent worksite security management system according to an embodiment of the present application;

FIG. 2 is a block diagram of an intelligent worksite security management system in accordance with an embodiment of the present application;

FIG. 3 is a block diagram of a training module in the intelligent worksite safety management system according to an embodiment of the present application;

FIG. 4 is a system architecture diagram of an intelligent worksite security management system in accordance with an embodiment of the present application;

FIG. 5 is a flow chart of convolutional neural network coding in an intelligent worksite security management system in accordance with an embodiment of the present application;

FIG. 6 is a flow chart of non-local neural network coding in an intelligent worksite security management system, according to an embodiment of the present application;

FIG. 7 is a system architecture diagram of a training module in an intelligent worksite security management system, in accordance with an embodiment of the present application;

FIG. 8 is a flow chart of a method of intelligent worksite security management according to an embodiment of the present application;

fig. 9 is a block diagram of an electronic device according to an embodiment of the application.

Detailed Description

Hereinafter, exemplary embodiments according to the present application will be described in detail with reference to the accompanying drawings. It should be apparent that the described embodiments are only some embodiments of the present application and not all embodiments of the present application, and it should be understood that the present application is not limited by the example embodiments described herein.

Scene overview

As described above in the background art, the smart construction site is to use an informatization means to accurately design and simulate construction of an engineering project through a three-dimensional design platform, and around the management of a construction process, establish an informatization ecological circle of the construction project for interconnection collaboration, intelligent production and scientific management, and perform data mining analysis on the data and engineering information collected by the internet of things in a virtual reality environment, so as to provide a process trend prediction and expert planning, realize visual intelligent management of engineering construction, and improve the informatization level of engineering management, thereby gradually realizing green construction and ecological construction.

In the operation of the equipment in the intelligent construction site, particularly large-scale equipment, accidents occur due to accidental intrusion of personnel, so that the normal operation and the operation of the equipment are affected, and the personnel can be seriously injured. Thus, an intelligent worksite safety management scheme is desired.

At present, deep learning and neural networks have been widely used in the fields of computer vision, natural language processing, speech signal processing, and the like. In addition, deep learning and neural networks have also shown levels approaching and even exceeding humans in the fields of image classification, object detection, semantic segmentation, text translation, and the like.

In recent years, deep learning and development of neural networks have provided new solutions and solutions for security management at intelligent sites.

Accordingly, the existing safety management scheme for the intelligent construction site is to collect surrounding monitoring images through a camera installed on large equipment, so that accidents caused by accidental intrusion of personnel are avoided. However, since the camera mounted on the large-sized device is generally disposed at a high position in order to make the monitoring more comprehensive, this causes a person object to become a small-sized object in the image, which may cause difficulty in recognition of the small-sized object. Moreover, since the monitoring image has a lot of information interference, it is difficult to accurately analyze and identify the intruder, and further it is difficult to perform security management on the smart site.

Based on the above, in the technical scheme of the application, an artificial intelligent monitoring technology based on machine vision is adopted, so that after the monitored image is subjected to blocking processing to obtain more accurate small-size object information in the image, each image block is further subjected to target object detection to frame an interested region, and after the pixels of the image are enhanced, global implicit relevant characteristic information of the interested region is extracted to detect and judge whether personnel invasion exists. Therefore, personnel invasion can be accurately detected in real time when large equipment operates, early warning signals are sent out when the personnel invaded in and out are detected, safety management of an intelligent building site is achieved, and further accidents are avoided while normal operation of the equipment is guaranteed.

Specifically, in the technical scheme of the application, firstly, a monitoring image is collected through a camera of large equipment deployed on an intelligent building site. Next, in order to make monitoring more comprehensive, it is considered that in the process of actually performing the detection of the intruder to implement the security management of the smart worksite, the camera is generally installed at a higher position of the large-scale device, so that the object related to the person in the monitoring image collected by the camera belongs to a small-sized object, which brings difficulty to the intrusion detection of the person. Therefore, in the technical scheme of the application, the monitoring image is further subjected to image blocking processing to obtain a plurality of image blocks. Accordingly, in a specific example of the present application, the monitoring image may be subjected to a uniform image blocking process to obtain the plurality of image blocks, where each of the plurality of image blocks has the same size. It should be appreciated that after the image segmentation process, the dimensions of each image block of the plurality of image blocks are reduced compared to the original image, and thus, the person object of small size in the monitoring image is no longer a small-sized object in the image block, so as to facilitate detection of an intruder.

It should be appreciated, however, that considering that there may be a large amount of interference features of environmental information in each image block of the monitoring image, this may be difficult to determine for detecting a person intrusion, and therefore, in order to accurately detect an intruder when a large-scale apparatus is operating, detection calibration needs to be performed for a person target object in each image block. That is, the image blocks are respectively passed through a target detection network, so as to detect whether a target object may exist in each image block through the target detection network, and at least one region of interest is obtained. Specifically, the target anchoring layer of the target detection network is used for sliding with an anchor frame B to process each image block in the plurality of image blocks so as to frame the region of interest, and therefore the at least one region of interest is obtained. In particular, in one specific example of the present application, the target detection network is here an anchor window based target detection network, which is Fast R-CNN, fast R-CNN or RetinaNet.

Further, it is considered that in the at least one region of interest, a large amount of small environmental particles such as dust exist in the operation site of the smart site, which affects the definition of the region of interest in the monitored image, that is, the region of interest becomes blurred due to interference of external environmental factors, so that the accuracy of identifying and detecting the intruder is reduced. Therefore, in the technical scheme of the application, the enhancement of the image definition is performed by the image enhancer based on the countermeasure generation network before the feature extraction. Specifically, the region of interest is input into a generator of the antagonism generation network based image enhancer to deconvolution the region of interest by the generator to obtain the enhanced region of interest. In particular, here, the countermeasure-based generation network includes a discriminator for generating an image with enhanced image sharpness and a generator for calculating a difference between the image with enhanced data and a real image and updating network parameters of the generator by a gradient descent direction propagation algorithm to obtain the generator with enhanced image sharpness.

Then, a convolutional neural network model which is a feature extractor with excellent performance in terms of local implicit feature extraction of the image can be used for feature mining of the enhanced region of interest, so as to extract implicit feature distribution information about personnel invasion in the enhanced region of interest, and a region of interest feature map is obtained.

Then, it is considered that since convolution is a typical local operation, it can only extract image local features, but cannot pay attention to the global, which affects the detection accuracy. For the regions of interest of the image blocks, the regions of interest of the image blocks are not isolated, and the correlation between the feature distributions of the regions of interest of the image blocks generates a foreground object. Therefore, in the technical scheme of the application, in order to more accurately detect and judge personnel invasion, a non-local neural network is used for further extracting the characteristics of the image. That is, the region of interest feature map is passed through a non-local neural network model to expand the feature receptive field through the non-local neural network model, thereby obtaining an enhanced region of interest feature map. In particular, here, the non-local neural network captures hidden dependency information by calculating the similarity between the features of the regions of interest of each image block, so as to model the context features, so that the network focuses on the global overall content between the features of the regions of interest of each image block, and further, the feature extraction capability of the backbone network is improved in classification and detection tasks.

And further, taking the enhanced region of interest feature map as a classification feature map, and performing classification processing in a classifier to obtain a classification result for indicating whether personnel invasion exists when the large-scale equipment operates. Therefore, the personnel invasion can be accurately detected in real time when the large-scale equipment operates, so that an early warning signal is sent when the personnel invading in and out are detected.

Particularly, in the technical scheme of the application, the non-local neural network can capture remote dependency information by calculating the similarity of all pixel points of an image so as to enable the network to focus on the whole image characteristic association semantics, but because the image local characteristic association semantics in the region of interest expressed by the region of interest feature map is also important, the intrinsic responsiveness of the enhanced region of interest feature map relative to the region of interest feature map is expected to be improved, so that the accuracy of the classification result of the enhanced region of interest feature map through a classifier is improved.

Based on this, the applicant of the present application calculated a sequence-to-sequence response rule internalization learning loss function between the enhanced region-of-interest feature map and the region-of-interest feature map, expressed as:

V ₁ And V ₂ The feature vectors obtained after the projection of the enhanced region of interest feature map and the region of interest feature map are respectively, and W ₁ And W is ₂ Classifier to V ₁ And V ₂ Is a weight matrix of (a).

Here, through the classifier to the squeezing-excitation channel attention mechanism of the weight matrix of different sequences, the enhanced distinguishing capability between the feature vector sequences obtained after the feature map is unfolded can be obtained, through training the network by using the loss function, the recovery of causal relation features with better distinguishing performance between response sequences can be realized, so as to carry out internalization learning on the reason-result type response rules between the sequences, and the intrinsic responsiveness between the sequences is enhanced. In this way, the intrinsic responsiveness of the enhanced region-of-interest feature map to the region-of-interest feature map is improved, so that the expression capability of the enhanced region-of-interest feature map to the image local feature association semantics is improved, the classification effect of the enhanced region-of-interest feature map is improved, and the classification accuracy is further improved. Therefore, personnel invasion can be accurately detected in real time when large equipment operates, early warning signals are sent out when the personnel invaded in and out are detected, safety management of an intelligent building site is achieved, and further accidents are avoided while normal operation of the equipment is guaranteed.

Based on this, the application proposes an intelligent worksite safety management system comprising: the monitoring unit is used for acquiring monitoring images acquired by cameras of large-scale equipment deployed on the intelligent construction site; the blocking unit is used for carrying out image blocking processing on the monitoring image to obtain a plurality of image blocks; the image block target detection unit is used for respectively passing the plurality of image blocks through a target detection network to obtain at least one region of interest; a region-of-interest pixel enhancement unit for passing the region of interest through an image enhancer based on an antagonism generation network to obtain an enhanced region of interest; the feature extraction unit is used for enabling the enhanced region of interest to pass through a convolutional neural network model serving as a feature extractor to obtain a region of interest feature map; the feature enhancement unit is used for enabling the region-of-interest feature map to pass through a non-local neural network model to obtain an enhanced region-of-interest feature map; and the safety management result generation unit is used for enabling the enhanced region of interest feature map to pass through a classifier to obtain a classification result, wherein the classification result is used for indicating whether personnel invasion exists when the large-scale equipment operates.

FIG. 1 is an application scenario diagram of an intelligent worksite security management system according to an embodiment of the present application. As shown in fig. 1, in this application scenario, a monitoring image is acquired by a camera (e.g., C as illustrated in fig. 1) of a large-scale device deployed at an intelligent worksite. The image is then input to a server (e.g., S in fig. 1) deployed with a smart worksite security management algorithm with which the server can process the image to generate a classification result indicating whether or not there is a person intrusion while the large-scale equipment is running.

Having described the basic principles of the present application, various non-limiting embodiments of the present application will now be described in detail with reference to the accompanying drawings.

Exemplary System

FIG. 2 is a block diagram of an intelligent worksite security management system in accordance with an embodiment of the present application. As shown in fig. 2, the smart worksite safety management system 300 according to an embodiment of the present application includes an inference module, wherein the inference module includes: a monitoring unit 310; a blocking unit 320; an image block target detection unit 330; a region of interest pixel enhancement unit 340; a feature extraction unit 350; a feature enhancing unit 360; and a security management result generation unit 370.

The monitoring unit 310 is configured to acquire a monitoring image acquired by a camera of a large-scale device deployed at the smart site; the blocking unit 320 is configured to perform image blocking processing on the monitoring image to obtain a plurality of image blocks; the image block object detection unit 330 is configured to pass the plurality of image blocks through an object detection network respectively to obtain at least one region of interest; the region of interest pixel enhancement unit 340 is configured to enhance the region of interest by using an image enhancer based on a countermeasure generation network; the feature extraction unit 350 is configured to pass the enhanced region of interest through a convolutional neural network model serving as a feature extractor to obtain a feature map of the region of interest; the feature enhancement unit 360 is configured to pass the region of interest feature map through a non-local neural network model to obtain an enhanced region of interest feature map; and the security management result generating unit 370 is configured to pass the enhanced region of interest feature map through a classifier to obtain a classification result, where the classification result is used to indicate whether a person invades when the large-scale device is running.

Fig. 4 is a system architecture diagram of an intelligent worksite safety management system according to an embodiment of the present application. As shown in fig. 4, in the system architecture of the intelligent site safety management system 300, in the process of inference, a monitoring image collected by a camera of a large-scale device deployed at the intelligent site is first obtained by the monitoring unit 310; the blocking unit 320 performs image blocking processing on the monitoring image acquired by the monitoring unit 310 to obtain a plurality of image blocks; next, the image block object detecting unit 330 respectively passes the plurality of image blocks obtained by the blocking unit 320 through an object detection network to obtain at least one region of interest; the region of interest pixel enhancement unit 340 passes the region of interest through an image enhancer based on an countermeasure generation network to obtain an enhanced region of interest; then, the feature extraction unit 350 passes the enhanced region of interest obtained by the region of interest pixel enhancement unit 340 through a convolutional neural network model as a feature extractor to obtain a region of interest feature map; the feature enhancement unit 360 obtains an enhanced region of interest feature map by using the region of interest feature map obtained by the feature extraction unit 350 through a non-local neural network model; further, the security management result generation unit 370 passes the enhanced region of interest feature map through a classifier to obtain a classification result indicating whether or not there is a person intrusion when the large-scale apparatus is operated.

Specifically, during the operation of the intelligent site safety management system 300, the monitoring unit 310 and the blocking unit 320 are configured to obtain a monitoring image collected by a camera of a large-scale device deployed on the intelligent site, and then perform image blocking processing on the monitoring image to obtain a plurality of image blocks. It should be understood that in the operation of the devices on the smart site, especially the large devices, accidents occur due to accidental intrusion of personnel, which not only affects the normal operation and operation of the devices, but also brings significant casualties to the personnel, so that monitoring images can be collected through cameras of the large devices deployed on the smart site. Next, in order to make monitoring more comprehensive, it is considered that in the process of actually performing the detection of the intruder to implement the security management of the smart worksite, the camera is generally installed at a higher position of the large-scale device, so that the object related to the person in the monitoring image collected by the camera belongs to a small-sized object, which brings difficulty to the intrusion detection of the person. Therefore, in the technical scheme of the application, the monitoring image is further subjected to image blocking processing to obtain a plurality of image blocks. Accordingly, in a specific example of the present application, the monitoring image may be subjected to a uniform image blocking process to obtain the plurality of image blocks, where each of the plurality of image blocks has the same size. It should be appreciated that after the image segmentation process, the dimensions of each image block of the plurality of image blocks are reduced compared to the original image, and thus, the person object of small size in the monitoring image is no longer a small-sized object in the image block, so as to facilitate detection of an intruder.

Specifically, during operation of the intelligent worksite safety management system 300, the image block target detection unit 330 is configured to pass the plurality of image blocks through a target detection network to obtain at least one region of interest. It should be understood that, considering that there may be a large amount of interference features of environmental information in each image block of the monitoring image, this makes it difficult to determine the detection of personnel intrusion, so in order to accurately detect the intruder when the large-scale apparatus is running, it is necessary to perform detection calibration on the personnel target object in each image block. That is, the image blocks are respectively passed through a target detection network, so as to detect whether a target object may exist in each image block through the target detection network, and at least one region of interest is obtained. Specifically, the target anchoring layer of the target detection network is used for sliding with an anchor frame B to process each image block in the plurality of image blocks so as to frame the region of interest, and therefore the at least one region of interest is obtained. In particular, in one specific example of the present application, the target detection network is here an anchor window based target detection network, which is Fast R-CNN, fast R-CNN or RetinaNet.

Specifically, during operation of the intelligent worksite safety management system 300, the region of interest pixel enhancement unit 340 is configured to enhance the region of interest by passing the region of interest through an image enhancer based on an antagonism generation network. Considering that in the at least one region of interest, a large amount of small environmental particles such as dust exist in the operation site of the smart site, which affects the definition of the region of interest in the monitored image, that is, the region of interest becomes blurred due to the interference of external environmental factors, thereby reducing the accuracy of identifying and detecting the intruder. Therefore, in the technical scheme of the application, the enhancement of the image definition is performed by the image enhancer based on the countermeasure generation network before the feature extraction. Specifically, the region of interest is input into a generator of the antagonism generation network based image enhancer to deconvolution the region of interest by the generator to obtain the enhanced region of interest. In particular, here, the countermeasure-based generation network includes a discriminator for generating an image with enhanced image sharpness and a generator for calculating a difference between the image with enhanced data and a real image and updating network parameters of the generator by a gradient descent direction propagation algorithm to obtain the generator with enhanced image sharpness.

Specifically, during operation of the intelligent worksite safety management system 300, the feature extraction unit 350 is configured to pass the enhanced region of interest through a convolutional neural network model that is a feature extractor to obtain a region of interest feature map. In the technical scheme of the application, a convolutional neural network model which is used as a feature extractor and has excellent performance in the aspect of local implicit feature extraction of an image is used for carrying out feature mining on the enhanced region of interest, so as to extract implicit feature distribution information about personnel invasion in the enhanced region of interest, thereby obtaining a region of interest feature map. In one particular example, the convolutional neural network includes a plurality of neural network layers that are cascaded with one another, wherein each neural network layer includes a convolutional layer, a pooling layer, and an activation layer. In the coding process of the convolutional neural network, each layer of the convolutional neural network carries out convolutional processing based on a convolutional kernel on input data by using the convolutional layer in the forward transmission process of the layer, carries out pooling processing on a convolutional feature map output by the convolutional layer by using the pooling layer and carries out activation processing on the pooling feature map output by the pooling layer by using the activation layer. More specifically, the step of passing the enhanced region of interest through a convolutional neural network model as a feature extractor to obtain a region of interest feature map includes: each layer of the convolutional neural network model using the feature extractor performs, in forward transfer of the layer, input data: carrying out convolution processing on input data to obtain a convolution characteristic diagram; pooling the convolution feature images based on the local feature matrix to obtain pooled feature images; performing nonlinear activation on the pooled feature map to obtain an activated feature map; wherein the output of the last layer of the convolutional neural network serving as the feature extractor is the region of interest feature map, and the input of the first layer of the convolutional neural network serving as the feature extractor is the enhanced region of interest.

FIG. 5 is a flow chart of convolutional neural network coding in an intelligent worksite security management system in accordance with an embodiment of the present application. As shown in fig. 5, in the convolutional neural network coding process, the convolutional neural network coding process includes: each layer of the convolutional neural network model using the feature extractor performs, in forward transfer of the layer, input data: s210, carrying out convolution processing on input data to obtain a convolution characteristic diagram; s220, pooling the convolution feature map based on a local feature matrix to obtain a pooled feature map; s230, performing nonlinear activation on the pooled feature map to obtain an activated feature map; wherein the output of the last layer of the convolutional neural network serving as the feature extractor is the region of interest feature map, and the input of the first layer of the convolutional neural network serving as the feature extractor is the enhanced region of interest. Specifically, during operation of the intelligent worksite safety management system 300, the feature enhancement unit 360 is configured to pass the region of interest feature map through a non-local neural network model to obtain an enhanced region of interest feature map. Considering that convolution is a typical local operation, it only can extract local features of an image, but cannot pay attention to the global, and can affect detection accuracy. For the regions of interest of the image blocks, the regions of interest of the image blocks are not isolated, and the correlation between the feature distributions of the regions of interest of the image blocks generates a foreground object. Therefore, in the technical scheme of the application, in order to more accurately detect and judge personnel invasion, a non-local neural network is used for further extracting the characteristics of the image. That is, the region of interest feature map is passed through a non-local neural network model to expand the feature receptive field through the non-local neural network model, thereby obtaining an enhanced region of interest feature map. In particular, here, the non-local neural network captures hidden dependency information by calculating the similarity between the features of the regions of interest of each image block, so as to model the context features, so that the network focuses on the global overall content between the features of the regions of interest of each image block, and further, the feature extraction capability of the backbone network is improved in classification and detection tasks.

FIG. 6 is a flow chart of non-local neural network coding in an intelligent worksite security management system, according to an embodiment of the present application. As shown in fig. 6, in the non-local neural network encoding process, it includes: s310, inputting the region of interest feature map into a first point convolution layer, a second point convolution layer and a third point convolution layer of the non-local neural network respectively to obtain a first feature map, a second feature map and a third feature map; s320, calculating a weighted sum of the first feature map and the second feature map according to positions to obtain an intermediate fusion feature map; s330, inputting the intermediate fusion feature map into a Softmax function to normalize feature values of each position in the intermediate fusion feature map so as to obtain a normalized intermediate fusion feature map; s340, calculating a weighted sum of the normalized intermediate fusion feature map and the third feature map according to positions to obtain a re-fusion feature map; s350, embedding a Gaussian similarity function into the re-fusion feature map to calculate the similarity between feature values of each position in the re-fusion feature map so as to obtain a global similarity feature map; s360, the global similar feature map passes through a fourth point convolution layer of the non-local neural network to adjust the channel number of the global similar feature map so as to obtain a channel-adjusted global similar feature map; and S370, calculating a weighted sum of the channel adjustment global similarity feature map and the region of interest feature map by position to obtain the enhanced region of interest feature map.

Specifically, during the operation of the intelligent site safety management system 300, the safety management result generating unit 370 is configured to pass the enhanced region of interest feature map through a classifier to obtain a classification result, where the classification result is used to indicate whether a person invades when a large-scale device is operated. In the technical scheme of the application, the enhanced region of interest feature map is used as a classification feature map to be subjected to classification processing in a classifier so as to obtain a classification result used for indicating whether personnel invasion exists when large-scale equipment operates. Therefore, the personnel invasion can be accurately detected in real time when the large-scale equipment operates, so that an early warning signal is sent when the personnel invading in and out are detected. In a specific example of the present application, the step of passing the enhanced region of interest feature map through a classifier to obtain a classification result includes: processing the enhanced region of interest feature map using the classifier to obtain a classification result with the following formula:

O＝softmax{(W _n ,Bn):…:(W ₁ ,B ₁ ) Project (F), where Project (F) represents projecting the enhanced region of interest feature map as a vector, W ₁ To W _n Weight matrix for all the connection layers of each layer, B ₁ To B _n Representing the bias vector for each fully connected layer. Specifically, the classifier includes a plurality of fully connected layers and a Softmax layer cascaded with a last fully connected layer of the plurality of fully connected layers. Wherein in the classification process of the classifier, the enhanced region of interest feature map is first projected as a vector, for example, in a specific example, the enhanced region of interest feature map is expanded along a row vector or a column vector into a classification feature vector; then, performing multiple full-connection coding on the classification feature vectors by using multiple full-connection layers of the classifier to obtain coded classification feature vectors; further, the coding classification feature vector is input into a Softmax layer of the classifier, that is, the coding classification feature vector is subjected to classification processing by using the Softmax classification function to obtain a classification result for indicating whether personnel invasion exists when a large-scale device is operated.

It will be appreciated that training of the object detection network, the countermeasure-generating network based image enhancer, the convolutional neural network model as a feature extractor, the non-local neural network model, and the classifier is required before the inference is made using the neural network model described above. That is, in the intelligent worksite safety management system of the present application, a training module for training the object detection network, the countermeasure-generation network-based image enhancer, the convolutional neural network model as the feature extractor, the non-local neural network model, and the classifier is further included.

FIG. 3 is a block diagram of a training module in the intelligent worksite safety management system according to an embodiment of the present application. As shown in fig. 3, the intelligent worksite safety management system 300 according to an embodiment of the present application further includes a training module 400 including: a training data acquisition unit 410; training the partitioning unit 420; training the image block object detection unit 430; training the region of interest pixel enhancement unit 440; training the feature extraction unit 450; training a feature enhancement unit 460; a classification loss unit 470; an intrinsic chemistry loss unit 480; and a training unit 490.

The training data obtaining unit 410 is configured to obtain training data, where the training data includes a training monitoring image, and a true value of whether a person invades when the large-scale device is running; the training blocking unit 420 is configured to perform image blocking processing on the training monitoring image to obtain a plurality of training image blocks; the training image block target detection unit 430 is configured to pass the plurality of training image blocks through the target detection network to obtain at least one training region of interest, respectively; the training region of interest pixel enhancement unit 440 is configured to pass the training region of interest through the image enhancer based on the countermeasure generation network to obtain a training enhanced region of interest; the training feature extraction unit 450 is configured to pass the training enhanced region of interest through the convolutional neural network model serving as the feature extractor to obtain a training region of interest feature map; the training feature enhancement unit 460 is configured to pass the training region of interest feature map through the non-local neural network model to obtain a training enhanced region of interest feature map; the classification loss unit 470 is configured to pass the training enhanced region of interest feature map through the classifier to obtain a classification loss function value; the intrinsic learning loss unit 480 is configured to calculate a sequence-to-sequence response rule intrinsic learning loss function value based on a distance between the enhanced region-of-interest feature map and a feature vector obtained after the projection of the region-of-interest feature map; and the training unit 490 is configured to calculate, as a loss function value, a weighted sum of the classification loss function value and the sequence-to-sequence response rule intrinsic learning loss function value to train the target detection network, the countermeasure-generation network-based image enhancer, the convolutional neural network model as a feature extractor, the non-local neural network model, and the classifier.

Fig. 7 is a system architecture diagram of a training module in the intelligent worksite safety management system according to an embodiment of the present application. As shown in fig. 7, in the system architecture of the intelligent worksite safety management system 300, during the training process, training data is first acquired by the training data acquisition unit 410, where the training data includes training monitoring images, and whether there is a real value of personnel intrusion when the large-scale equipment is running; next, the training blocking unit 420 performs image blocking processing on the training monitor image acquired by the training data acquisition unit 410 to obtain a plurality of training image blocks; the training image block target detection unit 430 respectively passes the plurality of training image blocks obtained by the training blocking unit 420 through the target detection network to obtain at least one training interest region; the training region of interest pixel enhancement unit 440 passes the training region of interest obtained by the training image block target detection unit 430 through the image enhancer based on the countermeasure generation network to obtain a training enhanced region of interest; then, the training feature extraction unit 450 passes the training enhancement region of interest obtained by the training region of interest pixel enhancement unit 440 through the convolutional neural network model as the feature extractor to obtain a training region of interest feature map; the training feature enhancement unit 460 passes the training interested region feature map obtained by the training feature extraction unit 450 through the non-local neural network model to obtain a training enhanced interested region feature map; then, the classification loss unit 470 passes the training enhanced region of interest feature map obtained by the training feature enhancement unit 460 through the classifier to obtain a classification loss function value; the intrinsic learning loss unit 480 calculates a sequence-to-sequence response rule intrinsic learning loss function value based on a distance between the enhanced region-of-interest feature map and feature vectors obtained after the region-of-interest feature map is projected; further, the training unit 490 calculates a weighted sum of the classification loss function value and the sequence-to-sequence response rule intrinsic learning loss function value as a loss function value to train the object detection network, the countermeasure-generation network-based image enhancer, the convolutional neural network model as a feature extractor, the non-local neural network model, and the classifier.

Particularly, in the technical scheme of the application, the non-local neural network can capture remote dependency information by calculating the similarity of all pixel points of an image so as to enable the network to focus on the whole image characteristic association semantics, but because the image local characteristic association semantics in the region of interest expressed by the region of interest feature map is also important, the intrinsic responsiveness of the enhanced region of interest feature map relative to the region of interest feature map is expected to be improved, so that the accuracy of the classification result of the enhanced region of interest feature map through a classifier is improved. Based on this, the applicant of the present application calculated a sequence-to-sequence response rule internalization learning loss function between the enhanced region-of-interest feature map and the region-of-interest feature map, expressed as:

wherein V is ₁ And V ₂ The feature vectors obtained after the projection of the enhanced region of interest feature map and the region of interest feature map are respectively, and W ₁ And W is ₂ Classifier to V ₁ And V ₂ Is used, the ReLU (·) represents the ReLU activation function, the Sigmoid (·) represents the Sigmoid activation function,representing matrix multiplication, d (·,) represents the Euclidean distance between the two vectors,/ >Representing the inherent chemistry loss function value of the sequence-to-sequence response rule. Here, through the classifier to the squeezing-excitation channel attention mechanism of the weight matrix of different sequences, the enhanced distinguishing capability between the feature vector sequences obtained after the feature map is unfolded can be obtained, through training the network by using the loss function, the recovery of causal relation features with better distinguishing performance between response sequences can be realized, so as to carry out internalization learning on the reason-result type response rules between the sequences, and the intrinsic responsiveness between the sequences is enhanced. In this way, the intrinsic responsiveness of the enhanced region-of-interest feature map to the region-of-interest feature map is improved, so that the expression capability of the enhanced region-of-interest feature map to the image local feature association semantics is improved, the classification effect of the enhanced region-of-interest feature map is improved, and the classification accuracy is further improved. Therefore, personnel invasion can be accurately detected in real time when large equipment operates, early warning signals are sent out when the personnel invaded in and out are detected, safety management of an intelligent building site is achieved, and further accidents are avoided while normal operation of the equipment is guaranteed.

In summary, the intelligent site safety management system 300 according to the embodiment of the present application is illustrated, which uses an artificial intelligent monitoring technology based on machine vision to perform a blocking process on a monitored image to obtain more accurate small-size object information in the image, then further perform detection of a target object on each image block to perform region-of-interest framing, and extract global implicit relevant feature information of the region-of-interest after performing pixel enhancement on the image, so as to perform detection and judgment on whether personnel intrusion exists. Therefore, personnel invasion can be accurately detected in real time when large equipment operates, early warning signals are sent out when the personnel invaded in and out are detected, safety management of an intelligent building site is achieved, and further accidents are avoided while normal operation of the equipment is guaranteed.

As described above, the intelligent worksite security management system according to the embodiment of the present application may be implemented in various terminal devices. In one example, intelligent worksite security management system 300 according to embodiments of the present application may be integrated into a terminal device as a software module and/or hardware module. For example, the smart worksite security management system 300 may be a software module in the operating system of the terminal device, or may be an application developed for the terminal device; of course, the intelligent worksite security management system 300 may also be one of a plurality of hardware modules of the terminal device.

Alternatively, in another example, the smart worksite security management system 300 and the terminal device may be separate devices, and the smart worksite security management system 300 may be connected to the terminal device through a wired and/or wireless network and transmit interactive information in a contracted data format.

Exemplary method

FIG. 8 is a flow chart of a method for intelligent worksite security management according to an embodiment of the present application. As shown in fig. 8, the intelligent construction site safety management method according to the embodiment of the application comprises the following steps: s110, acquiring a monitoring image acquired by a camera of large equipment deployed at an intelligent building site; s120, a blocking unit is used for carrying out image blocking processing on the monitoring image to obtain a plurality of image blocks; s130, an image block target detection unit, which is used for respectively passing the plurality of image blocks through a target detection network to obtain at least one region of interest; s140, a region of interest pixel enhancement unit, configured to enhance the region of interest by using an image enhancer based on a countermeasure generation network; s150, a feature extraction unit is used for enabling the enhanced region of interest to pass through a convolutional neural network model serving as a feature extractor so as to obtain a region of interest feature map; s160, a feature enhancement unit, which is used for enabling the region of interest feature map to pass through a non-local neural network model to obtain an enhanced region of interest feature map; and S170, a safety management result generating unit is used for enabling the enhanced region of interest feature map to pass through a classifier to obtain a classification result, wherein the classification result is used for indicating whether personnel invasion exists when the large-scale equipment operates.

In one example, in the smart worksite security management method, the step S120 includes: and carrying out uniform image blocking processing on the monitoring image to obtain a plurality of image blocks, wherein each image block in the plurality of image blocks has the same size.

In one example, in the smart worksite security management method, the step S130 includes: the target detection network is an anchor window-based target detection network, and the anchor window-based target detection network is Fast R-CNN, fast R-CNN or RetinaNet.

In one example, in the smart worksite security management method, the step S140 includes: the countermeasure-based generation network comprises a discriminator and a generator, wherein the region-of-interest pixel enhancement unit is further configured to input the region of interest into the generator of the countermeasure-based generation network image enhancer to deconvolute the region of interest by the generator to obtain the enhanced region of interest.

In one example, in the smart worksite security management method, the step S150 includes: each layer of the convolutional neural network model using the feature extractor performs, in forward transfer of the layer, input data: carrying out convolution processing on input data to obtain a convolution characteristic diagram; pooling the convolution feature images based on the local feature matrix to obtain pooled feature images; performing nonlinear activation on the pooled feature map to obtain an activated feature map; wherein the output of the last layer of the convolutional neural network serving as the feature extractor is the region of interest feature map, and the input of the first layer of the convolutional neural network serving as the feature extractor is the enhanced region of interest.

In one example, in the smart worksite security management method, the step S160 includes: inputting the region of interest feature map into a first point convolution layer, a second point convolution layer and a third point convolution layer of the non-local neural network respectively to obtain a first feature map, a second feature map and a third feature map; calculating a weighted sum of the first feature map and the second feature map according to positions to obtain an intermediate fusion feature map; inputting the intermediate fusion feature map into a Softmax function to normalize feature values of each position in the intermediate fusion feature map so as to obtain a normalized intermediate fusion feature map; calculating a weighted sum of the normalized intermediate fusion feature map and the third feature map by position to obtain a re-fusion feature map; embedding a Gaussian similarity function into the re-fusion feature map to calculate the similarity between feature values of each position in the re-fusion feature map so as to obtain a global similarity feature map; the global similar feature map passes through a fourth point convolution layer of the non-local neural network to adjust the channel number of the global similar feature map so as to obtain a channel-adjusted global similar feature map; and calculating a position weighted sum of the channel-adjustment global similarity feature map and the region-of-interest feature map to obtain the enhanced region-of-interest feature map.

In one example, in the smart worksite security management method, the step S170 includes: processing the enhanced region of interest feature map using the classifier to obtain a classification result with the following formula:

O＝softmax{(W _n ,B _n ):…:(W ₁ ,B ₁ ) Project (F) where Project (F) represents projecting the enhanced region of interest feature map as a vector, W ₁ To W _n Weights for all layers of the connection layerMatrix, B ₁ To B _n Representing the bias vector for each fully connected layer.

In summary, the intelligent site safety management method according to the embodiment of the application is explained by adopting an artificial intelligent monitoring technology based on machine vision, so as to further detect target objects for each image block to frame a region of interest after the monitored image is subjected to blocking processing to obtain more accurate small-size object information in the image, and extract global implicit relevant characteristic information of the region of interest after the pixels of the image are enhanced so as to detect and judge whether personnel invasion exists. Therefore, personnel invasion can be accurately detected in real time when large equipment operates, early warning signals are sent out when the personnel invaded in and out are detected, safety management of an intelligent building site is achieved, and further accidents are avoided while normal operation of the equipment is guaranteed.

Exemplary electronic device

Next, an electronic device according to an embodiment of the present application is described with reference to fig. 9.

Fig. 9 illustrates a block diagram of an electronic device according to an embodiment of the application.

As shown in fig. 9, the electronic device 10 includes one or more processors 11 and a memory 12.

The processor 11 may be a Central Processing Unit (CPU) or other form of processing unit having data processing and/or instruction execution capabilities, and may control other components in the electronic device 10 to perform desired functions.

Memory 12 may include one or more computer program products that may include various forms of computer-readable storage media, such as volatile memory and/or non-volatile memory. The volatile memory may include, for example, random Access Memory (RAM) and/or cache memory (cache), and the like. The non-volatile memory may include, for example, read Only Memory (ROM), hard disk, flash memory, and the like. On which one or more computer program instructions may be stored that may be executed by processor 11 to implement the functions in the intelligent worksite safety management system of the various embodiments of the present application described above and/or other desired functions. Various content, such as enhanced region of interest feature maps, may also be stored in the computer-readable storage medium.

In one example, the electronic device 10 may further include: an input device 13 and an output device 14, which are interconnected by a bus system and/or other forms of connection mechanisms (not shown).

The input means 13 may comprise, for example, a keyboard, a mouse, etc.

The output device 14 may output various information including the classification result and the like to the outside. The output means 14 may include, for example, a display, speakers, a printer, and a communication network and remote output devices connected thereto, etc.

Of course, only some of the components of the electronic device 10 that are relevant to the present application are shown in fig. 9 for simplicity, components such as buses, input/output interfaces, etc. are omitted. In addition, the electronic device 10 may include any other suitable components depending on the particular application.

Exemplary computer program product and computer readable storage Medium

In addition to the methods and apparatus described above, embodiments of the application may also be a computer program product comprising computer program instructions which, when executed by a processor, cause the processor to perform steps in the functions of the intelligent worksite safety management method according to the various embodiments of the application described in the "exemplary systems" section of this specification.

The computer program product may write program code for performing operations of embodiments of the present application in any combination of one or more programming languages, including an object oriented programming language such as Java, C++ or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device, partly on a remote computing device, or entirely on the remote computing device or server.

Furthermore, embodiments of the present application may also be a computer-readable storage medium, having stored thereon computer program instructions that, when executed by a processor, cause the processor to perform steps in the functions of the intelligent worksite safety management method according to the various embodiments of the present application described in the "exemplary systems" section above.

The computer readable storage medium may employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. The readable storage medium may include, for example, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium would include the following: an electrical connection having one or more wires, a portable disk, a hard disk, random Access Memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

The basic principles of the present application have been described above in connection with specific embodiments, however, it should be noted that the advantages, benefits, effects, etc. mentioned in the present application are merely examples and not intended to be limiting, and these advantages, benefits, effects, etc. are not to be considered as essential to the various embodiments of the present application. Furthermore, the specific details disclosed herein are for purposes of illustration and understanding only, and are not intended to be limiting, as the application is not necessarily limited to practice with the above described specific details.

The block diagrams of the devices, apparatuses, devices, systems referred to in the present application are only illustrative examples and are not intended to require or imply that the connections, arrangements, configurations must be made in the manner shown in the block diagrams. As will be appreciated by one of skill in the art, the devices, apparatuses, devices, systems may be connected, arranged, configured in any manner. Words such as "including," "comprising," "having," and the like are words of openness and mean "including but not limited to," and are used interchangeably therewith. The terms "or" and "as used herein refer to and are used interchangeably with the term" and/or "unless the context clearly indicates otherwise. The term "such as" as used herein refers to, and is used interchangeably with, the phrase "such as, but not limited to.

It is also noted that in the apparatus, devices and methods of the present application, the components or steps may be disassembled and/or assembled. Such decomposition and/or recombination should be considered as equivalent aspects of the present application.

The previous description of the disclosed aspects is provided to enable any person skilled in the art to make or use the present application. Various modifications to these aspects will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other aspects without departing from the scope of the application. Thus, the present application is not intended to be limited to the aspects shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

The foregoing description has been presented for purposes of illustration and description. Furthermore, this description is not intended to limit embodiments of the application to the form disclosed herein. Although a number of example aspects and embodiments have been discussed above, a person of ordinary skill in the art will recognize certain variations, modifications, alterations, additions, and subcombinations thereof.

Claims

1. An intelligent worksite safety management system, comprising:

2. The intelligent worksite safety management system of claim 1, wherein the blocking unit is further configured to: and carrying out uniform image blocking processing on the monitoring image to obtain a plurality of image blocks, wherein each image block in the plurality of image blocks has the same size.

3. The intelligent worksite safety management system of claim 2, wherein the tile object detection unit is further configured to: the target detection network is an anchor window-based target detection network, and the anchor window-based target detection network is Fast R-CNN, fast R-CNN or RetinaNet.

4. The intelligent worksite safety management system of claim 3, wherein the countermeasure-based generation network includes a discriminator and a generator, wherein the region-of-interest pixel enhancement unit is further configured to input the region of interest into the generator of the countermeasure-based generation network image enhancer to deconvolute the region of interest by the generator to obtain the enhanced region of interest.

5. The intelligent worksite safety management system of claim 4, wherein the feature extraction unit is further configured to: each layer of the convolutional neural network model using the feature extractor performs, in forward transfer of the layer, input data:

carrying out convolution processing on input data to obtain a convolution characteristic diagram;

pooling the convolution feature images based on the local feature matrix to obtain pooled feature images; and

Non-linear activation is carried out on the pooled feature map so as to obtain an activated feature map;

wherein the output of the last layer of the convolutional neural network serving as the feature extractor is the region of interest feature map, and the input of the first layer of the convolutional neural network serving as the feature extractor is the enhanced region of interest.

6. The intelligent worksite safety management system of claim 5, wherein the feature enhancement unit is further configured to:

inputting the region of interest feature map into a first point convolution layer, a second point convolution layer and a third point convolution layer of the non-local neural network respectively to obtain a first feature map, a second feature map and a third feature map;

calculating a weighted sum of the first feature map and the second feature map according to positions to obtain an intermediate fusion feature map;

inputting the intermediate fusion feature map into a Softmax function to normalize feature values of each position in the intermediate fusion feature map so as to obtain a normalized intermediate fusion feature map;

calculating a weighted sum of the normalized intermediate fusion feature map and the third feature map by position to obtain a re-fusion feature map;

embedding a Gaussian similarity function into the re-fusion feature map to calculate the similarity between feature values of each position in the re-fusion feature map so as to obtain a global similarity feature map;

The global similar feature map passes through a fourth point convolution layer of the non-local neural network to adjust the channel number of the global similar feature map so as to obtain a channel-adjusted global similar feature map; and

and calculating a weighted sum of the channel adjustment global similar feature map and the region of interest feature map according to positions to obtain the enhanced region of interest feature map.

7. The smart worksite safety management system of claim 6, wherein the safety management result generation unit is further configured to: processing the enhanced region of interest feature map using the classifier to obtain a classification result with the following formula:

O＝softmax{(W _n ,B _n ):…:(W ₁ ,B ₁ ) Project (F), where Project (F) represents projecting the enhanced region of interest feature map as a vector, W ₁ To W _n Weight matrix for all connection layers of each layer, W ₁ To B _n Representing the bias vector for each fully connected layer.

8. The intelligent worksite safety management system of claim 7, further comprising a training module for training the object detection network, the countermeasure-generation network-based image enhancer, the convolutional neural network model as a feature extractor, the non-local neural network model, and the classifier;

Wherein, training module includes:

the training data acquisition unit is used for acquiring training data, wherein the training data comprises training monitoring images and whether a true value of personnel invasion exists when the large-scale equipment operates;

the training blocking unit is used for carrying out image blocking processing on the training monitoring image to obtain a plurality of training image blocks;

the training image block target detection unit is used for respectively passing the plurality of training image blocks through the target detection network to obtain at least one training interested region;

a training region of interest pixel enhancement unit for passing the training region of interest through the countermeasure-generating network-based image enhancer to obtain a training enhanced region of interest;

the training feature extraction unit is used for enabling the training enhancement region of interest to pass through the convolutional neural network model serving as the feature extractor so as to obtain a training region of interest feature map;

the training feature enhancement unit is used for enabling the training region-of-interest feature map to pass through the non-local neural network model to obtain a training enhancement region-of-interest feature map;

the classification loss unit is used for enabling the training enhancement region-of-interest feature map to pass through the classifier to obtain a classification loss function value;

An intrinsic learning loss unit, configured to calculate an intrinsic learning loss function value of a sequence-to-sequence response rule based on a distance between the enhanced region-of-interest feature map and feature vectors obtained after the projection of the region-of-interest feature map; and

a training unit for calculating a weighted sum of the classification loss function value and the sequence-to-sequence response rule intrinsic learning loss function value as a loss function value to train the object detection network, the countermeasure-generation network-based image enhancer, the convolutional neural network model as a feature extractor, the non-local neural network model, and the classifier.

9. The intelligent worksite safety management system of claim 8, wherein the intrinsic learning loss unit is further configured to: calculating a sequence-to-sequence response rule intrinsic learning loss function value based on the distance between the enhanced region-of-interest feature map and feature vectors obtained after the projection of the region-of-interest feature map;

wherein, the formula is: