CN113469150B

CN113469150B - Method and system for identifying risk behaviors

Info

Publication number: CN113469150B
Application number: CN202111029514.6A
Authority: CN
Inventors: 金淼; 张军; 雷民; 周峰; 殷小东; 陈习文; 卢冰; 王斯琪; 陈卓; 周玮; 汪泉; 付济良; 聂高宁; 王旭; 齐聪; 郭子娟; 余雪芹; 刘俊; 朱赤丹; 郭鹏
Original assignee: China Electric Power Research Institute Co Ltd CEPRI
Current assignee: China Electric Power Research Institute Co Ltd CEPRI
Priority date: 2021-09-03
Filing date: 2021-09-03
Publication date: 2021-11-12
Anticipated expiration: 2041-09-03
Also published as: CN113469150A

Abstract

The invention discloses a method and a system for identifying risk behaviors, wherein the method comprises the following steps: selecting acceleration signals of a target object in a continuous time period when the target object performs a target behavior, and acquiring acceleration characteristic vectors of the acceleration signals in the continuous time period through a deep neural network model; selecting a depth image of a target object at the end moment of a target behavior, and acquiring a target object feature vector of the target object in the depth image through a depth neural network model; selecting a two-dimensional image of a target object at the end moment of a target behavior, and acquiring background element characteristics of the target object in the two-dimensional image through a deep neural network model; inputting the acceleration characteristic vector and the target object characteristic vector into the neural network discrimination model to obtain the behavior action of the target object; and judging whether the behavior action of the target object meets a preset safety behavior standard or not based on the background element characteristics.

Description

Method and system for identifying risk behaviors

Technical Field

The invention relates to the technical field of human behavior recognition, in particular to a method and a system for recognizing risk behaviors.

Background

The intelligent identification technology for the safety risk of the field operation is a comprehensive technology consisting of machine vision, behavior identification, mode identification and the like, is an auxiliary management and control means for personal risk in the security field, is related to the fields of public safety, smart cities, medical rehabilitation and the like at present, and is also a hotspot development field under the new foundation construction background of China. The core human behavior recognition technology is that a computer is used for analyzing collected behavior information of a person to judge what the person is doing, and classification is carried out according to implementation modes, so that human behavior recognition can be divided into a recognition mode based on a sensor and a recognition mode based on vision, and the combination of the two modes is also included. The human body behavior identification method based on vision realizes human body abnormal behavior identification based on visual information through key technical steps of video processing, motion detection and tracking, feature extraction, mode classification and identification and the like; the wearable sensing information-based human behavior recognition technology is characterized in that inertial motion signal data of body movement is acquired through a plurality of sensor nodes bound or worn on the body, operations such as feature extraction and feature selection are performed on the basis of preprocessing such as denoising of the data, a reasonable classifier is selected for training, and finally a recognition algorithm is verified through action signals acquired actually. The inevitable trend in the field of risk and behavior monitoring and analysis of a multi-sensor combined monitoring mode is that a researcher can simultaneously obtain multi-source data including a color video, a depth image sequence and a three-dimensional human body skeleton sequence by distributing and controlling various visual sensors, wearable motion sensors and the like on an electric power field; under the drive of rich information sources and efficient intelligent algorithms, the field application of the abnormal behavior identification algorithm of the operators is more feasible.

In recent years, the integration of information technology and safety production is greatly promoted in the power industry, and advanced technologies such as intelligent safety tools, video monitoring and image analysis are actively adopted, so that the field safety and technical prevention level is improved.

Prior art 1 (publication No. CN 112149761A) provides a power intelligent construction site violation detection method based on YOLOv4 improved algorithm. Prior art 1 discloses: collecting images of a power construction site for training a model; carrying out image enhancement on the acquired image of the electric power construction site; marking the target area obtained after image and data enhancement by using a rectangular frame, and acquiring the coordinates and the included types of the rectangular frame; training an improved model based on YOLOv4 according to the acquired image of the construction site and the image obtained after data enhancement, the acquired coordinates of the rectangular frame and the acquired types; and acquiring images of the electric power construction site in real time, acquiring images of the electric power construction site to be detected in real time according to the trained model, and outputting violation images. Prior art 1 is applicable to the detection of the personnel violation and the construction vehicle violation of job site, realizes that electric power wisdom building site is visual, intelligent management, can effectively promote engineering scene management level and reduce the safety risk.

However, the electric power intelligent construction site violation detection method based on the YOLOv4 improved algorithm disclosed in prior art 1 is too single, the detectable categories are limited, the algorithm recognition efficiency can only detect categories with static characteristics, such as helmet wearing, vehicle loading and the like, while dynamic behaviors of workers cannot be judged, and the violation detection category and accuracy rate have relatively large limitations.

In the prior art 2, (publication number: CN 111814601A) proposes a video analysis method combining target detection and human body posture estimation, which obtains a live camera video stream through a streaming media server, and captures a video frame from the live camera video stream to obtain a decoded picture; performing rapid target detection on each obtained picture by adopting a deep learning algorithm to find out target objects of all identifiable categories; judging whether a person exists, and sequentially analyzing dangerous behaviors and dangerous scenes of the pictures; in the process of carrying out dangerous behavior analysis and dangerous scene analysis, if a dangerous event is found, an alarm is reported and pushed to a display platform in time; and continuously analyzing the video stream in real time. According to the method, the target detection and human body posture estimation algorithm are combined, the method is introduced into the field of intelligent monitoring and identification of the infrastructure video, dangerous behaviors or scenes can be identified, the labor force of manually staring at the screen in the traditional monitoring can be liberated, and the efficiency of infrastructure safety management is effectively improved.

However, in the video analysis method combining target detection and human body posture estimation disclosed in prior art 2, a two-dimensional image is used for alarming a dangerous scene, but due to the problem that objects overlap and are shielded in the two-dimensional image, false alarm is easily generated when a worker and a dangerous area appear in the same image. And the image data can only be used for detecting simple behaviors, and more semantic information is needed to be used for danger analysis on complex behaviors.

Therefore, a technique is needed to enable intelligent identification of field work risks.

Disclosure of Invention

The technical scheme of the invention provides a method and a system for identifying risk behaviors, which are used for solving the problem of how to intelligently identify the risk of field operation.

In order to solve the above problem, the present invention provides a method for identifying a risk behavior, the method comprising:

selecting acceleration signals of a target object in a continuous time period when the target object performs a target behavior, and acquiring acceleration characteristic vectors of the acceleration signals in the continuous time period through a deep neural network model;

selecting a depth image of a target object at the end moment of a target behavior, and acquiring a target object feature vector of the target object in the depth image through a depth neural network model;

selecting a two-dimensional image of a target object at the end moment of a target behavior, and acquiring background element characteristics of the target object in the two-dimensional image through a deep neural network model;

inputting the acceleration characteristic vector and the target object characteristic vector into the neural network discrimination model to obtain the behavior action of the target object;

and judging whether the behavior action of the target object meets a preset safety behavior standard or not based on the background element characteristics.

Preferably, the method further comprises the following steps: determining a depth and a width of the deep neural network model;

the depth of the deep neural network model is as follows:

the width of the deep neural network model is as follows:

wherein

And

to find the constants by the method of searching the optimum values,

is a manually adjusted parameter;

when the constraint condition is satisfied

Adaptively obtaining parameters of the deep neural network model, wherein

Is a set parameter.

Preferably, the method further comprises the following steps: determining attention weights for the deep neural network model:

determining an attention feature pyramid structure, wherein each layer of the attention feature pyramid structure is provided with different attention weights, and the attention feature weights respectively correspond to feature contribution ratios of the two-dimensional images with different resolutions;

the attention weight calculation formula is as follows:

wherein

A vector of scalar or two-dimensional image channels that are features; where i is the serial number of the current input feature, j is the serial number of the non-output feature, ϵ = 0.0001.

Preferably, the method further comprises the following steps: determining a position of the target object:

acquiring the time consumed by data transmission between a preset ultra-wideband base station and an ultra-wideband tag of the target object, and measuring the distance between the ultra-wideband base station and the tag by a two-way time flight method based on the time consumed by data transmission;

determining a relative position of the target object and the ultra-wideband base station based on a distance between the ultra-wideband base station and the tag;

positioning the ultra-wideband base station, and determining the absolute position of the ultra-wideband base station;

determining an absolute position of the target object based on the absolute position of the ultra-wideband base station and a relative position of the target object and the ultra-wideband base station;

and judging whether the behavior action of the target object meets a preset safety behavior standard or not based on the absolute position of the target object and the background element characteristics.

Based on another aspect of the invention, the invention provides a system for identifying risk behaviors, the system comprising:

the first acquisition unit is used for selecting acceleration signals of a target object in a continuous time period when the target object performs a target behavior, and acquiring acceleration characteristic vectors of the acceleration signals in the continuous time period through a deep neural network model;

the second acquisition unit is used for selecting a depth image of a target object at the end moment of a target behavior and acquiring a target object feature vector of the target object in the depth image through a depth neural network model;

the third acquisition unit is used for selecting a two-dimensional image of a target object at the end moment of a target behavior and acquiring background element characteristics of the target object in the two-dimensional image through a deep neural network model;

a fourth obtaining unit, configured to input the acceleration feature vector and the target object feature vector into the neural network discrimination model, and obtain a behavior action of the target object;

and the result unit is used for judging whether the behavior action of the target object meets a preset safety behavior standard or not based on the background element characteristics.

Preferably, the device further comprises a fifth obtaining unit, configured to determine a depth and a width of the deep neural network model;

the depth of the deep neural network model is as follows:

the width of the deep neural network model is as follows:

wherein

And

to find the constants by the method of searching the optimum values,

is a manually adjusted parameter;

when the constraint condition is satisfied

Adaptively obtaining parameters of the deep neural network model, wherein

Is a set parameter.

Preferably, the fifth obtaining unit is further configured to: determining attention weights for the deep neural network model:

the attention weight calculation formula is as follows:

wherein

Preferably, the system further comprises a sixth acquisition unit, configured to determine the position of the target object:

The invention provides a computer-readable storage medium, characterized in that the computer-readable storage medium stores a computer program for executing the above-mentioned method for identifying a risk behaviour.

The present invention provides an electronic device, characterized in that the electronic device includes:

a processor;

a memory for storing the processor-executable instructions;

the processor is used for reading the executable instructions from the memory and executing the method for identifying the risk behaviors.

The technical scheme of the invention provides a method and a system for identifying risk behaviors, wherein the method comprises the following steps: selecting acceleration signals of a target object in a continuous time period when the target object performs a target behavior, and extracting acceleration characteristic vectors of the acceleration signals in the continuous time period through a deep neural network model; selecting a depth image of a target object at the end moment of a target behavior, and extracting a target object feature vector of the target object in the depth image through a depth neural network model; selecting a two-dimensional image of a target object at the end moment of a target behavior, and extracting background element characteristics of the target object in the two-dimensional image through a deep neural network model; inputting the acceleration characteristic vector and the target object characteristic vector into a neural network discrimination model to obtain the behavior action of the target object; and judging whether the behavior action of the target object meets a preset safety behavior standard or not based on the background element characteristics. The intelligent identification scheme for the risk of the field operation, which is provided by the technical scheme of the invention, changes the manual supervision of passive field operation into active intelligent monitoring, and provides possibility for the prior early warning of the safety risk of the field operation. According to the technical scheme, the work mode that the computer does not stop for 24 hours is used, the safety supervision task is completed by assistance of manpower, the problem that the energy of safety management and control personnel is limited at present is solved, meanwhile, the labor cost can be reduced, the power field safety management appeal is written into an algorithm, the safety management and control are performed on the operation field, and the objectivity of safety supervision is effectively guaranteed.

Drawings

A more complete understanding of exemplary embodiments of the present invention may be had by reference to the following drawings in which:

FIG. 1 is a flow chart of a method for identifying risk behaviors in accordance with a preferred embodiment of the present invention;

FIG. 2 is a schematic diagram of an embodiment of a cross-modal power job site risk behavior identification system according to a preferred embodiment of the present invention;

FIG. 3 is a schematic diagram of an example of a cross-modal power job site risk behavior identification method according to a preferred embodiment of the present invention;

FIG. 4 is a schematic diagram of one example of a network structure of an image target recognition algorithm in accordance with a preferred embodiment of the present invention;

FIG. 5 is a schematic diagram of one example of a network structure of a cross-modal behavior recognition algorithm in accordance with a preferred embodiment of the present invention;

FIG. 6 is a schematic diagram of an example of an indoor and outdoor integrated precise positioning algorithm according to a preferred embodiment of the present invention;

FIG. 7 is a schematic diagram of an embodiment of a cross-modal power job site risk behavior identification device in accordance with a preferred embodiment of the present invention; and

fig. 8 is a diagram of a system for identifying risk behaviors in accordance with a preferred embodiment of the present invention.

Detailed Description

The exemplary embodiments of the present invention will now be described with reference to the accompanying drawings, however, the present invention may be embodied in many different forms and is not limited to the embodiments described herein, which are provided for complete and complete disclosure of the present invention and to fully convey the scope of the present invention to those skilled in the art. The terminology used in the exemplary embodiments illustrated in the accompanying drawings is not intended to be limiting of the invention. In the drawings, the same units/elements are denoted by the same reference numerals.

Unless otherwise defined, terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. Further, it will be understood that terms, such as those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of the relevant art and will not be interpreted in an idealized or overly formal sense.

FIG. 1 is a flow chart of a method for identifying risk behaviors in accordance with a preferred embodiment of the present invention. Aiming at the requirement of safety control of the existing electric power field operation personnel, the invention provides a cross-modal electric power operation site risk behavior identification method, the implementation mode of the invention can realize the analysis of multi-modal information of the electric power field operation, and the intelligent sensing and judging mode is used for detecting whether the behavior of the working personnel meets the safety standard.

As shown in fig. 1, the present invention provides a method for identifying risk behaviors, the method comprising:

step 101: selecting acceleration signals of a target object in a continuous time period when the target object performs a target behavior, and acquiring acceleration characteristic vectors of the acceleration signals in the continuous time period through a deep neural network model;

step 102: selecting a depth image of a target object at the end moment of a target behavior, and acquiring a target object feature vector of the target object in the depth image through a depth neural network model;

step 103: selecting a two-dimensional image of a target object at the end moment of a target behavior, and acquiring background element characteristics of the target object in the two-dimensional image through a deep neural network model;

step 104: inputting the acceleration characteristic vector and the target object characteristic vector into a neural network discrimination model to obtain the behavior action of the target object;

step 105: and judging whether the behavior action of the target object meets a preset safety behavior standard or not based on the background element characteristics.

The invention provides a cross-modal electric power operation site risk behavior identification method, which comprises the following steps: the system comprises an image target identification algorithm, a risk behavior identification algorithm, a cross-modal behavior identification algorithm and an indoor and outdoor integrated accurate positioning algorithm.

The input data of the cross-modal behavior recognition algorithm provided by the invention are data of a plurality of different modalities, including a two-dimensional image acquired by a common camera, a depth image acquired by a depth camera and an acceleration signal acquired by an acceleration sensor. The cross-modal behavior recognition algorithm comprises the following steps:

step one, carrying out data processing on the acceleration signals in a period of continuous time and extracting features through a neural network model. Firstly, data processing is needed, and the specific method is that the data processing is carried out in three axial directions

Single dimensional data overlay formation

A two-dimensional matrix. And then inputting the data into a convolutional neural network model, and extracting the characteristic vector of the acceleration data through superposition of a plurality of convolutional and pooling layers. Finally, obtaining a one-dimensional characteristic vector;

and step two, extracting the human body characteristics of the staff from the depth image at the finishing moment through a depth neural network. The deep neural network first preprocesses the data through the ResNet-50 network and combines the output characteristics of each stage of the ResNet-50 network. Then respectively extracting human key points and partial associated vectors as human characteristic vectors of the working personnel through two multilayer convolutional neural networks;

and step three, extracting background element characteristics of the two-dimensional image at the ending moment through a deep neural network. The deep neural network adopts a selective search method to extract a plurality of candidate areas on a test image. And then adjusting the size of the candidate region to a fixed size so that the candidate region can be matched with the input size of the features, training a classifier for each class by obtaining the feature vector through the model, and scoring the output feature vector by using the classifier. For each class, calculating an index of a coincidence region, adopting non-maximum value inhibition, and removing the regions of the overlapping positions on the basis of the regions of the highest score; finally, training a linear regression model to generate a stricter bounding box for each identified object in the image;

inputting all the characteristics into a neural network discrimination model, judging the behavior of the worker through the acceleration signal characteristics and the depth image characteristics, and judging whether the behavior of the worker meets the safety standard or not by combining the background characteristics of the two-dimensional image. Such as: in the high-voltage electroscope operation, when the worker does not lift the high-voltage electroscope rod or the worker only has the hand-lifting action but does not have the high-voltage electroscope rod in the hand, the worker judges that the action is illegal.

The image target recognition algorithm is used for recognizing and extracting various static characteristic elements in a power operation working scene, for example, whether protective equipment of workers is completely worn is detected through images shot by a camera; the risk behavior recognition algorithm is used for risk behavior recognition which cannot be completed by a single image, and comprises a cross-modal behavior recognition algorithm and an indoor and outdoor integrated precise positioning algorithm. The cross-modal behavior recognition algorithm is used for recognizing the behavior and motion states of the workers and analyzing whether the behavior of the workers meets the safety standard or not; the indoor and outdoor integrated precise positioning algorithm is used for determining and tracking the real-time position of the worker, and gives a warning when the worker enters a dangerous working area or leaves a working post according to a preset dangerous working area and a preset safe working area.

Preferably, before identifying the risk behavior, the method further comprises: and acquiring a two-dimensional image and a depth image of the target behavior and an acceleration signal in the motion process of the target behavior.

the depth of the deep neural network model is as follows:

the width of the deep neural network model is:

wherein

And

to find the constants by the method of searching the optimum values,

is a manually adjusted parameter;

when the constraint condition is satisfied

Adaptively obtaining parameters of the deep neural network model, wherein

Is a set parameter. Wherein the parameter is a learning rate, so that the objective function converges to a local minimum.

determining an attention characteristic pyramid structure, wherein each layer of the attention characteristic pyramid structure is provided with different attention weights which respectively correspond to characteristic contribution ratios of two-dimensional images with different resolutions;

the attention weight calculation formula is:

wherein

A vector of scalar or two-dimensional image channels that are features; where i is the serial number of the current input feature, j is the serial number of the non-output feature, ϵ = 0.0001. The features are semantic features of the two-dimensional image, including color, texture, shape, spatial relationship, attribute features, and the like.

Preferably, the image target recognition algorithm can automatically search for the depth and width of the optimal demodulation whole neural network through a group of composite coefficients, and different from the traditional method which depends on manually adjusting the structure of the network, the method can search out a group of optimal parameters which best accord with the network. The depth of the network model can be expressed as

The width of the network model can be expressed as

Wherein

And

as constants that can be found by the search for optimum,

are manually adjustable parameters. When it is satisfied with

Can adaptively obtain the parameters of the network, wherein

Is a set parameter. The method for solving the optimal parameters with the constraints can be realized by adjusting the parameters

A control model when

A minimum optimal basic model is obtained; when increasing

When two dimensions of the model are simultaneously enlarged, the model is enlarged, the performance is improved, and the consumed resources are enlarged.

The image target detection algorithm adopts a multi-scale data fusion method, and the input of the algorithm is not limited to fixed resolution, and can be data with any resolution. The multi-scale data fusion method adopts an attention feature pyramid structure, and different attention weights are introduced into each layer of the feature pyramid, so that feature contribution ratios of different layers can be adjusted while input data of different scales are fused. The attention weight is calculated by the formula

Wherein

A scalar quantity of the features or a vector quantity of a certain channel, and also can be a multi-dimensional tensor; where i is the serial number of the current input feature, j is the serial number of the non-output feature, ϵ = 0.0001. To ensure that the weight is greater than 0, a ReLU activation function is used before the weight calculation step.

Before judging whether the behavior action of the target object meets the preset safety behavior standard based on the background element characteristics, the method further comprises the following steps: determining the position of the target object:

setting an ultra-wideband base station, determining the time consumed by data transmission between the ultra-wideband base station and an ultra-wideband tag of a target object, and measuring the distance between the ultra-wideband base station and the tag by a two-way time flight method based on the time consumed by data transmission;

determining the relative position of the target object and the ultra-wideband base station based on the distance between the ultra-wideband base station and the label;

determining the absolute position of the target object based on the absolute position of the ultra-wideband base station and the relative position of the target object and the ultra-wideband base station;

and judging whether the behavior action of the target object meets a preset safety behavior standard or not based on the absolute position and the background element characteristics of the target object.

The indoor and outdoor integrated precise positioning algorithm provided by the invention provides a high-accuracy positioning scheme by combining the Beidou satellite and the ultra-wideband technology. The Beidou satellite provides large-range positioning in an outdoor scene, and the ultra-wideband technology provides outdoor small-range or indoor high-accuracy positioning. The two positioning techniques are combined as follows:

step one, a worker wears the equipment provided with the ultra-wideband tag and enters a working scene provided with a plurality of ultra-wideband base stations. And measuring the distance between the base station and the tag by using a bidirectional time flight method according to the time consumed by data transmission between the ultra-wideband base station and the tag. Determining the relative position of a worker and a base station through a neural network training model according to the distance relationship between the base station and the label;

and step two, measuring the absolute position of the ultra-wideband base station by a Beidou satellite positioning technology. By using a real-time dynamic algorithm of multiple base stations, each fixed reference station does not directly send any correction information to a mobile user, but all original data is sent to a data processing center through a data communication link, a mobile station sends the approximate coordinates of the mobile station to a control center, the data management center broadcasts a group of fixed optimal reference stations after receiving position information, and a virtual reference station is simulated near the position of the user by using the data of the fixed optimal reference stations to form an ultra-short baseline with the mobile station. And performing auxiliary fusion analysis on all the data to determine the absolute position of the worker in the map.

The invention provides a cross-modal electric power operation site risk behavior recognition system, which comprises: the device comprises a sensing unit, a transmission unit, an edge computing unit, a cloud platform unit and a terminal equipment unit.

The sensing unit is used for acquiring and capturing original data of multiple modes of an electric power operation site and performing all-dimensional data acquisition on the risk behavior information of the electric power operation site. The method comprises the following steps: the method comprises the steps of collecting a two-dimensional image of a working site through a common camera, collecting a depth image of the working site through a depth camera, collecting an acceleration signal through an acceleration sensor, and collecting a signal for positioning through a Beidou chip and an ultra-wideband chip.

The transmission unit is used for transmitting the data acquired by the sensing unit and transmitting the data acquired by the sensing unit to the edge calculation unit;

the edge computing unit is used for receiving the captured data, processing and analyzing the captured data, and judging whether the behavior of the worker has risks or not through the cross-modal power operation site risk behavior identification method. And issuing a warning when the safety risk is judged to exist.

The cloud platform unit is used for summarizing and storing risk behaviors of all work places in the area, and then visual display and data analysis can be carried out on summarized data. The cloud platform has the main functions of remote visualization of operation, violation warning, violation statistics and the like. The remote operation visualization function can be used for remotely viewing all the real-time video images of the power working site accessed to the cloud platform. The violation warning function can receive the warning sent by the edge calculation unit and visually display the warning about the time, the place and the violation image. The violation behavior statistical function can be used for counting and analyzing all violation operations, for example, the number of violations in each region is checked, so that safety supervision is enhanced, and safety education should be developed when frequent violations occur.

The terminal equipment unit can be used for recording the construction work ticket flow and recording the risks in the operation process. The terminal equipment unit can perform data transmission with the cloud platform, when the staff violate the operation, the cloud platform can also send an alarm to the terminal equipment, the business process is suspended, and the subsequent steps cannot be performed until the safety supervisor eliminates the risk to confirm the construction safety.

The data of multiple modes in the sensing unit of the invention needs to be captured by different devices respectively, and the arrangement position of the collected devices needs to meet the data collection specification. For example, the position of the camera is not suitable to be too close to make the work site shoot incompletely, and is not suitable to be too far to cause the staff and the equipment to be too small in the image; the arrangement position of the sensors is to avoid signal shielding and interference; the Beidou chip is arranged in a relatively open area, and satellite signals can be received.

The transmission modes of the transmission unit comprise a wired transmission mode and a wireless transmission mode, and the optimal transmission mode is selected according to different data transmission requirements. Fixed equipment such as a visual camera, an ultra-wideband base station and the like adopts wired transmission to reduce mutual interference of signals, and equipment needing to be moved such as an acceleration sensor, an ultra-wideband tag and the like adopts a wireless transmission mode to ensure portability.

The wired transmission in the transmission unit of the invention mainly adopts a universal serial bus to carry out data transmission. The serial bus has the advantages of simple connection, hot plug support, high transmission speed, strong error correction capability, no need of extra power supply, low cost and the like. The universal serial bus interface is consistent and simple in connection, the system can automatically detect and configure the equipment, and the system supporting hot plugging of newly added equipment does not need to be restarted; the theoretical rate of the universal serial bus interface is 10Gbps, and as partial bandwidth is reserved to support other functions, the actual effective bandwidth is about 7.2Gbps, and the theoretical transmission speed can reach 900 MB/s; the transmission protocol comprises functions of transmission error management, error recovery and the like, and simultaneously, transmission errors are processed according to different transmission types; the bus can provide 5V voltage power supply for the devices connected to the bus, and the small devices do not need additional power supply; the interface circuit is simple and easy to realize, and the cost is lower than that of a serial port and a parallel port.

In the transmission unit of the invention, ZigBee is mainly adopted for data transmission in wireless transmission. ZigBee transmission has the advantages of low power consumption, low cost, short time delay, high capacity, high safety, unlicensed frequency band and the like. The power consumption of the ZigBee standby is lower than that of common wireless transmission modes, such as Bluetooth, WiFi and the like; the requirement on the communication controller is reduced by greatly simplifying the protocol, so that the cost is low; the response speed of Zigbee is high, generally, the sleep state is only 15ms, and the node connection and network entry only needs 30 ms; a three-level security mode is provided, including no security setting, using an access control list to prevent illegal acquisition of data, and using a symmetric password of a high-level encryption standard to flexibly determine security attributes thereof; the common frequency bands of ZigBee are 2.4GHZ, 868MHz and 915MHz, which are all unlicensed frequency bands.

According to the edge computing unit, the data processing and analyzing method adopts the cross-mode electric power operation site risk behavior identification method, and whether the operation behavior of the worker has a safety risk or not is judged. And repeatedly and circularly detecting until the operation is finished, giving out a warning when the risk is detected, and sending the risk record data to the cloud platform.

The invention provides a cross-modal electric power operation site risk behavior recognition device, which comprises: the system comprises an equipment box, an intelligent safety helmet and a positioning base station;

the equipment box comprises an algorithm operation platform, a camera, a depth camera, a wireless communication equipment receiving end and alarm equipment. The camera is in wired connection with the algorithm operation platform through a universal serial bus, and the sensor is in wireless connection with the algorithm operation platform through ZigBee. Acquiring image information through a common camera and a depth camera, acquiring sensor information through a receiving end of wireless communication equipment, analyzing data through an algorithm operating platform to judge whether the operation of a worker meets the standard, and sending out a warning by warning equipment when dangerous behaviors are detected;

the intelligent safety helmet is provided with an acceleration sensor, a Beidou positioning chip, an ultra-wideband positioning label chip, a power supply module and a wireless transmission module sending end. The wireless transmission module sending end is used for sending the acquired sensing information to the equipment box in real time for data analysis;

the positioning base station comprises an ultra-wideband positioning base station module, a Beidou positioning module and a power supply module. The positioning base station needs to be dispersedly arranged in a working site, and the periphery of the positioning base station cannot be shielded by metal or concrete buildings.

Referring to fig. 2, a schematic diagram of an embodiment of a method for identifying risk behaviors of a cross-modal power operation site provided in the present invention is shown. The method comprises the following steps:

the method comprises the following steps that firstly, a sensing unit collects and captures original data of multiple modes of an electric power operation site, and omnibearing data collection is carried out on the electric power operation site. The concrete expression is as follows: the method comprises the steps of collecting a two-dimensional image of a working site through a common camera, collecting a depth image of the working site through a depth camera, collecting an acceleration signal through an acceleration sensor, and collecting a signal for positioning through a Beidou chip and an ultra-wideband chip. The acquired equipment arrangement position needs to meet the data acquisition specification, for example, the camera arrangement position is not suitable for being too close to ensure that the working site cannot shoot completely, and is not suitable for being too far away to ensure that the working personnel and the equipment are too small in the image; the arrangement position of the sensor is to avoid the interference of signal metal and a concrete shelter on signal transmission; the position that big dipper chip was arranged should be in comparatively spacious region, guarantees to receive satellite signal.

And step two, the transmission unit is used for transmitting the data acquired by the sensing unit and transmitting the data acquired by the sensing unit to the edge calculation unit. The transmission modes of the transmission unit comprise a wired transmission mode and a wireless transmission mode, and the optimal transmission mode is selected according to different data transmission requirements. Fixed equipment such as a camera and an ultra-wideband base station adopts wired transmission to reduce mutual interference of signals, and equipment needing to be moved such as an acceleration sensor and an ultra-wideband tag adopts a wireless transmission mode to ensure portability.

The wire transmission mainly adopts a universal serial bus for data transmission, and the serial bus has the advantages of simple connection, hot plug support, high transmission speed, strong error correction capability, no need of extra power supply, low cost and the like. The universal serial bus interface is consistent and simple in connection, the system can automatically detect and configure the equipment, and the system supporting hot plugging of newly added equipment does not need to be restarted; the theoretical rate of the universal serial bus interface is 10Gbps, and as partial bandwidth is reserved to support other functions, the actual effective bandwidth is about 7.2Gbps, and the theoretical transmission speed can reach 900 MB/s; the transmission protocol comprises functions of transmission error management, error recovery and the like, and simultaneously, transmission errors are processed according to different transmission types; the bus can provide 5V voltage power supply for the devices connected to the bus, and the small devices do not need additional power supply; the interface circuit is simple and easy to realize, and the cost is lower than that of a serial port and a parallel port.

The wireless transmission mainly adopts ZigBee for data transmission, the ZigBee transmission has the advantages of low power consumption, low cost, short time delay, high capacity, high safety, unlicensed frequency band and the like, and a ZigBee chip in a 2.4GHz frequency band is selected as a chip for wireless transmission in the embodiment.

And thirdly, the edge computing unit is used for receiving, processing and analyzing the captured data, repeatedly circulating the detection of the power field operation condition until the operation is finished through a cross-mode power operation field risk behavior identification method, sending a warning when the safety risk of field operation personnel is judged, and sending the behavior with the risk and the image data to the cloud platform.

And fourthly, the cloud platform unit is used for summarizing and storing the risk behaviors of all the working places in the area, and then visual display and data analysis can be carried out on the summarized data. The cloud platform has the main functions of remote visualization of operation, violation warning, violation statistics and the like. The operation remote visualization function can remotely view all real-time video images accessed to the cloud platform. The violation warning function can receive the warning sent by the edge calculation unit and visually display the time, the place and the violation image. The violation behavior statistical function can count and analyze all violation operations, for example, the violation operations with the largest number of occurrences are analyzed to facilitate safety education, and the number of violations in each region is checked to facilitate strengthening safety supervision.

And step five, the terminal equipment unit is used for recording the construction work ticket flow and preventing the illegal work ticket from continuing when the operation has risks. The terminal equipment unit can carry out data transmission with the cloud platform, and when the staff breaks rules and regulations, the cloud platform can also send an alarm to the terminal equipment, suspend the work ticket flow, and cannot carry out subsequent steps until the safety supervisor eliminates risks to confirm construction safety.

Referring to fig. 3, a schematic diagram of an embodiment of a method for identifying risk behaviors of a cross-modal power operation site provided in the present invention is shown. The method comprises an image target recognition algorithm and a risk behavior recognition algorithm:

the image target recognition algorithm is used for recognizing and extracting various static characteristic elements in the power operation working scene. For example, images taken by a camera detect whether the protective equipment of the staff is fully worn.

The risk behavior recognition algorithm is used for recognizing risk behaviors which cannot be completed by a single image. The algorithm comprises a cross-modal behavior recognition algorithm and an indoor and outdoor integrated precise positioning algorithm. The cross-modal behavior recognition algorithm is used for recognizing the behavior and motion states of the workers and analyzing whether the behavior of the workers meets the safety standard or not; the indoor and outdoor integrated precise positioning algorithm is used for determining and tracking the real-time position of the worker, and analyzing whether the position of the worker meets the safety standard. And according to the pre-defined dangerous area and the working area, when the staff enters the dangerous area or leaves the working post, a warning is given.

Referring to fig. 4, a schematic diagram of an embodiment of a network structure of an image target recognition algorithm provided in the present invention is shown. The method comprises the following steps:

convolutional network

To (1) a

A convolutional layer can be represented as

，

Representing the input tensor in its entirety,

the representative output tensor i is the number of the convolutional layer. The whole network is composed of

A convolutional layer composition, which can be expressed as

Can be simplified into

And j is the number of the convolutional layer. In practical applications, a plurality of convolutional layers with the same structure will be generally referred to as a phase, for example, ResNet has 5 phases, each convolutional layer in a phase has the same structure, and a convolutional network can be expressed by taking a phase as a unit as follows:

wherein

Is shown as

The dimensions of the layer tensor are such that,

is shown as

A stage consisting of a convolutional layer

Repetition of

And (5) secondary composition. The design of networks in general focuses on finding the best network layer

Larger networks have larger model widths and depths, and can often obtain higher precision, but precision gains quickly saturate after reaching a certain value, which explains the limitation of expanding only a single dimension. The dimensions of the model expansion are not completely independent, which means thatThe various expansion dimensions need to be balanced.

The image target recognition algorithm can automatically search the depth and width of the optimal demodulation whole neural network through a group of composite coefficients, and is different from the traditional method that the structure of the network is manually adjusted, and the method can search a group of optimal parameters which best accord with the network. The depth of the network model can be expressed as

The width of the network model can be expressed as

Wherein

And

as constants that can be found by the search for optimum,

are manually adjustable parameters. When it is satisfied with

Can adaptively obtain the parameters of the network, wherein

A control model when

A minimum optimal basic model is obtained; when increasing

Of time modelsThe two dimensions are enlarged simultaneously, the model is enlarged, the performance is improved, and the consumed resources are enlarged.

In fusing different input features, previous research on FPN and some improvements to FPN mostly simply adds features. In practice these different input features have different resolutions and their contributions to the fused output features are often found to be unequal by experimental observation. In order to solve the problem, the image target detection algorithm adopts a multi-scale data fusion method. Where the algorithm input is not limited to a fixed resolution, it may be data of any resolution. The multi-scale data fusion method adopts an attention feature pyramid structure, and different attention weights are introduced into each layer of the feature pyramid, so that feature contribution ratios of different layers can be adjusted while input data of different scales are fused. The attention weight is calculated by the formula

Wherein

Referring to fig. 5, a schematic diagram of an embodiment of a network structure of a cross-modal behavior recognition algorithm provided in the present invention is shown. The input data of the cross-modal behavior recognition algorithm are data of a plurality of different modalities, including a two-dimensional image acquired by a common camera, a depth image acquired by a depth camera, and an acceleration signal acquired by an acceleration sensor. The cross-modal behavior recognition algorithm comprises the following steps:

step one, carrying out data processing on the acceleration signals in a period of continuous time and extracting features through a neural network model. And extracting the human body characteristics of the staff from the depth image at the ending moment through a depth neural network. And extracting background element characteristics of the two-dimensional image at the ending moment through a deep neural network. Inputting all the characteristics into a neural network discrimination model, judging the behavior of the worker through the acceleration signal characteristics and the depth image characteristics, and judging whether the behavior of the worker meets the safety standard or not by combining the background characteristics of the two-dimensional image.

For example, in the high-voltage electroscope operation, when the worker does not lift the high-voltage electroscope rod, or the worker only has the hand-lifting action but does not have the high-voltage electroscope rod in the hand, the worker determines the illegal action.

Fig. 6 is a schematic diagram of an embodiment of an indoor and outdoor integrated precise positioning algorithm provided in the present invention. The method comprises the following steps:

the indoor and outdoor integrated precise positioning algorithm provides a high-accuracy positioning scheme through combination of a Beidou satellite and an ultra-wideband technology. The Beidou satellite provides large-range positioning in an outdoor scene, and the ultra-wideband technology provides outdoor small-range or indoor high-accuracy positioning. The two positioning techniques are combined as follows:

step one, a worker wears a device provided with an ultra-wideband tag and enters a working scene provided with a plurality of ultra-wideband base stations, and the scheme of 4 base stations is adopted in the embodiment of the invention. And measuring the distance between the base station and the tag by using a bidirectional time flight method according to the time consumed by data transmission between the ultra-wideband base station and the tag. Determining the relative position of a worker and a base station through a neural network training model according to the distance relationship between the base station and the label;

Referring to fig. 7, the schematic diagram of an embodiment of the cross-modal power operation site risk behavior identification device provided in the present invention includes an equipment box, an intelligent safety helmet, and a positioning base station:

the positioning base station is composed of an ultra-wideband positioning base station module, a Beidou positioning module and a power supply module. The positioning base station needs to be dispersedly arranged in a working site, and the periphery of the positioning base station cannot be shielded by metal or concrete buildings. In order to ensure the positioning accuracy, the arrangement position of the positioning base station is open enough.

The method for identifying the risk behaviors of the cross-modal power operation site in the power production can accurately monitor the positions and behavior states of all workers on the site in real time, and can give an alarm in time when abnormal behaviors such as approaching a dangerous area, leaving a working post, contacting a charged body illegally and the like occur. Compared with the existing production mode of manual observation, the efficiency is greatly improved, the safety of electric power production is improved, and the method has high practicability.

Fig. 8 is a diagram of a system for identifying risk behaviors in accordance with a preferred embodiment of the present invention. As shown in fig. 8, the present invention provides a system for identifying risk behaviors, the system comprising:

the first obtaining unit 801 is configured to select acceleration signals of a target object in a continuous time period when the target object performs a target behavior, and obtain an acceleration feature vector of the acceleration signals in the continuous time period through a deep neural network model.

The second obtaining unit 802 is configured to select a depth image of the target object at the target behavior end time, and obtain a target object feature vector of the target object in the depth image through the depth neural network model.

A third obtaining unit 803, configured to select a two-dimensional image of the target object at the end time of the target behavior, and obtain, through the deep neural network model, a background element feature of the target object in the two-dimensional image.

A fourth obtaining unit 804, configured to input the acceleration feature vector and the target object feature vector into the neural network discrimination model, and obtain a behavior action of the target object.

And the result unit 805 is configured to determine whether the behavior action of the target object meets a preset safety behavior standard based on the background element characteristics.

Preferably, the system further comprises a fifth obtaining unit for determining the depth and width of the deep neural network model;

the depth of the deep neural network model is as follows:

the width of the deep neural network model is:

wherein

And

to find the constants by the method of searching the optimum values,

is a manually adjusted parameter;

when the constraint condition is satisfied

Adaptively obtaining parameters of the deep neural network model, wherein

Preferably, a fifth obtaining unit is further included, configured to determine attention weights of the deep neural network model:

determining that different attention weights are set for each layer of the attention feature pyramid structure and respectively correspond to feature contribution ratios of two-dimensional images with different resolutions;

the attention weight calculation formula is:

wherein

A vector of scalar or two-dimensional image channels that are features; where i is the serial number of the current input feature, j is the serial number of the non-output feature, ϵ = 0.0001. The feature is a semantic feature of the two-dimensional image, and comprises color, texture, shape, spatial relationship, attribute feature and the like.

Preferably, the system further comprises a sixth acquisition unit for determining the position of the target object: acquiring the time consumed by data transmission between a preset ultra-wideband base station and an ultra-wideband tag of a target object, and measuring the distance between the ultra-wideband base station and the tag by a two-way time flight method based on the time consumed by the data transmission;

The system 800 for identifying a risk behavior according to the preferred embodiment of the present invention corresponds to the method 100 for identifying a risk behavior according to the preferred embodiment of the present invention, and will not be described herein again.

a processor;

a memory for storing the processor-executable instructions;

As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

It should be noted that the above-mentioned embodiments are only for illustrating the technical solutions of the present invention and not for limiting the protection scope thereof, and although the present invention has been described in detail with reference to the above-mentioned embodiments, those skilled in the art should understand that after reading the present invention, they can make various changes, modifications or equivalents to the specific embodiments of the present invention, but these changes, modifications or equivalents are within the protection scope of the appended claims.

The invention has been described with reference to a few embodiments. However, other embodiments of the invention than the one disclosed above are equally possible within the scope of the invention, as would be apparent to a person skilled in the art from the appended patent claims.

Generally, all terms used in the claims are to be interpreted according to their ordinary meaning in the technical field, unless explicitly defined otherwise herein. All references to "a/an/the [ device, component, etc ]" are to be interpreted openly as referring to at least one instance of said device, component, etc., unless explicitly stated otherwise. The steps of any method disclosed herein do not have to be performed in the exact order disclosed, unless explicitly stated.

Claims

1. A method for identifying risk behaviors, the method comprising:

determining a location of the target object;

based on the background element characteristics, judging whether the behavior action of the target object meets a preset safety behavior standard or not, including:

2. The method of claim 1, further comprising determining a depth and a width of the deep neural network model:

the depth of the deep neural network model is as follows:

the width of the deep neural network model is as follows:

wherein

And

to find the constants by the method of searching the optimum values,

is a manually adjusted parameter;

when the constraint condition is satisfied

Adaptively obtaining parameters of the deep neural network model, wherein

Is a set parameter.

3. The method of claim 1, further comprising: determining attention weights for the deep neural network model:

the attention weight calculation formula is as follows:

wherein

A vector of scalar or two-dimensional image channels of a feature.

4. A system for identifying risk behaviors, the system comprising:

a sixth acquisition unit for determining a position of the target object:

and the result unit is used for judging whether the behavior action of the target object meets a preset safety behavior standard or not based on the absolute position of the target object and the background element characteristics.

5. The system of claim 4, further comprising a fifth acquisition unit for determining a depth and a width of the deep neural network model;

the depth of the deep neural network model is as follows:

the width of the deep neural network model is as follows:

wherein

And

to find the constants by the method of searching the optimum values,

is a manually adjusted parameter;

when the constraint condition is satisfied

Adaptively obtaining parameters of the deep neural network model, wherein

Is a set parameter.

6. The system of claim 4, further comprising a fifth obtaining unit for determining attention weights of the deep neural network model:

the attention weight calculation formula is as follows:

wherein

A vector of scalar or two-dimensional image channels of a feature.

7. A computer-readable storage medium, characterized in that it stores a computer program for performing the method of any of the preceding claims 1-3.

8. An electronic device, characterized in that the electronic device comprises:

a processor;

a memory for storing the processor-executable instructions;

the processor is configured to read the executable instructions from the memory and execute the instructions to implement the method of any one of claims 1 to 3.