CN113627406A

CN113627406A - Abnormal behavior detection method and device, computer equipment and storage medium

Info

Publication number: CN113627406A
Application number: CN202111189600.3A
Authority: CN
Inventors: 李鹏; 黄文琦; 梁凌宇; 曾群生; 陈佳捷; 吴洋; 刘高
Original assignee: Southern Power Grid Digital Grid Research Institute Co Ltd
Current assignee: Southern Power Grid Digital Grid Research Institute Co Ltd
Priority date: 2021-10-12
Filing date: 2021-10-12
Publication date: 2021-11-09
Anticipated expiration: 2041-10-12
Also published as: CN113627406B

Abstract

The application relates to an abnormal behavior detection method, an abnormal behavior detection device, a computer device and a storage medium. The method comprises the following steps: acquiring target video stream data; inputting the target video stream data into a multi-task detection model trained in advance to obtain a detection result of a target object contained in the target video stream data; the detection result comprises an object identification result and a safety dressing detection result corresponding to each safety dressing type; and determining a first abnormal behavior detection result according to the detection result of the target object. By adopting the method, manual detection is not adopted, and the effect of detecting abnormal behaviors is improved; in addition, the method adopts a multi-task detection model, detects whether the pedestrian wears various types of safety dresses while detecting the pedestrian, further improves the effect of detecting abnormal behaviors and effectively prevents safety accidents.

Description

Abnormal behavior detection method and device, computer equipment and storage medium

Technical Field

The present disclosure relates to the field of power operation field control technologies, and in particular, to a method and an apparatus for detecting abnormal behavior, a computer device, and a storage medium.

Background

Electric power production is a world-recognized field of high-risk production operations. Statistically, most power safety production accidents are caused by habitual violations of field operators. In order to ensure the safety of the power field operation, the behavior of the field operator needs to be detected.

At present, in order to detect the behaviors of field operators, a safety worker is generally arranged on an electric power operation field, particularly a live working field, to monitor whether the behaviors of workers are standard in the construction process, and to warn and record abnormal behaviors.

However, the method not only needs to train and equip professional security personnel, is time-consuming and labor-consuming, but also has limited energy, and is difficult to avoid negligence and omission in the supervision process, and has poor detection effect on abnormal behaviors, thereby causing safety accidents.

Disclosure of Invention

In view of the above, it is necessary to provide an abnormal behavior detection method, an abnormal behavior detection apparatus, a computer device, and a storage medium, which can improve the effect of detecting abnormal behavior.

A method of abnormal behavior detection, the method comprising:

acquiring target video stream data;

inputting the target video stream data into a multi-task detection model trained in advance to obtain a detection result of a target object contained in the target video stream data; the detection result comprises an object identification result and a safety dressing detection result corresponding to each safety dressing type;

and determining a first abnormal behavior detection result according to the detection result of the target object.

In one embodiment, the multitask detection model comprises a feature extraction module, an object identification module and a detection module corresponding to each safety dressing type;

the inputting the target video stream data into a multi-task detection model trained in advance to obtain a detection result of a target object contained in the target video stream data includes:

inputting the target video stream data to the feature extraction module to obtain convolution features of the target video stream data;

inputting the convolution characteristics to the object identification module to obtain an object identification result;

and inputting the convolution characteristics to the detection modules corresponding to the safety dressing types to obtain the safety dressing detection results corresponding to the safety dressing types.

In one embodiment, the training process of the multi-tasking detection model comprises:

acquiring a training data set; the training data set comprises a first sample image, a sample identification result of an object contained in the first sample image, and a sample detection result corresponding to each safety dressing type of an object contained in a second sample image and the second sample image;

inputting the training data set into a target neural network to obtain an object identification prediction result and a prediction result corresponding to each safe dressing type;

determining a target loss function according to the object identification prediction result, the sample identification result of the object, the prediction result corresponding to each safe dressing type and the sample detection result corresponding to each safe dressing type, and iteratively updating the parameters of the target neural network according to the target loss function;

and when the target loss function meets a preset condition, stopping the iterative update of the parameters of the target neural network to obtain the multi-task detection model.

In one embodiment, the determining the target loss function according to the object identification prediction result, the sample identification result of the object, the prediction result corresponding to each safety clothing type, and the sample detection result corresponding to each safety clothing type includes:

determining a target position frame marking loss function and a target category prediction loss function according to the target identification prediction result and the sample identification result of the target;

determining a loss function corresponding to each safe dressing type according to the prediction result corresponding to each safe dressing type and the sample detection result corresponding to each safe dressing type;

and adding the target position frame marking loss function, the target type prediction loss function and the loss functions corresponding to the safety dressing types to obtain a target loss function.

In one embodiment, the method further comprises:

inputting the target video stream data into a pre-trained behavior prediction model to obtain a behavior prediction result of the target object;

and determining a second abnormal behavior detection result according to the behavior prediction result of the target object.

In one embodiment, the inputting the target video stream data into a pre-trained behavior prediction model to obtain target behavior prediction information includes:

acquiring a preset number of image frames containing a target object from the target video stream data;

and inputting the preset number of image frames into a pre-trained behavior prediction model to obtain a behavior prediction result of the target object.

In one embodiment, the method further comprises:

according to the target video stream data, identifying a signboard to obtain a signboard identification result; the signboard recognition result includes: a prohibited behavior and a region corresponding to the prohibited behavior;

determining a second abnormal behavior detection result according to the behavior prediction result of the target object, wherein the determining comprises:

determining the predicted behavior of the target object according to the behavior prediction result of the target object;

if the predicted behavior is the prohibited behavior and the target object is in the area corresponding to the prohibited behavior, determining that a second abnormal behavior detection result is an abnormal behavior;

and if the predicted behavior is not the prohibited behavior or the target object is not in the area corresponding to the prohibited behavior, determining that a second abnormal behavior detection result is a non-abnormal behavior.

An abnormal behavior detection apparatus, the apparatus comprising:

the first acquisition module is used for acquiring target video stream data;

the multitask detection module is used for inputting the target video stream data into a multitask detection model trained in advance to obtain a detection result of a target object contained in the target video stream data; the detection result comprises an object identification result and a safety dressing detection result corresponding to each safety dressing type;

and the first determining module is used for determining a first abnormal behavior detection result according to the detection result of the target object.

the multitask detection module is specifically configured to:

In one embodiment, the apparatus further comprises:

the second acquisition module is used for acquiring a training data set; the training data set comprises a first sample image, a sample identification result of an object contained in the first sample image, and a sample detection result corresponding to each safety dressing type of an object contained in a second sample image and the second sample image;

the prediction module is used for inputting the training data set into a target neural network to obtain an object recognition prediction result and a prediction result corresponding to each safe dressing type;

the updating module is used for determining a target loss function according to the object identification prediction result, the sample identification result of the object, the prediction result corresponding to each safe dressing type and the sample detection result corresponding to each safe dressing type, and iteratively updating the parameters of the target neural network according to the target loss function; and when the target loss function meets a preset condition, stopping the iterative update of the parameters of the target neural network to obtain the multi-task detection model.

In one embodiment, the update module is specifically configured to:

In one embodiment, the apparatus further comprises:

the behavior prediction module is used for inputting the target video stream data into a pre-trained behavior prediction model to obtain a behavior prediction result of the target object;

and the second determining module is used for determining a second abnormal behavior detection result according to the behavior prediction result of the target object.

In one embodiment, the behavior prediction module is specifically configured to:

In one embodiment, the apparatus further comprises:

the signboard identification module is used for identifying a signboard according to the target video stream data to obtain a signboard identification result; the signboard recognition result includes: a prohibited behavior and a region corresponding to the prohibited behavior;

the second determining module is specifically configured to:

A computer device comprising a memory storing a computer program and a processor implementing the steps of the first aspect when the computer program is executed.

A computer-readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of the first aspect described above.

The abnormal behavior detection method, the abnormal behavior detection device, the computer equipment and the storage medium acquire target video stream data; inputting the target video stream data into a multi-task detection model trained in advance to obtain a detection result of a target object contained in the target video stream data; the detection result comprises an object identification result and a safety dressing detection result corresponding to each safety dressing type; and determining a first abnormal behavior detection result according to the detection result of the target object, and avoiding manual detection, thereby improving the effect of detecting the abnormal behavior. In addition, the abnormal behavior detection method, the abnormal behavior detection device, the computer equipment and the storage medium adopt a multi-task detection model to detect whether the pedestrian wears various types of safety garments or not while detecting the pedestrian, so that the abnormal behavior detection effect is further improved, and safety accidents are effectively prevented.

Drawings

FIG. 1 is a schematic flow chart diagram illustrating a method for abnormal behavior detection in one embodiment;

FIG. 2 is a diagram illustrating a detection result of a target object according to an example;

FIG. 3 is a flowchart illustrating steps of obtaining a detection result of a target object according to an embodiment;

FIG. 4 is a schematic diagram of the structure of a multitasking detection model in one example;

FIG. 5 is a flow diagram that illustrates the training process for the multi-tasking detection model, under an embodiment;

FIG. 6 is a schematic flow chart of the step of determining the objective loss function in one embodiment;

FIG. 7 is a flowchart illustrating a method for abnormal behavior detection in another embodiment;

FIG. 8 is a flowchart illustrating the step of obtaining behavior prediction information of a target object in another embodiment;

FIG. 9 is a diagram illustrating an exemplary process for obtaining behavior prediction information of a target object;

FIG. 10 is a flowchart illustrating a method for detecting abnormal behavior in another embodiment;

FIG. 11 is a schematic view of a signboard and a corresponding prohibited action zone;

FIG. 12 is a signboard schematic view of a construction site;

fig. 13 is a block diagram showing the structure of an abnormal behavior detection apparatus according to an embodiment;

FIG. 14 is a diagram illustrating an internal structure of a computer device according to an embodiment.

Detailed Description

In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.

In an embodiment, as shown in fig. 1, an abnormal behavior detection method is provided, and this embodiment is illustrated by applying the method to a terminal, it is to be understood that the method may also be applied to a server, and may also be applied to a system including the terminal and the server, and is implemented by interaction between the terminal and the server. The terminal can be, but is not limited to, various personal computers, notebook computers, smart phones, tablet computers and portable wearable devices, and the server can be implemented by an independent server or a server cluster formed by a plurality of servers. In this embodiment, the method includes the steps of:

step 101, obtaining target video stream data.

In the embodiment of the application, the terminal can shoot target video stream data at a job site through the camera equipment. The operation site may be any construction site, such as an electric power operation site, an oil exploration site, and a construction operation site.

Step 102, inputting the target video stream data into a multi-task detection model trained in advance to obtain a detection result of a target object contained in the target video stream data.

The detection result comprises an object identification result and a safety dressing detection result corresponding to each safety dressing type.

In the embodiment of the application, the terminal may store a pre-trained multi-task detection model. The trained multi-task detection model has the function of simultaneously identifying multiple tasks, namely the trained multi-task detection model has the function of simultaneously identifying a target object from video stream data and judging whether the target object wears various types of safe dresses. And the terminal inputs the target video stream data into the trained multi-task detection model to obtain the object identification result of the target object contained in the target video stream data and the safety dressing detection result corresponding to each safety dressing type, so that the detection result of the target object contained in the target video stream data is obtained. The target objects are all personnel entering the operation field. Each safety clothing type is a type of various safety clothing required by a job site, for example, a safety helmet, a work clothes, an insulating shoe, an insulating glove, a pair of safety glasses, a safety rope, and the like, and the embodiment of the present invention is not limited.

And 103, determining a first abnormal behavior detection result according to the detection result of the target object.

In this embodiment, the terminal may determine whether the target object included in the target video stream data wears each type of secure clothing according to the detection result of the target object. Then, the terminal may determine the first abnormal behavior detection result according to whether or not the target object included in the target video stream data wears various types of secure dresses. The first abnormal behavior detection result may indicate whether the target object has an abnormal behavior for each safe dressing type.

In one example, the object identification result may include a probability of the object class, and the safety clothing detection result corresponding to each safety clothing type may include a probability corresponding to each safety clothing type. Aiming at the safety dressing detection result of each safety dressing type, if the probability corresponding to the safety dressing type is greater than a preset first probability threshold value, determining that the target object wears the safety dressing of the type; and if the probability corresponding to the safety dressing type is less than or equal to a preset first probability threshold, determining that the target object does not wear the safety dressing of the type.

Optionally, each safety dressing type may include: work clothes and safety helmets; the safety dressing detection result corresponding to each safety dressing type may include: the probability of wearing work clothes and the probability of wearing safety helmets. The object recognition result may include: and labeling the result of the target position box and predicting the probability of the target category. For example, as shown in fig. 2, the target position box mark boxes the target object, and the probability of target category prediction, the probability of wearing work clothes, and the probability of wearing safety helmet of the target object are displayed above the target object. If the probability of wearing the working clothes is larger than a preset second probability threshold value, determining that the target object wears the working clothes; and if the probability of wearing the working clothes is less than or equal to a preset second probability threshold value, determining that the target object does not wear the working clothes. If the probability of wearing the safety helmet is larger than a preset third probability threshold value, determining that the target object wears the safety helmet; and if the probability of wearing the safety helmet is smaller than or equal to a preset third probability threshold value, determining that the target object does not wear the safety helmet. The second probability threshold and the third probability threshold may be equal, for example, both equal to 0.5, or may not be equal, for example, the second probability threshold is equal to 0.4, and the third probability threshold is equal to 0.3.

According to the abnormal behavior detection method, the abnormal behavior detection device, the computer equipment and the storage medium, the terminal acquires target video stream data and inputs the target video stream data into a multi-task detection model trained in advance to obtain a detection result of a target object contained in the target video stream data; the detection result comprises an object identification result and a safety dressing detection result corresponding to each safety dressing type; and the terminal determines a first abnormal behavior detection result according to the detection result of the target object. The abnormal behavior detection method, the abnormal behavior detection device, the computer equipment and the storage medium do not adopt manual detection any more, and the effect of detecting the abnormal behavior is improved. In addition, the abnormal behavior detection method, the abnormal behavior detection device, the computer equipment and the storage medium adopt a multi-task detection model to detect whether the pedestrian wears various types of safety garments or not while detecting the pedestrian, so that the abnormal behavior detection effect is further improved, and safety accidents are effectively prevented.

In one embodiment, as shown in FIG. 3, the multi-tasking detection model includes a feature extraction module, an object recognition module, and a detection module corresponding to each type of secure apparel.

In the embodiment of the present application, the feature extraction module (i.e., a backbone network in a neural network) may be used as a shared network portion of the object identification module and the detection module corresponding to each secure clothing type, and is configured to extract features of the input image and perform convolution calculation on the extracted features. The object identification module and the detection module corresponding to each safe dressing type are prediction branches in the neural network. Specifically, the features extracted by the feature extraction module are subjected to convolution calculation, then are predicted by the object identification module and the detection modules corresponding to the safety clothing types, and the object identification module and the detection modules corresponding to the safety clothing types respectively output respective prediction results.

The specific process of inputting target video stream data into a pre-trained multi-task detection model to obtain a detection result of a target object contained in the target video stream data comprises the following steps:

step 301, inputting the target video stream data to a feature extraction module to obtain the convolution feature of the target video stream data.

In this embodiment of the application, the terminal may input the target video stream data to the feature extraction module to obtain a convolution feature of the target video stream data. The feature extraction module extracts features of the target video stream data, and performs convolution calculation on the extracted features of the target video stream data to obtain convolution features of the target video stream data.

Step 302, inputting the convolution characteristics to an object identification module to obtain an object identification result.

In the embodiment of the application, the terminal can input the convolution characteristics to the object identification module to obtain an object identification result. The object recognition module predicts the object recognition according to the input convolution characteristics and outputs an object recognition result.

And step 303, inputting the convolution characteristics to the detection modules corresponding to the safety clothing types to obtain the safety clothing detection results corresponding to the safety clothing types.

In this embodiment of the application, the terminal may input the convolution characteristics to the detection modules corresponding to the respective safety clothing types, respectively, to obtain the safety clothing detection results corresponding to the respective safety clothing types. The detection module corresponding to each safe dressing type can predict each safe dressing type according to the input convolution characteristics and output a safe dressing detection result corresponding to each safe dressing type.

In one example, as shown in fig. 4, the object recognition module may include: a target position frame marking module and a target category prediction module; the detection module corresponding to each safety dressing type can comprise: wear work clothes detection module and wear safety helmet detection module. And the terminal inputs the target video stream data into the feature extraction module to obtain the convolution feature of the target video stream data. And the terminal inputs the convolution characteristics into the target position frame marking module, the target category prediction module, the wearing work clothes detection module and the wearing safety helmet detection module respectively to obtain a target position frame marking result, a target category prediction result, a wearing work clothes detection result and a wearing safety helmet detection result respectively.

According to the abnormal behavior detection method, the abnormal behavior detection device, the computer equipment and the storage medium, the terminal acquires target video stream data and inputs the target video stream data to the pre-trained feature extraction module of the multi-task detection model to obtain convolution features of the target video stream data; the terminal inputs the convolution characteristics to an object identification module to obtain an object identification result; the terminal inputs the convolution characteristics to the detection module corresponding to each safe dressing type to obtain a safe dressing detection result corresponding to each safe dressing type; and the terminal determines a first abnormal behavior detection result according to the object identification result of the target object and the safety dressing detection results corresponding to the safety dressing types. The abnormal behavior detection method, the abnormal behavior detection device, the computer equipment and the storage medium share the feature extraction module, and a network structure for feature extraction does not need to be trained independently aiming at each prediction task of the object identification module and the detection module corresponding to each safe dressing type, so that the calculated amount of a model can be saved, the identification efficiency of multi-task detection is improved, the effect of abnormal behavior detection is further improved, and safety accidents are effectively prevented.

In one embodiment, as shown in fig. 5, the specific process of the training process of the multitask detection model includes the following steps:

step 501, a training data set is obtained.

The training data set comprises a first sample image, a sample identification result of an object contained in the first sample image, a second sample image and a sample detection result corresponding to each safety dressing type of the object contained in the second sample image.

In the embodiment of the application, the terminal can obtain the training data set of the multi-task detection model. The training data set includes a first sample image, and a sample recognition result of an object (which may be referred to as a first sample object) included in the first sample image, and a sample detection result corresponding to each security clothing type of a second sample image, and an object (which may be referred to as a second sample object) included in the second sample image. The first sample image and the second sample image may be the same sample image or different sample images.

In one example, the sample identification result of the first sample object may include: the target location box of the first sample object is labeled with the target category of the first sample object. The sample detection result corresponding to each safe dressing type of the second sample object may include: whether the second sample subject is wearing various types of safety gear, e.g., whether the second sample subject is wearing a work garment, whether wearing a safety helmet.

Step 502, inputting the training data set to a target neural network to obtain an object identification prediction result and a prediction result corresponding to each safe dressing type.

In this embodiment of the application, the terminal may input the training data set to a target neural network (for convenience of distinction, may be referred to as a first target neural network), so as to obtain a prediction result of object recognition and a prediction result corresponding to each secure clothing type. The first target neural network performs object recognition on a first sample image of the training data set and outputs an object recognition prediction result of the first sample image; and the first target neural network detects each safe dressing type of the second sample image of the training data set and outputs a prediction result corresponding to each safe dressing type of the second sample image.

In one example, the first target neural network may be a yolo (you Only Look once) algorithm.

Step 503, determining a target loss function according to the object identification prediction result, the sample identification result of the object, the prediction result corresponding to each safe dressing type and the sample detection result corresponding to each safe dressing type, and iteratively updating the parameters of the target neural meridian according to the target loss function.

In this embodiment, the terminal may determine the first target loss function by using the object identification prediction result, the object sample identification result, the prediction result corresponding to each secure clothing type, and the sample detection result corresponding to each secure clothing type as parameters of the target loss function (which may be referred to as the first target loss function for convenience of distinction). Then, the terminal iteratively updates parameters of the first target neural network according to the first target loss function.

Wherein the object recognition prediction result may include: and marking the predicted value and the target category predicted value on the target position frame. In one example, the target location box annotation prediction value may include: and the position coordinate predicted value of the target object and the target position frame mark size predicted value.

And step 504, when the target loss function meets the preset condition, stopping the iterative update of the parameters of the target neural network to obtain a multi-task detection model.

In the embodiment of the application, in the process of iteratively updating the parameter of the first target neural network, each iteration updates once, and the terminal may determine whether the first target loss function meets a preset condition. And when the first target loss function meets the preset condition, the terminal stops the iterative update of the parameters of the first target neural network. And the terminal takes the first target neural network as a multi-task detection model when the iterative update of the parameters of the first target neural network is stopped. The preset condition may be a condition that a difference between a prediction result and a sample result of each prediction task corresponding to the object identification and each clothing type detection is smaller than a preset difference threshold. The difference between the prediction result and the sample result of each prediction task corresponding to the object identification and each dressing type detection is represented by a loss function value.

According to the abnormal behavior detection method and device, the computer equipment and the storage medium, the terminal obtains the training data set; inputting the training data set into a target neural network to obtain an object identification prediction result and a prediction result corresponding to each safe dressing type; determining a target loss function according to the object identification prediction result, the sample identification result of the object, the prediction result corresponding to each safe dressing type and the sample detection result corresponding to each safe dressing type, and iteratively updating the parameters of the target neural network according to the target loss function; and when the target loss function meets the preset condition, stopping the iterative update of the parameters of the target neural network to obtain the multi-task detection model. According to the abnormal behavior detection method, the abnormal behavior detection device, the computer equipment and the storage medium, the multi-task detection model can complete detection corresponding to each clothing type of the target object while identifying the target object, so that the detection efficiency of detection corresponding to each clothing type of the target object can be improved while reducing the model calculation amount, the abnormal behavior detection effect is further improved, and safety accidents are effectively prevented.

In one embodiment, as shown in fig. 6, a specific process of determining the target loss function according to the object identification prediction result, the sample identification result of the object, the prediction result corresponding to each safe clothing type, and the sample detection result corresponding to each safe clothing type includes the following steps:

step 601, determining a target position frame marking loss function and a target type prediction loss function according to the target identification prediction result and the sample identification result of the target.

In an embodiment of the present application, the object recognition prediction result may include: marking a predicted value and a target category predicted value on a target position frame; the sample recognition result of the object may include: and marking the real value and the real value of the target category by the target position box. The terminal can use the predicted value of the target position frame mark and the real value of the target position frame mark as parameters of the target position frame mark loss function, and determine the target position frame mark loss function according to the preset calculation rule of the target position frame mark loss function. And the terminal takes the predicted value of the target category and the true value of the target category as parameters of the target category prediction loss function, and determines the target category prediction loss function according to a preset calculation rule of the target category prediction loss function.

In one example, the target location box annotation prediction value may include: the predicted value of the position coordinate of the target object and the predicted value of the marking size of the target position frame; marking the real value of the target position box can comprise: and the real value of the position coordinate of the target object and the real value of the labeling size of the target position frame. In one example, the predicted value of the target location box label size may include: the width predicted value of the target position frame label and the height predicted value of the target position frame label; the actual values of the target location box callout size may include: the actual value of the width of the target position box label and the actual value of the height of the target position box label.

Step 602, determining a loss function corresponding to each safe clothing type according to the prediction result corresponding to each safe clothing type and the sample detection result corresponding to each safe clothing type.

In this embodiment of the application, the terminal may use the prediction result corresponding to each secure clothing type and the sample detection result corresponding to each secure clothing type as parameters of the loss function corresponding to each secure clothing type, and determine the loss function corresponding to each secure clothing type according to a preset calculation rule of the loss function corresponding to each secure clothing type.

Step 603, adding the target position frame marking loss function, the target type prediction loss function and the loss functions corresponding to the safety dressing types to obtain a target loss function.

In this embodiment, the terminal may add the target location frame labeling loss function, the target category prediction loss function, and the loss functions corresponding to the respective safety dressing types, and use the obtained sum as the first target loss function.

In one example, the multitasking detection model may include: the system comprises a target position frame marking detection module, a target type detection module, a work clothes wearing detection module and a safety helmet wearing detection module. Wherein, whether wear work clothes detection module and whether wear safety helmet detection module and be similar to target position frame mark detection module and target class detection module. The two detection modules respectively predict each detection frame comprising the target object to obtain a two-dimensional array, and then the two-dimensional array is subjected to softmax mapping to respectively obtain the probability of wearing the work clothes and the probability of wearing the safety helmet. Namely, the safety dressing detection result corresponding to each safety dressing type comprises the following steps: the probability of wearing work clothes and the probability of wearing safety helmets. The first target loss function of the multi-task detection model is the sum of a target position frame marking loss function, a target class prediction loss function, a work clothes wearing detection loss function and a safety helmet wearing detection loss function. For example, the first objective loss function of the multi-tasking detection model may be expressed as:

L_total=L_box+L_cls+L_safehat+L_uniform

wherein L is_totalFirst objective loss function, L, for a multi-tasking detection model_boxLabeling the target location box with a loss function, L_clsPredicting a loss function for the target class, L_safehatFor detecting loss functions for wearing helmets, L_uniformLoss functions were detected for wearing work clothes.

Wherein L is_box、L_cls、L_safehatAnd L_uniformMay be represented as follows:

wherein L is_totalIs a first target loss function, L_boxLabeling the target location box with a loss function, L_clsTarget class prediction loss function, L_safehatFor detecting loss functions for wearing helmets, L_uniformDetecting loss for wearing work clothesA function;

、

、

and

respectively the weight of each loss function;

representing a target j at position i; x and y are position coordinates of the marked target object, w is the width of a position frame where the marked target object is located, h is the height of the position frame where the marked target object is located, and p is a real value;

、

in order to predict the position coordinates of the target object,

for the width of the predicted location box where the target object is located,

for the height of the predicted location box where the target object is located,

to predict probability values.

According to the abnormal behavior detection method, the abnormal behavior detection device, the computer equipment and the storage medium, the terminal determines the target position frame marking loss function and the target type prediction loss function according to the target identification prediction result and the sample identification result of the target; determining a loss function corresponding to each safe dressing type according to the prediction result corresponding to each safe dressing type and the sample detection result corresponding to each safe dressing type; adding the target position frame marking loss function, the target category prediction loss function and the loss functions corresponding to the safety dressing types to obtain a first target loss function; and training the first target neural network according to the first target loss function to obtain the multi-task detection model. According to the abnormal behavior detection method, the abnormal behavior detection device, the computer equipment and the storage medium, the multi-task detection model can complete detection corresponding to each clothing type of the target object while identifying the target object, so that the detection efficiency of detection corresponding to each clothing type of the target object can be improved while reducing the model calculation amount, the abnormal behavior detection effect is further improved, and safety accidents are effectively prevented.

In one embodiment, as shown in fig. 7, the method further comprises the steps of:

step 701, inputting target video stream data into a pre-trained behavior prediction model to obtain a behavior prediction result of a target object.

In the embodiment of the present application, a pre-training behavior prediction model may be stored in the terminal. The terminal can input target video stream data to the pre-training behavior prediction model to obtain a behavior prediction result of the target object.

Wherein the behavior prediction model is capable of predicting the next behavior of the target object. The behavior prediction model comprises: the behavior feature extraction module and the behavior prediction module. The behavior prediction model can extract continuous behavior characteristics of the target object in the target video stream data through the behavior characteristic extraction module, and then conduct behavior prediction on the extracted continuous behavior characteristics of the target object through the behavior prediction module, so that a behavior prediction result is output.

In one example, the output of the behavior feature extraction module is a B +1 dimensional vector of the number of abnormal behaviors (B is the number of actions that need to be predicted). And the behavior prediction module obtains the probability of each type of abnormal behavior by passing the B +1 dimensional vector through a Softmax function. That is, the behavior prediction result may include probabilities of various types of abnormal behaviors.

Step 702, determining a second abnormal behavior detection result according to the behavior prediction result of the target object.

In the embodiment of the application, the terminal can judge whether the target object possibly has a preset abnormal behavior at the next moment according to the behavior prediction result of the target object. If the target object may have a preset abnormal behavior at the next moment, the terminal may determine that the second abnormal behavior detection result is an abnormal behavior; if the target object may not have the preset abnormal behavior at the next moment, the terminal may determine that the second abnormal behavior detection result is the non-abnormal behavior.

Wherein, the preset abnormal behavior may include: crossing, climbing, entering a prohibited area, issuing an abnormal action (such as a pivot turn, running, etc.), etc., which are not limited in the embodiments of the present application.

According to the abnormal behavior detection method, the abnormal behavior detection device, the computer equipment and the storage medium, the terminal inputs target video stream data into a pre-trained behavior prediction model to obtain a behavior prediction result of a target object, and then determines a second abnormal behavior detection result according to the behavior prediction result of the target object. The abnormal behavior detection method, the abnormal behavior detection device, the computer equipment and the storage medium can predict whether the target object possibly has the abnormal behavior at the next moment, so that the function of warning in advance is achieved, the effect of detecting the abnormal behavior is further improved, and safety accidents are effectively prevented.

In one embodiment, as shown in fig. 8, the specific process of inputting the target video stream data into the pre-trained behavior prediction model to obtain the behavior prediction information of the target object includes the following steps:

step 801, acquiring a preset number of image frames containing a target object from target video stream data.

In the embodiment of the application, the terminal can acquire a preset number of image frames containing the target object in the target video stream data acquired in the operation field. Wherein the preset number of image frames containing the target object are consecutive image frames.

Step 802, inputting a preset number of image frames into a pre-trained behavior prediction model to obtain a behavior prediction result of a target object.

In this embodiment of the application, the terminal may input a preset number of image frames into a pre-trained behavior prediction model to predict a behavior of the target object at a next time, so as to obtain a behavior prediction result of the target object.

In one example, as shown in FIG. 9, the terminal may obtain 4 consecutive frames of images including the target object in the target video stream data, corresponding to the images at time T-3, time T-2, time T-1, and time T, respectively, where T is the current time. And the terminal inputs the acquired continuous 4 frames of images into a pre-trained behavior prediction model, so that the behavior of the target object at the T +1 moment can be predicted.

In one embodiment, the behavior prediction model is obtained by training the second target neural network based on a preset number of behavior prediction training samples. The second target neural network may be a terminal-friendly base network pre-trained by the ImageNet data set, for example, a MobileNet network or a SqueezeNet network. The behavior prediction training samples may include: a succession of image frames containing a target object, and a sample predicted behavior of the target object in the succession of image frames.

The training process of the behavior prediction model may include: and respectively inputting the behavior prediction training samples into a second target neural network, and training the second target neural network based on a gradient descent method to obtain a trained behavior prediction model.

In one example, the training process of the behavior prediction model may include: and the terminal respectively inputs the behavior prediction training samples into the second target neural network to obtain a behavior prediction result of the target object in the continuous image frame at the next moment. Then, the terminal determines a second target loss function according to a behavior prediction result of the target object in the continuous image frame at the next moment and a sample prediction result of the target object in the continuous image frame. And then, the terminal iteratively updates the parameters of the second target neural network based on a gradient descent method according to the second target loss function. And when the second target loss function meets a second preset condition, stopping the iterative update of the parameters of the second target neural network to obtain a behavior prediction model.

In one example, the second target loss function is a Cross-Entropy (Cross-Entropy) loss function, which may be expressed as:

wherein B is the number of predicted behaviors, P is the true value of behavior occurrence,

the predicted probability corresponding to the ith behavior, N is the number of image frames, and c is the predicted behavior.

According to the abnormal behavior detection method, the abnormal behavior detection device, the computer equipment and the storage medium, the terminal obtains the preset number of image frames containing the target object from the target video stream data, and then inputs the preset number of image frames into the pre-trained behavior prediction model to obtain the behavior prediction result of the target object. The abnormal behavior detection method, the abnormal behavior detection device, the computer equipment and the storage medium can predict whether the target object has the abnormal behavior next time, so that the function of warning in advance is achieved, the effect of detecting the abnormal behavior is further improved, and safety accidents are effectively prevented.

In one embodiment, as shown in fig. 10, the method further comprises the steps of:

step 1001, identifying a signboard according to target video stream data to obtain a signboard identification result.

Wherein, signboard recognition result includes: forbidden behavior and forbidden behavior corresponding region.

In the embodiment of the application, a pre-training signboard recognition model can be stored in the terminal. The terminal can input target video stream data into a pre-trained signboard recognition model to perform signboard recognition and determine the character information and the graphic information of the signboard. Then, the terminal can determine the type of the signboard according to the text information and the graphic information of the signboard. And finally, the terminal can determine the forbidden behavior and the area corresponding to the forbidden behavior according to the type of the signboard and a preset forbidden behavior division rule. Wherein the inhibiting act may include: crossing, and climbing, entering a prohibited area, issuing an abnormal action (e.g., pivot, run, etc.), etc., which are not limited in this embodiment.

The preset forbidden behavior partition rule comprises the following steps: the type of each signboard is corresponding to each type of prohibited behavior and the corresponding area of each type of prohibited behavior. For example, the signboard and the corresponding prohibited behavior region as shown in fig. 11, climbing behavior is prohibited in the region around the "no climbing high pressure danger" signboard, and crossing behavior is prohibited in the region around the "no stepping high pressure danger" signboard.

In one example, the terminal uses a trained optical character recognition algorithm as the signboard recognition model. For example, the optical character recognition algorithm may be a lightweight Chinese optical character recognition ChineseOCR-lite algorithm. The light-weight Chinese optical character recognition algorithm can check characters on the signboard, determine image information on the signboard by detecting dark and bright modes, and then translate the characters and the image information into computer characters; that is, the characters or pictures in the paper document are converted into image files with black and white dot matrixes in an optical mode, and then the characters or pictures in the image files are converted into texts to be output. As shown in fig. 12, the signboard of the construction site provided in this embodiment includes text information such as work, stop high voltage danger, prohibition of work of a person closing a gate, prohibition of climbing high voltage danger, and the like, and an signboard of graphic information corresponding to each text information.

Step 1002, determining the predicted behavior of the target object according to the behavior prediction result of the target object.

In the embodiment of the application, the terminal can determine the predicted behavior of the target object according to the behavior prediction result of the target object.

In one example, the behavior prediction result may include a probability for each preset behavior type. For a preset behavior type, if the probability of the preset behavior type is greater than a preset behavior threshold, the terminal may determine that the predicted behavior of the target object is the preset behavior type.

In step 1003, if the predicted behavior is a prohibited behavior and the target object is in the area corresponding to the prohibited behavior, it is determined that the second abnormal behavior detection result is an abnormal behavior.

In the embodiment of the application, the terminal can judge whether the predicted behavior of the target object is the prohibited behavior according to the predicted behavior of the target object and the prohibited behavior of the signboard. And the terminal can judge whether the target object is in the area corresponding to the prohibited behavior according to the target video stream data and the area corresponding to the prohibited behavior of the signboard. And if the predicted behavior is the prohibited behavior and the target object is in the area corresponding to the prohibited behavior, the terminal determines that the second abnormal behavior detection result is the abnormal behavior.

Step 1004, if the behavior is predicted to be a non-prohibited behavior, or the target object is not in the area corresponding to the prohibited behavior, determining that the second abnormal behavior detection result is a non-abnormal behavior.

In the embodiment of the application, if the predicted behavior is a non-prohibited behavior, or the target object is not located in the area corresponding to the prohibited behavior, the terminal determines that the second abnormal behavior detection result is a non-abnormal behavior.

According to the abnormal behavior detection method, the abnormal behavior detection device, the computer equipment and the storage medium, the terminal determines the predicted behavior of the target object according to the behavior prediction result of the target object, and judges whether the second abnormal behavior detection result is the abnormal behavior according to the predicted behavior of the target object, the forbidden behavior of the signboard and the area corresponding to the forbidden behavior of the signboard. According to the abnormal behavior detection method, the abnormal behavior detection device, the computer equipment and the storage medium, the signboard information of the current region of the target object is combined with the behavior predicted to occur of the target object, whether the preset prohibited behavior of the current region of the target object can occur at the next moment is judged, and therefore whether the potential risk of the target object exists in the current region is predicted, potential safety hazards can be timely and effectively eliminated, the effect of abnormal behavior detection is further improved, and safety accidents are effectively prevented.

It should be understood that although the various steps in the flowcharts of fig. 1-10 are shown in order as indicated by the arrows, the steps are not necessarily performed in order as indicated by the arrows. The steps are not performed in the exact order shown and described, and may be performed in other orders, unless explicitly stated otherwise. Moreover, at least some of the steps in fig. 1-10 may include multiple steps or multiple stages, which are not necessarily performed at the same time, but may be performed at different times, which are not necessarily performed in sequence, but may be performed in turn or alternately with other steps or at least some of the other steps.

In one embodiment, as shown in fig. 13, there is provided an abnormal behavior detection apparatus 1300 including: a first obtaining module 1310, a multitask detecting module 1320, and a first determining module 1330, wherein:

a first obtaining module 1310, configured to obtain target video stream data;

a multitask detection module 1320, configured to input the target video stream data into a multitask detection model trained in advance, and obtain a detection result of a target object included in the target video stream data; the detection result comprises an object identification result and a safety dressing detection result corresponding to each safety dressing type;

the first determining module 1330 is configured to determine a first abnormal behavior detection result according to the detection result of the target object.

Optionally, the multitask detection model includes a feature extraction module, an object identification module, and detection modules corresponding to the respective safety dressing types;

the multitask detection module 1320 is specifically configured to:

Optionally, the apparatus 1300 further includes:

Optionally, the update module is specifically configured to:

Optionally, the apparatus 1300 further includes:

Optionally, the behavior prediction module is specifically configured to:

Optionally, the apparatus 1300 further includes:

the second determining module is specifically configured to:

For specific limitations of the abnormal behavior detection device, reference may be made to the above limitations of the abnormal behavior detection method, which are not described herein again. The modules in the abnormal behavior detection device may be wholly or partially implemented by software, hardware and a combination thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.

In one embodiment, a computer device is provided, which may be a terminal, and its internal structure diagram may be as shown in fig. 14. The computer device includes a processor, a memory, a communication interface, a display screen, and an input device connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The communication interface of the computer device is used for carrying out wired or wireless communication with an external terminal, and the wireless communication can be realized through WIFI, an operator network, NFC (near field communication) or other technologies. The computer program is executed by a processor to implement a method of abnormal behavior detection. The display screen of the computer equipment can be a liquid crystal display screen or an electronic ink display screen, and the input device of the computer equipment can be a touch layer covered on the display screen, a key, a track ball or a touch pad arranged on the shell of the computer equipment, an external keyboard, a touch pad or a mouse and the like.

Those skilled in the art will appreciate that the architecture shown in fig. 14 is merely a block diagram of some of the structures associated with the disclosed aspects and is not intended to limit the computing devices to which the disclosed aspects apply, as particular computing devices may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.

In one embodiment, a computer device is provided, which includes a memory and a processor, the memory stores a computer program, and the processor implements the steps of the above abnormal behavior detection method when executing the computer program.

In one embodiment, a computer readable storage medium is provided, on which a computer program is stored, which computer program, when being executed by a processor, realizes the steps of the above abnormal behavior detection method.

It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, storage, database or other medium used in the embodiments provided herein can include at least one of non-volatile and volatile memory. Non-volatile Memory may include Read-Only Memory (ROM), magnetic tape, floppy disk, flash Memory, optical storage, or the like. Volatile Memory can include Random Access Memory (RAM) or external cache Memory. By way of illustration and not limitation, RAM can take many forms, such as Static Random Access Memory (SRAM) or Dynamic Random Access Memory (DRAM), among others.

The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.

The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims

1. A method of abnormal behavior detection, the method comprising:

acquiring target video stream data;

determining a first abnormal behavior detection result according to the detection result of the target object;

the multi-task detection model comprises a feature extraction module, an object identification module and detection modules corresponding to all the safety dressing types;

the inputting the target video stream data into a multi-task detection model trained in advance to obtain a detection result of a target object contained in the target video stream data includes: inputting the target video stream data to the feature extraction module to obtain convolution features of the target video stream data; inputting the convolution characteristics to the object identification module to obtain an object identification result; and inputting the convolution characteristics to the detection modules corresponding to the safety dressing types to obtain the safety dressing detection results corresponding to the safety dressing types.

2. The method of claim 1, wherein the training process of the multi-tasking detection model comprises:

3. The method of claim 2, wherein determining the target loss function based on the object identification prediction result, the sample identification result of the object, the prediction result corresponding to each of the safety clothing types, and the sample detection result corresponding to each of the safety clothing types comprises:

4. The method of claim 1, further comprising:

5. The method of claim 4, wherein inputting the target video stream data into a pre-trained behavior prediction model to obtain behavior prediction information of a target object comprises:

6. The method of claim 4, further comprising:

7. An abnormal behavior detection apparatus, characterized in that the apparatus comprises:

the first acquisition module is used for acquiring target video stream data;

the first determining module is used for determining a first abnormal behavior detection result according to the detection result of the target object;

wherein the multitask detection module is specifically configured to: inputting the target video stream data to the feature extraction module to obtain convolution features of the target video stream data; inputting the convolution characteristics to the object identification module to obtain an object identification result; and inputting the convolution characteristics to the detection modules corresponding to the safety dressing types to obtain the safety dressing detection results corresponding to the safety dressing types.

8. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor, when executing the computer program, implements the steps of the method of any of claims 1 to 6.

9. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method of any one of claims 1 to 6.