CN113287120A

CN113287120A - Vehicle driving environment abnormity monitoring method and device, electronic equipment and storage medium

Info

Publication number: CN113287120A
Application number: CN202180000757.9A
Authority: CN
Inventors: 高毅鹏; 刘力铭; 黄凯明
Original assignee: Streamax Technology Co Ltd
Current assignee: Streamax Technology Co Ltd
Priority date: 2021-04-09
Filing date: 2021-04-09
Publication date: 2021-08-20
Also published as: WO2022213336A1

Abstract

The application is suitable for the technical field of safety monitoring, and provides a method and a device for monitoring vehicle driving environment abnormity, electronic equipment and a storage medium, wherein the method comprises the following steps: acquiring an in-vehicle image in real time; inputting the in-vehicle image into a trained behavior detection model for processing to obtain an abnormal detection result output by the behavior detection model; the processing of the in-vehicle image by the behavior detection model comprises: acquiring key point characteristics of personnel in the in-vehicle image and associated embedded values thereof; determining the behavior state of the personnel according to the key point features and the associated embedded values thereof; determining whether abnormal behaviors exist in the vehicle or not based on the behavior state; and if the abnormal behavior exists in the vehicle is determined according to the abnormal detection result, reporting the abnormal behavior to a specified terminal. The application can realize the supervision of the behaviors of the driver and the passengers in the vehicle in the driving process, thereby effectively ensuring the safety of the driver and the passengers.

Description

Vehicle driving environment abnormity monitoring method and device, electronic equipment and storage medium

Technical Field

The application relates to the technical field of safety monitoring, in particular to a method and a device for monitoring vehicle driving environment abnormity, electronic equipment and a storage medium.

Background

Along with the rapid development of modern construction of socialist in China, urban population is gradually increased, various types of transportation facilities rush into the daily life of people, and the increase of the transportation facilities greatly facilitates the traveling of people on one hand, but also brings a great deal of safety problems on the other hand, particularly public transportation facilities which play a great role in the daily life of people, such as net appointment, taxis, buses and the like. In recent years, an event that a bus driver is beaten by a passenger to cause a major traffic accident often occurs, the phenomenon that the passenger is beaten by the driver is often played on a taxi, and an event that the safety of the passenger on a taxi is dangerous occurs sometimes. How to guarantee the safety of passengers and drivers becomes a first problem of research.

At present, each public transport operation platform usually provides some channels for complaints or alarms, and the complaints or alarms are used for monitoring the behavior of a driver, but the complaints or alarms are usually carried out after abnormal behaviors occur, so that the timeliness is poor, the abnormal behaviors of the driver are difficult to be timely notified to the operation platform or police for passenger harassment and injury, and the safety of the driver and passengers cannot be guaranteed in the driving process.

In summary, in the prior art, effective monitoring of abnormal behaviors is lacked in a vehicle driving environment, and safety of a driver and passengers in a driving process cannot be effectively guaranteed.

Disclosure of Invention

The embodiment of the application provides a method and a device for monitoring the driving environment abnormity of a vehicle, electronic equipment and a storage medium, and can solve the problems that in the prior art, the driving environment of the vehicle is lack of effective monitoring on abnormal behaviors, and the safety of a driver and passengers in the driving process cannot be effectively guaranteed.

In a first aspect, an embodiment of the present application provides a method for monitoring an abnormality of a driving environment of a vehicle, including:

acquiring an in-vehicle image in real time;

inputting the in-vehicle image into a trained behavior detection model for processing to obtain an abnormal detection result output by the behavior detection model;

wherein the processing of the in-vehicle image by the behavior detection model comprises: acquiring key point features of personnel in the in-vehicle image and associated embedded values of the key point features, wherein the associated embedded values are used for identifying the association degree between the key point features; determining the behavior state of the personnel according to the key point features and the associated embedded values thereof; determining whether abnormal behaviors exist in the vehicle or not based on the behavior state;

and if the abnormal behavior exists in the vehicle is determined according to the abnormal detection result, reporting the abnormal behavior to a specified terminal.

In a possible implementation manner of the first aspect, the behavior detection model includes a feature extraction module, an attention module, and a feature fusion module, and the behavior detection model further includes a double-headed decoupling structure;

the step of obtaining the key point characteristics of the personnel in the in-vehicle image and the associated embedded values thereof comprises the following steps:

inputting the in-vehicle image into the feature extraction module, and outputting the human body features of the personnel through the feature extraction module;

inputting the human body features into the attention module, and carrying out adaptive weighting on the human body features through the attention module;

inputting the output result of the attention module into the feature fusion module, and performing feature fusion on the output result of the attention module through the feature fusion module;

and performing task regression on the output result of the feature fusion module by using the double-head decoupling structure to obtain the key point features of the personnel and the associated embedded values thereof.

In a possible implementation manner of the first aspect, the feature extraction module includes a first feature extraction sub-module, a second feature extraction sub-module, and a third feature extraction sub-module, and the attention module includes a channel attention sub-module and a spatial attention sub-module;

the step of outputting the human body features of the person through the feature extraction module includes:

outputting a first human body feature of a first level through the first feature extraction submodule;

outputting a second human body feature of a second level through the second feature extraction submodule;

outputting a third human body feature of a third level through the third feature extraction submodule, wherein the first level, the second level and the third level have a progressive relation;

the step of inputting the human body features into the attention module, and the step of adaptively weighting the human body features through the attention module includes:

performing channel adaptive weighting on the first human body feature, the second human body feature and the third human body feature by the channel attention submodule;

spatially adaptively weighting, by the spatial attention submodule, the first human feature, the second human feature, and the third human feature;

the step of performing feature fusion on the output result of the attention module through the feature fusion module includes:

and performing feature fusion on the output result of the channel attention submodule and the output result of the space attention submodule through the feature fusion module.

In a possible implementation manner of the first aspect, the behavior detection model further includes a space-time graph convolution module, and the step of determining the behavior state of the person according to the keypoint features and the associated embedded values thereof includes:

obtaining target characteristics according to the key point characteristics and the associated embedded values;

inputting the target characteristics into a space-time diagram convolution module for processing, and acquiring the behavior state of the personnel output by the space-time diagram convolution module; the space-time graph convolution module is used for performing first graph convolution of a space dimension and second graph convolution of a time dimension on the target feature, and determining the behavior state of the person according to a convolution result of the first graph convolution and a convolution result of the second graph.

In one possible implementation manner of the first aspect, the vehicle driving environment abnormality monitoring method further includes:

performing model optimization on the trained behavior detection model according to a preset algorithm to obtain a target behavior detection model;

processing the in-vehicle image by using the target behavior detection model to obtain an abnormal detection result output by the target behavior detection model;

and if the abnormal behavior is determined to exist in the vehicle according to the abnormal detection result output by the target behavior detection model, reporting the abnormal behavior to the specified terminal.

In a possible implementation manner of the first aspect, the step of performing model optimization on the trained behavior detection model according to a preset algorithm to obtain a target behavior detection model includes:

acquiring a target training sample set;

acquiring a teacher network model and a student network model, wherein the teacher network model is the trained behavior detection model, and the student network model is the trained behavior detection model which is pruned according to preset parameters;

and carrying out model distillation on the teacher network model and the student network model based on the confrontation generation network and the training sample set to obtain a target behavior detection model.

In a second aspect, an embodiment of the present application provides a vehicle driving environment abnormality monitoring apparatus, including:

the image acquisition unit is used for acquiring an in-vehicle image in real time;

the abnormality detection unit is used for inputting the in-vehicle image into the trained behavior detection model for processing to obtain an abnormality detection result output by the behavior detection model; wherein the processing of the in-vehicle image by the behavior detection model comprises: acquiring key point features of personnel in the in-vehicle image and associated embedded values of the key point features, wherein the associated embedded values are used for identifying the association degree between the key point features; determining the behavior state of the personnel according to the key point features and the associated embedded values thereof; determining whether abnormal behaviors exist in the vehicle or not based on the behavior state;

and the abnormal reporting unit is used for reporting the abnormal behavior to a specified terminal if the abnormal behavior is determined to exist in the vehicle according to the abnormal detection result.

In a third aspect, an embodiment of the present application provides an electronic device, which includes a memory, a processor, and computer readable instructions stored in the memory and executable on the processor, and the processor executes the computer readable instructions to implement the vehicle driving environment abnormality monitoring method according to the first aspect.

In a fourth aspect, the present application provides a computer-readable storage medium, which stores computer-readable instructions, and when the computer-readable instructions are executed by a processor, the method for monitoring the vehicle driving environment abnormality according to the first aspect is implemented.

In a fifth aspect, the present invention provides a computer readable instruction product, which when run on an electronic device, causes the electronic device to execute the vehicle driving environment abnormality monitoring method according to the first aspect.

In this application embodiment, through obtaining the image in the car in real time, will the image in the car is input to the behavior detection model that has been trained and is handled, obtains the unusual testing result of behavior detection model output, wherein, the behavior detection model is to the processing of image in the car includes: acquiring key point features of personnel in the in-vehicle image and associated embedded values of the key point features, wherein the associated embedded values are used for identifying the association degree between the key point features; determining the behavior state of the personnel according to the key point features and the associated embedded values thereof; and if the abnormal behavior is determined to exist in the vehicle according to the abnormal detection result, reporting the abnormal behavior to a designated terminal, and realizing supervision of the behavior of a driver and passengers in the vehicle in the driving process so that an operation platform or a related law enforcement unit can take measures in time to avoid accidents, thereby effectively ensuring the safety of the driver and the passengers.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the embodiments or the prior art descriptions will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without inventive exercise.

FIG. 1 is a flowchart of an implementation of a method for monitoring an abnormality of a driving environment of a vehicle according to an embodiment of the present application;

fig. 2 is a flowchart illustrating a specific implementation of obtaining a key point feature and an associated embedded value thereof in the method for monitoring an abnormality of a driving environment of a vehicle according to the embodiment of the present application;

fig. 3 is a flowchart of a specific implementation of extracting human body features through a feature extraction module in the method for monitoring abnormality in driving environment of a vehicle according to the embodiment of the present application;

FIG. 4 is a flowchart illustrating an implementation of adaptive weighting by an attention module in a method for monitoring an abnormality of a driving environment of a vehicle according to an embodiment of the present disclosure;

FIG. 5 is a flowchart illustrating an implementation of determining a behavior state in a method for monitoring an abnormality of a driving environment of a vehicle according to an embodiment of the present disclosure;

FIG. 6 is a schematic diagram illustrating a distribution strategy of a subset of domains in a vehicle driving environment abnormality monitoring method according to an embodiment of the present application;

FIG. 7 is a flowchart illustrating a specific implementation of model optimization in a method for monitoring an abnormality of a driving environment of a vehicle according to an embodiment of the present disclosure;

fig. 8 is a block diagram of a vehicle driving environment abnormality monitoring apparatus according to an embodiment of the present application;

fig. 9 is a schematic diagram of an electronic device provided in an embodiment of the present application.

Detailed Description

In the following description, for purposes of explanation and not limitation, specific details are set forth, such as particular system structures, techniques, etc. in order to provide a thorough understanding of the embodiments of the present application. It will be apparent, however, to one skilled in the art that the present application may be practiced in other embodiments that depart from these specific details. In other instances, detailed descriptions of well-known systems, devices, circuits, and methods are omitted so as not to obscure the description of the present application with unnecessary detail.

It will be understood that the terms "comprises" and/or "comprising," when used in this specification and the appended claims, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.

It should also be understood that the term "and/or" as used in this specification and the appended claims refers to and includes any and all possible combinations of one or more of the associated listed items.

As used in this specification and the appended claims, the term "if" may be interpreted contextually as "when", "upon" or "in response to" determining "or" in response to detecting ". Similarly, the phrase "if it is determined" or "if a [ described condition or event ] is detected" may be interpreted contextually to mean "upon determining" or "in response to determining" or "upon detecting [ described condition or event ]" or "in response to detecting [ described condition or event ]".

Furthermore, in the description of the present application and the appended claims, the terms "first," "second," "third," and the like are used for distinguishing between descriptions and not necessarily for describing or implying relative importance.

Reference throughout this specification to "one embodiment" or "some embodiments," or the like, means that a particular feature, structure, or characteristic described in connection with the embodiment is included in one or more embodiments of the present application. Thus, appearances of the phrases "in one embodiment," "in some embodiments," "in other embodiments," or the like, in various places throughout this specification are not necessarily all referring to the same embodiment, but rather "one or more but not all embodiments" unless specifically stated otherwise. The terms "comprising," "including," "having," and variations thereof mean "including, but not limited to," unless expressly specified otherwise.

The vehicle driving environment abnormity monitoring method provided by the embodiment of the application can be applied to a vehicle-mounted intelligent terminal. The method and the device are not only suitable for business vehicles such as taxies and buses, but also suitable for situations such as following planes, convenience cars of relatives and friends, school buses and the like. The embodiment of the present application does not set any limit to the specific type of the terminal device.

Fig. 1 shows an implementation process of a vehicle driving environment anomaly monitoring method provided by an embodiment of the present application, where an execution end in the embodiment of the present application is a vehicle-mounted intelligent terminal, and the method includes steps S101 to S103. The specific realization principle of each step is as follows:

s101: and acquiring an in-vehicle image in real time.

In the embodiment of the application, the vehicle-mounted intelligent terminal is provided with the camera, and the camera of the vehicle-mounted intelligent terminal is utilized to shoot images in the vehicle in real time. The in-vehicle image is an image to be processed.

In some embodiments, an in-vehicle video is obtained in real time, the in-vehicle video is composed of a series of frame video images, and a plurality of frames of in-vehicle images are extracted from the in-vehicle video.

In a possible implementation mode, video images with specified frame numbers are extracted from in-vehicle videos shot by a camera of the vehicle-mounted intelligent terminal, then a plurality of video images with specified frame numbers are selected from the extracted video images with specified frame numbers according to a preset image selection algorithm to serve as in-vehicle images to be processed, and the frame numbers can be customized by a user.

Illustratively, 18 frames of video images are extracted from an in-vehicle video shot by a camera of the in-vehicle intelligent terminal, and then 10 frames of in-vehicle images are selected from the extracted multi-frame video images according to a preset image extraction algorithm to serve as the in-vehicle images to be processed.

In fact, the process of extracting the images in the vehicle with the specified frame number from the videos in the vehicle shot by the camera of the vehicle-mounted intelligent terminal is also a process of primarily screening the images in the vehicle, and the screening standard can be determined according to the definition of the images, whether the images contain human body features and the like.

In some embodiments, the method includes the steps of detecting characteristic points of a human face of multiple frames of video images shot by a camera of the vehicle-mounted intelligent terminal, screening out the video images without the human face to obtain the video images with the human face, extracting video images with specified frames from the video images with the human face according to a preset extraction algorithm, performing quality selection on the video images with the specified frames by using a preset image selection algorithm, and selecting the video images with relatively better quality as images to be processed in the vehicle.

The quality optimization in the embodiment of the application is to judge the quality of the image by using a quality algorithm, then output a corresponding numerical value, and determine the image with the best quality as the in-vehicle image to be processed according to the comparison result of the numerical values corresponding to the images.

Because the camera of the vehicle-mounted intelligent terminal captures the video, if each frame of video image in the video is processed, the calculation amount is increased, and a large amount of redundancy is caused. In the embodiment of the application, the quality is selected through a preset image selection algorithm, the calculated amount can be reduced, redundancy is avoided, and the video image with the best quality is selected as the in-vehicle image to be processed, so that the accuracy of subsequent processing is facilitated, and the effectiveness of face detection processing is improved.

The preset extraction algorithm may be a random extraction algorithm, and a set number of video images are randomly extracted from the preliminarily screened video images.

In some embodiments, the video images may also be screened according to a preset image standard, and the video images with specified frame numbers meeting the preset image standard are extracted as the images to be processed in the vehicle, where the preset image standard includes image definition, human face integrity, number of human body key parts or human face angles.

S102: and inputting the in-vehicle image into the trained behavior detection model for processing to obtain an abnormal detection result output by the behavior detection model.

The behavior detection model is a deep learning neural network model, and can be obtained by training sample images of various known behavior categories as a training sample set, for example, sample images in a kinetic behavior data set can be used.

In order to improve the accuracy of detecting abnormal behaviors, the processing of the in-vehicle image by the behavior detection model includes: acquiring key point features of people in the in-vehicle image and associated embedded values thereof, wherein the associated embedded values are used for identifying the degree of association among the key point features, determining the behavior state of the people according to the key point features and the associated embedded values thereof, and determining whether abnormal behaviors exist in the vehicle or not based on the behavior state.

In the embodiment of the present application, the input of the trained behavior detection model is the in-vehicle images of several frames. The behavior detection model acquires the key point characteristics of the personnel in each frame of the in-vehicle image, acquires the associated embedded values corresponding to the key point characteristics, and can determine the association degree between the key point characteristics by using the associated embedded values.

In some embodiments, the closer the distance of the associated embedded value is, the higher the degree of association between the key point features is; conversely, the farther the distance of the associated embedded value is, the lower the degree of association between the key point features is.

In a possible implementation manner, the number of the persons in the vehicle interior image is determined according to the key point features of the persons in the vehicle interior image and the associated embedded values of the key point features.

In the embodiment of the application, when different key point features belong to the same person, the distances between the associated embedded values corresponding to the different key point features are shorter, and when the different key point features belong to different persons, the distances between the associated embedded values corresponding to the different key point features are longer.

In some embodiments, the number of people is determined according to the number of feature groups by calculating a difference between associated embedded values corresponding to each of the key point features, determining similar key point features according to the difference, grouping the similar key point features into the same feature group.

In one embodiment, the keypoint features for which the difference between the associated embedding values is less than or equal to a preset difference threshold are determined to be similar keypoint features.

Illustratively, the Hungarian algorithm is utilized to group keypoint features with similar associated embedding values into the same feature group. For the introduction and use of the hungarian algorithm, reference is made to the prior art and no further description is given here.

As a possible implementation manner of the present application, the behavior detection model includes a feature extraction module, an attention module, and a feature fusion module, and fig. 2 shows a specific implementation process for acquiring the key point features of the person in the in-vehicle image and the associated embedded values thereof:

a1: and inputting the in-vehicle image into the feature extraction module, and outputting the human body features of the personnel through the feature extraction module.

In this embodiment of the application, the feature extraction module is configured to extract and output human body features of a person in the in-vehicle image. The human body features include human bone features.

In some embodiments, the feature extraction module includes a mobrieet v3 network, and the human skeleton features of the driver and the passenger are obtained by using the mobrieet v3 network output.

As a possible implementation manner of the present application, the feature extraction module includes a first feature extraction sub-module, a second feature extraction sub-module, and a third feature extraction sub-module, and as shown in fig. 3, the step of outputting the human body features of the person through the feature extraction module includes:

a11: and outputting the first human body feature of the first level through the first feature extraction submodule. The first feature extraction submodule is used for extracting features of a first level.

A12: and outputting a second human body feature of a second level through the second feature extraction submodule. The second feature extraction submodule is used for extracting features of a second level.

A13: and outputting a third human body feature of a third level through the third feature extraction submodule, wherein the first level, the second level and the third level have a progressive relation. The third feature extraction submodule is used for extracting features of a third level.

In the embodiment of the application, the first feature extraction submodule, the second feature extraction submodule and the third feature extraction submodule respectively extract features of different levels from the in-vehicle image. The characteristic enrichment characteristic information of different levels is obtained, so that the accuracy of the behavior detection model detection is improved.

In one embodiment, the first level represents shallow semantics, the second level represents middle semantics, and the third level represents deep semantics.

In another embodiment, the first hierarchy, the second hierarchy, and the third hierarchy have a decreasing relationship, the first hierarchy representing a deep semantic, the second hierarchy representing a middle semantic, and the third hierarchy representing a shallow semantic.

In an embodiment, the first level represents a first resolution, and the first feature extraction sub-module is configured to extract human features of the first resolution; the second level represents a first resolution, and the second feature extraction submodule is used for extracting human features of the second resolution; the third level represents a third resolution, and the third feature extraction submodule is configured to extract human features of the third resolution. Wherein the first resolution is lower than the second resolution, and the second resolution is lower than the third resolution.

A2: inputting the human body features into the attention module, and carrying out self-adaptive weighting on the human body features through the attention module.

In the embodiment of the present application, the attention module controls a weighting coefficient by using an attention mechanism, and adaptively weights the human body features extracted by the feature extraction module according to the weighting coefficient. The output of the attention module includes adaptively weighted human features.

As one possible implementation of the present application, the attention module includes a channel attention sub-module and a spatial attention sub-module. As shown in fig. 4, the step of inputting the human body characteristics into the attention module and adaptively weighting the human body characteristics by the attention module includes:

a21: performing channel adaptive weighting on the first human body feature, the second human body feature and the third human body feature by the channel attention submodule.

A22: spatially adaptive weighting, by the spatial attention submodule, the first human feature, the second human feature, and the third human feature.

In the embodiment of the application, the channel attention submodule and the space attention submodule are used for respectively squeezing and exciting the first human body feature, the second human body feature and the third human body feature, so that the self-adaptive weighting of the first human body feature, the second human body feature and the third human body feature channel and space is realized.

In one possible embodiment, the first human body feature, the second human body feature and the third human body feature are input to the attention module after feature concatenation.

A3: and inputting the output result of the attention module into the feature fusion module, and performing feature fusion on the output result of the attention module through the feature fusion module.

The characteristic fusion module is used for fusing the self-adaptive weighted human body characteristics so as to be beneficial to further analysis and processing.

In some embodiments, the output result of the channel attention submodule and the output result of the spatial attention submodule are feature-fused by the feature fusion module.

A4: and performing task regression on the output result of the feature fusion module by using the double-head decoupling structure to obtain the key point features of the personnel and the associated embedded values thereof.

In an embodiment of the present application, the behavior detection model further includes a double-head decoupling structure. Coupling refers to the phenomenon whereby two or more systems or two forms of motion interact with each other through interactions to join them together. Decoupling is the process of mathematically separating the two motions. The dependency relationships between the modules necessarily result in coupling.

In a possible implementation, the output result of the feature fusion module includes a key point heat map and associated embedding values, and task regression is performed by using the double-headed decoupling structure. Task regression refers to determining the labels of the persons to which the key point features belong. The key point heat map adopts a loss function focal loss to carry out regression, and the correlation embedding value adopts a loss function push loss matched with a loss function pull loss to realize regression.

According to the embodiment of the application, the coupling degree between the modules is reduced by using the double-head decoupling structure, and the serious information return coupling is avoided.

Illustratively, three sub-feature extraction modules, namely C2, C3 and C4, of a Moblienet v3 network are used for extracting human features of a shallow layer, a middle layer and a deep layer respectively, and then a channel attention sub-module and a space attention sub-module are used for carrying out channel and space adaptive weighting on the human features extracted by the C2, the C3 and the C4 respectively. And integrating the self-adaptive weighted human body features by using a feature fusion module, and performing task regression on an output result of the feature fusion module by using a double-head decoupling structure to obtain key point features of personnel in the in-vehicle image and associated embedded values thereof.

As a possible implementation manner of the present application, the behavior detection model further includes a space-time graph convolution module, and fig. 5 shows a specific implementation flow for determining the behavior state of the person according to the key point features and the associated embedded values thereof:

b1: and obtaining target characteristics according to the key point characteristics and the associated embedded values.

In this embodiment, the target feature is obtained by performing feature vector stitching on the key point feature and the associated embedded value.

B2: and inputting the target characteristics into a space-time diagram convolution module for processing, and acquiring the behavior state of the personnel output by the space-time diagram convolution module. The space-time graph convolution module is used for performing first graph convolution of a space dimension and second graph convolution of a time dimension on the target feature, and determining the behavior state of the person according to a convolution result of the first graph convolution and a convolution result of the second graph.

In this embodiment of the present application, the space-time diagram convolution module includes three layers of space-time diagram convolution. First, an allocation strategy of the domain subsets is defined, as shown in fig. 6, the domain subsets are divided into 3 subsets according to the distance from the node subsets in the domain to the center of the human body, and the 3 subsets are respectively a first subset close to the center of the human body, a second subset far away from the center of the human body, and a third subset of the node of the domain. Then, according to the definition of the domain subset, the space-time graph convolution can be similar to the ordinary convolution, firstly, the first graph convolution of the space dimension is carried out, namely different subsets are multiplied by different key point feature vectors, the same key point feature vector is multiplied in the same subset, then, the second graph convolution of the time dimension is carried out, after the simple three-layer space-time graph convolution superposition, the behavior state of the people in the vehicle is regressed by utilizing a softmax function. By the method, the structural prior of the key points of the human skeleton is effectively utilized, and the information transmission in the time dimension is also effectively utilized, so that the behavior state of people in the vehicle can be accurately judged, and the accuracy of monitoring the abnormal driving environment in the vehicle is improved.

S103: and if the abnormal behavior exists in the vehicle is determined according to the abnormal detection result, reporting the abnormal behavior to a specified terminal.

In the embodiment of the present application, the above-mentioned persons include a driver and a passenger. The above behavior states are classified into a normal behavior state and an abnormal behavior state. And if the behavior state is an abnormal behavior state, the abnormal detection result also comprises abnormal behaviors. The abnormal behavior includes passenger harassment, driver harassment, and passenger assault.

In the embodiment of the application, when the abnormal behavior in the vehicle is determined, a report event is immediately generated according to the abnormal behavior and uploaded to the appointed terminal. The vehicle-mounted intelligent terminal is in communication connection with the appointed terminal. The designated terminal includes, but is not limited to, an intelligent terminal of an operation platform, an intelligent terminal of a supervision department, and a mobile terminal of a designated user.

In some possible embodiments, the vehicle driving environment abnormality monitoring method further includes: and if the abnormal behavior in the vehicle is determined according to the abnormal detection result, sending a sound alarm, and prompting a driver and passengers in the vehicle to stop the abnormal behavior through the sound alarm.

As one possible embodiment of the present application, the method for monitoring an abnormality in a driving environment of a vehicle further includes:

c1: and performing model optimization on the trained behavior detection model according to a preset algorithm to obtain a target behavior detection model.

In the embodiment of the application, in order to further adapt to the computational power requirement of the edge computing device, model optimization is performed on the trained behavior detection model.

In a possible embodiment, the preset algorithm includes model distillation, as shown in fig. 7, and the step C1 includes:

c11: and acquiring a target training sample set. The number of samples and the type of samples in the target training sample set may be determined according to actual needs, and the present application is not specifically limited herein. The training sample set of the behavior detection model can also be directly adopted.

C12: and acquiring a teacher network model and a student network model, wherein the teacher network model is the trained behavior detection model, and the student network model is the trained behavior detection model which is pruned according to preset parameters. The preset parameter may be a model sensitivity.

The pruning refers to the number of channels of the convolutional layer in the model. For example, in the embodiment of the present application, the pruning rate is set to 0.4, and 40% of channels of a single convolutional layer in the trained behavior detection model are pruned.

C13: and carrying out model distillation on the teacher network model and the student network model based on the confrontation generation network and the training sample set to obtain a target behavior detection model.

In the embodiment of the application, the idea of confrontation generation network is added in the process of model distillation, so that the precision of the model can be effectively ensured.

C2: and processing the in-vehicle image by using the target behavior detection model to obtain an abnormal detection result output by the target behavior detection model.

In the embodiment of the application, the trained behavior detection model is determined as an original model and is used as a teacher network, and the trained behavior detection model which is automatically pruned based on sensitivity analysis is used as a student network for model distillation. And adding a GAN thought in the distillation process, taking a teacher network as a discrimination model and a student network as a generation model during model distillation, and effectively supervising the generation network by using the discrimination network to perform iterative optimization of the models to finally obtain a network model with high algorithm precision and low calculation amount.

C3: and if the abnormal behavior is determined to exist in the vehicle according to the abnormal detection result output by the target behavior detection model, reporting the abnormal behavior to the specified terminal.

In the embodiment of the application, the model is further compressed by using the distillation of the structured pruning matching model, so that the optimized target behavior detection model is obtained, the in-vehicle image is processed by using the target behavior detection model, the abnormal detection result output by the target behavior detection model is obtained, the model calculation amount is greatly reduced while the model precision loss is avoided, and the real-time performance of abnormal behavior monitoring can be enhanced.

It can be seen from above that, in this application embodiment, through obtaining the image in the car in real time, will the image input in the car is handled to the action detection model that has trained the completion, obtains the unusual testing result of action detection model output, wherein, the action detection model is right the processing of image in the car includes: acquiring key point features of personnel in the in-vehicle image and associated embedded values of the key point features, wherein the associated embedded values are used for identifying the association degree between the key point features; determining the behavior state of the personnel according to the key point features and the associated embedded values thereof; and if the abnormal behavior is determined to exist in the vehicle according to the abnormal detection result, reporting the abnormal behavior to a designated terminal, and realizing supervision of the behavior of a driver and passengers in the vehicle in the driving process so that an operation platform or a related law enforcement unit can take measures in time to avoid accidents, thereby effectively ensuring the safety of the driver and the passengers.

It should be understood that, the sequence numbers of the steps in the foregoing embodiments do not imply an execution sequence, and the execution sequence of each process should be determined by its function and inherent logic, and should not constitute any limitation to the implementation process of the embodiments of the present application.

Fig. 8 shows a block diagram of a vehicle driving environment abnormality monitoring device provided in the embodiment of the present application, corresponding to the vehicle driving environment abnormality monitoring method described in the above embodiment, and only the relevant portions of the embodiment of the present application are shown for convenience of description.

Referring to fig. 8, the vehicle driving environment abnormality monitoring device includes: an image acquisition unit 81, an abnormality detection unit 82, and an abnormality reporting unit 83, wherein:

an image acquisition unit 81 for acquiring an in-vehicle image in real time;

the abnormality detection unit 82 is configured to input the in-vehicle image into a trained behavior detection model for processing, and obtain an abnormality detection result output by the behavior detection model; wherein the processing of the in-vehicle image by the behavior detection model comprises: acquiring key point features of personnel in the in-vehicle image and associated embedded values of the key point features, wherein the associated embedded values are used for identifying the association degree between the key point features; determining the behavior state of the personnel according to the key point features and the associated embedded values thereof; determining whether abnormal behaviors exist in the vehicle or not based on the behavior state;

and an anomaly reporting unit 83, configured to report an abnormal behavior to a specific terminal if it is determined that the abnormal behavior exists in the vehicle according to the anomaly detection result.

As a possible implementation manner of the present application, the behavior detection model includes a feature extraction module, an attention module, and a feature fusion module, and the behavior detection model further includes a double-headed decoupling structure; the abnormality detection unit 82 is specifically configured to:

As a possible implementation manner of the present application, the feature extraction module includes a first feature extraction sub-module, a second feature extraction sub-module, and a third feature extraction sub-module, and the attention module includes a channel attention sub-module and a spatial attention sub-module;

the step of outputting the human body characteristics of the person through the characteristic extraction module includes:

the step of inputting the human body features into the attention module and adaptively weighting the human body features by the attention module includes:

the step of performing feature fusion on the output result of the attention module by the feature fusion module includes:

As a possible implementation manner of the present application, the behavior detection model further includes a space-time graph convolution module, and the anomaly detection unit 82 is further configured to:

As one possible embodiment of the present application, the vehicle driving environment abnormality monitoring apparatus further includes:

the model optimization unit is used for carrying out model optimization on the trained behavior detection model according to a preset algorithm to obtain a target behavior detection model;

the anomaly detection unit 82 is further configured to process the in-vehicle image by using the target behavior detection model, and obtain an anomaly detection result output by the target behavior detection model;

the abnormal reporting unit 83 is further configured to report the abnormal behavior to the designated terminal if it is determined that the abnormal behavior exists in the vehicle according to the abnormal detection result output by the target behavior detection model.

As a possible implementation manner of the present application, the model optimization unit includes:

the sample acquisition module is used for acquiring a target training sample set;

the optimization training module is used for acquiring a teacher network model and a student network model, wherein the teacher network model is the trained behavior detection model, and the student network model is the trained behavior detection model which is pruned according to preset parameters;

and the target model generation module is used for carrying out model distillation on the teacher network model and the student network model based on the confrontation generation network and the training sample set to obtain a target behavior detection model.

It should be noted that, for the information interaction, execution process, and other contents between the above-mentioned devices/units, the specific functions and technical effects thereof are based on the same concept as those of the embodiment of the method of the present application, and specific reference may be made to the part of the embodiment of the method, which is not described herein again.

Embodiments of the present application further provide a computer-readable storage medium, which stores computer-readable instructions, and when the computer-readable instructions are executed by a processor, the steps of any one of the vehicle driving environment abnormality monitoring methods shown in fig. 1 to 7 are implemented.

The embodiment of the present application further provides an electronic device, which includes a memory, a processor, and computer readable instructions stored in the memory and executable on the processor, where the processor executes the computer readable instructions to implement the steps of any one of the vehicle driving environment abnormality monitoring methods shown in fig. 1 to 7.

The embodiment of the application also provides a computer readable instruction product, when the computer readable instruction product runs on an electronic device, the electronic device is caused to execute the steps of implementing any one of the vehicle driving environment abnormality monitoring methods as shown in fig. 1 to 7.

Fig. 9 is a schematic diagram of an electronic device according to an embodiment of the present application. As shown in fig. 9, the electronic apparatus 9 of this embodiment includes: a processor 90, a memory 91, and computer readable instructions 92 stored in the memory 91 and executable on the processor 90. The processor 90, when executing the computer readable instructions 92, implements the steps in the various vehicle driving environment abnormality monitoring method embodiments described above, such as steps S101-S103 shown in fig. 1. Alternatively, the processor 90, when executing the computer readable instructions 92, implements the functions of the modules/units in the above-described device embodiments, such as the functions of the units 81 to 83 shown in fig. 8.

Illustratively, the computer readable instructions 92 may be partitioned into one or more modules/units that are stored in the memory 91 and executed by the processor 90 to accomplish the present application. The one or more modules/units may be a series of computer-readable instruction segments capable of performing specific functions, which are used to describe the execution of the computer-readable instructions 92 in the electronic device 9.

The electronic device 9 may be a vehicle-mounted intelligent terminal. The electronic device 9 may include, but is not limited to, a processor 90, a memory 91. Those skilled in the art will appreciate that fig. 9 is only an example of the electronic device 9, and does not constitute a limitation to the electronic device 9, and may include more or less components than those shown, or combine certain components, or different components, for example, the electronic device 9 may further include an input-output device, a network access device, a bus, etc.

The Processor 90 may be a Central Processing Unit (CPU), other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic device, discrete hardware component, etc. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

The memory 91 may be an internal storage unit of the electronic device 9, such as a hard disk or a memory of the electronic device 9. The memory 91 may also be an external storage device of the electronic device 9, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like, which are provided on the electronic device 9. Further, the memory 91 may also include both an internal storage unit and an external storage device of the electronic device 9. The memory 91 is used for storing the computer readable instructions and other programs and data required by the electronic device. The memory 91 may also be used to temporarily store data that has been output or is to be output.

It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-mentioned division of the functional units and modules is illustrated, and in practical applications, the above-mentioned function distribution may be performed by different functional units and modules according to needs, that is, the internal structure of the apparatus is divided into different functional units or modules to perform all or part of the above-mentioned functions. Each functional unit and module in the embodiments may be integrated in one processing unit, or each unit may exist alone physically, or two or more units are integrated in one unit, and the integrated unit may be implemented in a form of hardware, or in a form of software functional unit. In addition, specific names of the functional units and modules are only for convenience of distinguishing from each other, and are not used for limiting the protection scope of the present application. The specific working processes of the units and modules in the system may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.

The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. With this understanding, all or part of the flow of the method implemented by the present application may be implemented by hardware related to computer readable instructions, which may be stored in a computer readable storage medium, and when the computer readable instructions are executed by a processor, the steps of the method embodiments may be implemented. Wherein the computer readable instructions comprise computer readable instruction code which may be in source code form, object code form, an executable file or some intermediate form, and the like. The computer readable medium may include at least: any entity or device capable of carrying computer-readable instruction code to an apparatus/terminal device, recording medium, computer Memory, Read-Only Memory (ROM), Random-Access Memory (RAM), electrical carrier signal, telecommunications signal, and software distribution medium. Such as a usb-disk, a removable hard disk, a magnetic or optical disk, etc. In certain jurisdictions, computer-readable media may not be an electrical carrier signal or a telecommunications signal in accordance with legislative and patent practice.

In the above embodiments, the descriptions of the respective embodiments have respective emphasis, and reference may be made to the related descriptions of other embodiments for parts that are not described or illustrated in a certain embodiment.

The above-mentioned embodiments are only used for illustrating the technical solutions of the present application, and not for limiting the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; such modifications and substitutions do not substantially depart from the spirit and scope of the embodiments of the present application and are intended to be included within the scope of the present application.

Claims

1. A vehicle driving environment abnormality monitoring method characterized by comprising:

acquiring an in-vehicle image in real time;

2. The vehicle driving environment abnormality monitoring method according to claim 1, wherein the behavior detection model includes a feature extraction module, an attention module, and a feature fusion module, and further includes a double-headed decoupling structure;

3. The vehicle driving environment abnormality monitoring method according to claim 2, wherein the feature extraction module includes a first feature extraction submodule, a second feature extraction submodule, and a third feature extraction submodule, and the attention module includes a channel attention submodule and a space attention submodule;

4. The method for monitoring the abnormality in the driving environment of the vehicle according to claim 1, wherein the behavior detection model further includes a space-time graph convolution module, and the step of determining the behavior state of the person according to the key point features and the associated embedded values thereof includes:

5. The vehicle driving environment abnormality monitoring method according to any one of claims 1 to 4, characterized by further comprising:

6. The method for monitoring the abnormality of the driving environment of the vehicle according to claim 5, wherein the step of performing model optimization on the trained behavior detection model according to a preset algorithm to obtain a target behavior detection model includes:

acquiring a target training sample set;

7. A vehicle driving environment abnormality monitoring device characterized by comprising:

8. The vehicle driving environment abnormality monitoring device according to claim 7, wherein the behavior detection model includes a feature extraction module, an attention module, and a feature fusion module, and further includes a double-headed decoupling structure; the abnormality detection unit is specifically configured to:

9. The vehicle driving environment abnormality monitoring device according to claim 8, wherein the feature extraction module includes a first feature extraction submodule, a second feature extraction submodule, and a third feature extraction submodule, and the attention module includes a channel attention submodule and a space attention submodule;

10. The vehicle driving environment abnormality monitoring device according to claim 7, characterized in that the abnormality detection unit is further configured to:

11. The vehicular driving environment abnormality monitoring device according to any one of claims 7 to 10, characterized by further comprising:

the abnormality detection unit is further configured to process the in-vehicle image by using the target behavior detection model to obtain an abnormality detection result output by the target behavior detection model;

and the abnormal reporting unit is further configured to report the abnormal behavior to the designated terminal if it is determined that the abnormal behavior exists in the vehicle according to the abnormal detection result output by the target behavior detection model.

12. The vehicle driving environment abnormality monitoring device according to claim 11, characterized in that the model optimization unit includes:

13. An electronic device comprising a memory, a processor, and computer readable instructions stored in the memory and executable on the processor, wherein the processor when executing the computer readable instructions performs the steps of:

acquiring an in-vehicle image in real time;

14. The electronic device of claim 13, wherein the behavior detection model comprises a feature extraction module, an attention module, and a feature fusion module, the behavior detection model further comprising a dual-headed decoupling structure;

15. The electronic device of claim 13, wherein the behavior detection model further comprises a space-time graph convolution module, and wherein the step of determining the behavior state of the person based on the keypoint features and their associated embedded values comprises:

16. The electronic device of any of claims 13-15, wherein the processor, when executing the computer readable instructions, further performs the steps of:

17. A computer readable storage medium having computer readable instructions stored thereon which, when executed by a processor, perform the steps of:

acquiring an in-vehicle image in real time;

18. The computer-readable storage medium of claim 17, wherein the behavior detection model comprises a feature extraction module, an attention module, and a feature fusion module, the behavior detection model further comprising a dual-headed decoupling structure;

19. The computer-readable storage medium of claim 17, wherein the behavior detection model further comprises a space-time graph convolution module, and wherein the step of determining the behavior state of the person based on the keypoint features and their associated embedded values comprises:

20. The computer readable storage medium of any of claims 17 to 19, wherein the computer readable instructions, when executed by the processor, further perform the steps of: