CN111860316A

CN111860316A - Driving behavior recognition method and device and storage medium

Info

Publication number: CN111860316A
Application number: CN202010698251.7A
Authority: CN
Inventors: 潘兵; 金忠孝
Original assignee: SAIC Motor Corp Ltd
Current assignee: SAIC Motor Corp Ltd
Priority date: 2020-07-20
Filing date: 2020-07-20
Publication date: 2020-10-30
Anticipated expiration: 2040-07-20
Also published as: CN111860316B

Abstract

The invention provides a method, a device and a storage medium for identifying driving behaviors. Because the key part carries the important information of the image to be analyzed, the method directly analyzes the important information of the image to be analyzed, and the obtained identification accuracy is higher. In addition, the image cutting submodule cuts off the non-key information, so that the interference caused by the non-key information in the driving behavior recognition can be avoided, and the accuracy of the driving behavior recognition is further improved.

Description

Driving behavior recognition method and device and storage medium

Technical Field

The present invention relates to the field of image processing, and more particularly, to a method and an apparatus for identifying driving behavior, and a storage medium.

Background

When a driver drives, the driver is easy to do dangerous behaviors, such as smoking, calling, drinking and the like, and when the driver has the dangerous behaviors, the attention of the driver is dispersed, so that traffic accidents are easily caused. Therefore, it is necessary to analyze the driving behavior of the driver during the driving process and to perform early warning in time when the driver has dangerous behavior, so as to improve the driving safety of the vehicle.

In the prior art, when a driving behavior is analyzed, a picture of a driver is generally collected and input into a pre-constructed classification model to obtain the behavior of the driver. However, this method of determining the driving behavior has a low accuracy, and thus the accuracy of performing the warning operation based on the determined driving behavior is low.

Disclosure of Invention

In view of the above, the present invention provides a method, an apparatus, and a storage medium for identifying a driving behavior, so as to solve the problems that the accuracy of a method for determining a driving behavior is low, and further the accuracy of an early warning operation performed based on the determined driving behavior is low in the prior art.

In order to solve the technical problems, the invention adopts the following technical scheme:

a method of identifying driving behavior, comprising:

acquiring an image to be analyzed;

calling a pre-generated data processing model to process the image to be analyzed; the data processing model comprises an image cutting sub-model and an image classification sub-model; the image cutting sub-model is used for identifying key information in the image to be analyzed and cutting the image to be analyzed to obtain a target image comprising the key information in the image to be analyzed; the image classification submodel is used for carrying out driving behavior identification operation on the target image; the data processing model is obtained by training in a preset model training mode based on training samples; the training samples comprise image samples marked with driving behavior categories in advance;

And obtaining an image processing result of the image to be analyzed, which is obtained by the data processing model.

Optionally, the training process of the data processing model includes:

acquiring a training sample, an image cropping sub-initial model and an image classification sub-initial model; the training samples comprise image samples marked with driving behavior categories in advance;

respectively training the image cropping sub-initial model and the image classification sub-initial model by using the training samples in a preset model training mode to obtain an image cropping sub-model and an image classification sub-model;

determining the set of image cropping submodels and image classification submodels as the data processing model.

Optionally, a preset model training mode is adopted, the training samples are used to respectively train the image cropping sub-initial model and the image classification sub-initial model, and an image cropping sub-model and an image classification sub-model are obtained, including:

training a pre-acquired image classification reference model by using the training sample, and stopping when the trained image classification reference model meets corresponding preset training stopping conditions;

Calling the image cropping sub-initial model to identify key information in each image sample in the training sample, and cropping each image sample to obtain an image to be processed including the key information in the image sample;

combining the images to be processed corresponding to each image sample to obtain an image set to be processed, and training the image classification sub-initial model by using the image set to be processed until the trained image classification sub-initial model meets the corresponding preset training stopping condition;

based on the difference between the classification precision of the trained image classification reference model and the classification precision of the image classification sub-initial model, training the image cutting sub-initial model by using the training sample, and stopping when the trained image cutting sub-initial model meets the corresponding preset training stopping condition;

returning to the step of calling the image cutting sub-initial model to identify the key information in each image sample in the training samples, and sequentially executing until the change rates of the respective loss functions of the image cutting sub-initial model and the image classification sub-initial model obtained by training are smaller than a preset threshold value;

And determining the trained image clipping sub-initial model as an image clipping sub-model, and determining the trained image classification sub-initial model as an image classification sub-model.

Optionally, acquiring an image to be analyzed includes:

acquiring an infrared image obtained by acquiring an image of a target area of a driver;

carrying out data enhancement and standardization processing on the infrared image to obtain an image to be analyzed; the data enhancement includes image cropping.

Optionally, after the obtaining of the image processing result of the image to be analyzed obtained by the data processing model, the method further includes:

and outputting preset warning information to the terminal equipment of the user under the condition that the image processing result is any one of preset dangerous behaviors.

A driving behavior recognition apparatus comprising:

the image acquisition module is used for acquiring an image to be analyzed;

the image processing module is used for calling a pre-generated data processing model to process the image to be analyzed; the data processing model comprises an image cutting sub-model and an image classification sub-model; the image cutting sub-model is used for identifying key information in the image to be analyzed and cutting the image to be analyzed to obtain a target image comprising the key information in the image to be analyzed; the image classification submodel is used for carrying out driving behavior identification operation on the target image; the data processing model is obtained by training in a preset model training mode based on training samples; the training samples comprise image samples marked with driving behavior categories in advance;

And the processing result acquisition module is used for acquiring the image processing result of the image to be analyzed, which is obtained by the data processing model.

Optionally, a model training module is further included, and the model training module includes:

the data acquisition submodule is used for acquiring a training sample, an image cutting sub-initial model and an image classification sub-initial model; the training samples comprise image samples marked with driving behavior categories in advance;

the training submodule is used for respectively training the image cutting sub-initial model and the image classification sub-initial model by using the training sample in a preset model training mode to obtain an image cutting sub-model and an image classification sub-model;

a combination sub-module for determining the set of image cropping sub-models and image classification sub-models as the data processing model.

Optionally, the training submodule includes:

the first training unit is used for training a pre-acquired image classification reference model by using the training sample and stopping when the trained image classification reference model meets the corresponding preset training stopping condition;

the cutting processing unit is used for calling the image cutting sub-initial model to identify key information in each image sample in the training sample, and cutting each image sample to obtain an image to be processed including the key information in the image sample;

The second training unit is used for combining the images to be processed corresponding to each image sample to obtain an image set to be processed, and training the image classification sub-initial model by using the image set to be processed until the trained image classification sub-initial model meets the corresponding preset training stopping condition;

the third training unit is used for training the image cropping sub-initial model by using the training sample based on the difference between the classification precision of the trained image classification reference model and the classification precision of the image classification sub-initial model, and stopping when the trained image cropping sub-initial model meets the corresponding preset training stopping condition;

the judging unit is used for judging whether the change rates of respective loss functions of the image cutting sub-initial model and the image classification sub-initial model obtained through training are smaller than a preset threshold value or not;

the cutting processing unit is further configured to call the image cutting sub-initial model to identify key information in each image sample in the training samples under the condition that the judging unit judges that the change rates of the respective loss functions of the image cutting sub-initial model and the image classification sub-initial model obtained through training are not both smaller than a preset threshold;

And the model determining unit is used for determining the image cropping sub-initial model obtained by training as the image cropping sub-model and determining the image classification sub-initial model obtained by training as the image classification sub-model under the condition that the judging unit judges that the change rates of the respective loss functions of the image cropping sub-initial model and the image classification sub-initial model obtained by training are smaller than a preset threshold value.

Optionally, the image acquisition module is specifically configured to:

acquiring an infrared image obtained by acquiring an image of a target area of a driver, and performing data enhancement and standardization processing on the infrared image to obtain an image to be analyzed; the data enhancement includes image cropping.

A storage medium comprising a stored program, wherein the program, when executed, controls a device on which the storage medium is located to perform the above-described method of identifying driving behavior.

Compared with the prior art, the invention has the following beneficial effects:

the invention provides a method, a device and a storage medium for identifying driving behaviors. In the invention, the key part of the image to be analyzed is extracted and analyzed independently, and the key part carries the important information of the image to be analyzed, namely, the invention directly analyzes the important information of the image to be analyzed, and the obtained identification accuracy is higher. In addition, the image cutting submodule cuts off the non-key information, so that the interference caused by the non-key information in the driving behavior recognition can be avoided, and the accuracy of the driving behavior recognition is improved.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the provided drawings without creative efforts.

Fig. 1 is a flowchart of a method for identifying driving behaviors according to an embodiment of the present invention;

FIG. 2 is a flowchart of a method for training a data processing model according to an embodiment of the present invention;

FIG. 3 is a scene diagram illustrating a training process of a model for recognizing driving behavior according to an embodiment of the present invention;

FIG. 4 is a flowchart of a method for training a data processing model according to another embodiment of the present invention;

FIG. 5 is a scene diagram illustrating a training process of another driving behavior recognition model according to an embodiment of the present invention;

fig. 6 is a schematic structural diagram of a driving behavior recognition apparatus according to an embodiment of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

When analyzing driving behavior, usually, a picture of a driver is collected and input into a pre-constructed classification model to obtain the behavior of the driver.

At present, the mainstream solution is to input the driver image collected by the camera into a trained model for judgment through a computer vision technology. The technical schemes can be divided into two types, one is a classification technical scheme, namely, images are input into a classification model, and the behavior class of a driver is directly output. The light and fast classification model has high image processing speed, but the model has small parameter and insufficient depth, so that the model is difficult to distinguish small differences in the image, if a gesture the same as a smoking action is made but the user does not take the cigarette, compared with the image of the smoking, the whole image has only few pixel points to distinguish, and the image is very easy to be mistakenly identified as the smoking. The complex model with a large number of parameters has high identification precision, but the speed for processing the image is slow, and real-time detection on vehicle-mounted equipment is difficult to realize.

The other is a detection technical scheme, which refers to inputting an image into a detection model, detecting the position of a driver, then classifying and judging local images with the driver, or directly detecting whether objects such as smoke, mobile phones and the like exist in the images. The speed of the model used in the technical scheme is slower than that of the same-quantity classification model in image processing, and the model training depends on a training set marked with a detection target position, so that the cost is higher.

The existing technical scheme based on computer vision is difficult to consider the recognition speed and the recognition precision, and the model judgment result is also easily influenced by the illumination condition and the distance change between the camera and the driver. That is to say, the accuracy of the manner of determining the driving behavior in the prior art is low, and then the accuracy of performing the early warning operation based on the determined driving behavior is low.

In order to solve the above-mentioned problems of slow recognition speed, low recognition accuracy, and the recognition result being susceptible to the illumination condition and the distance change between the camera and the driver, the inventors found that if a lightweight classification model can be selected and the entire image is not directly analyzed, but an important feature region in the image is cut out and enlarged to an original size input model for classification, it is easier to recognize important features but smaller features in the entire image (e.g., smoke is small in the entire image during smoking), and therefore the accuracy is high, and the recognition speed can be increased without analyzing non-important regions. In addition, an important characteristic region (generally, a region around the head of a driver) is intercepted and amplified to the size of an original image, the influence of the change of the distance between the camera and the driver on the model is small, and the influence of the change of the size of the driver on an original image on the accuracy of the model when the distance between the camera and the driver is changed is avoided. In addition, the scheme of the invention only classifies the picture according to the important characteristic region, and the surrounding environment information contained in other regions of the picture does not influence the model, so that the influence of environmental changes such as illumination conditions on the model is small.

On the basis of the above, an embodiment of the present invention provides a method for identifying a driving behavior, which may be applied to a vehicle controller or an external controller in communication with the vehicle controller, and referring to fig. 1, the method for identifying a driving behavior may include:

and S11, acquiring an image to be analyzed.

In practical applications, the camera configured on the vehicle may be an infrared camera, and the infrared camera is used to take a picture of a target area of the driver, such as the upper half of the body, and take the picture to obtain an infrared image. The infrared camera is used for collecting images, so that the images collected under various illumination conditions are clearer, if a common camera is used, the collected images are very fuzzy under scenes with poor light conditions such as night, and the model identification precision is influenced.

After the infrared image is obtained, the infrared image is subjected to a normalization process.

After the standardization processing is carried out, data enhancement and standardization processing are carried out on the standardized infrared image, and an image to be analyzed is obtained.

Wherein the data enhancement comprises: translation, horizontal turnover, chromaticity adjustment, contrast adjustment, histogram equalization and a random interception and scaling method, wherein the random interception and scaling method is to cut a region with the size of 0.8-1.0 times of the area of an original picture from a random position of the picture. The normalization method is to scale the picture to a fixed size 240x240, then subtract the mean of the pixel values of all channels of the picture from the pixel value of each channel of the picture, and then divide by the variance of the pixel values of all channels of the picture.

And S12, calling a pre-generated data processing model to process the image to be analyzed.

Specifically, in practical application, a data processing model (i.e., a pre-trained data processing model) is obtained through pre-training, and the data processing model comprises an image cropping sub-model and an image classification sub-model; the image cutting sub-model is used for identifying key information in the image to be analyzed and cutting the image to be analyzed to obtain a target image comprising the key information in the image to be analyzed; the image classification submodel is used for carrying out driving behavior identification operation on the target image.

The data processing model is obtained by training in a preset model training mode based on training samples; the training samples include image samples pre-labeled with driving behavior categories.

And processing the image to be analyzed by using a data processing model to obtain the probability value of the image to be analyzed belonging to each driving category, and outputting the driving category with the maximum corresponding probability value. The driving behavior categories include normal driving, drinking water while driving, smoking while driving, and making a call while driving.

And S13, obtaining an image processing result of the image to be analyzed obtained by the data processing model.

And after the pre-generated data processing model is called to process the image to be analyzed, the data processing model can obtain an image processing result of the image to be analyzed. Then, the controller in this embodiment acquires the image processing result. And judging whether the image processing result is any one of preset dangerous behaviors (drinking water during driving, smoking during driving and making a call during driving), if so, outputting warning information to terminal equipment of the driver, and outputting voices or characters for 'please drive a car' to equipment such as a mobile phone, a computer and the like of the driver.

In this embodiment, when the driving behavior of the driver is recognized, the image to be analyzed is first obtained, then the image to be analyzed is cut by the image cutting sub-model in the data processing model to obtain the target image, and the target image is analyzed based on the image classification sub-model to obtain the analysis result of the driving behavior. In the invention, the key part of the image to be analyzed is extracted and analyzed independently, and the key part carries the important information of the image to be analyzed, namely, the invention directly analyzes the important information of the image to be analyzed, and the obtained identification accuracy is higher. In addition, the image cutting submodule cuts off the non-key information, so that the interference caused by the non-key information in the driving behavior recognition can be avoided, and the accuracy of the driving behavior recognition is improved.

The above-described driving behavior recognition is performed using a data processing model, and a training process of the data processing model is now described. Specifically, referring to fig. 2, the training process of the data processing model includes:

and S21, acquiring a training sample, an image cropping sub-initial model and an image classification sub-initial model.

The training samples in the embodiment are divided into a training set and a verification set, the training set and the verification set both comprise image samples marked with driving behavior categories in advance, the image samples comprise pictures of normal driving, smoking, drinking and calling of drivers in different vehicle types at different time (different time corresponds to different illumination), and the number of the pictures of the different categories is close. These pictures are taken from two different angles. The training picture input model needs to be subjected to data enhancement and standardization before training. The data enhancement and normalization processes can be referred to the respective descriptions above. In this embodiment, pictures with different conditions (different illumination, different camera angles and distances) are used, and various data enhancement methods are used to simulate the changing conditions in practice. In addition, the training sample in this embodiment may also be an image acquired by using an infrared camera, so that the images acquired under various illumination conditions are relatively clear, and further, the model identification accuracy is improved.

The image cropping sub-initial model and the image classification sub-initial model are generated in advance, the image cropping sub-initial model specifically comprises an image cropping identification module and an image cropping module, the image cropping identification module can output a key area of an image sample, namely an area comprising key information, the area is represented by using four cropping parameters of tx, ty, tw and th, wherein the tx, ty, tw and th respectively represent an x pixel coordinate, a y pixel coordinate, a width and a height of a central point of the key area. The image cropping module is used for cropping a target image with a center point (tx, ty), a height tw and a width th from an image sample, and after the target image is obtained, the image is enlarged to 240x240, and then the normalization operation is performed. And after the standardization is finished, inputting the obtained image into an image classification sub-initial model, and carrying out driving behavior recognition operation on the target image by the model to obtain the driving behavior.

And S22, training the image cropping sub-initial model and the image classification sub-initial model respectively by using the training samples in a preset model training mode to obtain an image cropping sub-model and an image classification sub-model.

In this embodiment, the parameter values of the model parameters in the image cropping sub-initial model are fixed first, the training sample is used to train the image classification sub-initial model, after the training is completed, the parameter values of the model parameters in the image classification sub-initial model are fixed, the training sample is used to train the image cropping sub-initial model, and the above operations are repeated until the corresponding training stopping conditions are met.

Referring to fig. 3, fig. 3 shows a connection relationship between models, a training sample is input into an image cropping sub initial model to perform a cropping operation, a cropped image is input into an image classification sub initial model, and the cropped image is used to perform a training operation on the image classification sub initial model. Meanwhile, the training samples are directly input into the image classification reference model, and the training operation is carried out on the image classification sub-initial model. After the training of the image classification reference model and the image classification sub-initial model is finished, the classification precision and the training sample of the image classification reference model and the image classification sub-initial model are used for carrying out training operation on the image cutting sub-initial model until the training of the image cutting sub-initial model is finished. And repeating the steps until the training is finished.

The structure of the initial model of the image clipper in the above embodiment is as follows:

one 3 x 3 convolution layer, step 2, input channel 3, output channel 16.

The block convolution module comprises a 3 x 3 block convolution layer, a step size 2, an input channel 16, an output channel 16, a block number 16, a BN layer, a Relu activation function layer, a 1 x 1 convolution layer, a step size 1, an input channel 16 and an output channel 24.

The block convolution module comprises a 3 x 3 block convolution layer, a step 2, an input channel 24, an output channel 24, a block number 24, a BN layer, a Relu activation function layer, a 1 x 1 convolution layer, a step 1, an input channel 24 and an output channel 40.

The block convolution module comprises a 3 x 3 block convolution layer, a step size of 2, an input channel 40, an output channel 40, a block number of 40, a BN layer, a Relu activation function layer, a 1 x 1 convolution layer, a step size of 1, an input channel 40 and an output channel 64.

The block convolution module comprises a 3 x 3 block convolution layer, a step size of 2, an input channel 64, an output channel 64, a block number of 64, a BN layer, a Relu activation function layer, a 1 x 1 convolution layer, a step size of 1, an input channel 64 and an output channel 80.

The fully-connected module comprises a fully-connected layer, a Relu activation function layer, a fully-connected layer and 4 output neurons which respectively represent tx, ty, tw and th.

In order to implement the foregoing solution, in another implementation manner of the present invention, a specific implementation procedure of step S22 is provided, and referring to fig. 4, step S22 may include:

and S31, training the pre-acquired image classification reference model by using the training sample, and stopping when the trained image classification reference model meets the corresponding preset training stopping condition.

In practical applications, in order to prove that the accuracy of the driving behavior recognition of the image after the image sample is cropped by using the image cropping sub-initial model in the invention is higher than the accuracy of the driving behavior recognition of the complete image sample without performing the cropping operation. In the invention, an image classification reference model is set for comparison of the identification accuracy.

The image classification reference model may be the same as the internal architecture of the image classification sub-initial model in the present invention, for example, a lightweight and fast classification network may be selected, such as mobilene _ v3 and Efficientnet _ b0, and in this embodiment, Efficientnet _ b0 is selected. The efficiency _ b0 can be built based on PyTorch framework, and the present embodiment selects a light-weight and fast classification network to improve the classification speed.

After the classification network types of the image classification reference model and the image classification sub-initial model are determined, training the image classification reference model by using a training sample, wherein the loss function of the training in the step is as follows:

where m is the number of training samples, x_iIs the output of the network on the ith training sample, which is a 1xn matrix, x_i[j]Is the jth output in the matrix. class_iIs the true class number of the ith training sample. The classification accuracy can be improved by minimizing the above-mentioned loss function, and the training is stopped when the loss function is smaller than a preset threshold.

And S32, calling the image clipper initial model to identify key information in each image sample in the training samples, and clipping the to-be-processed image including the key information in the image sample from each image sample.

In the embodiment of the invention, when the image classification sub-initial model is trained, the training sample is not directly used for training the image classification sub-initial model, but the image cropping sub-initial model is used for performing cropping operation on the image samples in the training sample, before the cropping, the key information in each image sample in the training sample is identified, and then the image to be processed including the key information in the image sample is obtained by cropping from each image sample.

When the image cropping sub-initial model is used, the parameter values of the cropping parameters in the image cropping basic model are preset as preset values, specifically tx and ty are respectively 0.5, tw and th are respectively half of 240, namely 120, that is, half of the original image is cropped by taking the center of the image sample as a cropping center point. After the cropping is completed, the cropped image is enlarged to 240 × 240.

And S33, combining the images to be processed corresponding to each image sample to obtain an image set to be processed, and training the image classification sub-initial model by using the image set to be processed until the trained image classification sub-initial model meets the corresponding preset training stopping condition.

And combining the images to be processed corresponding to the image samples obtained by cutting into an image set to be processed, then training the image classification sub-initial model by using the image set to be processed, wherein the loss function in the training is the same as that of the image classification reference model, and stopping the training when the loss function is smaller than a preset threshold value.

It should be noted that, when the image classification sub-initial model is trained, the parameter values of the model parameters (including the cropping parameters) in the image cropping basic model remain unchanged.

In addition, the image classification sub-initial model and the image classification reference model in this embodiment may be obtained by training using an ImageNet data set, that is, performing initialization processing on the model parameters of the two models using the ImageNet data set.

And S34, training the image cropping sub-initial model by using the training sample based on the difference between the classification precision of the trained image classification reference model and the classification precision of the image classification sub-initial model, and stopping when the trained image cropping sub-initial model meets the corresponding preset training stopping condition.

After the training of the image classification sub-initial model is finished, training the image cutting sub-initial model, wherein the corresponding loss function is L₂＝p¹-p²+q‖w‖₂ ²Wherein p is¹For classifying images the classification accuracy of the reference model, p²For the classification precision of the image classification sub-initial model, q is an artificially set regularization parameter which can be 0.001-0.005, and w is a vector formed by all weight parameters to be trained in the image classification sub-initial model, | w |₂ ²Is a regularization term that is the square root of the sum of the squares of all the weight parameters to be trained in the image classification sub-initial model.

When the loss function is minimized, due to q | w |₂ ²Is a positive number, then p¹-p²At a minimum, i.e. p needs to be guaranteed²＞p¹According to the invention, part of the original image intercepted according to the image cropping sub-initial model is input into the image classification sub-initial model after being standardized, and the classification precision is higher than that obtained by directly inputting the original image into the image classification reference model after being standardized. In addition, due to the introduction of the regularization term, model overfitting is avoided.

When the image cropping sub-initial model is trained, the model parameters of the image classification sub-initial model are kept unchanged.

In the model training, if data is input in batch form, for example, 64 training samples at a time, the classification accuracy of the trained image classification reference model or the classification accuracy of the image classification sub-initial model may be the average classification accuracy of the 64 samples, and then the difference between the two models may be obtained.

S35, judging whether the change rates of respective loss functions of the trained image cutting sub-initial model and the trained image classification sub-initial model are smaller than a preset threshold value; if yes, go to step S36; if not, returning to execute the step S32 until the change rates of the loss functions of the trained image cropping sub-initial model and the trained image classification sub-initial model are both smaller than the preset threshold value.

In practical application, the steps are repeated until the fluctuation of the loss functions of the image cropping sub-initial model and the image classification sub-initial model is less than 0.01 percent of the fluctuation, the two loss functions are considered to be converged, and the training is finished and the model is stored.

And S36, determining the trained image cropping sub-initial model as an image cropping sub-model, and determining the trained image classification sub-initial model as an image classification sub-model.

And S23, determining the set of the image cropping sub-model and the image classification sub-model as the data processing model.

Specifically, the image clipping sub-model and the image classification sub-model are connected in sequence to obtain a data processing model.

In the above embodiment, the data processing model and the image classification reference model are independent from each other, and in practical application, if the network model is saved, the image cropping sub-model in the data processing model may be used in a partial network in the image classification reference model. Specifically, referring to fig. 5, the entire model architecture includes two sub-networks, the first sub-network includes an image classification reference model and an image cropping sub-model. The image classification reference model can classify the input image, and a lightweight and fast classification network is selected, which may be mobilene _ v3 or Efficientnet _ b0, and in this embodiment, Efficientnet _ b0 is selected. The image classification reference model comprises a classification module internally provided with a convolution layer, a BN (Batch Normalization) layer and an activation layer and a classification module internally provided with a full connection layer, and a classification result can be obtained through the two classification modules. The image cropping sub-model uses the first classification module in the image classification reference model, so that the internal structure in the image cropping sub-model can be simplified, for example, the number of layers of convolutional layers and the like is reduced. The image classification sub-model is the same as the image classification reference model and is divided into two classification modules, in this embodiment, the first classification model, the image cropping sub-model and the image classification sub-model of the image classification reference model constitute the data processing model in this embodiment.

The input of the image classification reference model is a picture after standardization and image enhancement processing, and the output is the behavior probability of the driver (the probability is only used for training, and the output of the image classification sub-model is the final recognition result). The image cropping sub-model can output the coordinates of the attention area, i.e. the local area of the image in which the model is interested, and the area image is intercepted and normalized to be used as the input of the second sub-network. The image cropping sub-model can be realized by two fully-connected layers, the input is the output of the last convolution layer of the image classification reference model, the second fully-connected layer is provided with 4 neurons tx, ty, tw and th which respectively represent the x pixel coordinate, the y pixel coordinate, the width of the attention area and the height of the attention area of the central point of the attention area; the second sub-network only has an image classification sub-model, and can classify the input images, and a lightweight and fast classification network is selected, which may be mobilene _ v3, and Efficientnet _ b0, and in this embodiment, Efficientnet _ b0 is selected. The image classification reference model of the first sub-network is only used for network model training and is not required to be used when the behavior of a driver is recognized, the result output by the model is the classification result of the classification module of the second sub-network, and the classification result is one of normal driving, smoking, calling and drinking.

When the driving behavior is identified, the identification result is displayed, and if the model identifies that the driver smokes, makes a call or drinks water, early warning is carried out.

It should be noted that, if the architecture of fig. 3 is adopted, the image reference classification model may be trained only once, and if the architecture of fig. 5 is adopted, since the image cropping sub-model needs to use one classification module in the image reference classification model, the image reference classification model also needs to be trained many times as the image classification sub-model. Specifically, when the image classification submodel is trained, the image reference classification model is trained at the same time.

In the embodiment, the images shot by the common camera in the vehicle at night are blurred, so that the recognition of the model to the behavior of the driver is influenced, and the infrared camera is used, so that clear images can be obtained under any illumination condition. The training set contains various lighting conditions such as daytime, night and night driver pictures, which are taken from two different angles, and the training set pictures have diversity. The images are subjected to various data enhancement before being input into the model training, and a method for randomly intercepting and zooming the images is used, so that the size and the position of a driver in the images are not fixed any more, the change of the driver in the images caused by the adjustment of a seat and the like by the driver is simulated, and the diversity of the images of the training set is further enriched, so that the model has better robustness when processing images with different illumination conditions and small movement of the driver, and can accurately identify the behavior of the driver.

The image cutting submodule in the data processing model can intercept and amplify the region with obvious driver behavior characteristics in the whole image, the network model can distinguish details which are difficult to distinguish by a common network model more easily, and the overall recognition precision is higher.

The classification modules in the two classification models adopt light-weight and quick classification networks, the image processing speed of the models is high, and the models can be applied to vehicle-mounted scenes with limited computing power. The method belongs to a classification scheme, and position marking information is not needed during model training.

Alternatively, on the basis of the above embodiment of the method for identifying driving behavior, another embodiment of the present invention provides an apparatus for identifying driving behavior, and with reference to fig. 6, the apparatus may include:

the image acquisition module 11 is used for acquiring an image to be analyzed;

the image processing module 12 is configured to invoke a pre-generated data processing model to process the image to be analyzed; the data processing model comprises an image cutting sub-model and an image classification sub-model; the image cutting sub-model is used for identifying key information in the image to be analyzed and cutting the image to be analyzed to obtain a target image comprising the key information in the image to be analyzed; the image classification submodel is used for carrying out driving behavior identification operation on the target image; the data processing model is obtained by training in a preset model training mode based on training samples; the training samples comprise image samples marked with driving behavior categories in advance;

And the processing result obtaining module 13 is configured to obtain an image processing result of the image to be analyzed, which is obtained by the data processing model.

Further, still include the model training module, the model training module includes:

Further, the training submodule includes:

Further, the image acquisition module is specifically configured to:

Further, still include:

and the early warning module is used for outputting preset warning information to the terminal equipment of the user under the condition that the image processing result is any one of preset dangerous behaviors.

It should be noted that, for the working processes of each module, sub-module, and unit in this embodiment, please refer to the corresponding description in the above embodiments, which is not described herein again.

Alternatively, on the basis of the above-mentioned embodiments of the method and the device for identifying driving behavior, another embodiment of the present invention provides a storage medium on which a program is stored, which when executed by a processor implements the above-mentioned method for identifying driving behavior.

Optionally, on the basis of the above embodiment of the method and apparatus for identifying driving behavior, another embodiment of the present invention provides an electronic device, including: a memory and a processor;

wherein the memory is used for storing programs;

the processor calls a program and is used to:

acquiring an image to be analyzed;

Further, the training process of the data processing model comprises:

Further, a preset model training mode is adopted, the training sample is used for respectively training the image cutting sub-initial model and the image classification sub-initial model, and an image cutting sub-model and an image classification sub-model are obtained, and the method comprises the following steps:

Further, acquiring an image to be analyzed, comprising:

Further, after the obtaining of the image processing result of the image to be analyzed obtained by the data processing model, the method further includes:

The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims

1. A method for identifying driving behavior, comprising:

acquiring an image to be analyzed;

2. The recognition method of claim 1, wherein the training process of the data processing model comprises:

3. The recognition method of claim 2, wherein the training samples are used to train the image cropping sub-initial model and the image classification sub-initial model respectively by using a preset model training mode to obtain an image cropping sub-model and an image classification sub-model, and the method comprises the following steps:

4. The recognition method according to claim 1, wherein obtaining an image to be analyzed comprises:

5. The identification method according to claim 1, further comprising, after the obtaining of the image processing result of the image to be analyzed obtained by the data processing model:

6. A driving behavior recognition apparatus, comprising:

the image acquisition module is used for acquiring an image to be analyzed;

7. The recognition device of claim 6, further comprising a model training module, the model training module comprising:

8. The recognition device of claim 7, wherein the training submodule comprises:

9. The identification device according to claim 6, wherein the image acquisition module is specifically configured to:

10. A storage medium, characterized in that the storage medium includes a stored program, wherein a device on which the storage medium is located is controlled to execute the recognition method of driving behavior according to any one of claims 1 to 5 when the program is executed.