CN110807352A - In-vehicle and out-vehicle scene visual analysis method for dangerous driving behavior early warning - Google Patents

In-vehicle and out-vehicle scene visual analysis method for dangerous driving behavior early warning Download PDF

Info

Publication number
CN110807352A
CN110807352A CN201910808682.1A CN201910808682A CN110807352A CN 110807352 A CN110807352 A CN 110807352A CN 201910808682 A CN201910808682 A CN 201910808682A CN 110807352 A CN110807352 A CN 110807352A
Authority
CN
China
Prior art keywords
loss function
scene
vehicle
cab
road
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910808682.1A
Other languages
Chinese (zh)
Other versions
CN110807352B (en
Inventor
缪其恒
苏志杰
孙焱标
王江明
许炜
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang Zero Run Technology Co Ltd
Original Assignee
Zhejiang Zero Run Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang Zero Run Technology Co Ltd filed Critical Zhejiang Zero Run Technology Co Ltd
Priority to CN201910808682.1A priority Critical patent/CN110807352B/en
Publication of CN110807352A publication Critical patent/CN110807352A/en
Application granted granted Critical
Publication of CN110807352B publication Critical patent/CN110807352B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/56Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/59Context or environment of the image inside of a vehicle, e.g. relating to seat occupancy, driver state or inner lighting conditions
    • G06V20/597Recognising the driver's state or behaviour, e.g. attention or drowsiness
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Abstract

The invention discloses a visual analysis method for in-vehicle and out-vehicle scenes for early warning of dangerous driving behaviors, which comprises the following steps of: s1, data acquisition, synchronization and pretreatment; s2, semantic coding of road scenes; s3, semantic coding of a cab scene; s4, classifying time sequence dangerous driving behaviors; and S5 model forward operation deployment and output post-processing. The technical scheme includes that synchronous foresight and cab scene images are input, road scenes and cab scenes are subjected to convolutional neural network feature coding, then are sent to a time sequence behavior classifier based on a recurrent neural network in a cascading mode, dangerous driving behavior categories are output and used for a subsequent corresponding early warning algorithm, and judgment on dangerous driving behaviors is accurate.

Description

In-vehicle and out-vehicle scene visual analysis method for dangerous driving behavior early warning
Technical Field
The invention relates to the field of driver behavior auxiliary systems, in particular to a visual analysis method for in-vehicle and out-vehicle scenes for early warning of dangerous driving behaviors.
Background
According to the statistics of the road traffic accident data, more than half of the traffic accidents are caused by the danger of the drivers or the wrong vehicle operation. However, most of such human accidents are caused by driving fatigue or distraction. Therefore, the active safety system and the driver behavior analysis system have important application values. The conventional driving assistance systems for passenger vehicles and commercial vehicles respectively perform dangerous driving behavior early warning according to important road scene parameters, such as pre-collision time (TTC) and pre-compaction line Time (TLC), or according to important driver's cab scene parameters, such as eyeball opening and face orientation.
However, both of these systems have their own advantages and disadvantages; the system based on the road scene analysis cannot accurately reflect the concentration and fatigue degree of a driver on vehicle operation, is susceptible to the influence of an installation angle and a view field range, can falsely trigger corresponding early warning signals in certain normal driving behaviors, and cannot accurately send out the early warning signals in certain untrained dangerous behaviors. Therefore, the two systems have a potential complementary relationship, the early warning accuracy and reliability of dangerous driving behaviors can be effectively improved by comprehensively analyzing the scene inside and outside the vehicle, and at present, no method based on the visual joint analysis of the scene inside and outside the vehicle is applied to an auxiliary driving (early warning) system.
At present, no auxiliary driving system deployed in mass-production vehicle models comprehensively analyzes scenes inside and outside a vehicle and then sends out early warning signals of corresponding dangerous driving behaviors (such as too short distance between vehicles, lane departure, fatigue driving, distraction and the like). The existing assistant driving system is mainly based on: i) vehicle dynamics parameters and steering signals; ii) visual system perception results; and iii) carrying out corresponding driving behavior early warning on the sensing result of the millimeter wave radar system. The used visual system can be divided into two types according to scene analysis basis: i) the cab vision system mainly carries out recognition of partial fatigue and distracted driving states such as doze, yawning and the like through facial image feature analysis of a driver; a forward looking system, primarily through road scene image feature analysis, for specific vehicle driving state identification, such as yaw, pre-crash, etc.
The potential problems existing in the dangerous driving behavior early warning by independently using the two vision systems are as follows: i) systems based on road scene analysis need to recognize driver intent (e.g., lane change intent with turn lights) by specific signal input, false alarms are easily triggered, and the system cannot accurately reflect driver concentration and fatigue on vehicle operation; ii) the system based on the analysis of the cab scene is susceptible to the influence of the installation angle and the field of view, some normal driving behaviors can trigger corresponding early warning signals by mistake, and the system cannot accurately send out the early warning signals for some untrained dangerous behaviors.
Disclosure of Invention
The invention aims to solve the problem of inaccurate judgment on dangerous driving conditions caused by low fusion degree of a cab vision system and a forward-looking system, and provides an in-vehicle and out-vehicle scene vision analysis method for dangerous driving behavior early warning.
In order to achieve the technical purpose, the invention provides a technical scheme that the in-vehicle and out-vehicle scene visual analysis method for the early warning of dangerous driving behaviors comprises the following steps:
s1, data acquisition, synchronization and pretreatment;
s2, semantic coding of road scenes;
s3, semantic coding of a cab scene;
s4, classifying time sequence dangerous driving behaviors;
s5, model forward operation deployment and output post-processing.
In the scheme, firstly, road scene and cab scene image input is collected and synchronized, and after preprocessing operations such as format conversion, ROI selection and scaling, the images are sent to a subsequent deep convolutional neural network module analysis module; secondly, based on the foresight traffic scene image input, performing foresight scene semantic feature description by using a road scene deep convolution neural network obtained through offline training, enabling the activation part to be image areas such as road boundaries, vehicles and pedestrians, outputting encoded road scene semantic features, and sending the road scene semantic features and cab scene semantic features into a time sequence analysis module after cascading; then, based on the input of a driver's cab scene image, performing semantic feature description on the driver's cab scene by using a deep convolutional neural network of the driver's cab scene obtained by off-line training, enabling the activated part to be an image area of the face, the upper body and the like of the driver, outputting the coded driver's cab scene semantic features, and cascading the coded driver's cab scene semantic features with the road scene semantic features and then sending the coded driver's cab scene semantic features into a time sequence analysis module; and finally, classifying the time-series behaviors by using a recurrent neural network model or a support vector machine according to different early warning application requirements based on the characteristics of the inside scene and the outside scene of the cascade vehicle.
The step S1 includes the following steps:
s11, preprocessing a road scene image: road scene image data are collected through a forward-looking camera and stored in an image cache pool, and are subjected to feature description of a convolutional neural network and then are sent to a time sequence behavior classifier based on the recurrent neural network in a cascading mode;
s12, cab scene image preprocessing: the method comprises the steps of collecting cab scene image data through a cab camera, storing the cab scene image data in an image cache pool, and sending the cab scene image data to a time sequence behavior classifier based on a recurrent neural network in a cascade mode after being described by characteristics of the recurrent neural network.
The step S2 includes the following steps:
s21, road scene neural network topology: the input is a road scene RGB image with 320 × 180 × 3 channels, and the backbone network comprises convolution, pooling, normalization, activation and deconvolution basic networks;
s22, road scene training data set: collecting a traffic scene data set, and manually marking to generate a multi-task training label;
s23, off-line training of the road scene model: designing a road scene characteristic loss function L by comprehensively considering the application of a road scene neural network in an auxiliary driving system and the compatibility and portability of network characteristicstraffic
Loss of said stepFunction LtrafficThe following calculation formula is used:
Ltraffic=k1Lobj+k2Lroad
Figure BDA0002184413470000031
Figure BDA0002184413470000032
Lce(y,g)=glogy+(1-g)log(1-y)
Figure BDA0002184413470000033
in the formula, LobjAs a function of the target loss, LroadFor the road surface semantic loss function, k1 is the target loss function LobjK2 is the road surface semantic loss function LroadWeight coefficient of (1), L1s(loci,gloc,i) And Lce(attki,gatt,ki) As a cross-entropy loss function, L1s(y, g) is smoothL1 loss function, target loss function LobjIncluding classification loss function, position regression loss function and attribute classification loss function of each target, α is weight coefficient of classification loss function, β is weight coefficient of position regression loss function, lambdajFor weighting coefficients of attribute classification loss function, road semantic loss function LroadResulting from the image pixel level cross entropy summation.
The step S3 includes the following steps:
s31, cab scene neural network topology: the method comprises the steps that (1) a scene infrared image of a cab with 320-180-1 channels is input, and a backbone network comprises convolution, pooling, normalization, activation and deconvolution basic networks;
s32, a cab scene training data set: collecting a scene data set of a cab, and manually marking to generate a multi-task training label;
s33 driver' S cabin scene modelOff-line training: the application of a cab scene neural network in an auxiliary driving system and the compatibility and the portability of network characteristics are comprehensively considered, and a cab scene characteristic loss function L is designeddriver
The loss function LdriverThe following calculation formula is used:
Ldriver=μ1Lfd2Lgd3Lhp
Figure BDA0002184413470000041
Figure BDA0002184413470000042
Figure BDA0002184413470000043
in the formula: l isfdDetecting a loss function for a face, LgdAs a function of eye orientation loss, LhpAs a function of facial orientation loss, μ1Detecting a loss function L for a facefdWeight coefficient of (d), mu2As a function of eye orientation loss LgdWeight coefficient of (d), mu3As a function of facial orientation loss LhpThe face detection loss function LfdIncluding face classification loss function, face region regression loss function and key point regression loss function, α1Weighting coefficients for the face classification loss function, α2Loss function which is a regression loss function for the face region, α3Face orientation regression loss function L as weight coefficient of the key point regression loss functionhpIncluding face orientation classification loss function, face orientation angle regression loss function, and orientation classification and angle consistency loss function, β1Weighting coefficients for face orientation classification loss functions, β2Weighting coefficients for the face orientation angle regression loss function, β3To weight coefficients of the orientation classification and angle consistency loss function,eyeball orientation regression loss function LgdComprises an eyeball orientation classification loss function, an eyeball orientation angle regression loss function, an eyeball orientation classification and angle consistency loss function, gamma1Weight coefficient, gamma, of a function of eye orientation classification loss2Weight coefficient, gamma, of the regression loss function for the eyeball orientation angle3Weighting coefficients of the eyeball orientation classification and angle consistency loss function.
The step S4 includes the following steps:
s41, topology of the long-term and short-term memory neural network;
s42, training a long-term and short-term memory neural network data set;
s43, off-line training of the long-term and short-term memory neural network; method for constructing driving behavior classification loss function L by curing convolution characteristic layer network parametersbehaviorOptimizing the loss function L in a batch stochastic gradient descent mannerbehavior
The loss function LbehaviorThe following calculation formula is used:
Figure BDA0002184413470000051
in the formula: b isi,jTo predict behavior classes, gb,ijThe behavior category true value is shown, N is the number of independent fragments, and T is the number of independent fragment frames.
In step S5, the models include a road scene model, a cab scene model, and a dangerous driving behavior classification model.
The invention has the beneficial effects that:
1. compared with a visual analysis system based on single scene input, the system for the early warning of the dangerous behaviors has higher reliability in comprehensively analyzing the driving state of a driver and the driving state of a vehicle
2. By utilizing a deep learning method, behavior categories generated by end-to-end neural network and massive driving data training are suitable for different driving groups and driving habits, and compared with an early warning system based on specific rules and numerical criteria, the early warning robustness of dangerous behaviors is higher;
3. the scene activation areas are consistent with the interested activation areas when the systems are independently applied, and can be integrated in the neural network-based architecture vision system without introducing extra characteristic computation, so that the portability and the expandability of the invention are better.
4. And meanwhile, the potential dangerous vehicle motion and the potential fatigue driving state are identified, and the vehicle driving state early warning and the corresponding driver driving state are associated, so that the driving state of the driver in the corresponding dangerous driving state is identified to reduce unnecessary false alarm in the part in the normal driving state.
Drawings
Fig. 1 is a flowchart of a method for visually analyzing scenes inside and outside a vehicle for early warning of dangerous driving behaviors according to the present invention.
Fig. 2 is a specific method flow of the in-vehicle and out-vehicle scene visual analysis method for early warning of dangerous driving behaviors in the present invention.
FIG. 3 is a schematic diagram of a deep neural network architecture suitable for use in the present invention.
Detailed Description
For the purpose of better understanding the objects, technical solutions and advantages of the present invention, the following detailed description of the present invention with reference to the accompanying drawings and examples should be understood that the specific embodiment described herein is only a preferred embodiment of the present invention, and is only used for explaining the present invention, and not for limiting the scope of the present invention, and all other embodiments obtained by a person of ordinary skill in the art without making creative efforts shall fall within the scope of the present invention.
Example (b): as shown in fig. 1, a method flowchart of an in-vehicle and out-vehicle scene visual analysis method for dangerous driving behavior early warning includes the following steps:
s1, data acquisition, synchronization and pretreatment;
s2, semantic coding of road scenes;
s3, semantic coding of a cab scene;
s4, classifying time sequence dangerous driving behaviors;
s5, model forward operation deployment and output post-processing.
In the embodiment, firstly, road scene and cab scene image input is collected and synchronized, and after preprocessing operations such as format conversion, ROI selection and scaling, the images are sent to a subsequent deep convolutional neural network module analysis module; secondly, based on the foresight traffic scene image input, performing foresight scene semantic feature description by using a road scene deep convolution neural network obtained through offline training, enabling the activation part to be image areas such as road boundaries, vehicles and pedestrians, outputting encoded road scene semantic features, and sending the road scene semantic features and cab scene semantic features into a time sequence analysis module after cascading; then, based on the input of a driver's cab scene image, performing semantic feature description on the driver's cab scene by using a deep convolutional neural network of the driver's cab scene obtained by off-line training, enabling the activated part to be an image area of the face, the upper body and the like of the driver, outputting the coded driver's cab scene semantic features, and cascading the coded driver's cab scene semantic features with the road scene semantic features and then sending the coded driver's cab scene semantic features into a time sequence analysis module; and finally, classifying the time-series behaviors by using a recurrent neural network model or a support vector machine according to different early warning application requirements based on the characteristics of the inside scene and the outside scene of the cascade vehicle.
As shown in fig. 2, a flow chart of a specific method of the in-vehicle and out-vehicle scene visual analysis method for the early warning of dangerous driving behaviors is shown,
the invention provides a vehicle interior and exterior scene vision joint analysis method for early warning of dangerous driving behaviors, which inputs road scene images (colors) acquired by a forward-looking camera and cab scene images (infrared) acquired by a cab camera, and after CNN (convolutional neural network) feature description, the road scene images and the cab scene images are cascaded and sent to a time sequence behavior classifier based on a recurrent neural network, and dangerous driving behavior categories are output. The horizontal field angle of the front-view camera is 50 degrees, the left and right centers of the front-view camera are arranged at the position of the front windshield with the height of 1-1.2 meters, and the direction of the front-view camera is horizontal and forward; the horizontal field of view angle of driver' S cabin camera is 50, adopts infrared light filling, installs in A post department towards driver region, S1, data acquisition, synchronization and preliminary treatment: and (3) offline adjusting and configuring acquisition parameters such as exposure, gain and the like, marking a system time stamp after acquiring image original data, sending the image original data into respective preprocessing cache queues according to the same sequence number after matching the time stamp, and inputting the image original data into a neural network operation unit after the following preprocessing operations.
S11, preprocessing a road scene image: the method comprises the steps of reading a picture (YUV format, 1280 × 720) at the top of a road scene image cache pool, converting the picture into an RGB format, intercepting a predefined ROI, and then zooming to the road scene neural network input size (according to a corresponding network input interface, the picture is defaulted to 320 × 180 × 3).
S12, cab scene image preprocessing: reading a picture (YUV format) at the top of a cache pool of the scene image of the cab, intercepting Y-channel data, and then intercepting a predefined ROI (region of interest) to be zoomed to the input size of a neural network of the scene of the cab (according to a corresponding network input interface, the default is 320 × 180 × 1).
S2, road scene semantic coding: the road scene semantic coding neural network is shown in fig. 2, a forward-looking traffic scene image is input, and road scene activation semantic features are output through offline data acquisition, neural network model training and online model reasoning.
S21, road scene neural network topology: the input is a road scene RGB image with 320 × 180 × 3 channels, and the backbone network is mainly composed of basic network operations such as convolution (conv), pooling (posing), normalization (BN), activation (Relu) and deconvolution (deconv). The scene features comprise feature descriptions at 1/4, 1/16 and 1/64 scales of network input, and branches of calculation loss functions adopted by offline training comprise a target detection branch, a road semantic segmentation branch and a target attribute classification branch.
S22, road scene training data set: the method comprises the steps of collecting a traffic scene data set, extracting time sequence discrete samples including different time (day, night and the like), weather (sunny, cloudy, rainy and the like), driving scenes (city, high speed, tunnel and the like), manually labeling to generate a multitask training label, and mainly comprising a target detection label (in a target frame form), a lane boundary label (in a semantic layer form) and a drivable area label (in a semantic layer form). The target frame form comprises a target category (0-other, 1-small vehicle, 2-big vehicle, 3-pedestrian, 4-non-motor vehicle, 5-signal lamp, 6-signboard), a position (x, Y, W, H) and other custom attributes (such as signboard category, vehicle 3D attribute and the like).
S23, off-line training of the road scene model: designing a road scene characteristic loss function L by comprehensively considering the application of a road scene neural network in an auxiliary driving system (including other traffic participants, the identification of traffic signal identifications and the identification of lanes and road boundaries) and the compatibility and the portability of network characteristicstrafficThe following were used:
Ltraffic=k1Lobi+k2Lroad
Figure BDA0002184413470000071
Figure BDA0002184413470000072
Lce(y,g)=glogy+(1-g)log(1-y)
Figure BDA0002184413470000073
in the formula, LobjAs a function of the target loss, LroadFor the road surface semantic loss function, k1 is the target loss function LobjK2 is the road surface semantic loss function LroadWeight coefficient of (1), L1s(loci,gloc,i) And Lce(attki,gatt,ki) As a cross-entropy loss function, L1s(y, g) is smoothL1 loss function, target loss function LobjIncluding classification loss function, position regression loss function and attribute classification loss function of each target, α is weight coefficient of classification loss function, β is weight coefficient of position regression loss function, lambdajFor weighting coefficients of attribute classification loss function, road semantic loss function LroadResulting from the image pixel level cross entropy summation.
And expanding the training data set in the step S22 by adopting color, geometric transformation and other modes, reversely transmitting the gradient information of the loss function by utilizing a batch sample gradient descending mode, and updating the corresponding weight parameters of the neural network.
S3, semantic coding of cab scene: as shown in fig. 3, the deep neural network architecture is a schematic diagram of a semantic coding neural network for a cab scene, as shown in fig. 3, the semantic coding neural network for the cab scene is input as an image of the cab scene, and the semantic features for activating the cab scene are output through offline data acquisition, neural network model training and online model reasoning.
S31, cab scene neural network topology: the input is 320 × 180 × 1 channel cab scene infrared image, and the main network part is composed of basic network operations such as convolution (conv), pooling (pooling), normalization (BN), activation (Relu), deconvolution (deconv) and the like, similar to the road scene neural network. The cab scenario features mainly include feature descriptions at the scale of 1/4, 1/8, and 1/16 of the neural network inputs.
S32, a cab scene training data set: the method comprises the steps of collecting a cab scene data set, extracting time sequence discrete samples, manually labeling and generating a multi-task training label, wherein the cab scene data set comprises different time (day, night and the like), weather (clear, cloudy, rainy and the like), cab camera installation positions (middle, a column and the like), cab space structures (cars, suv and the like) and driver identities (vehicle drivers with different features, sexes and the like), and mainly comprises a face area, a key point label, a face orientation label and an eyeball orientation label. The label format of the face region label is the same as that of the common target frame label format, and the face key point label comprises 13 key points (including image coordinate positions of 8 key points of eyes, 1 key point of a nose tip and 4 key points of a mouth), a face orientation (a head three-degree-of-freedom rotation angle under a camera coordinate system) and an eyeball orientation which are two-degree-of-freedom angle labels (namely, angles of up-and-down and left-and-right rotation of an eyeball in a face plane).
S33, off-line training of a cab scene model: the application of a cab scene neural network in an auxiliary driving system (including the detection of fatigue and distracted driving behaviors) and the compatibility and the portability of network characteristics are comprehensively considered, and a cab scene characteristic loss function L is designeddriverThe following were used:
Ldriver=μ1Lfd2Lgd3Lhp
Figure BDA0002184413470000092
in the formula: l isfdDetecting a loss function for a face, LgdAs a function of eye orientation loss, LhpAs a function of facial orientation loss, μ1Detecting a loss function L for a facefdWeight coefficient of (d), mu2As a function of eye orientation loss LgdWeight coefficient of (d), mu3As a function of facial orientation loss LhpThe face detection loss function LfdIncluding face classification loss function, face region regression loss function and key point regression loss function, α1Weighting coefficients for the face classification loss function, α2Loss function which is a regression loss function for the face region, α3Face orientation regression loss function L as weight coefficient of the key point regression loss functionhpIncluding face orientation classification loss function, face orientation angle regression loss function, and orientation classification and angle consistency loss function, β1Weighting coefficients for face orientation classification loss functions, β2Weighting coefficients for the face orientation angle regression loss function, β3Eye orientation regression loss function L as weight coefficient of orientation classification and angle consistency loss functiongdComprises an eyeball orientation classification loss function, an eyeball orientation angle regression loss function, an eyeball orientation classification and angle consistency loss function, gamma1Weight coefficient, gamma, of a function of eye orientation classification loss2Regression loss function for eyeball orientation angleWeight coefficient of (a), γ3Weighting coefficients of the eyeball orientation classification and angle consistency loss function.
And expanding the training data set in the step S32 by adopting color, geometric transformation and other modes, reversely transmitting the gradient information of the loss function by utilizing a batch sample gradient descending mode, and updating the corresponding weight parameters of the neural network.
S4, time-series dangerous driving behavior classification: cascading the coded driving cab indoor and outdoor scenes as a single-moment behavior feature description, classifying the predefined segment length behavior features by using a long-short-term memory neural network (LSTM, shown in figure 3), and outputting predefined driving behavior categories (0-normal driving, 1-lane departure, 2-front vehicle potential collision, 3-fatigue driving and 4-inattentive driving).
S41, LSTM network topology: the number of time series recursion units is 12 (behavior corresponding to time series data of approximately 1 second at a processing speed of 12.5 frames/second), and the formula used is as follows:
ft=sigmoid(σf(xt,ht-1))
it=sigmoid(σi(xt,ht-1))
ot=sigmoid(σo(xt,ht-1))
ct=ft·ct-1+it·tanh(σc(xt,ht-1))
ht=ot·tanh(ct)
in the formula: x is the number oftAs an input vector, ftTo forget the gate vector, itTo update the gate vector, htIs a hidden layer vector, otTo output the gate vector, ctIs a tuple state vector.
S42, LSTM network training data set: and (3) acquiring image data of the indoor scene and the outdoor scene of the corresponding cab with the predefined driving behaviors in step 4 according to the data acquisition and synchronization mode in step 1, wherein the image data comprises scenes and working conditions described in data sets in step S2 and step S3, and corresponding video segments (2 seconds correspond to an event) are intercepted according to a frame rate of 12.5 frames, and each video segment corresponds to a behavior label.
S43, LSTM network offline training: solidifying the network parameters of the convolution characteristic layer (namely, the gradient is not propagated to the part reversely), and constructing a driving behavior classification loss function LbehaviorThis loss function is optimized as follows in a batch stochastic gradient descent:
in the formula: b isi,jTo predict behavior classes, gb,ijThe behavior category true value is shown, N is the number of independent fragments, and T is the number of independent fragment frames.
S5, model forward operation deployment and model output post-processing: as described in step S2, step 3 and step 4, the model includes three branches, namely, a road scene model, a cab scene model and a dangerous driving behavior classification model. The branches included in the step S2 and the step S3 are only used for training, calculating a loss function and a back propagation gradient, only the corresponding scene feature layer part is reserved during forward operation, and after data quantization, thinning and other compression operations are performed on the corresponding model parameters according to the operation characteristics of a front-end platform, the corresponding model parameters are cascaded according to a preset feature channel sequence and then are transmitted to a long-short memory module to perform time sequence driving behavior online classification.
The present invention is not limited to the specific implementation scope of the present invention, but the scope of the present invention includes and is not limited to the specific implementation scope of the present invention, and all equivalent changes made according to the shape and structure of the present invention are within the protection scope of the present invention.

Claims (9)

1. A visual analysis method for in-vehicle and out-vehicle scenes for early warning of dangerous driving behaviors is characterized by comprising the following steps of: the method comprises the following steps:
s1, data acquisition, synchronization and pretreatment;
s2, semantic coding of road scenes;
s3, semantic coding of a cab scene;
s4, classifying time sequence dangerous driving behaviors;
s5, model forward operation deployment and output post-processing.
2. The in-vehicle and out-vehicle scene visual analysis method for dangerous driving behavior early warning as claimed in claim 1, wherein:
the step S1 includes the following steps:
s11, preprocessing a road scene image: road scene image data are collected through a forward-looking camera and stored in an image cache pool, and are subjected to feature description of a convolutional neural network and then are sent to a time sequence behavior classifier based on the recurrent neural network in a cascading mode;
s12, cab scene image preprocessing: the method comprises the steps of collecting cab scene image data through a cab camera, storing the cab scene image data in an image cache pool, and sending the cab scene image data to a time sequence behavior classifier based on a recurrent neural network in a cascade mode after being described by characteristics of the recurrent neural network.
3. The in-vehicle and out-vehicle scene visual analysis method for dangerous driving behavior early warning as claimed in claim 1, wherein:
the step S2 includes the following steps:
s21, road scene neural network topology: the input is a road scene RGB image with 320 × 180 × 3 channels, and the backbone network comprises convolution, pooling, normalization, activation and deconvolution basic networks;
s22, road scene training data set: collecting a traffic scene data set, and manually marking to generate a multi-task training label;
s23, off-line training of the road scene model: designing a road scene characteristic loss function L by comprehensively considering the application of a road scene neural network in an auxiliary driving system and the compatibility and portability of network characteristicstraffic
4. The in-vehicle and out-vehicle scene visual analysis method for dangerous driving behavior early warning as claimed in claim 3, wherein:
the step loss function LtrafficThe following calculation formula is used:
Ltraffic=k1Lobj+k2Lroad
Figure FDA0002184413460000011
Figure FDA0002184413460000021
Lce(y,g)=glogy+(1-g)log(1-y)
Figure FDA0002184413460000022
in the formula, LobjAs a function of the target loss, LroadFor the road surface semantic loss function, k1 is the target loss function LobjK2 is the road surface semantic loss function LroadWeight coefficient of (1), L1s(loci,gloc,i) And Lce(attki,gatt,ki) As a cross-entropy loss function, L1s(y, g) is smoothL1 loss function, target loss function LobjIncluding classification loss function, position regression loss function and attribute classification loss function of each target, α is weight coefficient of classification loss function, β is weight coefficient of position regression loss function, lambdajFor weighting coefficients of attribute classification loss function, road semantic loss function LroadResulting from the image pixel level cross entropy summation.
5. The in-vehicle and out-vehicle scene visual analysis method for dangerous driving behavior early warning as claimed in claim 1, wherein:
the step S3 includes the following steps:
s31, cab scene neural network topology: the method comprises the steps that (1) a scene infrared image of a cab with 320-180-1 channels is input, and a backbone network comprises convolution, pooling, normalization, activation and deconvolution basic networks;
s32, a cab scene training data set: collecting a scene data set of a cab, and manually marking to generate a multi-task training label;
s33, off-line training of a cab scene model: the application of a cab scene neural network in an auxiliary driving system and the compatibility and the portability of network characteristics are comprehensively considered, and a cab scene characteristic loss function L is designeddriver
6. The in-vehicle and out-vehicle scene visual analysis method for dangerous driving behavior early warning as claimed in claim 5, wherein:
the loss function LdriverThe following calculation formula is used:
Ldriver=μ1Lfd2Lgd3Lhp
Figure FDA0002184413460000032
Figure FDA0002184413460000033
in the formula: l isfdDetecting a loss function for a face, LgdAs a function of eye orientation loss, LhpAs a function of facial orientation loss, μ1Detecting a loss function L for a facefdWeight coefficient of (d), mu2As a function of eye orientation loss LgdWeight coefficient of (d), mu3As a function of facial orientation loss LhpThe face detection loss function LfdIncluding face classification loss function, face region regression loss function and key point regression loss function, α1Weighting coefficients for the face classification loss function, α2Loss function which is a regression loss function for the face region, α3Face orientation regression loss function L as weight coefficient of the key point regression loss functionhpIncluding face orientation classification loss function, face orientation angle regression loss function, and orientation classification and angle consistency loss function, β1Weighting coefficients for face orientation classification loss functions, β2Weighting coefficients for the face orientation angle regression loss function, β3Eye orientation regression loss function L as weight coefficient of orientation classification and angle consistency loss functiongdComprises an eyeball orientation classification loss function, an eyeball orientation angle regression loss function, an eyeball orientation classification and angle consistency loss function, gamma1Weight coefficient, gamma, of a function of eye orientation classification loss2Weight coefficient, gamma, of the regression loss function for the eyeball orientation angle3Weighting coefficients of the eyeball orientation classification and angle consistency loss function.
7. The in-vehicle and out-vehicle scene visual analysis method for dangerous driving behavior early warning as claimed in claim 1, wherein:
the step S4 includes the following steps:
s41, topology of the long-term and short-term memory neural network;
s42, training a long-term and short-term memory neural network data set;
s43, off-line training of the long-term and short-term memory neural network; method for constructing driving behavior classification loss function L by curing convolution characteristic layer network parametersbehaviorOptimizing the loss function L in a batch stochastic gradient descent mannerbehavior
8. The in-vehicle and out-vehicle scene visual analysis method for dangerous driving behavior early warning as claimed in claim 7, wherein:
the loss function LbehaviorThe following calculation formula is used:
Figure FDA0002184413460000041
in the formula: b isi,jTo predict behavior classes, gb,ijThe behavior category true value is shown, N is the number of independent fragments, and T is the number of independent fragment frames.
9. The in-vehicle and out-vehicle scene visual analysis method for dangerous driving behavior early warning as claimed in claim 1, wherein:
in step S5, the models include a road scene model, a cab scene model, and a dangerous driving behavior classification model.
CN201910808682.1A 2019-08-29 2019-08-29 In-vehicle scene visual analysis method for dangerous driving behavior early warning Active CN110807352B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910808682.1A CN110807352B (en) 2019-08-29 2019-08-29 In-vehicle scene visual analysis method for dangerous driving behavior early warning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910808682.1A CN110807352B (en) 2019-08-29 2019-08-29 In-vehicle scene visual analysis method for dangerous driving behavior early warning

Publications (2)

Publication Number Publication Date
CN110807352A true CN110807352A (en) 2020-02-18
CN110807352B CN110807352B (en) 2023-08-25

Family

ID=69487468

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910808682.1A Active CN110807352B (en) 2019-08-29 2019-08-29 In-vehicle scene visual analysis method for dangerous driving behavior early warning

Country Status (1)

Country Link
CN (1) CN110807352B (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112396235A (en) * 2020-11-23 2021-02-23 浙江天行健智能科技有限公司 Traffic accident occurrence time prediction modeling method based on eyeball motion tracking
CN113179389A (en) * 2021-04-15 2021-07-27 江苏濠汉信息技术有限公司 System and method for identifying crane jib of power transmission line dangerous vehicle
CN113221613A (en) * 2020-12-14 2021-08-06 国网浙江宁海县供电有限公司 Power scene early warning method for generating scene graph auxiliary modeling context information
CN113255519A (en) * 2021-05-25 2021-08-13 江苏濠汉信息技术有限公司 Crane lifting arm identification system and multi-target tracking method for power transmission line dangerous vehicle
CN113537115A (en) * 2021-07-26 2021-10-22 东软睿驰汽车技术(沈阳)有限公司 Method and device for acquiring driving state of driver and electronic equipment

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103770733A (en) * 2014-01-15 2014-05-07 中国人民解放军国防科学技术大学 Method and device for detecting safety driving states of driver
CN103818256A (en) * 2012-11-16 2014-05-28 西安众智惠泽光电科技有限公司 Automobile fatigue-driving real-time alert system
US20160150070A1 (en) * 2013-07-18 2016-05-26 Secure4Drive Communication Ltd. Method and device for assisting in safe driving of a vehicle
WO2018039646A1 (en) * 2016-08-26 2018-03-01 Netradyne Inc. Recording video of an operator and a surrounding visual field
CN108319909A (en) * 2018-01-29 2018-07-24 清华大学 A kind of driving behavior analysis method and system
CN108764034A (en) * 2018-04-18 2018-11-06 浙江零跑科技有限公司 A kind of driving behavior method for early warning of diverting attention based on driver's cabin near infrared camera
US20190213429A1 (en) * 2016-11-21 2019-07-11 Roberto Sicconi Method to analyze attention margin and to prevent inattentive and unsafe driving

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103818256A (en) * 2012-11-16 2014-05-28 西安众智惠泽光电科技有限公司 Automobile fatigue-driving real-time alert system
US20160150070A1 (en) * 2013-07-18 2016-05-26 Secure4Drive Communication Ltd. Method and device for assisting in safe driving of a vehicle
CN103770733A (en) * 2014-01-15 2014-05-07 中国人民解放军国防科学技术大学 Method and device for detecting safety driving states of driver
WO2018039646A1 (en) * 2016-08-26 2018-03-01 Netradyne Inc. Recording video of an operator and a surrounding visual field
CN110290945A (en) * 2016-08-26 2019-09-27 奈特雷代恩股份有限公司 Record the video of operator and around visual field
US20190213429A1 (en) * 2016-11-21 2019-07-11 Roberto Sicconi Method to analyze attention margin and to prevent inattentive and unsafe driving
CN108319909A (en) * 2018-01-29 2018-07-24 清华大学 A kind of driving behavior analysis method and system
CN108764034A (en) * 2018-04-18 2018-11-06 浙江零跑科技有限公司 A kind of driving behavior method for early warning of diverting attention based on driver's cabin near infrared camera

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112396235A (en) * 2020-11-23 2021-02-23 浙江天行健智能科技有限公司 Traffic accident occurrence time prediction modeling method based on eyeball motion tracking
CN112396235B (en) * 2020-11-23 2022-05-03 浙江天行健智能科技有限公司 Traffic accident occurrence time prediction modeling method based on eyeball motion tracking
CN113221613A (en) * 2020-12-14 2021-08-06 国网浙江宁海县供电有限公司 Power scene early warning method for generating scene graph auxiliary modeling context information
CN113221613B (en) * 2020-12-14 2022-06-28 国网浙江宁海县供电有限公司 Power scene early warning method for generating scene graph auxiliary modeling context information
CN113179389A (en) * 2021-04-15 2021-07-27 江苏濠汉信息技术有限公司 System and method for identifying crane jib of power transmission line dangerous vehicle
CN113255519A (en) * 2021-05-25 2021-08-13 江苏濠汉信息技术有限公司 Crane lifting arm identification system and multi-target tracking method for power transmission line dangerous vehicle
CN113537115A (en) * 2021-07-26 2021-10-22 东软睿驰汽车技术(沈阳)有限公司 Method and device for acquiring driving state of driver and electronic equipment

Also Published As

Publication number Publication date
CN110807352B (en) 2023-08-25

Similar Documents

Publication Publication Date Title
CN110807352B (en) In-vehicle scene visual analysis method for dangerous driving behavior early warning
CN110097109B (en) Road environment obstacle detection system and method based on deep learning
US11392131B2 (en) Method for determining driving policy
JP7332726B2 (en) Detecting Driver Attention Using Heatmaps
EP3682367B1 (en) Gesture control for communication with an autonomous vehicle on the basis of a simple 2d camera
RU2216780C2 (en) Method and device for real-time identification and confinement of relative-displacement area in scene and for determination of displacement speed and direction
DE102018201054A1 (en) System and method for image representation by a driver assistance module of a vehicle
CN107133559B (en) Mobile object detection method based on 360 degree of panoramas
CN110228484B (en) Low-delay intelligent remote driving system with auxiliary driving function
DE102020113280A1 (en) AUTOMATIC GENERATION OF BASIC TRUTH DATA FOR TRAINING OR RE-TRAINING ONE OR MORE MODELS FOR MACHINE LEARNING
CN111259719B (en) Cab scene analysis method based on multi-view infrared vision system
CN112215306B (en) Target detection method based on fusion of monocular vision and millimeter wave radar
CN105654753A (en) Intelligent vehicle-mounted safe driving assistance method and system
DE102018212655A1 (en) Detection of the intention to move a pedestrian from camera images
CN110781718B (en) Cab infrared vision system and driver attention analysis method
CN111860269B (en) Multi-feature fusion series RNN structure and pedestrian prediction method
US20180208201A1 (en) System and method for a full lane change aid system with augmented reality technology
DE102018002955A1 (en) PROCESS AND CONTROL ARRANGEMENT FOR ENVIRONMENTAL PERCEPTION
Jiang et al. Target detection algorithm based on MMW radar and camera fusion
CN114419603A (en) Automatic driving vehicle control method and system and automatic driving vehicle
CN117292346A (en) Vehicle running risk early warning method for driver and vehicle state integrated sensing
DE112020003845T5 (en) DEVICE, MEASUREMENT DEVICE, DISTANCE MEASUREMENT SYSTEM AND METHOD
CN115909245A (en) Visual multi-task processing method based on deep learning
WO2019048623A1 (en) Assistance system for a vehicle
CN110556024B (en) Anti-collision auxiliary driving method and system and computer readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information
CB02 Change of applicant information

Address after: 310051 1st and 6th floors, no.451 Internet of things street, Binjiang District, Hangzhou City, Zhejiang Province

Applicant after: Zhejiang Zero run Technology Co.,Ltd.

Address before: 310051 1st and 6th floors, no.451 Internet of things street, Binjiang District, Hangzhou City, Zhejiang Province

Applicant before: ZHEJIANG LEAPMOTOR TECHNOLOGY Co.,Ltd.

GR01 Patent grant
GR01 Patent grant