CN111242015B - Method for predicting driving dangerous scene based on motion profile semantic graph - Google Patents

Method for predicting driving dangerous scene based on motion profile semantic graph Download PDF

Info

Publication number
CN111242015B
CN111242015B CN202010026768.1A CN202010026768A CN111242015B CN 111242015 B CN111242015 B CN 111242015B CN 202010026768 A CN202010026768 A CN 202010026768A CN 111242015 B CN111242015 B CN 111242015B
Authority
CN
China
Prior art keywords
driving
layer
motion profile
event
matrix
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010026768.1A
Other languages
Chinese (zh)
Other versions
CN111242015A (en
Inventor
高珍
欧明锋
余荣杰
许靖宁
冯巾松
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tongji University
Original Assignee
Tongji University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tongji University filed Critical Tongji University
Priority to CN202010026768.1A priority Critical patent/CN111242015B/en
Publication of CN111242015A publication Critical patent/CN111242015A/en
Application granted granted Critical
Publication of CN111242015B publication Critical patent/CN111242015B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/004Artificial life, i.e. computing arrangements simulating life
    • G06N3/006Artificial life, i.e. computing arrangements simulating life based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/25Determination of region of interest [ROI] or a volume of interest [VOI]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Abstract

The invention relates to a method for predicting a driving dangerous scene based on a motion profile semantic graph, which comprises the following steps: step S1: acquiring a driving video and dividing an interested region; step S2: detecting a traffic object by using a target detection algorithm and generating a motion profile semantic graph; step S3: counting motion data and an acceleration threshold value, and dividing a motion profile semantic graph into high-risk events or normal events; step S4: inputting the important kinematic features into a random forest classifier, and sorting according to the feature importance to obtain the important kinematic features; step S5: constructing a multi-mode deep neural network model; step S6: and obtaining a motion profile semantic graph and important kinematic features of the driving video to be detected, inputting the motion profile semantic graph and the important kinematic features into a multi-modal depth neural network model, predicting whether the driving has risk, and alarming to a driver if the driving has risk. Compared with the prior art, the method has the advantages of improving the prediction accuracy of the driving dangerous scene, reducing the measurement fluctuation and the like.

Description

Method for predicting driving dangerous scene based on motion profile semantic graph
Technical Field
The invention relates to the field of automobile auxiliary driving, in particular to a method for predicting a driving dangerous scene based on a motion profile semantic graph.
Background
The data fusion model using deep learning is a new trend of traffic safety prediction, because both video data and kinematic data have respective limitations, and fusing these two types of data in a reasonable manner to improve the accuracy of scene risk prediction is a hotspot of current research. There have been some studies on high risk driving scene recognition, but there are still some problems. There are studies to detect dangerous situations by using abrupt changes in vehicle speed and direction and combining frame differences of videos, wherein the manner of comparing the frame differences of videos by an automatic encoder is more suitable for corner dangerous situations, and the accuracy is only 71% in general cases, which is not ideal. In addition, classical machine learning classifiers, including kNN, random forest, SVM, decision tree, gaussian neighborhood and AdaBoost, are used based on kinematic data, but the accuracy of the test results has volatility and is greatly affected by the prediction range. In addition, it is proposed to create Motion images based on driving forward videos, and predict risks by performing TTC calculation or capturing other information through tracks.
Disclosure of Invention
The invention aims to overcome the defects of lower accuracy and larger fluctuation of measurement precision in the prior art and provides a method for predicting a driving dangerous scene based on a motion profile semantic graph.
The aim of the invention can be achieved by the following technical scheme:
a method for predicting a driving danger scene based on a motion profile semantic graph comprises the following steps:
step S1: acquiring a driving video of a vehicle, and segmenting a region of interest (ROI) of the driving video;
step S2: detecting a traffic object by using a target detection algorithm in an interesting region of the driving video and generating a motion profile semantic graph containing semantics;
step S3: counting motion data of the vehicle, setting an acceleration threshold according to a counting result, and dividing the motion profile semantic graph into high-risk events or normal events;
step S4: inputting the high-risk event or the normal event into a random forest classifier, and sequencing classification results according to feature importance to obtain important kinematic features;
step S5: constructing a multi-mode deep neural network model according to the motion profile semantic graph and the important kinematic features;
step S6: and S1-S4, executing the driving video to be detected, obtaining a motion profile semantic graph and important kinematic features of the driving video to be detected, inputting the motion profile semantic graph and the important kinematic features into the multi-mode deep neural network model, predicting whether the driving has risk, and giving an alarm to a driver if the driving has risk.
In step S1, the specific process of segmenting the region of interest of the driving video is as follows:
step S101: filtering irrelevant image textures in the driving video through a Gaussian filter, and extracting the outline of the road in the driving video through an edge detection algorithm, wherein the method specifically comprises the following steps:
step S1011: the video frame image of the colorful driving video is converted into a gray image, and the method is concretely as follows:
f=0.299*R+0.587*G+0.114*B
wherein R, G, B represents a matrix of each of the three RGB channels, respectively;
step S1012: the gray scale map is filtered by a gaussian filter, concretely as follows:
Figure BDA0002362750200000021
where f (m, n) represents the original image gray value at position (m, n), g (m, n) represents the Gaussian filtered gray value;
step S1013: calculating the gradient strength and gradient direction of corresponding pixels in the gray level diagram filtered by the Gaussian filter through a Sobel operator, wherein the gradient strength and gradient direction are as follows:
Figure BDA0002362750200000022
Figure BDA0002362750200000023
Figure BDA0002362750200000024
Figure BDA0002362750200000025
Figure BDA0002362750200000031
Figure BDA0002362750200000032
wherein G is x (m, n) is the gradient strength in the transverse direction, G y (m, n) is the longitudinal gradient strength, S x Is a transverse Sobel operator, S y Is a longitudinal Sobel operator, G (m, n) is gradient strength, and theta (m, n) is gradient direction;
step S1014: comparing the gradient intensity of the current pixel with two pixels along the positive and negative gradient directions, if the gradient intensity of the current pixel is the largest compared with the other two pixels, the pixel point is reserved as an edge point, otherwise, the pixel point is restrained, namely, is set to 0;
step S1015: setting an upper threshold value and a lower threshold value v min And v max Wherein is greater than v max Are detected as edges and are lower than v min Is detected as a non-edge. For the middle pixel point, if the middle pixel point is adjacent to the pixel point determined to be the edge, thenJudging the edge; otherwise, the image is non-edge, so that a corresponding binary image (the gray value of the edge point is 1 and the gray value of the non-edge point is 0) is obtained.
Step S102: detecting a transformation of a straight line in a contour of the road by Huo Fuxian transformation, performing an accumulator to calculate a number of points mapped to the line in a hough space, and detecting the straight line if there are enough mapped points in the hough space;
step S103: after hough line transformation, more than two lines may be detected in the image. Since only two lines are required to calculate the position of the vanishing point, the lines are divided into two groups of left and right, the average parameter of each group is calculated, two intersecting lines are obtained from the average parameter, and the coordinate of the intersection of the two intersecting lines is calculated as x d ,y d I.e. vanishing points;
step S104: y is the ordinate of vanishing point d Upper boundary y as ROI region u The lower boundary y of the ROI area is defined by the largest ordinate among the detected starting points of the two groups of lines l The width of the ROI area is the width of the driving video.
The specific process of generating the motion profile semantic graph in step S2 is as follows:
step S201: the method comprises the following steps of carrying out averaging treatment on the interested area of each frame of image of the driving video, converting the interested area into a row of pixels, and specifically comprising the following steps:
step S2011: longitudinal [ y ] in each frame of picture for obtaining driving video l ,y u ]Transverse [0,w ]]RGB pixel values within a rectangular range of (i.e. (y) u -y l W, 3) a three-dimensional integer matrix, w being the video width;
step S2012: for each channel in RGB in the rectangular range, the average value of the longitudinal pixels is taken as the pixel value of one point, i.e. (y u -y l W, 3) the average value of the first dimension of the three-dimensional integer matrix, arranged in a row of pixels of 1×w, i.e., (1, w, 3) matrix;
step S202: a row of pixel matrix obtained for each frame is spliced in time order to form (fps× (t b -t a ) W, 3) matrix, fps is the number of frames per second of videoGenerating a colorful motion profile according to the pixel matrix;
step S203: identifying the motion profile by a real-time object detection frame, judging whether a traffic object in the identified traffic environment is positioned in an interested area, if so, marking the traffic object on the position of a corresponding frame line in the motion profile in a colored pixel line segment form according to the transverse position of the traffic object in a frame picture of a driving video pair to form a motion profile semantic graph, wherein the specific process is as follows:
step S2031: for t f A moment video frame picture, wherein a YOLO real-time object detection frame is used for identifying all traffic objects in the picture, and four pieces of information of the position, the size, the type and the confidence coefficient of the traffic objects are obtained;
step S2032: screening all traffic objects that confidence is greater than c t Traffic objects with center coordinates in the ROI area, wherein the traffic objects comprise pedestrians and vehicles;
step S2033: the pixel line segment position corresponding to each traffic object in the video frame of the driving video is calculated, and the method specifically comprises the following steps:
Figure BDA0002362750200000041
Figure BDA0002362750200000042
wherein [ x ] 1 ,x 2 ]For the pixel line segment position corresponding to the traffic object, x c ,w o Detecting the center coordinates and the width of an object respectively for YOLO, wherein w is the width of a video picture;
step S2034: t in the motion profile f The corresponding pixel row (i.e. the t-th f Line), pixel line segment [ x ] of different class type object 1 ,x 2 ]Different colors are set, if the object is a vehicle, [ x ] 1 ,x 2 ]Pixels within the range are set to red; if the object is a pedestrian, setting green;
finally, a motion profile semantic graph containing the semantic features of the moving object is formed, line segments of the object in the graph are arranged along with time to form a continuous track, and the width degree of the track reflects the relative longitudinal position of the traffic object from the vehicle. The wider the trajectory, the closer the representative traffic object is, and the higher the corresponding risk factor.
The specific process of step S3 is as follows:
step S301: the majority of the vehicle kinematic characteristic variables accord with normal distribution, abnormal values of vehicle motion data are detected and filtered through a 3 sigma principle of the normal distribution, namely, each non-empty kinematic characteristic variable of one driving record is judged, and the abnormal values accord with the conditions, specifically:
|x-μ|>3σ
wherein x is a kinematic parameter, μ is an average value of x, and σ is a standard deviation of x;
filling the missing value by a linear interpolation method, specifically:
Figure BDA0002362750200000051
wherein, the liquid crystal display device comprises a liquid crystal display device,
Figure BDA0002362750200000052
is a missing value, d i-1 D is the last non-empty nearest neighbor value of the missing value i+1 The next non-empty nearest neighbor value that is a missing value, n is the total number of records, t i-1 ,t i ,t i+1 D is respectively i-1 ,/>
Figure BDA0002362750200000053
d i+1 Corresponding time;
step S302: extracting vehicle acceleration data a in natural driving data, drawing and observing a distribution curve, determining an acceleration threshold value of obvious deceleration behavior, and recording as TH d
Step S303: scanning driving time sequence data, and according to acceleration condition a less than or equal to TH d Collecting emergency braking moment t d For each time t d Taking d before 1 To d 2 Time slices of seconds, constituting a potentially high risk event slice e c Combining video checking, eliminating false alarms caused by data acquisition errors, and combining n conflict_candidate The high risk event fragments form a high risk event preparation set
Figure BDA0002362750200000054
To avoid event overlap, ensure that adjacent emergency braking moments meet t d [i+1]-t d [i]≥|d 1 -d 2 |。
Step S304: from the rest of the driving time sequence data, the data is represented by the value of d 1 -d 2 I is a time window, and n is randomly sampled out normal_candidate Normal non-conflicting events as a normal event preparation set
Figure BDA0002362750200000055
The specific process of step S4 is as follows:
step S401: one containing m l Recording each event with multiple kinematic features, extracting n kinematic features { m } 1 ,…,m n Using event classification as a classification label value of the sample as an important feature of the sample to generate a sample set;
step S402: selecting n from sample set by sampling and replacing method s The q samples are used as training sets, q times are repeated to generate q training sets { S ] 1 ,…,S q };
Step S403: constructing a random forest { T } containing q CART decision trees by taking each training set as an input of one decision tree 1 ,…,T q }, wherein for T i Selection m for each node randomly not repeated node By means of the m node Pairs of features S i Dividing, and obtaining optimal division by using the minimum base index as a standard, so as to train q CART decision trees;
step S404: sorting classification results according to feature importance to obtain important kinematic features, specifically:
step S4041: calculate { m } 1 ,…,m n Each kinematic feature m in } j Average amount of change I of node splitting uncertainty in all decision trees j (i.e., importance), the degree of node o's opacity is measured using the base index, as follows:
Figure BDA0002362750200000061
wherein, GI o For node o to be impure, k represents class (high risk, normal), p ok Represents the proportion of class k in node o, p ok′ Representing the proportion of non-category k;
step S4042: calculating m j Importance I in the ith tree ji The formula is as follows:
Figure BDA0002362750200000062
/>
wherein O is the m containing kinematic feature of the ith tree j Node set, GI jio A base index G which is the node o of the ith tree jiol ,G jior The base indexes of the left and right new nodes after the node o branches are obtained;
step S4043: calculating m j Importance I in all trees j The formula is as follows:
Figure BDA0002362750200000063
wherein q is the number of CART decision trees;
step S4044: obtain the importance set { I ] of all kinematic features 1 ,…,I X After } the importance is normalized, the specific steps are as follows:
Figure BDA0002362750200000064
ranking the importance sets of the features subjected to normalization processing from large to small to obtain n before importance ranking immportant Features of (2)
Figure BDA0002362750200000065
Step S405: prepare the set of normal events
Figure BDA0002362750200000066
And a high risk event preparation set
Figure BDA0002362750200000067
Uses the n in the event for each event of the event important Represented by kinematic features, i.e. for each event +.>
Figure BDA0002362750200000068
Wherein id is event number, label is event type, and forms normal event set +.>
Figure BDA0002362750200000069
And high risk event set->
Figure BDA00023627502000000610
The multi-modal deep neural network model specifically comprises:
an input layer for converting the motion profile semantic graph into a matrix m 1
Conv1 layer, setting parameters of convolution layer including number, size, step length and activation function of filter, inputting m 1 Obtaining matrix m 2
The pool layer sets parameters of the pooling layer, including the size, the type, the step length and the like of the filter, and inputs m 2 Performing maximum pooling to obtain a matrix m 3
Conv2 layer, set the parameters of convolution layer, input m 3 Obtaining a matrix m through a ReLU activation function 4
Pool2 layer, setting parameters of pooling layer, inputting m 4 Performing maximum pooling to obtain a matrix m 5
Conv3 layer, set the parameters of convolution layer, input m 5 Obtaining a matrix m through a ReLU activation function 6
Conv4 layer, set the parameters of convolution layer, input m 6 Obtaining a matrix m through a ReLU activation function 7
Conv5 layer, set the parameters of convolution layer, input m 7 Obtaining a matrix m through a ReLU activation function 8
Pool5 layer, setting pooling layer parameters, inputting m 8 Performing maximum pooling to obtain a matrix m 9
FC6 smoothing layer, matrix m to be input 9 Smoothing into a one-dimensional matrix m 10
Drop6 layer, input matrix m 10 Discarding part of nerve nodes with a certain proportion of Dropout probability, preventing overfitting, and obtaining matrix m 11
FC7 full connection layer, input matrix m 11 Output a one-dimensional matrix m of r×1 12
Let m 12 And f kinematic Merging, i.e. [ f kinematic m 12 ]As an input to the FC8 fully connected layer, a 2×1 matrix is output, two values in the matrix correspond to predicted values of probabilities belonging to a risky class and a risky class, and then the predicted values are processed using Softmax to make the sum of the probabilities of the two classes 1.
The specific process of step S5 is as follows:
step S501: dividing step S4 into normal event sets
Figure BDA0002362750200000071
And high risk event set->
Figure BDA0002362750200000072
Respectively divide the training set into training sets theta in a ratio of 2:1 train Test set Θ test
Step S502: training multimodal deep nervesNetwork model, through n epoch The epoch, the loss value of the model is converged to a smaller value, training is stopped, and the final multi-mode deep neural network model M is stored DCNN
Step S503: for test set Θ test (comprising e c Normal event e n High risk events) invoke a trained M for each event in the set DCNN Model, obtaining its predicted classification value, and making statistics to obtain the normal event predicted by model
Figure BDA0002362750200000073
And conflict event->
Figure BDA0002362750200000074
Generating a confusion matrix as shown in table 1 according to the prediction result of the test set:
TABLE 1 confusion matrix
Figure BDA0002362750200000075
Obtaining sensitivity I of model according to confusion matrix calculation sensitivity Specificity I specificity The formula is as follows:
I sensitivity =TP/(TP+FN)
I specificity =TN/(FP+TN)
and according to I sensitivity And I specificity And generating an ROC curve for evaluating the model prediction effect.
Compared with the prior art, the invention has the following beneficial effects:
1. the invention combines video data and kinematic data to perform risk prediction, and the model accuracy reaches 91.6 percent, which is far superior to other single-source data models.
2. According to the invention, a real-time object detection frame is used for detecting a moving object on a frame picture of a driving video, semantic information of a traffic object track is added in a motion profile generated by the video, the track of potential conflict objects such as motor vehicles, non-motor vehicles and pedestrians is highlighted in a colored line segment mode, and the interference of the track of static elements in a traffic environment on a prediction result is greatly reduced.
3. According to the invention, the random forest is used for screening important kinematic feature variables, so that the accuracy of the multi-mode deep neural network model is improved.
Drawings
FIG. 1 is a schematic flow chart of the present invention;
FIG. 2 is a schematic illustration of a road profile extracted by edge detection according to the present invention;
FIG. 3 is a schematic illustration of a forward driving video based region of interest of the present invention;
FIG. 4 is a schematic diagram of the conversion of a region of interest of a forward driving video to a motion profile;
FIG. 5 (a) is a motion profile semantic graph of a normal event after YOLO target recognition based on the present invention;
FIG. 5 (b) is a motion profile semantic graph of the present invention after YOLO-based object recognition and noise filtering.
Detailed Description
The invention will now be described in detail with reference to the drawings and specific examples. The present embodiment is implemented on the premise of the technical scheme of the present invention, and a detailed implementation manner and a specific operation process are given, but the protection scope of the present invention is not limited to the following examples.
As shown in fig. 1, a method for predicting a driving dangerous scene based on a motion profile semantic graph includes the following steps:
step S1: acquiring a driving video of a vehicle, and segmenting a region of interest (ROI) of the driving video;
step S2: detecting a traffic object by using a target detection algorithm in an interesting region of the driving video and generating a motion profile semantic graph containing semantics;
step S3: counting motion data of the vehicle, and setting an acceleration threshold according to a counting result to divide a motion profile semantic graph into high risk events or normal events;
step S4: inputting the high-risk event or the normal event into a random forest classifier, and sequencing classification results according to the feature importance to obtain important kinematic features;
step S5: constructing a multi-mode deep neural network model according to the motion profile semantic graph and the important kinematic features;
step S6: and S1-S4, the driving video to be detected is executed, a motion profile semantic graph and important kinematic features of the driving video to be detected are obtained and input into a multi-modal depth neural network model, whether the driving is at risk or not is predicted, and if the driving is at risk, the warning is given to a driver.
The specific process of segmenting the region of interest of the driving video in step S1 is as follows:
step S101: as shown in fig. 2, filtering irrelevant image textures in the driving video by a gaussian filter, and extracting the outline of the road in the driving video by an edge detection algorithm specifically includes:
step S1011: the video frame image of the colorful driving video is converted into a gray image, and the method is concretely as follows:
f=0.299*R+0.587*G+0.114*B
wherein R, G, B represents a matrix of each of the three RGB channels, respectively;
step S1012: the gray scale map is filtered by a gaussian filter, concretely as follows:
Figure BDA0002362750200000091
where f (m, n) represents the original image gray value at position (m, n), g (m, n) represents the Gaussian filtered gray value;
step S1013: calculating the gradient strength and gradient direction of corresponding pixels in the gray level diagram filtered by the Gaussian filter through a Sobel operator, wherein the gradient strength and gradient direction are as follows:
Figure BDA0002362750200000092
Figure BDA0002362750200000093
Figure BDA0002362750200000094
Figure BDA0002362750200000095
Figure BDA0002362750200000096
/>
Figure BDA0002362750200000097
wherein G is x (m, n) is the gradient strength in the transverse direction, G y (m, n) is the longitudinal gradient strength, S x Is a transverse Sobel operator, S y Is a longitudinal Sobel operator, G (m, n) is gradient strength, and theta (m, n) is gradient direction;
step S1014: comparing the gradient intensity of the current pixel with two pixels along the positive and negative gradient directions, if the gradient intensity of the current pixel is the largest compared with the other two pixels, the pixel point is reserved as an edge point, otherwise, the pixel point is restrained, namely, is set to 0;
step S1015: setting an upper threshold value and a lower threshold value v min And v max Wherein is greater than v max Are detected as edges and are lower than v min Is detected as a non-edge. For the middle pixel point, if the middle pixel point is adjacent to the pixel point determined as the edge, the edge is determined; otherwise, the image is non-edge, so that a corresponding binary image (the gray value of the edge point is 1 and the gray value of the non-edge point is 0) is obtained.
Step S102: detecting a transformation of a straight line in a contour of a road by Huo Fuxian transformation, performing an accumulator to calculate a number of points mapped to the line in a hough space, and if there are enough mapped points in the hough space, detecting the straight line;
step S103: after hough line transformation, more than two lines may be detected in the image. Since only two lines are required to calculate the position of the vanishing point, the lines are divided into two groups of left and right, the average parameter of each group is calculated, two intersecting lines are obtained from the average parameter, and the coordinate of the intersection of the two intersecting lines is calculated as x d ,y d I.e. vanishing points;
step S104: y is the ordinate of vanishing point d Upper boundary y as ROI region u The lower boundary y of the ROI area is defined by the largest ordinate among the detected starting points of the two groups of lines l The width of the ROI area is the width of the driving video.
The specific process of generating the motion profile semantic graph in step S2 is as follows:
step S201: the method comprises the following steps of carrying out averaging treatment on an interested region of each frame of image of a driving video, converting the interested region into a row of pixels, and specifically comprising the following steps:
step S2011: longitudinal [ y ] in each frame of picture for obtaining driving video l ,y u ]Transverse [0,w ]]RGB pixel values within a rectangular range of (i.e. (y) u -y l W, 3) a three-dimensional integer matrix, w being the video width;
step S2012: for each channel in RGB in the rectangular range, the average value of the longitudinal pixels is taken as the pixel value of one point, i.e. (y u -y l W, 3) the average value of the first dimension of the three-dimensional integer matrix, arranged in a row of pixels of 1×w, i.e., (1, w, 3) matrix;
step S202: a row of pixel matrix obtained for each frame is spliced in time order to form (fps× (t b -t a ) W, 3) matrix, fps is the number of frames per second of video, and a color motion profile is generated according to the pixel matrix;
step S203: as shown in fig. 3, the motion profile is identified by the real-time object detection frame, and whether the traffic object in the identified traffic environment is located in the region of interest is determined, if yes, the traffic object is marked on the position of the corresponding frame line in the motion profile in the form of a colored pixel line segment according to the transverse position of the traffic object in the frame picture of the driving video pair, so as to form a motion profile semantic graph, which specifically comprises the following steps:
step S2031: for t f A moment video frame picture, wherein a YOLO real-time object detection frame is used for identifying all traffic objects in the picture, and four pieces of information of the position, the size, the type and the confidence coefficient of the traffic objects are obtained;
step S2032: screening traffic objects with confidence coefficient greater than 0.5 and central coordinates in the ROI area, wherein the traffic objects comprise pedestrians and vehicles;
step S2033: the pixel line segment position corresponding to each traffic object in the video frame of the driving video is calculated, and the method specifically comprises the following steps:
Figure BDA0002362750200000111
Figure BDA0002362750200000112
wherein [ x ] 1 ,x 2 ]For the pixel line segment position corresponding to the traffic object, x c ,w o Detecting the center coordinates and the width of an object respectively for YOLO, wherein w is the width of a video picture;
step S2034: as shown in fig. 4, t in the motion profile f The corresponding pixel row (i.e. the t-th f Line), pixel line segment [ x ] of different class type object 1 ,x 2 ]Different colors are set, if the object is a vehicle, [ x ] 1 ,x 2 ]Pixels within the range are set to red; if the object is a pedestrian, setting green;
as shown in fig. 5 (a) and fig. 5 (b), a motion profile semantic graph containing the semantic features of the moving object is finally formed, line segments of the object in the graph are arranged along with time to form a continuous track, and the width degree of the track reflects the relative longitudinal position of the traffic object from the vehicle. The wider the trajectory, the closer the representative traffic object is, and the higher the corresponding risk factor.
The specific process of step S3 is as follows:
step S301: the majority of the vehicle kinematic characteristic variables accord with normal distribution, abnormal values of vehicle motion data are detected and filtered through a 3 sigma principle of the normal distribution, namely, each non-empty kinematic characteristic variable of one driving record is judged, and the abnormal values accord with the conditions, specifically:
|x-μ|>3σ
wherein x is a kinematic parameter, μ is an average value of x, and σ is a standard deviation of x;
filling the missing value by a linear interpolation method, specifically:
Figure BDA0002362750200000113
wherein, the liquid crystal display device comprises a liquid crystal display device,
Figure BDA0002362750200000114
is a missing value, d i-1 D is the last non-empty nearest neighbor value of the missing value i+1 The next non-empty nearest neighbor value that is a missing value, n is the total number of records, t i-1 ,t i ,t i+1 D is respectively i-1 ,/>
Figure BDA0002362750200000121
d i+1 Corresponding time;
step S302: extracting vehicle acceleration data a in natural driving data, drawing and observing a distribution curve, and determining an acceleration threshold value-0.3 of obvious deceleration behavior;
step S303: scanning driving time sequence data, and collecting emergency braking time t according to acceleration condition a less than or equal to-0.3 d For each time t d Taking the first 8 to 1 second time segment to form a potential high risk event segment e c Combining video checking, eliminating false alarm caused by data acquisition errors, and forming 179 event fragments into a high-risk event preparation set
Figure BDA0002362750200000123
To avoid event overlap, it is ensured that adjacent emergency braking moments meet the conditions: t is t d [i+1]-t d [i]≥7。
Step S304: randomly sampling 1055 normal non-conflict events from the rest driving time sequence data by taking 7 seconds as a time window as a normal event preparation set
Figure BDA0002362750200000124
The specific process of step S4 is as follows:
step S401: one containing m l The bar records, each with 10 events of kinematic features, extracted 26 features kinematic features in table 2 as important features of the sample, table 2 specifically as follows:
table 2 event sample field specification table
Figure BDA0002362750200000122
Figure BDA0002362750200000131
Taking event classification as a classification label value of the sample to generate a sample set;
step S402: 616 samples are selected as training sets (89 samples of high risk events and 527 samples of normal events) by a sampling and replacing method, and 1000 times are repeated to generate 1000 training sets { S 1 ,…,S 1000 };
Step S403: taking each training set as an input of a decision tree to construct a random forest { T }, which contains 1000 CART decision trees 1 ,…,T 1000 }, wherein for T i Selection m for each node randomly not repeated node =2 features, using these 2 feature pairs S i Dividing, and obtaining optimal division by using the minimum base index as a division standard, so as to train 1000 CART decision trees;
step S404: sorting classification results according to feature importance to obtain important kinematic features, specifically:
step S4041: calculating each kinematic feature m of 26 kinematic features j Average amount of change I of node splitting uncertainty in all decision trees j (i.e., importance), the degree of node o's opacity is measured using the base index, as follows:
Figure BDA0002362750200000132
wherein, GI o For node o to be impure, k represents class (high risk, normal), p ok Represents the proportion of class k in node o, p ok′ Representing the proportion of non-category k;
step S4042: calculating m j Importance I in the ith tree ji The formula is as follows:
Figure BDA0002362750200000133
wherein O is the m containing kinematic feature of the ith tree j Node set, GI jio A base index G which is the node o of the ith tree jiol ,G jior The base indexes of the left and right new nodes after the node o branches are obtained;
step S4043: calculating m j Importance I in all trees j The formula is as follows:
Figure BDA0002362750200000141
wherein q is the number of CART decision trees;
step S4044: obtain the importance set { I ] of all kinematic features 1 ,…,I X After } the importance is normalized, the specific steps are as follows:
Figure BDA0002362750200000142
the importance sets of the features that completed the normalization process are ranked from large to small, and the features that obtained the importance top 5 rank are shown in table 3:
TABLE 3 feature importance ranking table
Characteristic variable name Description of the invention
ACCEL_MEAN Average value of acceleration in 8 seconds to 2 seconds before risk moment
ACCEL_MAX Maximum acceleration within 8 seconds to 2 seconds before risk moment
ACCEL_MIN Minimum acceleration value within 8 seconds to 2 seconds before risk time
ACCEL_5S Acceleration 5 seconds before the risk time
ACCEL_6S Acceleration at 6 seconds before the risk time
Step S405: prepare the set of normal events
Figure BDA0002362750200000143
And a high risk event preparation set
Figure BDA0002362750200000144
Each event of the plurality of events is represented by the above 5 features in the event, namely { id, accel_mean, accel_max, accel_min, accel_5s, accel_6s, label } wherein id is the number of the event, label is the type of the event, and a normal event set is formed>
Figure BDA0002362750200000145
And high risk event set->
Figure BDA0002362750200000146
/>
The multi-modal deep neural network model specifically comprises:
an input layer for converting the motion profile semantic graph into a matrix m 1
Conv1 layer, setting parameters of convolution layer including number, size, step length and activation function of filter, inputting m 1 Obtaining matrix m 2
Pool1 layer, setting parameters of pooling layer including filter size, type and step length, etc., inputting m 2 Performing maximum pooling to obtain a matrix m 3
Conv2 layer, set the parameters of convolution layer, input m 3 Obtaining a matrix m through a ReLU activation function 4
Pool2 layer, setting parameters of pooling layer, inputting m 4 Performing maximum pooling to obtain a matrix m 5
Conv3 layer, set the parameters of convolution layer, input m 5 Obtaining a matrix m through a ReLU activation function 6
Conv4 layer, set the parameters of convolution layer, input m 6 Obtaining a matrix m through a ReLU activation function 7
Conv5 layer, set the parameters of convolution layer, input m 7 Obtaining a matrix m through a ReLU activation function 8
Pool5 layer, setting pooling layer parameters, inputting m 8 Maximum is carried outPooling to obtain matrix m 9
FC6 smoothing layer, matrix m to be input 9 Smoothing into a one-dimensional matrix m 10
Drop6 layer, input matrix m 10 Discarding part of nerve nodes with a certain proportion of Dropout probability, preventing overfitting, and obtaining matrix m 11
FC7 full connection layer, input matrix m 11 Output a one-dimensional matrix m of r×1 12
Let m 12 And f kinematic Merging, i.e. [ f kinematic m 12 ]As an input to the FC8 fully-connected layer, a 2×1 matrix is output, two values in the matrix correspond to predicted values of probabilities belonging to a risky class and a risky-free class, and then the predicted values are processed using Softmax to make the sum of the probabilities of the two classes 1, and the matrix transformation in the multi-modal deep neural network model is specifically as shown in table 1:
table 1 multi-modal network architecture table
Layer(s) Input device Output of
Conv1 224×224×3 54×54×96
Pool1 54×54×96 28×28×96
Conv2 28×28×96 28×28×256
Pool2 28×28×256 13×13×256
Conv3 13×13×256 13×13×384
Conv4 13×13×384 13×13×384
Conv5 13×13×384 13×13×256
Pool5 13×13×256 6×6×256
FC6 6×6×256 4096×1
Drop6 4096×1 2048×1
FC7 2048×1 5×1
FC8 10×1(5×1+5×1) 2×1
The specific process of step S5 is as follows:
step S501: dividing step S4 into normal event sets
Figure BDA0002362750200000151
And high risk event set->
Figure BDA0002362750200000152
Respectively divide the training set into training sets theta in a ratio of 2:1 train Test set Θ test
Step S502: training a multi-modal deep neural network model through n epoch The epoch, the loss value of the model is converged to a smaller value, training is stopped, and the final multi-mode deep neural network model M is stored DCNN
Step S503: for test set Θ test (comprising e c Normal event e n High risk events) invoke a trained M for each event in the set DCNN Model, obtaining its predicted classification value, and making statistics to obtain the normal event predicted by model
Figure BDA0002362750200000161
And conflict event->
Figure BDA0002362750200000162
Generating a confusion matrix as shown in table 2 according to the prediction result of the test set:
TABLE 2 confusion matrix
Figure BDA0002362750200000163
Obtaining sensitivity I of model according to confusion matrix calculation sensitivity Specificity I specificity The formula is as follows:
I sensitivity =TP/(TP+FN)
I specificity =TN/(FP+TN)
and according to I sensitivity And I speciicity And generating an ROC curve for evaluating the model prediction effect. The AUC of the ROC curve corresponding to the multi-modal deep neural network model is 0.9, the AUC of the decision tree model is 0.56, the AUC of the random forest model is 0.75, the AUC of the bayesian network model is 0.69, and the AUC of the logistic regression model is 0.69. In contrast, the multi-modal deep neural network model is superior to other models in terms of both accuracy and authenticity.
Furthermore, the particular embodiments described herein may vary from one embodiment to another, and the above description is merely illustrative of the structure of the present invention. All such small variations and simple variations in construction, features and principles of the inventive concept are intended to be included within the scope of the present invention. Various modifications or additions to the described embodiments or similar methods may be made by those skilled in the art without departing from the structure of the invention or exceeding the scope of the invention as defined in the accompanying claims.

Claims (6)

1. The method for predicting the driving dangerous scene based on the motion profile semantic graph is characterized by comprising the following steps of:
step S1: acquiring a driving video of a vehicle, and dividing an interested region of the driving video;
step S2: detecting a traffic object by using a target detection algorithm in an interesting region of the driving video and generating a motion profile semantic graph containing semantics;
step S3: counting motion data of the vehicle, setting an acceleration threshold according to a counting result, and dividing the motion profile semantic graph into high-risk events or normal events;
step S4: inputting the high-risk event or the normal event into a random forest classifier, and sequencing classification results according to feature importance to obtain important kinematic features;
step S5: constructing a multi-mode deep neural network model according to the motion profile semantic graph and the important kinematic features;
step S6: S1-S4 are executed on the driving video to be detected, a motion profile semantic graph and important kinematic features of the driving video to be detected are obtained and input into the multi-mode deep neural network model, whether the driving has risks or not is predicted, and if the driving has risks, the driving is alarmed;
the specific process of generating the motion profile semantic graph in the step S2 is as follows:
step S201: carrying out averaging treatment on the interested area of each frame of image of the driving video, and converting the interested area into a row of pixels;
step S202: all the pixels in one row are spliced together in time sequence to form a motion profile;
step S203: identifying the motion profile by a real-time object detection frame, judging whether a traffic object in the identified traffic environment is positioned in an interested area, if so, marking the traffic object on the position of a corresponding frame line in the motion profile in a form of a colored pixel line segment according to the transverse position of the traffic object in a frame picture of a driving video, and forming a motion profile semantic graph;
the importance of the characteristics is determined by the base index of the corresponding characteristics of the high-risk event or the normal event;
the specific process of step S4 is as follows:
step S401: one containing m l Recording each event with multiple kinematic features, extracting n kinematic features { m } 1 ,…,m n Using event classification as a classification label value of the sample as an important feature of the sample to generate a sample set;
step S402: selecting n from sample set by sampling and replacing method s The q samples are used as training sets, q times are repeated to generate q training sets { S ] 1 ,…,S q };
Step S403: constructing a random forest { T } containing q CART decision trees by taking each training set as an input of one decision tree 1 ,…,T q }, wherein for T i Selection m for each node randomly not repeated node By means of the m node Pairs of features S i Dividing, and obtaining optimal division by using the minimum base index as a standard, so as to train q CART decision trees;
step S404: sorting classification results according to feature importance to obtain important kinematic features, specifically:
step S4041: calculate { m } 1 ,…,m n Each kinematic feature m in } j Average amount of change I of node splitting uncertainty in all decision trees j I.e. importance, the degree of non-purity of node o is measured using the base index, as follows:
Figure FDA0004136855640000021
wherein, GI o For the unreliability of node o, k represents the class: high risk, normal, p ok Represents the proportion of class k in node o, p ok′ Representing the proportion of non-category k;
step S4042: calculating m j Importance I in the ith tree ji The formula is as follows:
I ji =∑ o∈O I jio =∑ o∈O (GI jio -G jiol -G jior )
wherein O is the m containing kinematic feature of the ith tree j Node set, GI jio A base index G which is the node o of the ith tree jiol ,G jior The base indexes of the left and right new nodes after the node o branches are obtained;
step S4043: calculating m j Importance I in all trees j The formula is as follows:
Figure FDA0004136855640000022
wherein q is the number of CART decision trees;
step S4044: obtain the importance set { I ] of all kinematic features 1 ,…,I X After } the importance is normalized, the specific steps are as follows:
Figure FDA0004136855640000023
ranking the importance sets of the features subjected to normalization processing from large to small to obtain n before importance ranking important Features of (2)
Figure FDA0004136855640000026
Step S405: prepare the set of normal events
Figure FDA0004136855640000024
And a high risk event preparation set
Figure FDA0004136855640000025
Uses the n in the event for each event of the event important Represented by kinematic features, i.e. for each event +.>
Figure FDA0004136855640000031
Wherein id is event number, label is event type, and forms normal event set +.>
Figure FDA0004136855640000032
And high risk event set->
Figure FDA0004136855640000033
The multi-modal deep neural network model specifically comprises:
an input layer for converting the motion profile semantic graph into a matrix m 1
Conv1 layer, setting convolution layer parameters including filteringNumber, size, step length and activation function of the device, input m 1 Obtaining matrix m 2
Pool1 layer, setting parameters of pooling layer including filter size, type and step length, etc., inputting m 2 Performing maximum pooling to obtain a matrix m 3
Conv2 layer, set the parameters of convolution layer, input m 3 Obtaining a matrix m through a ReLU activation function 4
Pool2 layer, setting parameters of pooling layer, inputting m 4 Performing maximum pooling to obtain a matrix m 5
Conv3 layer, set the parameters of convolution layer, input m 5 Obtaining a matrix m through a ReLU activation function 6
Conv4 layer, set the parameters of convolution layer, input m 6 Obtaining a matrix m through a ReLU activation function 7
Conv5 layer, set the parameters of convolution layer, input m 7 Obtaining a matrix m through a ReLU activation function 8
Pool5 layer, setting pooling layer parameters, inputting m 8 Performing maximum pooling to obtain a matrix m 9
FC6 smoothing layer, matrix m to be input 9 Smoothing into a one-dimensional matrix m 10
Drop6 layer, input matrix m 10 Discarding part of nerve nodes with a certain proportion of Dropout probability, preventing overfitting, and obtaining matrix m 11
FC7 full connection layer, input matrix m 11 Output a one-dimensional matrix m of r×1 12
Let m 12 And f kinematic Merging, i.e. [ f kinematic m 12 ]As an input to the FC8 fully connected layer, a 2×1 matrix is output, two values in the matrix correspond to predicted values of probabilities belonging to a risky class and a risky class, and then the predicted values are processed using Softmax to make the sum of the probabilities of the two classes 1.
2. The method for predicting a driving hazard scene based on motion profile semantic graphs according to claim 1, wherein the region of interest comprises an upper boundary and a lower boundary.
3. The method for predicting a driving hazard scene based on a motion profile semantic graph according to claim 2, wherein the specific process of segmenting the region of interest of the driving video in step S1 is as follows:
step S101: filtering irrelevant image textures in the driving video through a Gaussian filter, and extracting the outline of a road in the driving video through an edge detection algorithm;
step S102: detecting a transformation of a straight line in the contour of the road by Huo Fuxian transformation;
step S103: and calculating cross lines for the detected groups of straight lines, obtaining cross points according to the cross lines to determine the upper boundary of the region of interest, and determining the lower boundary of the region of interest according to the starting points of the groups of straight lines.
4. The method for predicting driving risk scenes based on motion profile semantic graphs according to claim 1, wherein the traffic objects include pedestrians and vehicles.
5. The method for predicting driving hazard scenes based on the motion profile semantic graph according to claim 1, wherein the specific process of the step S3 is as follows:
step S301: detecting and filtering abnormal values of vehicle motion data through a normal distributed 3 sigma principle, and filling the missing values through a linear interpolation method;
step S302: acquiring corresponding acceleration distribution according to the filtered and filled vehicle motion data, judging vehicle avoidance behavior, and setting an acceleration threshold value of a dangerous driving event according to a judgment result;
step S303: extracting a potential dangerous driving event according to the acceleration threshold value;
step S304: and calibrating a normal event set and a conflict event set on the potential dangerous driving event according to the checking result of the driving video.
6. The method for predicting a driving hazard scene based on a motion profile semantic graph according to claim 1, wherein the multi-modal deep neural network model comprises a visual data processing layer, a kinematic data processing layer, a data fusion layer and a prediction layer.
CN202010026768.1A 2020-01-10 2020-01-10 Method for predicting driving dangerous scene based on motion profile semantic graph Active CN111242015B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010026768.1A CN111242015B (en) 2020-01-10 2020-01-10 Method for predicting driving dangerous scene based on motion profile semantic graph

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010026768.1A CN111242015B (en) 2020-01-10 2020-01-10 Method for predicting driving dangerous scene based on motion profile semantic graph

Publications (2)

Publication Number Publication Date
CN111242015A CN111242015A (en) 2020-06-05
CN111242015B true CN111242015B (en) 2023-05-02

Family

ID=70872597

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010026768.1A Active CN111242015B (en) 2020-01-10 2020-01-10 Method for predicting driving dangerous scene based on motion profile semantic graph

Country Status (1)

Country Link
CN (1) CN111242015B (en)

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111767850A (en) * 2020-06-29 2020-10-13 北京百度网讯科技有限公司 Method and device for monitoring emergency, electronic equipment and medium
CN111767851A (en) * 2020-06-29 2020-10-13 北京百度网讯科技有限公司 Method and device for monitoring emergency, electronic equipment and medium
CN111860425B (en) * 2020-07-30 2021-04-09 清华大学 Deep multi-mode cross-layer cross fusion method, terminal device and storage medium
CN111950478B (en) * 2020-08-17 2021-07-23 浙江东鼎电子股份有限公司 Method for detecting S-shaped driving behavior of automobile in weighing area of dynamic flat-plate scale
CN112115819B (en) * 2020-09-03 2022-09-20 同济大学 Driving danger scene identification method based on target detection and TET (transient enhanced test) expansion index
CN112084968B (en) * 2020-09-11 2023-05-26 清华大学 Semantic characterization method and system based on air monitoring video and electronic equipment
CN112396093B (en) * 2020-10-29 2022-10-14 中国汽车技术研究中心有限公司 Driving scene classification method, device and equipment and readable storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2013013487A1 (en) * 2011-07-26 2013-01-31 华南理工大学 Device and method for monitoring driving behaviors of driver based on video detection
CN106611169A (en) * 2016-12-31 2017-05-03 中国科学技术大学 Dangerous driving behavior real-time detection method based on deep learning
CN108764034A (en) * 2018-04-18 2018-11-06 浙江零跑科技有限公司 A kind of driving behavior method for early warning of diverting attention based on driver's cabin near infrared camera
CN109543601A (en) * 2018-11-21 2019-03-29 电子科技大学 A kind of unmanned vehicle object detection method based on multi-modal deep learning

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10296796B2 (en) * 2016-04-06 2019-05-21 Nec Corporation Video capturing device for predicting special driving situations

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2013013487A1 (en) * 2011-07-26 2013-01-31 华南理工大学 Device and method for monitoring driving behaviors of driver based on video detection
CN106611169A (en) * 2016-12-31 2017-05-03 中国科学技术大学 Dangerous driving behavior real-time detection method based on deep learning
CN108764034A (en) * 2018-04-18 2018-11-06 浙江零跑科技有限公司 A kind of driving behavior method for early warning of diverting attention based on driver's cabin near infrared camera
CN109543601A (en) * 2018-11-21 2019-03-29 电子科技大学 A kind of unmanned vehicle object detection method based on multi-modal deep learning

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Rongjie Yu等.Driving Style Analyses for Car-sharing Users Utilizing Low-frequency Trajectory Data.2019 5th International Conference on Transportation Information and Safety (ICTIS).2019,927-933. *
高珍等.连续数据环境下的道路交通事故风险预测模型.中国公路学报.2018,(第04期),284-291. *

Also Published As

Publication number Publication date
CN111242015A (en) 2020-06-05

Similar Documents

Publication Publication Date Title
CN111242015B (en) Method for predicting driving dangerous scene based on motion profile semantic graph
CN109977812B (en) Vehicle-mounted video target detection method based on deep learning
CN109829403B (en) Vehicle anti-collision early warning method and system based on deep learning
CN110210475B (en) License plate character image segmentation method based on non-binarization and edge detection
CN111860274B (en) Traffic police command gesture recognition method based on head orientation and upper half skeleton characteristics
CN113421269A (en) Real-time semantic segmentation method based on double-branch deep convolutional neural network
CN111626170B (en) Image recognition method for railway side slope falling stone intrusion detection
CN106845458B (en) Rapid traffic sign detection method based on nuclear overrun learning machine
CN104915642A (en) Method and apparatus for measurement of distance to vehicle ahead
CN116383685A (en) Vehicle lane change detection method based on space-time interaction diagram attention network
CN114596316A (en) Road image detail capturing method based on semantic segmentation
CN113807298B (en) Pedestrian crossing intention prediction method and device, electronic equipment and readable storage medium
CN113724286A (en) Method and device for detecting saliency target and computer-readable storage medium
Muthalagu et al. Object and Lane Detection Technique for Autonomous Car Using Machine Learning Approach
Sumi et al. Frame level difference (FLD) features to detect partially occluded pedestrian for ADAS
CN116630702A (en) Pavement adhesion coefficient prediction method based on semantic segmentation network
CN116434203A (en) Anger driving state identification method considering language factors of driver
CN115761697A (en) Auxiliary intelligent driving target detection method based on improved YOLOV4
Pargi et al. Classification of different vehicles in traffic using RGB and Depth images: A Fast RCNN Approach
CN114758326A (en) Real-time traffic post working behavior state detection system
CN111310607B (en) Highway safety risk identification method and system based on computer vision and artificial intelligence
Dorrani Traffic Scene Analysis and Classification using Deep Learning
CN113701642A (en) Method and system for calculating appearance size of vehicle body
CN110738113A (en) object detection method based on adjacent scale feature filtering and transferring
JP7280426B1 (en) Object detection device and object detection method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant