CN110969130A - Driver dangerous action identification method and system based on YOLOV3 - Google Patents

Driver dangerous action identification method and system based on YOLOV3 Download PDF

Info

Publication number
CN110969130A
CN110969130A CN201911220885.5A CN201911220885A CN110969130A CN 110969130 A CN110969130 A CN 110969130A CN 201911220885 A CN201911220885 A CN 201911220885A CN 110969130 A CN110969130 A CN 110969130A
Authority
CN
China
Prior art keywords
face
roi
driver
dangerous
yolov3
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201911220885.5A
Other languages
Chinese (zh)
Other versions
CN110969130B (en
Inventor
袁嘉言
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xiamen Ruiwei Information Technology Co ltd
Original Assignee
Xiamen Ruiwei Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xiamen Ruiwei Information Technology Co ltd filed Critical Xiamen Ruiwei Information Technology Co ltd
Priority to CN201911220885.5A priority Critical patent/CN110969130B/en
Publication of CN110969130A publication Critical patent/CN110969130A/en
Application granted granted Critical
Publication of CN110969130B publication Critical patent/CN110969130B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/59Context or environment of the image inside of a vehicle, e.g. relating to seat occupancy, driver state or inner lighting conditions
    • G06V20/597Recognising the driver's state or behaviour, e.g. attention or drowsiness
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • G06V40/166Detection; Localisation; Normalisation using acquisition arrangements
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computing Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Multimedia (AREA)
  • Molecular Biology (AREA)
  • Evolutionary Biology (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Human Computer Interaction (AREA)
  • Probability & Statistics with Applications (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

The invention provides a driver dangerous action recognition method based on YOLOV3, which comprises the steps of acquiring an infrared image of a driver, determining the position of a face through a face detection algorithm, and selecting a region to be recognized of driver dangerous action according to the position of the face; using a YOLOV3 algorithm to quickly detect whether a dangerous action state of a driver occurs in the area to be identified; if the YOLOV3 algorithm detects that the driver does dangerous action, extracting dangerous action areas to perform deep learning classification, and determining which dangerous action the driver does; counting for a period of time, if drivers do certain dangerous behaviors, reminding the drivers to pay attention to safe driving, and uploading the dangerous behaviors of the drivers to the cloud; the invention also provides a driver dangerous action recognition system based on the YOLOV3, so that the prediction result is more accurate, and the error recognition of the dangerous behavior of the alarm can be greatly reduced.

Description

Driver dangerous action identification method and system based on YOLOV3
Technical Field
The invention relates to a driver dangerous action identification method and system based on YOLOV 3.
Background
Along with the more and more perfect development of domestic road infrastructure, the number of operating vehicles on road traffic is increasing, and the country pays more attention to the road safety condition. Meanwhile, the deep learning technology develops rapidly in recent years, and a plurality of novel algorithms with good effect emerge in large numbers. The development of the technology and the requirement of scene application drive a large amount of high and new technologies to land on the ground, and the driving safety assistant driving is one direction of deep learning visual landing. Most of traffic accidents are caused by human factors in the driving process, and serious traffic accidents and serious economic losses can be caused by fatigue driving of a driver, drunk driving or improper operation in the driving process. Therefore, it is very important to detect dangerous actions during driving of the driver, so as to effectively reduce traffic accidents. Several exemplary methods for detecting dangerous driver behavior are described below:
(1) the calling identification technology based on the traditional image algorithm classification comprises the following steps: after the detection algorithm detects the face, a large area including the face is intercepted on the basis of the face, and whether a driver calls or not is analyzed by directly using the large area. A common conventional machine learning algorithm is: the face detection method comprises the steps of detecting a face by using an AdaBoost face detection algorithm, wherein AdaBoost is an abbreviation of English Adaptive Boosting, and is a machine learning method commonly used for rapidly detecting the face; intercepting a large area according to the detected face to construct positive and negative samples for making a call, and training a sample library constructed by learning by using an SVM (Support Vector Machine, which is a traditional Machine learning algorithm); and in the prediction stage, the trained SVM model is loaded, the intercepted area to be detected is input, and the probability of predicting whether the classification calls or not is output. The algorithm has the advantages that: the speed is high; algorithm points are as follows: with traditional machine learning classification, the final accuracy is not high due to insufficient learning ability of the algorithm.
(2) The technology for identifying the call based on deep learning algorithm classification comprises the following steps: in view of the above-mentioned insufficient learning ability of the traditional SVM machine learning method, the algorithm is often not fully satisfactory in response to a complex practical application scenario, and therefore the classification method is replaced by a CNN (convolutional neural network) network learning classification from an SVM; when the network is deep enough, the CNN has better characteristic extraction capability on data, and can learn the calling behavior under various complex light rays in actual driving to have better data fitting effect. The algorithm has the advantages that: the deep learning method can fit a large amount of complex data, the learning capacity of data features is stronger, and the prediction effect is better than that of an SVM (support vector machine) under the background of big data; the disadvantages are as follows: in view of the fact that large areas based on human faces are classified, a lot of invalid background information is introduced, and when the background with untrained algorithm is encountered, inexplicable false recognition easily occurs.
(3) The telephone call identification technology based on the deep learning detection method comprises the following steps: the method for classifying, identifying and calling the phone by the input image has the possibility that the situation of pure background is mistakenly identified as that a driver calls the phone, and the driver does not have any hand action at the face at the moment, so that the mistaken identification is difficult to understand. Thus, engineers have proposed detection-based methods to identify driver call actions. The method is characterized in that a modified SSD (Single Shot Multi Box Detector, which is an end-to-end object detection algorithm) algorithm is used for detecting the hand-held call-making action. By using the detection method, not only the classification result in the image can be known, but also the specific calling position when the current action occurs can be known, so that the calling detection method can learn the more specific action occurrence characteristics in the image. The algorithm has the advantages that: false detection of pure background can be effectively solved; the algorithm has the following defects: when a user holds a hand to make a call, the hand characteristics are more obvious than those of the telephone, and the hand is easily recognized as a call by mistake at the ear.
The above methods are gradually updated and iterated in effect along with the development of algorithm technology and hardware equipment, and the effect is better for one generation than for the other generation. However, there still exists a great problem in the use under the actual complex scene. The detection algorithm mentioned in the above (3) can detect the call-making action, and meanwhile, the call-making action is not made by the hand, and the call-making action is also likely to be mistakenly identified as a call-making action, obviously, the algorithm setting is not perfect; in the driving process of the two drivers, dangerous hand motions are more, and the safety is not enough when only one behavior of the calling motion is detected.
Disclosure of Invention
The invention aims to provide a method for predicting dangerous behaviors by using a prediction method, which can enable the prediction result to be more accurate and greatly reduce the false recognition of dangerous behaviors.
One of the present invention is realized by: a driver dangerous action recognition method based on YOLOV3 comprises the following steps:
step 1, acquiring an infrared image of a driver, detecting the face position of the driver by a face detection algorithm, and selecting a region to be identified of dangerous behaviors of the driver according to the face position;
step 2, detecting whether a dangerous action state of a driver occurs in the area to be identified by using a YOLOV3 model, if not, considering that the driver drives normally, and ending; if yes, obtaining a YOLOV3 recognition result, obtaining a YOLOV3 detection area, and entering the step 3;
step 3, identifying the area to be identified detected by the YOLOV3 through a lightcnn model, judging whether a dangerous action state of a driver occurs or not, if not, judging that the driver drives normally, if so, obtaining a lightcnn identification result, and if the lightcnn identification result is the same as the identification result of the YOLOV3 and frames of a first set value continuously occur under the condition, sending an alarm; if the lightcnn recognition result is different from the YOLOV3 recognition result, and this situation occurs continuously for frames of the second setting value, an alarm is given.
Further, the method also comprises a step 4 of uploading the identification result of the YOLOV3 and the identification result of the lightcnn to the cloud.
Further, the step 1 is further specifically: the method comprises the steps of obtaining an infrared image of a driver, detecting the face position of the driver by a face detection algorithm, fixing a downward square on the upper edge of the face, expanding a set multiple, then intercepting the area, normalizing the size of the area to a set size, and obtaining an area to be identified.
Further, the YOLOV3 algorithm is trained in the following manner:
collecting a driver picture of an actual application scene; the dangerous behavior classification of the collected driver pictures comprises the following steps: normal behavior, call, chat, believe, drink water and hold ears and touch face; and the rectangular boxes of 4 types are marked in the occurrence areas of the actions of calling, chatting, drinking, grabbing ears and touching faces, so that the positions and the types of the actions in the whole picture are recorded, and the rectangular boxes are not required to be marked for normal actions;
performing face detection on a sample library picture to record positions (face _ x, face _ y, face _ w and face _ h), fixing the upper edge of a face, and expanding the face by a set multiple downwards to form a dangerous behavior analysis area (roi _ x, roi _ y, roi _ w and roi _ h), wherein roi _ x is face _ x-face _ w (2.5-1)/2, roi _ y is face _ y, roi _ w is 2.5 face _ w and roi _ h is 2.5 face _ h, and (roi _ x, roi _ y, roi _ w and roi _ h) are intercepted from a training sample preprocessing direct remaking image and normalized to a set size; label preprocessing, the labeling label learned by YOLOV3 is the relative offset of the frame, so the real training labels for the target frame are (label _ x, label _ y, label _ w, label _ h), where label _ x is (box _ x-roi _ x)/roi _ w, label _ y is (box _ y-roi _ y)/roi _ h, label _ w is box _ w/roi _ w, label _ h is box _ h/roi _ h; class label _ class, normal behavior class 0, call class 1, chat type 2, water class 3 and ear/face-catching class 4;
training clustering selection of the network candidate boxes, and clustering the proportion of the 6 candidate boxes through a k-mean algorithm, wherein the 6 candidate boxes are divided into 2 network scale outputs, namely each network scale output comprises 3 proportion candidate boxes; the main network is VGG-mobileNet with a large amount of parameters cut, a deep learning framework is used for model training, and the training content is as follows: the method is characterized in that whether dangerous driving behaviors exist or not is predicted by inputting images, if the dangerous driving behaviors exist, the dangerous driving behaviors are specifically selected from the behaviors of making a call, chatting, drinking water or touching the ear and the face, and the accurate position where the dangerous behaviors occur is predicted.
Further, the lightcnn model training mode is as follows:
collecting a sample; sample collection is carried out by intercepting a sample area from a normal labeled marking frame for calling, chatting, drinking and touching the face with ears, and background samples are randomly intercepted from an area without dangerous behaviors in the 4 image; sample size normalized to 128 x 128;
selecting and training a model; the classification model selects a lightcnn network structure, the output of the network is 5 types, namely the normal behavior type is 0, the calling type is 1, the chatting type is 2, the drinking type is 3 and the ear-catching face-touching type is 4; performing target learning by using Softmax cross entropy loss, and carrying out operation of a Softmax cross entropy loss function in a caffe framework; and (5) finishing model training when the Softmax cross entropy loss is stably converged in a smaller area.
The second invention is realized by the following steps: a YOLOV 3-based driver hazardous action recognition system, comprising:
the identification region determining module is used for acquiring an infrared image of a driver, detecting the face position of the driver by a face detection algorithm and selecting a region to be identified of dangerous behaviors of the driver according to the face position;
the first identification module detects whether a dangerous action state of a driver occurs in the area to be identified by using a YOLOV3 model, if not, the driver is considered to drive normally, and the operation is finished; if yes, obtaining a YOLOV3 recognition result, obtaining a YOLOV3 detection area, and entering a recognition and alarm module;
the recognition and alarm module is used for recognizing the area to be recognized detected by the YOLOV3 through a lightcnn model, judging whether a dangerous action state of a driver occurs or not, if not, judging that the driver drives normally, if so, obtaining a lightcnn recognition result, and if the lightcnn recognition result is the same as the YOLOV3 recognition result, continuously generating frames of a first set value, and sending an alarm; if the lightcnn recognition result is different from the YOLOV3 recognition result, and this situation occurs continuously for frames of the second setting value, an alarm is given.
Further, the system further comprises an uploading module, and the uploading module uploads the Yolov3 recognition result and the lightcnn recognition result to the cloud.
Further, the identification area determining module is further specifically configured to: the method comprises the steps of obtaining an infrared image of a driver, detecting the face position of the driver by a face detection algorithm, fixing a downward square on the upper edge of the face, expanding a set multiple, then intercepting the area, normalizing the size of the area to a set size, and obtaining an area to be identified.
Further, the YOLOV3 algorithm is trained in the following manner:
collecting a driver picture of an actual application scene; the dangerous behavior classification of the collected driver pictures comprises the following steps: normal behavior, call, chat, drink, hold ear and touch face 5 types; and the rectangular boxes of 4 types are marked in the occurrence areas of the actions of calling, chatting, drinking, grabbing ears and touching faces, so that the positions and the types of the actions in the whole picture are recorded, and the rectangular boxes are not required to be marked for normal actions;
performing face detection on a sample library picture to record positions (face _ x, face _ y, face _ w and face _ h), fixing the upper edge of a face, and expanding the face by a set multiple downwards to form a dangerous behavior analysis area (roi _ x, roi _ y, roi _ w and roi _ h), wherein roi _ x is face _ x-face _ w (2.5-1)/2, roi _ y is face _ y, roi _ w is 2.5 face _ w and roi _ h is 2.5 face _ h, and (roi _ x, roi _ y, roi _ w and roi _ h) are intercepted from a training sample preprocessing direct remaking image and normalized to a set size; label preprocessing, the labeling label learned by YOLOV3 is the relative offset of the frame, so the real training labels for the target frame are (label _ x, label _ y, label _ w, label _ h), where label _ x is (box _ x-roi _ x)/roi _ w, label _ y is (box _ y-roi _ y)/roi _ h, label _ w is box _ w/roi _ w, label _ h is box _ h/roi _ h; class label _ class, normal behavior class 0, call class 1, chat type 2, water class 3 and ear/face-catching class 4;
training clustering selection of the network candidate boxes, and clustering the proportion of the 6 candidate boxes through a k-mean algorithm, wherein the 6 candidate boxes are divided into 2 network scale outputs, namely each network scale output comprises 3 proportion candidate boxes; the main network is VGG-mobileNet with a large amount of parameters cut, a deep learning framework is used for model training, and the training content is as follows: the method is characterized in that whether dangerous driving behaviors exist or not is predicted by inputting images, if the dangerous driving behaviors exist, the dangerous driving behaviors are specifically selected from the behaviors of making a call, chatting, drinking water or touching the ear and the face, and the accurate position where the dangerous behaviors occur is predicted.
Further, the lightcnn model training mode is as follows:
collecting a sample; sample collection is carried out by intercepting a sample area from a normal labeled marking frame for calling, chatting, drinking and touching the face with ears, and background samples are randomly intercepted from an area without dangerous behaviors in the 4 image; sample size normalized to 128 x 128;
selecting and training a model; the classification model selects a lightcnn network structure, the output of the network is 5 types, namely the normal behavior type is 0, the calling type is 1, the chatting type is 2, the drinking type is 3 and the ear-catching face-touching type is 4; performing target learning by using Softmax cross entropy loss, and carrying out operation of a Softmax cross entropy loss function in a caffe framework; and (5) finishing model training when the Softmax cross entropy loss is stably converged in a smaller area.
The invention has the following advantages:
the method has the advantages that a YOLOV3 detection architecture is introduced to detect dangerous actions of a driver, the detection range is wide, multiple dangerous actions such as calling, chatting, drinking, grabbing ears and touching faces of the driver can be detected at the same time, instead of detecting one action by one model, one model can be used for detecting multiple actions, and meanwhile, the design complexity of a system can be reduced, and the use of resources of the system can be reduced; and secondly, introducing a fine quadratic verification lightcnn classification network, intercepting and classifying the region detected by YOLOV3, namely enabling a lightcnn fine classification model to pay more specific attention to the fine features of the region of interest, so that the prediction result is more accurate and the error identification of the alarm dangerous behavior can be greatly reduced.
Drawings
The invention will be further described with reference to the following examples with reference to the accompanying drawings.
FIG. 1 is a flow chart of the method of the present invention.
FIG. 2 is a schematic view of a prediction flow of Yolov 3;
FIG. 3 is a schematic view of a lightcnn check classification prediction process;
FIG. 4 is a schematic flow chart of the whole process of detecting YOLOV3 and predicting dangerous behavior of driver by lightcnn secondary classification.
Detailed Description
As shown in fig. 1, the method for identifying dangerous actions of drivers based on YOLOV3 of the present invention includes:
step 1, acquiring an infrared image of a driver, detecting the face position of the driver by a face detection algorithm, fixing a downward square by the upper edge of the face, expanding a set multiple, intercepting the area, and normalizing the size of the area to a set size to obtain an area to be identified;
step 2, detecting whether a dangerous action state of a driver occurs in the area to be identified by using a YOLOV3 model, if not, considering that the driver drives normally, and ending; if yes, obtaining a YOLOV3 recognition result, obtaining a YOLOV3 detection area (the YOLOV3 detection area is an area obtained by detecting the area to be recognized and contains a motion image of the driver), and entering step 3;
step 3, identifying a YOLOV3 detection area through a lightcnn model, judging whether a dangerous action state of a driver occurs, if not, judging that the driver drives normally, if so, obtaining a lightcnn identification result, and if the lightcnn identification result is the same as the YOLOV3 identification result, continuously generating frames of a first set value, and sending an alarm; if the lightcnn recognition result is different from the YOLOV3 recognition result, and this situation occurs continuously for frames of the second setting value, an alarm is given.
And 4, uploading the YOLOV3 recognition result and the lightcnn recognition result to a cloud.
The training mode of the Yolov3 algorithm is as follows:
collecting a driver picture of an actual application scene; the dangerous behavior classification of the collected driver pictures comprises the following steps: normal behavior, call, chat, believe, drink water and hold ears and touch face; and the rectangular boxes of 4 types are marked in the occurrence areas of the actions of calling, chatting, drinking, grabbing ears and touching faces, so that the positions and the types of the actions in the whole picture are recorded, and the rectangular boxes are not required to be marked for normal actions;
performing face detection on a sample library picture to record positions (face _ x, face _ y, face _ w and face _ h), fixing the upper edge of a face, and expanding the face by a set multiple downwards to form a dangerous behavior analysis area (roi _ x, roi _ y, roi _ w and roi _ h), wherein roi _ x is face _ x-face _ w (2.5-1)/2, roi _ y is face _ y, roi _ w is 2.5 face _ w and roi _ h is 2.5 face _ h, and (roi _ x, roi _ y, roi _ w and roi _ h) are intercepted from a training sample preprocessing direct remaking image and normalized to a set size; label preprocessing, the labeling label learned by YOLOV3 is the relative offset of the frame, so the real training labels for the target frame are (label _ x, label _ y, label _ w, label _ h), where label _ x is (box _ x-roi _ x)/roi _ w, label _ y is (box _ y-roi _ y)/roi _ h, label _ w is box _ w/roi _ w, label _ h is box _ h/roi _ h; class label _ class, normal behavior class 0, call class 1, chat type 2, water class 3 and ear/face-catching class 4;
training clustering selection of the network candidate boxes, and clustering the proportion of the 6 candidate boxes through a k-mean algorithm, wherein the 6 candidate boxes are divided into 2 network scale outputs, namely each network scale output comprises 3 proportion candidate boxes; the main network is VGG-mobile Net (a classical network structure) with a great amount of parameters cut, the model training uses a deep learning framework, the training content is as follows: the method is characterized in that whether dangerous driving behaviors exist or not is predicted by inputting images, if the dangerous driving behaviors exist, the dangerous driving behaviors are specifically selected from the behaviors of making a call, chatting, drinking water or touching the ear and the face, and the accurate position where the dangerous behaviors occur is predicted.
The lightcnn model training mode is as follows:
collecting a sample; sample collection is carried out by intercepting a sample area from a normal labeled marking frame for calling, chatting, drinking and touching the face with ears, and background samples are randomly intercepted from an area without dangerous behaviors in the 4 image; sample size normalized to 128 x 128;
selecting and training a model; the classification model selects a lightcnn network structure, the output of the network is 5 types, namely the normal behavior type is 0, the calling type is 1, the chatting type is 2, the drinking type is 3 and the ear-catching face-touching type is 4; performing target learning by using Softmax cross entropy loss, and carrying out operation of a Softmax cross entropy loss function in a caffe framework; and (5) finishing model training when the Softmax cross entropy loss is stably converged in a smaller area.
The invention discloses a driver dangerous action recognition system based on YOLOV3, which comprises:
the recognition area determining module is used for acquiring an infrared image of a driver, detecting the face position of the driver by a face detection algorithm, fixing the upper edge of the face to a downward square, expanding a set multiple, intercepting the area, and normalizing the size of the area to a set size to obtain an area to be recognized;
the first identification module detects whether a dangerous action state of a driver occurs in the area to be identified by using a YOLOV3 model, if not, the driver is considered to drive normally, and the operation is finished; if yes, obtaining a YOLOV3 recognition result, obtaining a YOLOV3 detection area, and entering a recognition and alarm module;
the recognition and alarm module recognizes the YOLOV3 detection area through a lightcnn model, whether a dangerous action state of a driver occurs or not is judged, if not, the driver is considered to drive normally, if so, a lightcnn recognition result is obtained, and if the lightcnn recognition result is the same as the YOLOV3 recognition result and frames of a first set value continuously occur under the condition, an alarm is sent; if the lightcnn recognition result is different from the YOLOV3 recognition result, and this situation occurs continuously for frames of the second setting value, an alarm is given.
And the uploading module uploads the YOLOV3 recognition result and the lightcnn recognition result to the cloud.
The training mode of the Yolov3 algorithm is as follows:
collecting a driver picture of an actual application scene; the dangerous behavior classification of the collected driver pictures comprises the following steps: normal behavior, call, chat, drink, hold ear and touch face 5 types; and the rectangular boxes of 4 types are marked in the occurrence areas of the actions of calling, chatting, drinking, grabbing ears and touching faces, so that the positions and the types of the actions in the whole picture are recorded, and the rectangular boxes are not required to be marked for normal actions;
performing face detection on a sample library picture to record positions (face _ x, face _ y, face _ w and face _ h), fixing the upper edge of a face, and expanding the face by a set multiple downwards to form a dangerous behavior analysis area (roi _ x, roi _ y, roi _ w and roi _ h), wherein roi _ x is face _ x-face _ w (2.5-1)/2, roi _ y is face _ y, roi _ w is 2.5 face _ w and roi _ h is 2.5 face _ h, and (roi _ x, roi _ y, roi _ w and roi _ h) are intercepted from a training sample preprocessing direct remaking image and normalized to a set size; label preprocessing, the labeling label learned by YOLOV3 is the relative offset of the frame, so the real training labels for the target frame are (label _ x, label _ y, label _ w, label _ h), where label _ x is (box _ x-roi _ x)/roi _ w, label _ y is (box _ y-roi _ y)/roi _ h, label _ w is box _ w/roi _ w, label _ h is box _ h/roi _ h; class label _ class, normal behavior class 0, call class 1, chat type 2, water class 3 and ear/face-catching class 4;
training clustering selection of the network candidate boxes, and clustering the proportion of the 6 candidate boxes through a k-mean algorithm, wherein the 6 candidate boxes are divided into 2 network scale outputs, namely each network scale output comprises 3 proportion candidate boxes; the main network is VGG-mobile Net (a classical network structure) with a great amount of parameters cut, the model training uses a deep learning framework, the training content is as follows: the method is characterized in that whether dangerous driving behaviors exist or not is predicted by inputting images, if the dangerous driving behaviors exist, the dangerous driving behaviors are specifically selected from the behaviors of making a call, chatting, drinking water or touching the ear and the face, and the accurate position where the dangerous behaviors occur is predicted.
The lightcnn model training mode is as follows:
collecting a sample; sample collection is carried out by intercepting a sample area from a normal labeled marking frame for calling, chatting, drinking and touching the face with ears, and background samples are randomly intercepted from an area without dangerous behaviors in the 4 image; sample size normalized to 128 x 128;
selecting and training a model; the classification model selects a lightcnn network structure, the output of the network is 5 types, namely the normal behavior type is 0, the calling type is 1, the chatting type is 2, the drinking type is 3 and the ear-catching face-touching type is 4; performing target learning by using Softmax cross entropy loss, and carrying out operation of a Softmax cross entropy loss function in a caffe framework; and (5) finishing model training when the Softmax cross entropy loss is stably converged in a smaller area.
One specific embodiment of the present invention:
the method for recognizing the dangerous actions of the two-stage driver based on the Yolov3 detection and classification mainly comprises the following three contents: designing and training a detection algorithm for various dangerous behaviors such as calling, chatting, drinking, grabbing ears, touching faces and the like based on a deep learning YOLOV3 framework driver; secondly, the marked drivers call, chat with WeChat, drink water, grab ears, touch faces and mistakenly detect background areas, arrange the background areas into 5 types of samples, and train a fine CNN check classification network; and thirdly, a YOLOV3 multi-action detection algorithm is matched with a fine CNN check classification network for use, and the dangerous behaviors of the driver are comprehensively predicted. These three processes are described in detail below.
Firstly, based on deep learning YOLOV3 framework driver makes a call, chats WeChat, drinks and grabs ears and touches the face, four dangerous behavior detection algorithm flows mainly include:
(1) and collecting driver pictures (different light rays, different time periods, different devices, different drivers, different actions and the like) of actual application scenes. The dangerous behavior classification of the collected driver pictures comprises the following steps: normal behavior, call, chat, drink, hold ear and touch face 5 types; and the rectangular boxes of 4 types are marked in the action occurrence areas of calling, chatting, drinking, grabbing ears and touching faces, the positions and the types of the actions in the whole graph are recorded, and the rectangular boxes are not required to be marked for normal actions.
(2) Pretreatment of the sample and label. And (2.5-1)/2, namely, the roi _ y is the face _ y, the roi _ w is the face _ w, and the roi _ h is 2.5 times the face _ x _ y, the roi _ w and the roi _ h as a dangerous behavior analysis area. Training sample preprocessing directly reconstructs the image truncations (roi _ x, roi _ y, roi _ w, roi _ h) and sizes normalized to 256 × 256 dimensions. Label preprocessing, the labeling label learned by YOLOV3 is the relative offset of the frame, so the real training labels for the target frame are (label _ x, label _ y, label _ w, label _ h), where label _ x is (box _ x-roi _ x)/roi _ w, label _ y is (box _ y-roi _ y)/roi _ h, label _ w is box _ w/roi _ w, label _ h is box _ h/roi _ h; class label _ class, normal behavior class 0, call class 1, chat class 2, drink class 3, and grip-to-the-ear class 4.
(3) Training of the YOLOV3 algorithm. And training the clustering selection of the network candidate boxes, and clustering the proportion of the 6 candidate boxes by a k-mean (k-means clustering algorithm is a clustering analysis algorithm for iterative solution) algorithm, wherein the 6 candidate boxes are divided into 2 network scale outputs, namely each network scale output comprises 3 proportion candidate boxes. The main network is selected from VGG-mobile Net (a classical network structure) after a large number of cutting, the size of the network parameter after cutting is about 1m, and the network structure after cutting is not disclosed. The model training uses a deep learning framework, namely, cafe, and the training content is as follows: the method is characterized in that images with the size of 256 × 256 are input to predict whether dangerous driving behaviors exist, if the dangerous driving behaviors exist, the dangerous driving behaviors are specifically selected from the behaviors of calling, chatting, drinking or touching the ear and the face, and the accurate position where the dangerous behaviors occur is predicted. As shown in fig. 2.
Secondly, designing a fine secondary check classification model as follows:
(1) and (6) collecting a sample. Sample collection is carried out by intercepting sample areas from normal labeled labeling frames of calling, chatting, drinking and touching the face with ears, and background samples are randomly intercepted from areas without dangerous behaviors in the image 4. The sample size was normalized to 128 x 128, the background label was 0, the call label was 1, the chat label was 2, the drink label was 3, and the ear-to-touch label was 4.
(2) And selecting and training a model. The classification model selects a lightcnn (existing typical classification network structure) network structure, the lightcnn network structure is simplified on the original edition, the output of the network is 5 types of background labels, namely 0, 1 for calling, 2 for chatting, 3 for drinking and 4 for touching the face. The purpose of the fine quadratic verification algorithm is to classify the dangerous motion regions predicted by YOLOV3 more accurately, and finally, the input of the network is 128 × 128 in consideration of the fact that the input images have less discriminative detail loss and the network prediction speed cannot be too slow. Because the fine quadratic verification classification network is a multi-classification network, the target learning is directly performed by using Softmax cross entropy loss, the Softmax cross entropy loss is a typical loss function in a multi-classification problem, and the operation of the Softmax cross entropy loss function is carried in a caffe framework. And (5) finishing model training when the Softmax cross entropy loss is stably converged in a smaller area. The fine quadratic verification classification model prediction flow is shown in fig. 3.
And thirdly, as shown in fig. 4, the YOLOV3 multi-action detection algorithm is used in cooperation with the fine CNN check classification network to comprehensively predict the dangerous behaviors of the driver. The specific prediction process comprises the following 4 steps:
(1) an infrared image of a driver is acquired, and a face detection algorithm detects the face position of the driver. And determining a region to be analyzed for the dangerous behavior of the driver according to the real-time face position, wherein the region selection mode is that the upper edge of the face is fixed and a downward square is enlarged by 2.5 times, and the region is intercepted and then the size of the region is normalized to 256 × 256.
(2) And predicting the normalized prediction image block by using a trained dangerous behavior YOLOV3 detection model to predict whether dangerous behaviors occur in the image block. Specifically predicting which behavior is to make a call, chat about WeChat, drink water or touch the ear and face if a dangerous behavior occurs, and outputting a probability of the predicted behavior. Otherwise, the driver is considered to be driving normally.
(3) If YOLOV3 predicts that there is a behavior of calling, talking slightly, drinking or touching the face, the program does not need to give an alarm immediately, because the hand information detected by YOLOV3 is likely to be an important learning feature and easily causes the wrong detection of the behaviors of calling, talking slightly, drinking or touching the face. Therefore, the areas detected by YOLOV3 need to be checked by finer secondary scores, and the simplified lightcnn classification probability is output.
(4) The type of alarm is determined synthetically using the YOLOV3 and lightcnn classification results. If YOLOV3 detects that a certain dangerous behavior occurs, and lightcnn performs secondary verification on the region where the certain dangerous behavior occurs, the dangerous behavior is predicted to occur, and if two prediction results occur for 5 frames at the same time, the dangerous behavior is considered to occur; if YOLOV3 detects that a certain dangerous behavior occurs, but lightcnn secondary verification is carried out on the occurrence behavior area to predict that another dangerous behavior occurs, and 10 frames of continuous occurrence of the situation alarm is that lightcnn verifies the predicted dangerous behavior; if YOLOV3 detects that some dangerous behavior occurs, but lightcnn secondary verification of the behavior occurring area does not predict that dangerous behavior occurs, the driver is predicted to be driving normally.
Although specific embodiments of the invention have been described above, it will be understood by those skilled in the art that the specific embodiments described are illustrative only and are not limiting upon the scope of the invention, and that equivalent modifications and variations can be made by those skilled in the art without departing from the spirit of the invention, which is to be limited only by the appended claims.

Claims (10)

1. A driver dangerous action recognition method based on YOLOV3 is characterized in that: the method comprises the following steps:
step 1, acquiring an infrared image of a driver, detecting the face position of the driver by a face detection algorithm, and selecting a region to be identified of dangerous behaviors of the driver according to the face position;
step 2, detecting whether a dangerous action state of a driver occurs in the area to be identified by using a YOLOV3 model, if not, considering that the driver drives normally, and ending; if yes, obtaining a YOLOV3 recognition result, obtaining a YOLOV3 detection area, and entering the step 3;
step 3, identifying a YOLOV3 detection area through a lightcnn model, judging whether a dangerous action state of a driver occurs, if not, judging that the driver drives normally, if so, obtaining a lightcnn identification result, and if the lightcnn identification result is the same as the YOLOV3 identification result, continuously generating frames of a first set value, and sending an alarm; if the lightcnn recognition result is different from the YOLOV3 recognition result, and this situation occurs continuously for frames of the second setting value, an alarm is given.
2. The YOLOV 3-based driver dangerous motion recognition method as claimed in claim 1, wherein: the method further comprises a step 4 of uploading the identification result of the YOLOV3 and the identification result of the lightcnn to the cloud.
3. The YOLOV 3-based driver dangerous motion recognition method as claimed in claim 1, wherein: the step 1 is further specifically as follows: the method comprises the steps of obtaining an infrared image of a driver, detecting the face position of the driver by a face detection algorithm, fixing a downward square on the upper edge of the face, expanding a set multiple, then intercepting the area, normalizing the size of the area to a set size, and obtaining an area to be identified.
4. The YOLOV 3-based driver dangerous motion recognition method as claimed in claim 1, wherein: the training mode of the Yolov3 algorithm is as follows:
collecting a driver picture of an actual application scene; the dangerous behavior classification of the collected driver pictures comprises the following steps: normal behavior, call, chat, drink, hold ear and touch face 5 types; and the rectangular boxes of 4 types are marked in the occurrence areas of the actions of calling, chatting, drinking, grabbing ears and touching faces, so that the positions and the types of the actions in the whole picture are recorded, and the rectangular boxes are not required to be marked for normal actions;
performing face detection on a sample library picture to record positions (face _ x, face _ y, face _ w and face _ h), fixing the upper edge of a face, and expanding the face by a set multiple downwards to form a dangerous behavior analysis area (roi _ x, roi _ y, roi _ w and roi _ h), wherein roi _ x is face _ x-face _ w (2.5-1)/2, roi _ y is face _ y, roi _ w is 2.5 face _ w and roi _ h is 2.5 face _ h, and (roi _ x, roi _ y, roi _ w and roi _ h) are intercepted from a training sample preprocessing direct remaking image and normalized to a set size; label preprocessing, the labeling label learned by YOLOV3 is the relative offset of the frame, so the real training labels for the target frame are (label _ x, label _ y, label _ w, label _ h), where label _ x is (box _ x-roi _ x)/roi _ w, label _ y is (box _ y-roi _ y)/roi _ h, label _ w is box _ w/roi _ w, label _ h is box _ h/roi _ h; class label _ class, normal behavior class 0, call class 1, chat type 2, water class 3 and ear/face-catching class 4;
training clustering selection of the network candidate boxes, and clustering the proportion of the 6 candidate boxes through a k-mean algorithm, wherein the 6 candidate boxes are divided into 2 network scale outputs, namely each network scale output comprises 3 proportion candidate boxes; the main network is VGG-mobileNet with a large amount of parameters cut, a deep learning framework is used for model training, and the training content is as follows: the method is characterized in that whether dangerous driving behaviors exist or not is predicted by inputting images, if the dangerous driving behaviors exist, the dangerous driving behaviors are specifically selected from the behaviors of making a call, chatting, drinking water or touching the ear and the face, and the accurate position where the dangerous behaviors occur is predicted.
5. The YOLOV 3-based driver dangerous motion recognition method as claimed in claim 1, wherein: the lightcnn model training mode is as follows:
collecting a sample; sample collection is carried out by intercepting a sample area from a normal labeled marking frame for calling, chatting, drinking and touching the face with ears, and background samples are randomly intercepted from an area without dangerous behaviors in the 4 image; sample size normalized to 128 x 128;
selecting and training a model; the classification model selects a lightcnn network structure, the output of the network is 5 types, namely the normal behavior type is 0, the calling type is 1, the chatting type is 2, the drinking type is 3 and the ear-catching face-touching type is 4; performing target learning by using Softmax cross entropy loss, and carrying out operation of a Softmax cross entropy loss function in a caffe framework; and (5) finishing model training when the Softmax cross entropy loss is stably converged in a smaller area.
6. A driver dangerous action recognition system based on YOLOV3 is characterized in that: the method comprises the following steps:
the identification region determining module is used for acquiring an infrared image of a driver, detecting the face position of the driver by a face detection algorithm and selecting a region to be identified of dangerous behaviors of the driver according to the face position;
the first identification module detects whether a dangerous action state of a driver occurs in the area to be identified by using a YOLOV3 model, if not, the driver is considered to drive normally, and the operation is finished; if yes, obtaining a YOLOV3 recognition result, obtaining a YOLOV3 detection area, and entering a recognition and alarm module;
the recognition and alarm module is used for recognizing the area to be recognized detected by the YOLOV3 through a lightcnn model, judging whether a dangerous action state of a driver occurs or not, if not, judging that the driver drives normally, if so, obtaining a lightcnn recognition result, and if the lightcnn recognition result is the same as the YOLOV3 recognition result, continuously generating frames of a first set value, and sending an alarm; if the lightcnn recognition result is different from the YOLOV3 recognition result, and this situation occurs continuously for frames of the second setting value, an alarm is given.
7. The YOLOV 3-based driver hazard motion recognition system of claim 6, wherein: the system further comprises an uploading module which uploads the YOLOV3 recognition result and the lightcnn recognition result to the cloud.
8. The YOLOV 3-based driver hazard motion recognition system of claim 6, wherein: the identification area determining module is further specifically configured to: the method comprises the steps of obtaining an infrared image of a driver, detecting the face position of the driver by a face detection algorithm, fixing a downward square on the upper edge of the face, expanding a set multiple, then intercepting the area, normalizing the size of the area to a set size, and obtaining an area to be identified.
9. The YOLOV 3-based driver hazard motion recognition system of claim 6, wherein: the training mode of the Yolov3 algorithm is as follows:
collecting a driver picture of an actual application scene; the dangerous behavior classification of the collected driver pictures comprises the following steps: normal behavior, call, chat, believe, drink water and hold ears and touch face; and the rectangular boxes of 4 types are marked in the occurrence areas of the actions of calling, chatting, drinking, grabbing ears and touching faces, so that the positions and the types of the actions in the whole picture are recorded, and the rectangular boxes are not required to be marked for normal actions;
performing face detection on a sample library picture to record positions (face _ x, face _ y, face _ w and face _ h), fixing the upper edge of a face, and expanding the face by a set multiple downwards to form a dangerous behavior analysis area (roi _ x, roi _ y, roi _ w and roi _ h), wherein roi _ x is face _ x-face _ w (2.5-1)/2, roi _ y is face _ y, roi _ w is 2.5 face _ w and roi _ h is 2.5 face _ h, and (roi _ x, roi _ y, roi _ w and roi _ h) are intercepted from a training sample preprocessing direct remaking image and normalized to a set size; label preprocessing, the labeling label learned by YOLOV3 is the relative offset of the frame, so the real training labels for the target frame are (label _ x, label _ y, label _ w, label _ h), where label _ x is (box _ x-roi _ x)/roi _ w, label _ y is (box _ y-roi _ y)/roi _ h, label _ w is box _ w/roi _ w, label _ h is box _ h/roi _ h; class label _ class, normal behavior class 0, call class 1, chat type 2, water class 3 and ear/face-catching class 4;
training clustering selection of the network candidate boxes, and clustering the proportion of the 6 candidate boxes through a k-mean algorithm, wherein the 6 candidate boxes are divided into 2 network scale outputs, namely each network scale output comprises 3 proportion candidate boxes; the main network is VGG-mobileNet with a large amount of parameters cut, a deep learning framework is used for model training, and the training content is as follows: the method is characterized in that whether dangerous driving behaviors exist or not is predicted by inputting images, if the dangerous driving behaviors exist, the dangerous driving behaviors are specifically selected from the behaviors of making a call, chatting, drinking water or touching the ear and the face, and the accurate position where the dangerous behaviors occur is predicted.
10. The YOLOV 3-based driver hazard motion recognition system of claim 6, wherein: the lightcnn model training mode is as follows:
collecting a sample; sample collection is carried out by intercepting a sample area from a normal labeled marking frame for calling, chatting, drinking and touching the face with ears, and background samples are randomly intercepted from an area without dangerous behaviors in the 4 image; sample size normalized to 128 x 128;
selecting and training a model; the classification model selects a lightcnn network structure, the output of the network is 5 types, namely the normal behavior type is 0, the calling type is 1, the chatting type is 2, the drinking type is 3 and the ear-catching face-touching type is 4; performing target learning by using Softmax cross entropy loss, and carrying out operation of a Softmax cross entropy loss function in a caffe framework; and (5) finishing model training when the Softmax cross entropy loss is stably converged in a smaller area.
CN201911220885.5A 2019-12-03 2019-12-03 Driver dangerous action identification method and system based on YOLOV3 Active CN110969130B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911220885.5A CN110969130B (en) 2019-12-03 2019-12-03 Driver dangerous action identification method and system based on YOLOV3

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911220885.5A CN110969130B (en) 2019-12-03 2019-12-03 Driver dangerous action identification method and system based on YOLOV3

Publications (2)

Publication Number Publication Date
CN110969130A true CN110969130A (en) 2020-04-07
CN110969130B CN110969130B (en) 2023-04-18

Family

ID=70032784

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911220885.5A Active CN110969130B (en) 2019-12-03 2019-12-03 Driver dangerous action identification method and system based on YOLOV3

Country Status (1)

Country Link
CN (1) CN110969130B (en)

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111814637A (en) * 2020-06-29 2020-10-23 北京百度网讯科技有限公司 Dangerous driving behavior recognition method and device, electronic equipment and storage medium
CN111832450A (en) * 2020-06-30 2020-10-27 成都睿沿科技有限公司 Knife holding detection method based on image recognition
CN112699750A (en) * 2020-12-22 2021-04-23 南方电网深圳数字电网研究院有限公司 Safety monitoring method and system for intelligent gas station based on edge calculation and AI (Artificial Intelligence)
CN113033374A (en) * 2021-03-22 2021-06-25 开放智能机器(上海)有限公司 Artificial intelligence dangerous behavior identification method and device, electronic equipment and storage medium
CN113505709A (en) * 2021-07-15 2021-10-15 开放智能机器(上海)有限公司 Method and system for monitoring dangerous behaviors of human body in real time
CN114093019A (en) * 2020-07-29 2022-02-25 顺丰科技有限公司 Training method and device for throwing motion detection model and computer equipment
CN114266934A (en) * 2021-12-10 2022-04-01 上海应用技术大学 Dangerous action detection method based on cloud storage data
CN114724246A (en) * 2022-04-11 2022-07-08 中国人民解放军东部战区总医院 Dangerous behavior identification method and device
CN115546875A (en) * 2022-11-07 2022-12-30 科大讯飞股份有限公司 Multitask-based cabin internal behavior detection method, device and equipment
WO2023273060A1 (en) * 2021-06-30 2023-01-05 上海商汤临港智能科技有限公司 Dangerous action identifying method and apparatus, electronic device, and storage medium
EP3961498A4 (en) * 2020-06-29 2023-05-24 Beijing Baidu Netcom Science And Technology Co., Ltd. Dangerous driving behavior recognition method and apparatus, and electronic device and storage medium
TWI831524B (en) * 2022-12-15 2024-02-01 國立勤益科技大學 System and method for abnormal driving behavior detection based on spatial-temporal relationship between objects
CN117671592A (en) * 2023-12-08 2024-03-08 中化现代农业有限公司 Dangerous behavior detection method, dangerous behavior detection device, electronic equipment and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6227862B1 (en) * 1999-02-12 2001-05-08 Advanced Drivers Education Products And Training, Inc. Driver training system
CN107766876A (en) * 2017-09-19 2018-03-06 平安科技(深圳)有限公司 Driving model training method, driver's recognition methods, device, equipment and medium
CN109460699A (en) * 2018-09-03 2019-03-12 厦门瑞为信息技术有限公司 A kind of pilot harness's wearing recognition methods based on deep learning
CN109829386A (en) * 2019-01-04 2019-05-31 清华大学 Intelligent vehicle based on Multi-source Information Fusion can traffic areas detection method

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6227862B1 (en) * 1999-02-12 2001-05-08 Advanced Drivers Education Products And Training, Inc. Driver training system
CN107766876A (en) * 2017-09-19 2018-03-06 平安科技(深圳)有限公司 Driving model training method, driver's recognition methods, device, equipment and medium
CN109460699A (en) * 2018-09-03 2019-03-12 厦门瑞为信息技术有限公司 A kind of pilot harness's wearing recognition methods based on deep learning
CN109829386A (en) * 2019-01-04 2019-05-31 清华大学 Intelligent vehicle based on Multi-source Information Fusion can traffic areas detection method

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3961498A4 (en) * 2020-06-29 2023-05-24 Beijing Baidu Netcom Science And Technology Co., Ltd. Dangerous driving behavior recognition method and apparatus, and electronic device and storage medium
CN111814637A (en) * 2020-06-29 2020-10-23 北京百度网讯科技有限公司 Dangerous driving behavior recognition method and device, electronic equipment and storage medium
CN111832450A (en) * 2020-06-30 2020-10-27 成都睿沿科技有限公司 Knife holding detection method based on image recognition
CN111832450B (en) * 2020-06-30 2023-11-28 成都睿沿科技有限公司 Knife holding detection method based on image recognition
CN114093019A (en) * 2020-07-29 2022-02-25 顺丰科技有限公司 Training method and device for throwing motion detection model and computer equipment
CN112699750A (en) * 2020-12-22 2021-04-23 南方电网深圳数字电网研究院有限公司 Safety monitoring method and system for intelligent gas station based on edge calculation and AI (Artificial Intelligence)
CN113033374A (en) * 2021-03-22 2021-06-25 开放智能机器(上海)有限公司 Artificial intelligence dangerous behavior identification method and device, electronic equipment and storage medium
WO2023273060A1 (en) * 2021-06-30 2023-01-05 上海商汤临港智能科技有限公司 Dangerous action identifying method and apparatus, electronic device, and storage medium
CN113505709A (en) * 2021-07-15 2021-10-15 开放智能机器(上海)有限公司 Method and system for monitoring dangerous behaviors of human body in real time
CN114266934A (en) * 2021-12-10 2022-04-01 上海应用技术大学 Dangerous action detection method based on cloud storage data
CN114724246B (en) * 2022-04-11 2024-01-30 中国人民解放军东部战区总医院 Dangerous behavior identification method and device
CN114724246A (en) * 2022-04-11 2022-07-08 中国人民解放军东部战区总医院 Dangerous behavior identification method and device
CN115546875A (en) * 2022-11-07 2022-12-30 科大讯飞股份有限公司 Multitask-based cabin internal behavior detection method, device and equipment
TWI831524B (en) * 2022-12-15 2024-02-01 國立勤益科技大學 System and method for abnormal driving behavior detection based on spatial-temporal relationship between objects
CN117671592A (en) * 2023-12-08 2024-03-08 中化现代农业有限公司 Dangerous behavior detection method, dangerous behavior detection device, electronic equipment and storage medium
CN117671592B (en) * 2023-12-08 2024-09-06 中化现代农业有限公司 Dangerous behavior detection method, dangerous behavior detection device, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN110969130B (en) 2023-04-18

Similar Documents

Publication Publication Date Title
CN110969130B (en) Driver dangerous action identification method and system based on YOLOV3
CN110796046B (en) Intelligent steel slag detection method and system based on convolutional neural network
CN102332096B (en) Video caption text extraction and identification method
JP5127392B2 (en) Classification boundary determination method and classification boundary determination apparatus
CN101221623B (en) Object type on-line training and recognizing method and system thereof
US11335086B2 (en) Methods and electronic devices for automated waste management
CN113642474A (en) Hazardous area personnel monitoring method based on YOLOV5
CN110135327B (en) Driver behavior identification method based on multi-region feature learning model
Saleh et al. Traffic signs recognition and distance estimation using a monocular camera
CN102360434B (en) Target classification method of vehicle and pedestrian in intelligent traffic monitoring
TW200529093A (en) Face image detection method, face image detection system, and face image detection program
CN112733815B (en) Traffic light identification method based on RGB outdoor road scene image
CN115131590B (en) Training method of target detection model, target detection method and related equipment
WO2023241102A1 (en) Label recognition method and apparatus, and electronic device and storage medium
Satti et al. R‐ICTS: Recognize the Indian cautionary traffic signs in real‐time using an optimized adaptive boosting cascade classifier and a convolutional neural network
CN116152576B (en) Image processing method, device, equipment and storage medium
CN117372956A (en) Method and device for detecting state of substation screen cabinet equipment
Ciuntu et al. Real-time traffic sign detection and classification using machine learning and optical character recognition
CN117037081A (en) Traffic monitoring method, device, equipment and medium based on machine learning
CN104573663B (en) A kind of English scene character recognition method based on distinctive stroke storehouse
CN116258908A (en) Ground disaster prediction evaluation classification method based on unmanned aerial vehicle remote sensing image data
Rao et al. Convolutional Neural Network Model for Traffic Sign Recognition
CN113920327A (en) Insulator target identification method based on improved Faster Rcnn
Sharma et al. Smart vehicle accident detection system using faster r-cnn
CN112232124A (en) Crowd situation analysis method, video processing device and device with storage function

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant