WO2020253349A1 - 基于图像识别的驾驶行为预警方法、装置和计算机设备 - Google Patents
基于图像识别的驾驶行为预警方法、装置和计算机设备 Download PDFInfo
- Publication number
- WO2020253349A1 WO2020253349A1 PCT/CN2020/085576 CN2020085576W WO2020253349A1 WO 2020253349 A1 WO2020253349 A1 WO 2020253349A1 CN 2020085576 W CN2020085576 W CN 2020085576W WO 2020253349 A1 WO2020253349 A1 WO 2020253349A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- expression
- driver
- layer
- judgment result
- image
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/50—Context or environment of the image
- G06V20/59—Context or environment of the image inside of a vehicle, e.g. relating to seat occupancy, driver state or inner lighting conditions
- G06V20/597—Recognising the driver's state or behaviour, e.g. attention or drowsiness
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/174—Facial expression recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/20—Movements or behaviour, e.g. gesture recognition
Definitions
- This application belongs to the field of artificial intelligence technology, and in particular relates to a driving behavior early warning method, device, computer equipment, and computer-readable storage medium based on image recognition.
- the dangerous driving behavior of the driver is mainly through the hardware equipment installed on the car (such as OBD, namely: the full name in Chinese is the on-board diagnostic system, and the full name in English is: On-Board Diagnostics)
- OBD the hardware equipment installed on the car
- a driving violation such as a voice speed limit reminder when the current speed is detected.
- the embodiments of the present application provide a driving behavior early warning method, device, computer equipment, and computer readable storage medium based on image recognition to solve the problem of low warning accuracy in the existing driving behavior early warning method.
- the first aspect of the embodiments of the present application provides a driving behavior early warning method based on image recognition, which includes: acquiring a driver's face image and a driver's body motion image during the running of the vehicle; and acquiring according to the driver's face image Corresponding expression information; obtain the driver’s driving action information according to the driver’s body motion image; match the expression information with the dangerous expression set, and obtain the corresponding expression judgment result; compare the driving action information with the danger The driving action set is matched and judged, and the corresponding driving action judgment result is obtained; if the expression judgment result meets the preset condition and/or the driving action judgment result satisfies the preset condition, an alarm is issued.
- a second aspect of the embodiments of the present application provides a driving behavior early warning device based on image recognition, including: a first acquisition module, configured to acquire a driver's face image and a driver's body motion image during the driving of the vehicle; second The obtaining module is used to obtain corresponding facial expression information according to the driver's face image; the third obtaining module is used to obtain the driver's driving action information according to the driver's body motion image; the first matching judgment module is used to The expression information and the dangerous expression set are matched and judged, and the corresponding expression judgment result is obtained; the second matching judgment module is used to perform matching judgment on the driving action information and the dangerous driving action set, and obtain the corresponding driving action Judgment result; an alarm prompt module for sending an alarm prompt if the expression judgment result meets a preset condition and/or the driving action judgment result meets a preset condition.
- the third aspect of the embodiments of the present application provides a computer device, including: a memory, a processor, and a computer program stored in the memory and running on the processor, and the processor executes the computer program
- a computer device including: a memory, a processor, and a computer program stored in the memory and running on the processor, and the processor executes the computer program
- the fourth aspect of the embodiments of the present application provides a computer-readable storage medium, the computer-readable storage medium stores a computer program, and when the computer program is executed by a processor, the following steps are implemented: Driver’s facial image and driver’s body motion image; acquiring corresponding facial expression information based on the driver’s facial image; acquiring driver’s driving motion information based on the driver’s body motion image; combining the facial expression information with dangerous facial expressions Perform matching judgment and obtain the corresponding expression judgment result; perform matching judgment on the driving action information and the dangerous driving action set, and obtain the corresponding driving action judgment result; if the expression judgment result meets the preset conditions and/or If the judgment result of the driving action satisfies the preset condition, an alarm is issued.
- This application can measure the driver’s hazard information during driving from multiple dimensions, give warnings in advance, improve the accuracy and comprehensiveness of the warning, and solve the problem of a single reference dimension of the warning, which leads to low warning accuracy.
- FIG. 1 is a schematic diagram of an application environment of a driving behavior early warning method based on image recognition in an embodiment of the present application
- FIG. 2 is a schematic diagram of the implementation process of the driving behavior early warning method based on image recognition provided in Embodiment 1 of the present application;
- FIG. 3 is a schematic diagram of a driving behavior early warning device based on image recognition provided in Embodiment 2 of the present application;
- FIG. 4 is a schematic diagram of the first acquisition module in the image recognition-based driving behavior early warning device provided by an embodiment of the present application;
- FIG. 5 is another schematic diagram of a driving behavior early warning device based on image recognition provided by an embodiment of the present application.
- FIG. 6 is a schematic diagram of the second acquisition module in the driving behavior early warning device based on image recognition provided by an embodiment of the present application;
- FIG. 7 is a schematic diagram of the first matching judgment module in the driving behavior early warning device based on image recognition provided by an embodiment of the present application;
- FIG. 8 is a schematic diagram of a computer device provided in Embodiment 3 of the present application.
- the driving behavior early warning method based on image recognition can be applied in the application environment as shown in FIG. 1, wherein the client communicates with the server through the network.
- the client can be, but is not limited to, various personal computers, notebook computers, smart phones, tablet computers, and portable wearable devices.
- the server can be implemented as an independent server or a server cluster composed of multiple servers.
- FIG. 2 shows a schematic diagram of the implementation process of the image recognition-based driving behavior early warning method provided in Embodiment 1 of the present application.
- the driving behavior early warning method based on image recognition specifically includes the following steps 101 to 106, which are detailed as follows:
- Step 101 Obtain the driver's face image and the driver's body motion image during the running of the vehicle.
- one or a group of web cameras installed in the car can be used to photograph the driver’s face and body parts (hands) during the driving of the vehicle to obtain information including the driver’s face.
- Photo or video information If the photo is taken, the driver’s face and limbs can be captured at the set time interval, and the captured photos will be adjusted accordingly (such as cropping) to obtain the driver’s face and limbs Action image.
- the driver's body motion image is acquired during the driving of the vehicle, which may specifically include step 201, step 202, and step 203, which are detailed as follows:
- Step 201 Record the driver's video in real time while the vehicle is running.
- Step 202 Determine each frame drawing time point at equal intervals from the start time point of the video clip in the video information according to a preset time interval.
- Step 203 Extract the video frame corresponding to each frame drawing time point in the video clip to obtain each body motion image.
- a terminal device (such as a vehicle-mounted terminal) can be used to record the driver's video in real time while the vehicle is running.
- the server when it extracts the body motion image from the video clip in the video information, it may specifically extract the video frames in the video clip at equal intervals.
- the server may determine each frame drawing time point at equal intervals from the start time point of the video clip according to a preset time interval.
- the preset time interval can be set according to actual needs, for example, set to 100 milliseconds, that is, one video frame is extracted every 100 milliseconds. For example, if the total duration of the video clip is 2 minutes, that is, 120s, and its start time point is 0, the server will determine each frame time point on the video clip as 100ms, 200ms, 300ms, 400ms,... ...And so on, the last frame drawing time point is the position of 120s. Therefore, a total of 1200 determined frame drawing time points in the 2-minute video clip can be obtained.
- step 203 it can be understood that after determining each frame extraction time point, it is equivalent to determining which video frames should be extracted from the video clip this time as the body motion images.
- the server may extract video frames corresponding to each of the frame drawing time points in the video clip to obtain each body motion image. Continuing the above example, that is, the server extracts the video frames at the time points of 100ms, 200ms, 300ms, 400ms, ..., 120s on the video clip, and obtains a total of 1200 video frames as the respective body actions Images, that is, a total of 1200 body motion images.
- Step 102 Obtain corresponding facial expression information according to the driver's face image.
- step 102 specifically includes step 301 and step 302, which are detailed as follows:
- Step 301 Combine the action morphological features of various parts of the face on the driver's face image to obtain an action morphological feature set.
- the action morphological features of each part of the face on the driver's face image include action morphological features corresponding to salient expressions and micro-expressions. Since the action morphological features are all reflected in all parts of the face on the face image, the salient expressions The action morphological features corresponding to the micro-expression are mixed together, so the action morphological feature set is obtained first, which is convenient for the next step to identify separately.
- Step 302 Perform matching analysis on the action form feature set with the salient expression database and the micro-expression database to obtain corresponding salient expression information and micro-expression information, and use the salient expression information and the micro-expression information as the expressions information.
- the micro-expression database includes, but is not limited to, the following micro-expression action morphological features.
- the action morphological features of full angry expression are as follows:
- the lower eyelid is tight.
- the shape of the upper eyelid matches the tight lower eyelid, which is called glare.
- the suffocated angry expression action form feature combination is as follows:
- the eyebrows are flattened as a whole and still remain twisted.
- the eyebrows are raised, but the degree is slightly reduced; the frown muscles cause slight longitudinal wrinkles.
- the degree of eye opening increases, but it is not exaggerated.
- the upper eyelid lift is not as obvious as the expression of fear and fear, but the area exposed on the upper edge of the iris is larger than that of a normal relaxed face.
- the raised eyebrows and twisted eyebrows indicate that the heart is under pressure, but not disgust or anger.
- the combination of slightly worried action morphology features is as follows: At this time, the eyebrows are not greatly improved, only the uplift of the eyebrows and the straight and twisted shape of the eyebrows can be observed; the eyelids are natural as a whole, but the upper eyelids are still in a slightly higher position than normal. The exposed iris area is larger. This combination of eyebrows and eyes is the morphological feature of the micro-expression movement of fear.
- Step 103 Acquire driving motion information of the driver according to the driver's body motion image.
- the joint information and angle information of the limbs in the limb motion image are analyzed, and the joint information and angle information are queried in a pre-stored mapping table to match the driving motion of the driver.
- the mapping table includes a one-to-one correspondence between section information and angle information, driving motion, joint information, and angle information, and driving motion.
- a pre-trained driving motion recognition model can also be used to recognize the input body motion image and output the driver's driving motion.
- Step 104 Perform matching judgment on the expression information and the set of dangerous expressions, and obtain a corresponding expression judgment result.
- the dangerous expression set includes a dangerous salient expression set and a dangerous micro expression set
- step 104 specifically includes step 401, step 402, and step 403, as detailed below :
- Step 401 Perform matching judgment on the salient expression information and the dangerous salient expression set, and obtain a corresponding salient expression judgment result.
- the set of dangerously significant expressions includes but is not limited to yawning and blinking and closing eyes. If the salient expression information is successfully matched with the dangerous salient expression set, the corresponding salient expression judgment result is that there is a dangerous salient expression.
- Step 402 Perform matching judgment on the micro-expression information and the dangerous micro-expression set, and obtain a corresponding micro-expression judgment result.
- the set of dangerous micro-expression includes but is not limited to pain, anger, and fear.
- the micro-expression information includes at least one of pain, anger, and fear, it is determined that the corresponding micro-expression judgment result is that there is a dangerous micro-expression.
- Step 403 Synthesize the salient expression judgment result and the micro expression judgment result to obtain the expression judgment result.
- the expression judgment result includes the presence of a dangerously significant expression and a dangerous micro-expression, no dangerously significant expression and no dangerous micro-expression, a dangerously significant expression and no dangerous micro-expression, and a dangerous micro-expression and no dangerously significant expression.
- Step 105 Perform matching judgment on the driving action information and the dangerous driving action set, and obtain the corresponding driving action judgment result.
- the dangerous driving action set includes, but is not limited to, the action of making a phone call, the action of not holding the steering wheel with both hands, and the action of eating. If the driving action information contains any element in the dangerous driving action set, it is determined that the corresponding driving action judgment result is that there is a dangerous driving action. If the driving action information does not include any element in the dangerous driving action set, then Determine that the corresponding driving action judgment result is that there is no dangerous driving action.
- Step 106 If the expression judgment result meets the preset condition and/or the driving action judgment result meets the preset condition, an alarm is issued.
- step 106 for example, if one of the pre-set warning conditions is when the driver is angry, an alarm is issued, and when it is determined that the driver's expression is full of anger, a warning message is issued to the driver. If one of the pre-set warning conditions is when the driver leaves the steering wheel with both hands in the process of driving the car, a warning message is issued to the driver. If one of the pre-set warning conditions is that the driver’s expression is uneasy and he is making a call, then a warning message is issued to the driver. Preferably, the way of issuing warning information to the driver may be to broadcast a warning voice message.
- step 101 the following steps are further included:
- Step 501 Obtain all face images from the identity information database, remember that all face images are N, and N is an integer greater than 1.
- the identity information database can be updated. It is understandable that when the driver's face image is not in the identity information database, the update will store the driver's face image in the identity information database, ensuring the comprehensiveness of the identity information database.
- Step 502 Input a total of N+1 face images of the N face images and the driver's face image into a preset neural network, and output from the hidden layer of the preset neural network and N+1 face images The corresponding feature vector of N+1 face images.
- a human face image is an image including a human face, where the human face may also refer to the face of an individual on the electronic ID card.
- the hidden layer in this embodiment is any layer except the last layer in the preset neural network.
- the preset neural network is trained based on preset images with faces and corresponding group numbers of people, and images with the same group number are all face images of the same person.
- each face image input to the preset neural network will output a vector, which is used as the feature vector of the corresponding face image.
- the dimension of the feature vector in this embodiment may be 512.
- the preset neural network is a deep convolutional neural network, and the structure of a general deep convolutional neural network is generally greater than or equal to 5 layers; in this embodiment, in order to reduce the amount of calculation and increase the calculation speed, the preset neural network can adopt a simplified design. Choose the appropriate number of network layers, and remove layers such as the normalization layer and batch normalization layer.
- the first layer structure includes: a first convolution layer, a first activation layer, and a first down-sampling layer
- the second layer structure includes: a second Convolutional layer, second activation layer and second downsampling layer
- third layer structure includes: third convolutional layer, third activation layer, and third downsampling layer
- fourth layer structure includes: fourth a convolution Layer, the fourth a activation layer, the fourth b convolutional layer, the fourth b activation layer, and the fourth downsampling layer
- the fifth layer structure includes: the fifth convolution layer and the fifth activation layer
- the sixth layer structure includes: The first fully connected layer; the second fully connected layer.
- the convolution kernel of each convolution layer in the preset neural network is different, and the activation function of each activation layer is also different.
- the preferred hidden layer is the penultimate layer of the preset neural network, that is, the first fully connected layer.
- the output of the first fully connected layer has reached a low dimensionality and has the tightest facial features. Therefore, the first fully connected layer is used as a preferred hidden layer.
- the fourth a convolutional layer and the fourth a activation layer can be replaced by the first subconvolution layer, the first sub activation layer, the second subconvolution layer, and the second sub activation layer.
- the fifth convolution layer can be Replaced by the third sub-convolutional layer, the third sub-activation layer, the fourth sub-convolutional layer and the fourth sub-activation layer; among them, the convolution kernel of each sub-convolutional layer is different, and the activation function of the sub-activation layer is also different .
- a branch is added after the third downsampling layer.
- the output of the third downsampling layer and the output of the second subconvolutional layer are input into the fourth convolutional layer together.
- the fourth downsampling layer Later, a branch was added, and the output of the fourth downsampling layer and the output of the fifth convolutional layer were input into the first fully connected layer together.
- the above two branches can accelerate the convergence of the model and improve the accuracy.
- the output dimension of the second fully connected layer is the number of people in the training set.
- the output vector of the preset neural network should be (1 ,0,0,0,0,0,0,0); that is, what is the group number, then the corresponding bit in the vector is 1.
- the face vector is not the face images of the above 8 people again, then there is no group number corresponding to the output of the preset neural network, and the output result of the vector is artificially set and cannot express the facial features, so It is necessary to select the output of the hidden layer containing the facial features as the feature vector, so that even the facial images of people who have never been input to the preset neural network can be well recognized.
- Step 503 Determine the distances between the feature vectors of the driver's face image and the N feature vectors of the face images, respectively, to obtain N vector distances.
- step 503 the distances between the feature vector of the driver's face image and the feature vectors of the N second face images are respectively determined.
- the distance can be calculated by the Euler distance formula and the equidistance formula, which is not limited in this embodiment.
- Step 504 When one of the vector distances among the N vector distances is less than a preset distance reference value, it is determined that the face image of the driver and the face image corresponding to the one of the vector distances are of the same driver A face image, and the driver’s identity information is determined according to the identity information database.
- the preset distance reference value is used to divide the distance between the face images belonging to the same driver and the distance between the face images belonging to different drivers. If the distance is less than the preset distance reference value, it means that the driver faces of the two face images belong to the same driver, and the distance is greater than or equal to the preset distance reference value, which means that the two face images belong to two different drivers.
- FIG. 3 shows a schematic diagram of a driving behavior early warning device 30 based on image recognition provided in the second embodiment of the present application.
- the driving behavior early warning device 30 based on image recognition includes: a first acquisition module 31, a second acquisition module 32, a third acquisition module 33, a first matching judgment module 34, a second matching judgment module 35, and an alarm prompt module 36 .
- the specific functions of each module are as follows:
- the first acquisition module 31 is used to acquire the driver's face image and the driver's limb motion image during the running of the vehicle.
- the second acquiring module 32 is configured to acquire corresponding facial expression information according to the driver's face image.
- the third acquisition module 33 is configured to acquire the driver's driving motion information according to the driver's limb motion image.
- the first matching judgment module 34 is configured to perform matching judgment on the expression information and the dangerous expression set, and obtain a corresponding expression judgment result.
- the second matching judgment module 35 is used to perform matching judgment on the driving action information and the dangerous driving action set, and obtain the corresponding driving action judgment result.
- the warning prompt module 36 is configured to send an alarm prompt if the expression judgment result meets a preset condition and/or the driving action judgment result meets a preset condition.
- the first obtaining module 31 includes:
- the recording unit 311 is used to record the video of the driver in real time during the driving of the vehicle.
- the determining unit 312 is configured to determine each frame drawing time point at equal intervals according to a preset time interval from the starting point of the shooting time of the video.
- the frame extraction unit 313 is used to extract the video frames corresponding to each frame extraction time point in the video to obtain each driver's body motion image.
- the driving behavior early warning device 30 based on image recognition further includes:
- the fourth acquisition module 37 is configured to acquire all face images from the identity information database, and remember that all the face images are N, and N is an integer greater than 1.
- the output module 38 is configured to input a total of N+1 face images of N face images and the driver's face image into a preset neural network, and output the same N+1 face images from the hidden layer of the preset neural network The feature vector of the N+1 face images corresponding to the face image.
- the first determining module 39 is configured to determine the distances between the feature vector of the driver's face image and the N feature vectors of the face image respectively to obtain N vector distances.
- the second determining module 310 is configured to determine the face image of the driver and the face image corresponding to the one of the vector distances when one of the vector distances among the N vector distances is less than a preset distance reference value It is the face image of the same driver, and the driver's identity information is determined according to the identity information database.
- the second acquiring module 32 includes:
- the combining unit 321 is configured to combine the action morphological features of various parts of the face on the driver's face image to obtain an action morphological feature set.
- the analysis unit 322 is configured to perform matching analysis on the action morphological feature set with the salient expression database and the micro expression database to obtain corresponding salient expression information and micro expression information, and use the salient expression information and the micro expression information as The expression information.
- the first matching judgment module 34 includes:
- the first matching judgment unit 341 is configured to perform matching judgment between the salient expression information and the dangerous salient expression set, and obtain a corresponding salient expression judgment result.
- the second matching judgment unit 342 is configured to perform matching judgment on the micro-expression information and the dangerous micro-expression set, and obtain a corresponding micro-expression judgment result.
- the comprehensive judgment unit 343 is configured to combine the salient expression judgment result and the micro-expression judgment result to obtain the expression judgment result.
- the various modules in the above-mentioned image recognition-based driving behavior early warning device can be implemented in whole or in part by software, hardware, and combinations thereof.
- the foregoing modules may be embedded in the form of hardware or independent of the processor in the computer device, or may be stored in the memory of the computer device in the form of software, so that the processor can call and execute the operations corresponding to the foregoing modules.
- a computer device is provided.
- the computer device may be a client, and its internal structure diagram may be as shown in FIG. 8.
- the computer equipment includes a processor, a memory, a network interface and a database connected through a system bus. Among them, the processor of the computer device is used to provide calculation and control capabilities.
- the memory of the computer device includes a non-volatile storage medium and an internal memory.
- the non-volatile storage medium stores an operating system, a computer program, and a database.
- the internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage medium.
- the database of the computer equipment is used to store the data involved in the driving behavior early warning method based on image recognition.
- the network interface of the computer device is used to communicate with an external terminal through a network connection.
- the computer program is executed by the processor to realize a driving behavior early warning method based on image recognition.
- a computer device including a memory, a processor, and a computer program stored in the memory and running on the processor.
- the processor executes the computer program to implement driving based on image recognition in the above embodiments.
- the steps of the behavior warning method for example, step 101 to step 106 shown in FIG. 2.
- the processor executes the computer program, the function of each module/unit of the driving behavior early warning device based on image recognition in the above embodiment is realized, for example, the functions of modules 31 to modules 3 and 6 shown in FIG. 3. To avoid repetition, I won’t repeat them here.
- a computer-readable storage medium may be non-volatile or volatile, and has a computer program stored thereon.
- the steps of the driving behavior warning method based on image recognition in the foregoing embodiment are implemented, for example, step 101 to step 106 shown in FIG. 2.
- the computer program when executed by the processor, realizes the functions of the various modules/units of the driving behavior early warning device based on image recognition in the above embodiments, such as the functions of modules 31 to 36 shown in FIG. 3. To avoid repetition, I won’t repeat them here.
- Non-volatile memory may include read only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory.
- Volatile memory may include random access memory (RAM) or external cache memory.
- RAM is available in many forms, such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double data rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous chain Road (SyNchliNk) DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), etc.
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Theoretical Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Human Computer Interaction (AREA)
- Oral & Maxillofacial Surgery (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Psychiatry (AREA)
- Social Psychology (AREA)
- Image Processing (AREA)
- Image Analysis (AREA)
Abstract
Description
Claims (21)
- 一种基于图像识别的驾驶行为预警方法,其中,包括:在车辆行驶过程中获取驾驶员人脸图像和驾驶员肢体动作图像;根据所述驾驶员人脸图像获取相应的表情信息;根据所述驾驶员肢体动作图像获取驾驶员的驾驶动作信息;将所述表情信息与危险表情集合进行匹配判断,并得到相应的表情判断结果;将所述驾驶动作信息与危险驾驶动作集合进行匹配判断,并得到相应的驾驶动作判断结果;若所述表情判断结果满足预设条件和/或所述驾驶动作判断结果满足预设条件,则发出告警提示。
- 如权利要求1所述的基于图像识别的驾驶行为预警方法,其中,在所述车辆行驶过程中获取驾驶员人脸图像和驾驶员肢体动作图像中,获取驾驶员肢体动作图像包括:在车辆行驶过程中实时录制驾驶员的视频;从所述视频的拍摄时间起始点开始按照预设时间间隔等间距地确定出各个抽帧时间点;将所述视频中与各个所述抽帧时间点对应的视频帧抽取出来,得到各个驾驶员肢体动作图像。
- 如权利要求1所述的基于图像识别的驾驶行为预警方法,其中,在所述在车辆行驶过程中获取驾驶员人脸图像和驾驶员肢体动作图像之后,还包括:从身份信息数据库中获取所有的人脸图像,记所述所有的人脸图像为N张,N为大于1的整数;将N张人脸图像和所述驾驶员人脸图像共N+1张人脸图像输入预设神经网络,从所述预设神经网络的隐藏层输出与N+1张人脸图像对应的N+1张人脸图像的特征向量;分别确定所述驾驶员人脸图像的特征向量与N个所述人脸图像的特 征向量之间的距离,以得到N个向量距离;当所述N个向量距离中的其中一个向量距离小于预设距离参考值时,则确定所述驾驶员人脸图像和所述其中一个向量距离对应的人脸图像是同一驾驶员的人脸图像,并根据所述身份信息数据库确定所述驾驶员的身份信息。
- 如权利要求3所述的基于图像识别的驾驶行为预警方法,其中,所述预设神经网络的模型图从输入到输出的方向由第一层至第六层共六层构成,第一层结构包括:第一卷积层、第一激活层和第一下采样层;第二层结构包括:第二卷积层、第二激活层和第二下采样层;第三层结构包括:第三卷积层、第三激活层和第三层下采样层;第四层结构包括:第四卷积层、第四激活层、另一第四卷积层、另一第四激活层和第四下采样层;第五层结构包括:第五卷积层和第五激活层;第六层结构包括:第一全连接层和第二全连接层。
- 如权利要求4所述的基于图像识别的驾驶行为预警方法,其中,所述预设神经网络的隐藏层是所述第一全连接层,通过所述第一全连接层输出与N+1张人脸图像对应的N+1张人脸图像的特征向量。
- 如权利要求1所述的基于图像识别的驾驶行为预警方法,其中,根据所述驾驶员人脸图像获取相应的表情信息包括:对所述驾驶员人脸图像上的脸部各个部位的动作形态特征进行组合,得到动作形态特征集合;将所述动作形态特征集合与显著表情数据库及微表情数据库进行匹配分析以得到相应的显著表情信息与微表情信息,并将所述显著表情信息与所述微表情信息作为所述表情信息。
- 如权利要求6所述的基于图像识别的驾驶行为预警方法,其中,所述危险表情集合包括危险显著表情集合和危险微表情集合;所述将所述表情信息与危险表情集合进行匹配判断,并得到相应的表情判断结果包括:将所述显著表情信息与所述危险显著表情集合进行匹配判断,并得到相应的显著表情判断结果;将所述微表情信息与所述危险微表情集合进行匹配判断,并得到相应的微表情判断结果;综合所述显著表情判断结果和所述微表情判断结果以得到所述表情判断结果。
- 一种计算机设备,包括存储器、处理器以及存储在所述存储器中并可在所述处理器上运行的计算机程序,其中,所述处理器执行所述计算机程序时实现如下步骤:在车辆行驶过程中获取驾驶员人脸图像和驾驶员肢体动作图像;根据所述驾驶员人脸图像获取相应的表情信息;根据所述驾驶员肢体动作图像获取驾驶员的驾驶动作信息;将所述表情信息与危险表情集合进行匹配判断,并得到相应的表情判断结果;将所述驾驶动作信息与危险驾驶动作集合进行匹配判断,并得到相应的驾驶动作判断结果;若所述表情判断结果满足预设条件和/或所述驾驶动作判断结果满足预设条件,则发出告警提示。
- 如权利要求8所述的计算机设备,其中,在所述车辆行驶过程中获取驾驶员人脸图像和驾驶员肢体动作图像中,获取驾驶员肢体动作图像包括:在车辆行驶过程中实时录制驾驶员的视频;从所述视频的拍摄时间起始点开始按照预设时间间隔等间距地确定出各个抽帧时间点;将所述视频中与各个所述抽帧时间点对应的视频帧抽取出来,得到各个驾驶员肢体动作图像。
- 如权利要求8所述的计算机设备,其中,在所述在车辆行驶过程中获取驾驶员人脸图像和驾驶员肢体动作图像之后,还包括:从身份信息数据库中获取所有的人脸图像,记所述所有的人脸图像为N张,N为大于1的整数;将N张人脸图像和所述驾驶员人脸图像共N+1张人脸图像输入预设神经网络,从所述预设神经网络的隐藏层输出与N+1张人脸图像对应的N+1张人脸图像的特征向量;分别确定所述驾驶员人脸图像的特征向量与N个所述人脸图像的特征向量之间的距离,以得到N个向量距离;当所述N个向量距离中的其中一个向量距离小于预设距离参考值时,则确定所述驾驶员人脸图像和所述其中一个向量距离对应的人脸图像是同一驾驶员的人脸图像,并根据所述身份信息数据库确定所述驾驶员的身份信息。
- 如权利要求10所述的计算机设备,其中,所述预设神经网络的模型图从输入到输出的方向由第一层至第六层共六层构成,第一层结构包括:第一卷积层、第一激活层和第一下采样层;第二层结构包括:第二卷积层、第二激活层和第二下采样层;第三层结构包括:第三卷积层、第三激活层和第三层下采样层;第四层结构包括:第四卷积层、第四激活层、另一第四卷积层、另一第四激活层和第四下采样层;第五层结构包括:第五卷积层和第五激活层;第六层结构包括:第一全连接层和第二全连接层。
- 如权利要求11所述的计算机设备,其中,所述预设神经网络的隐藏层是所述第一全连接层,通过所述第一全连接层输出与N+1张人脸图像对应的N+1张人脸图像的特征向量。
- 如权利要求8所述的计算机设备,其中,根据所述驾驶员人脸图像获取相应的表情信息包括:对所述驾驶员人脸图像上的脸部各个部位的动作形态特征进行组合,得到动作形态特征集合;将所述动作形态特征集合与显著表情数据库及微表情数据库进行匹配分析以得到相应的显著表情信息与微表情信息,并将所述显 著表情信息与所述微表情信息作为所述表情信息。
- 如权利要求13所述的计算机设备,其中,所述危险表情集合包括危险显著表情集合和危险微表情集合;所述将所述表情信息与危险表情集合进行匹配判断,并得到相应的表情判断结果包括:将所述显著表情信息与所述危险显著表情集合进行匹配判断,并得到相应的显著表情判断结果;将所述微表情信息与所述危险微表情集合进行匹配判断,并得到相应的微表情判断结果;综合所述显著表情判断结果和所述微表情判断结果以得到所述表情判断结果。
- 一种计算机可读存储介质,所述计算机可读存储介质存储有计算机程序,其中,所述计算机程序被处理器执行时实现如下步骤:在车辆行驶过程中获取驾驶员人脸图像和驾驶员肢体动作图像;根据所述驾驶员人脸图像获取相应的表情信息;根据所述驾驶员肢体动作图像获取驾驶员的驾驶动作信息;将所述表情信息与危险表情集合进行匹配判断,并得到相应的表情判断结果;将所述驾驶动作信息与危险驾驶动作集合进行匹配判断,并得到相应的驾驶动作判断结果;若所述表情判断结果满足预设条件和/或所述驾驶动作判断结果满足预设条件,则发出告警提示。
- 如权利要求15所述的计算机可读存储介质,其中,在所述车辆行驶过程中获取驾驶员人脸图像和驾驶员肢体动作图像中,获取驾驶员肢体动作图像包括:在车辆行驶过程中实时录制驾驶员的视频;从所述视频的拍摄时间起始点开始按照预设时间间隔等间距地确定出各个抽帧时间点;将所述视频中与各个所述抽帧时间点对应的视频帧抽取出来,得到各个驾驶员肢体动作图像。
- 如权利要求15所述的计算机可读存储介质,其中,在所述在车辆行驶过程中获取驾驶员人脸图像和驾驶员肢体动作图像之后,还包括:从身份信息数据库中获取所有的人脸图像,记所述所有的人脸图像为N张,N为大于1的整数;将N张人脸图像和所述驾驶员人脸图像共N+1张人脸图像输入预设神经网络,从所述预设神经网络的隐藏层输出与N+1张人脸图像对应的N+1张人脸图像的特征向量;分别确定所述驾驶员人脸图像的特征向量与N个所述人脸图像的特征向量之间的距离,以得到N个向量距离;当所述N个向量距离中的其中一个向量距离小于预设距离参考值时,则确定所述驾驶员人脸图像和所述其中一个向量距离对应的人脸图像是同一驾驶员的人脸图像,并根据所述身份信息数据库确定所述驾驶员的身份信息。
- 如权利要求17所述的计算机可读存储介质,其中,所述预设神经网络的模型图从输入到输出的方向由第一层至第六层共六层构成,第一层结构包括:第一卷积层、第一激活层和第一下采样层;第二层结构包括:第二卷积层、第二激活层和第二下采样层;第三层结构包括:第三卷积层、第三激活层和第三层下采样层;第四层结构包括:第四卷积层、第四激活层、另一第四卷积层、另一第四激活层和第四下采样层;第五层结构包括:第五卷积层和第五激活层;第六层结构包括:第一全连接层和第二全连接层。
- 如权利要求18所述的计算机可读存储介质,其中,所述预设神经网络的隐藏层是所述第一全连接层,通过所述第一全连接层输出与N+1张人脸图像对应的N+1张人脸图像的特征向量。
- 如权利要求15所述的计算机可读存储介质,其中,根据所述驾驶 员人脸图像获取相应的表情信息包括:对所述驾驶员人脸图像上的脸部各个部位的动作形态特征进行组合,得到动作形态特征集合;将所述动作形态特征集合与显著表情数据库及微表情数据库进行匹配分析以得到相应的显著表情信息与微表情信息,并将所述显著表情信息与所述微表情信息作为所述表情信息。
- 如权利要求16所述的计算机可读存储介质,其中,所述危险表情集合包括危险显著表情集合和危险微表情集合;所述将所述表情信息与危险表情集合进行匹配判断,并得到相应的表情判断结果包括:将所述显著表情信息与所述危险显著表情集合进行匹配判断,并得到相应的显著表情判断结果;将所述微表情信息与所述危险微表情集合进行匹配判断,并得到相应的微表情判断结果;综合所述显著表情判断结果和所述微表情判断结果以得到所述表情判断结果。
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910532866.XA CN110399793A (zh) | 2019-06-19 | 2019-06-19 | 基于图像识别的驾驶行为预警方法、装置和计算机设备 |
CN201910532866.X | 2019-06-19 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2020253349A1 true WO2020253349A1 (zh) | 2020-12-24 |
Family
ID=68324171
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2020/085576 WO2020253349A1 (zh) | 2019-06-19 | 2020-04-20 | 基于图像识别的驾驶行为预警方法、装置和计算机设备 |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN110399793A (zh) |
WO (1) | WO2020253349A1 (zh) |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112634188A (zh) * | 2021-02-02 | 2021-04-09 | 深圳市爱培科技术股份有限公司 | 一种车辆远近景组合成像方法及装置 |
CN112820072A (zh) * | 2020-12-28 | 2021-05-18 | 深圳壹账通智能科技有限公司 | 危险驾驶预警方法、装置、计算机设备及存储介质 |
CN112959966A (zh) * | 2021-04-26 | 2021-06-15 | 积善云科技(武汉)有限公司 | 基于互联网出行和用户习惯深度学习的车载多媒体调控方法、系统、设备和计算机存储介质 |
CN113347381A (zh) * | 2021-05-24 | 2021-09-03 | 随锐科技集团股份有限公司 | 预测不雅举止轨迹的方法及系统 |
CN113723165A (zh) * | 2021-03-25 | 2021-11-30 | 山东大学 | 基于深度学习的待检测人员危险表情检测方法及系统 |
CN114475620A (zh) * | 2022-01-26 | 2022-05-13 | 南京科融数据系统股份有限公司 | 用于款箱押运系统的驾驶员验证方法及系统 |
CN116729254A (zh) * | 2023-08-10 | 2023-09-12 | 山东恒宇电子有限公司 | 基于俯瞰图像的公交驾驶舱安全驾驶行为监测系统 |
CN117493434A (zh) * | 2023-11-03 | 2024-02-02 | 青岛以萨数据技术有限公司 | 一种人脸图像存储方法、设备及介质 |
Families Citing this family (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110399793A (zh) * | 2019-06-19 | 2019-11-01 | 深圳壹账通智能科技有限公司 | 基于图像识别的驾驶行为预警方法、装置和计算机设备 |
CN111191523A (zh) * | 2019-12-11 | 2020-05-22 | 秒针信息技术有限公司 | 信息的显示方法和装置、存储介质及电子装置 |
CN111626101A (zh) * | 2020-04-13 | 2020-09-04 | 惠州市德赛西威汽车电子股份有限公司 | 一种基于adas的吸烟监测方法及系统 |
CN112016457A (zh) * | 2020-08-27 | 2020-12-01 | 青岛慕容信息科技有限公司 | 驾驶员分神以及危险驾驶行为识别方法、设备和存储介质 |
CN113129551A (zh) * | 2021-03-23 | 2021-07-16 | 广州宸祺出行科技有限公司 | 一种通过司机微表情自动报警的方法、系统、介质和设备 |
CN114170585B (zh) * | 2021-11-16 | 2023-03-24 | 广西中科曙光云计算有限公司 | 危险驾驶行为的识别方法、装置、电子设备及存储介质 |
CN114663863A (zh) * | 2022-02-24 | 2022-06-24 | 北京百度网讯科技有限公司 | 图像处理方法、装置、电子设备和计算机存储介质 |
CN114708628A (zh) * | 2022-03-07 | 2022-07-05 | 深圳市德驰微视技术有限公司 | 基于域控制器平台的车辆驾驶员监测方法及装置 |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2011159214A (ja) * | 2010-02-03 | 2011-08-18 | Fuji Heavy Ind Ltd | 行動検出装置 |
CN107220591A (zh) * | 2017-04-28 | 2017-09-29 | 哈尔滨工业大学深圳研究生院 | 多模态智能情绪感知系统 |
CN107697069A (zh) * | 2017-10-31 | 2018-02-16 | 上海汽车集团股份有限公司 | 汽车驾驶员疲劳驾驶智能控制方法 |
CN108537198A (zh) * | 2018-04-18 | 2018-09-14 | 济南浪潮高新科技投资发展有限公司 | 一种基于人工智能的驾驶习惯的分析方法 |
CN109766840A (zh) * | 2019-01-10 | 2019-05-17 | 腾讯科技(深圳)有限公司 | 人脸表情识别方法、装置、终端及存储介质 |
CN110399793A (zh) * | 2019-06-19 | 2019-11-01 | 深圳壹账通智能科技有限公司 | 基于图像识别的驾驶行为预警方法、装置和计算机设备 |
-
2019
- 2019-06-19 CN CN201910532866.XA patent/CN110399793A/zh active Pending
-
2020
- 2020-04-20 WO PCT/CN2020/085576 patent/WO2020253349A1/zh active Application Filing
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2011159214A (ja) * | 2010-02-03 | 2011-08-18 | Fuji Heavy Ind Ltd | 行動検出装置 |
CN107220591A (zh) * | 2017-04-28 | 2017-09-29 | 哈尔滨工业大学深圳研究生院 | 多模态智能情绪感知系统 |
CN107697069A (zh) * | 2017-10-31 | 2018-02-16 | 上海汽车集团股份有限公司 | 汽车驾驶员疲劳驾驶智能控制方法 |
CN108537198A (zh) * | 2018-04-18 | 2018-09-14 | 济南浪潮高新科技投资发展有限公司 | 一种基于人工智能的驾驶习惯的分析方法 |
CN109766840A (zh) * | 2019-01-10 | 2019-05-17 | 腾讯科技(深圳)有限公司 | 人脸表情识别方法、装置、终端及存储介质 |
CN110399793A (zh) * | 2019-06-19 | 2019-11-01 | 深圳壹账通智能科技有限公司 | 基于图像识别的驾驶行为预警方法、装置和计算机设备 |
Cited By (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112820072A (zh) * | 2020-12-28 | 2021-05-18 | 深圳壹账通智能科技有限公司 | 危险驾驶预警方法、装置、计算机设备及存储介质 |
CN112634188A (zh) * | 2021-02-02 | 2021-04-09 | 深圳市爱培科技术股份有限公司 | 一种车辆远近景组合成像方法及装置 |
CN113723165A (zh) * | 2021-03-25 | 2021-11-30 | 山东大学 | 基于深度学习的待检测人员危险表情检测方法及系统 |
CN113723165B (zh) * | 2021-03-25 | 2022-06-07 | 山东大学 | 基于深度学习的待检测人员危险表情检测方法及系统 |
CN112959966A (zh) * | 2021-04-26 | 2021-06-15 | 积善云科技(武汉)有限公司 | 基于互联网出行和用户习惯深度学习的车载多媒体调控方法、系统、设备和计算机存储介质 |
CN113347381A (zh) * | 2021-05-24 | 2021-09-03 | 随锐科技集团股份有限公司 | 预测不雅举止轨迹的方法及系统 |
CN114475620A (zh) * | 2022-01-26 | 2022-05-13 | 南京科融数据系统股份有限公司 | 用于款箱押运系统的驾驶员验证方法及系统 |
CN114475620B (zh) * | 2022-01-26 | 2024-03-12 | 南京科融数据系统股份有限公司 | 用于款箱押运系统的驾驶员验证方法及系统 |
CN116729254A (zh) * | 2023-08-10 | 2023-09-12 | 山东恒宇电子有限公司 | 基于俯瞰图像的公交驾驶舱安全驾驶行为监测系统 |
CN117493434A (zh) * | 2023-11-03 | 2024-02-02 | 青岛以萨数据技术有限公司 | 一种人脸图像存储方法、设备及介质 |
CN117493434B (zh) * | 2023-11-03 | 2024-05-03 | 青岛以萨数据技术有限公司 | 一种人脸图像存储方法、设备及介质 |
Also Published As
Publication number | Publication date |
---|---|
CN110399793A (zh) | 2019-11-01 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2020253349A1 (zh) | 基于图像识别的驾驶行为预警方法、装置和计算机设备 | |
WO2019104930A1 (zh) | 一种身份鉴定方法、电子装置及计算机可读存储介质 | |
KR102174595B1 (ko) | 비제약형 매체에 있어서 얼굴을 식별하는 시스템 및 방법 | |
CN110348420B (zh) | 手语识别方法、装置、计算机可读存储介质和计算机设备 | |
US10528849B2 (en) | Liveness detection method, liveness detection system, and liveness detection device | |
WO2021196738A1 (zh) | 儿童状态检测方法及装置、电子设备、存储介质 | |
CN109241842B (zh) | 疲劳驾驶检测方法、装置、计算机设备及存储介质 | |
WO2018218839A1 (zh) | 一种活体识别方法和系统 | |
US11328418B2 (en) | Method for vein recognition, and apparatus, device and storage medium thereof | |
CN109830280A (zh) | 心理辅助分析方法、装置、计算机设备和存储介质 | |
CN109299690B (zh) | 一种可提高视频实时人脸识别精度的方法 | |
Zhao et al. | Applying contrast-limited adaptive histogram equalization and integral projection for facial feature enhancement and detection | |
Hassanat | Visual words for automatic lip-reading | |
CN113627256B (zh) | 基于眨眼同步及双目移动检测的伪造视频检验方法及系统 | |
CN112329727A (zh) | 一种活体检测方法和装置 | |
CN111507149B (zh) | 基于表情识别的交互方法、装置和设备 | |
CN109784179A (zh) | 基于微表情识别的智能监护方法、装置、设备及介质 | |
CN110147740B (zh) | 人脸识别方法、装置、设备和存储介质 | |
Lee | Detection and recognition of facial emotion using bezier curves | |
CN111369559A (zh) | 妆容评估方法、装置、化妆镜和存储介质 | |
WO2022034779A1 (ja) | 画像処理装置および画像処理方法 | |
TWI767775B (zh) | 影像式情緒辨識系統和方法 | |
Wei et al. | Three-dimensional joint geometric-physiologic feature for lip-reading | |
Chetty et al. | Multimedia sensor fusion for retrieving identity in biometric access control systems | |
CN112487980A (zh) | 基于微表情治疗方法、装置、系统与计算机可读存储介质 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 20826517 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 20826517 Country of ref document: EP Kind code of ref document: A1 |
|
32PN | Ep: public notification in the ep bulletin as address of the adressee cannot be established |
Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 19.08.2022) |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 20826517 Country of ref document: EP Kind code of ref document: A1 |