CN109965889B - Fatigue driving detection method by using smart phone loudspeaker and microphone - Google Patents

Fatigue driving detection method by using smart phone loudspeaker and microphone Download PDF

Info

Publication number
CN109965889B
CN109965889B CN201910256804.0A CN201910256804A CN109965889B CN 109965889 B CN109965889 B CN 109965889B CN 201910256804 A CN201910256804 A CN 201910256804A CN 109965889 B CN109965889 B CN 109965889B
Authority
CN
China
Prior art keywords
neural network
lstm
fatigue driving
driver
microphone
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910256804.0A
Other languages
Chinese (zh)
Other versions
CN109965889A (en
Inventor
李凡
解亚东
吴玥
杨松
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Institute of Technology BIT
Original Assignee
Beijing Institute of Technology BIT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Institute of Technology BIT filed Critical Beijing Institute of Technology BIT
Priority to CN201910256804.0A priority Critical patent/CN109965889B/en
Publication of CN109965889A publication Critical patent/CN109965889A/en
Application granted granted Critical
Publication of CN109965889B publication Critical patent/CN109965889B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • A61B5/16Devices for psychotechnics; Testing reaction times ; Devices for evaluating the psychological state
    • A61B5/18Devices for psychotechnics; Testing reaction times ; Devices for evaluating the psychological state for vehicle drivers or machine operators
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • A61B5/72Signal processing specially adapted for physiological signals or for diagnostic purposes
    • A61B5/7235Details of waveform analysis
    • A61B5/7264Classification of physiological signals or data, e.g. using neural networks, statistical classifiers, expert systems or fuzzy systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/59Context or environment of the image inside of a vehicle, e.g. relating to seat occupancy, driver state or inner lighting conditions
    • G06V20/597Recognising the driver's state or behaviour, e.g. attention or drowsiness
    • GPHYSICS
    • G08SIGNALLING
    • G08BSIGNALLING OR CALLING SYSTEMS; ORDER TELEGRAPHS; ALARM SYSTEMS
    • G08B21/00Alarms responsive to a single specified undesired or abnormal condition and not otherwise provided for
    • G08B21/02Alarms for ensuring the safety of persons
    • G08B21/06Alarms for ensuring the safety of persons indicating a condition of sleep, e.g. anti-dozing alarms
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B2503/00Evaluating a particular growth phase or type of persons or animals
    • A61B2503/20Workers
    • A61B2503/22Motor vehicles operators, e.g. drivers, pilots, captains

Abstract

The invention discloses a fatigue driving detection method by using a loudspeaker and a microphone of a smart phone. A Doppler radar system is formed by a loudspeaker and a microphone of the smart phone, whether three typical actions (head nodding, yawning and abnormal control of a steering wheel) of fatigue driving exist is judged by using a Doppler effect when a driver moves, and whether the fatigue driving exists is finally analyzed according to detection results of the three typical actions. The invention only depends on the loudspeaker in the smart phone to emit high-frequency sound beyond the hearing range of human ears, and the microphone receives sound signals, so the invention has the advantages of low cost, strong anti-interference performance, no privacy leakage problem and good user experience, and is particularly suitable for the monitoring environment with only drivers in the front row and good road conditions in a vehicle. The three typical fatigue driving actions detected by the method are greatly different from the actions in normal driving, and the accurate action information can be obtained on the smart phone with limited perception capability by combining an undersampling technology and a deep learning technology, so that the method has high accuracy.

Description

Fatigue driving detection method by using smart phone loudspeaker and microphone
Technical Field
The invention relates to a fatigue driving detection method, in particular to a fatigue driving detection method by using a loudspeaker and a microphone of a smart phone, which is used for monitoring whether a driver has fatigue driving and belongs to the technical field of mobile computing application.
Background
With the development of economy, the number of automobiles is increasing, and more traffic accidents are accompanied. Many traffic accidents are greatly related to fatigue driving of drivers. Since many fatigue driving events are not obvious and do not have serious consequences, most drivers are unaware of the dangers of fatigue driving. Studies have shown that drivers are much less concerned about fatigue driving than other problems, such as using a cell phone while driving, speeding or drunk driving. Therefore, it is necessary to develop a fatigue driving detection system that can remind the driver to rest.
At present, the detection method for fatigue driving mainly relies on a special device additionally deployed on a vehicle or a driver's body to analyze whether fatigue driving exists by collecting information on the actions of the driver and the state of the vehicle. For example, brain waves are analyzed by electroencephalographic headphones worn by the driver; the driver's head movement is sensed by means of an infrared sensor embedded in the headrest and the fatigue state of the driver is analyzed by means of a camera arranged in the vehicle. The monitoring devices generally have the problems of high cost, forced wearing by users, poor anti-interference performance and the like, and most of the monitoring devices can only provide coarse-grained monitoring information and cannot accurately detect the fatigue state of drivers.
At present, methods for detecting whether a driver is tired by using a mobile phone camera exist in the market. For example, a camera on a mobile phone is used for tracking the direction of the sight line of the driver, so that whether the driver focuses attention or not is analyzed; analyzing a fatigue state of the driver by detecting a time of blinking and a frequency of blinking; or both front and rear cameras may be used to monitor driver and road conditions. However, the vision-based monitoring method has the problem of privacy disclosure, the accuracy is greatly influenced by the illumination intensity of the surrounding environment and weather, and at least one camera is required to face the driver.
In addition, there are methods to monitor the driver's driving behavior using non-visual sensors on the cell phone. For example, an accelerometer and a gyroscope built in a mobile phone are used for estimating the running speed of the vehicle, and dynamic actions such as lane changing and turning can be recognized; the angle of rotation of the steering wheel is detected using an audio sensor in the handset. However, most methods of detecting driver behavior using non-visual sensors on cell phones do not focus on the field of fatigue driving.
In summary, there is an urgent need for a method for detecting fatigue driving using a non-visual sensor in a smart phone.
Disclosure of Invention
The invention aims to solve the problem that a low-cost and effective fatigue driving detection method is lacked at present, and provides a fatigue driving detection method by using a loudspeaker and a microphone of a smart phone.
The idea of the invention is that: a simple Doppler radar system is formed by a loudspeaker and a microphone which are commonly equipped in the existing smart phone, and the main content of the Doppler effect is that the frequency of sound emitted by a sound source is changed by the relative motion of the sound source and a receiver. The method comprises the steps of judging whether three typical actions (head nodding, yawning and abnormal control of a steering wheel) of fatigue driving exist or not by utilizing the Doppler effect of a driver during movement, and finally analyzing whether the fatigue driving exists or not according to detection results of the three typical actions. The method is particularly suitable for the environment that only drivers are arranged in the front row of the automobile and the road condition is good.
The purpose of the invention is realized by the following technical scheme:
a fatigue driving detection method by using a loudspeaker and a microphone of a smart phone comprises the following steps:
step one, training two long-time memory neural networks and a full-connection neural network model.
Step 1.1, collecting state data of a driver under a normal driving condition. A smart phone is placed in a vehicle normally driven by a driver, a speaker of the smart phone emits continuous high-frequency sound signals, Doppler frequency shift is generated after the continuous high-frequency sound signals are reflected by a moving human body, and the Doppler frequency shift is received by a microphone of the smart phone. Meanwhile, the real action information of the driver is collected through video recording equipment arranged in the vehicle.
And step 1.2, cutting the high-frequency sound signals collected in the step 1.1 into frames with the same length. For each acquired frame, a band pass filter is used to filter out signals outside the desired frequency. And performing undersampling and fast Fourier transform on the filtered signals to obtain the feature vector required by the subsequent training classifier.
In order to calculate the detection result as soon as possible and remind the driver in time, it is preferable to divide the sound signal every 0.25 second into one frame.
Due to the limitation of the hardware performance of the mobile phone, the collected signals cannot obviously distinguish different actions, so that the resolution of each action is improved by adopting an undersampling technology. First, with fLAnd fHRespectively represent byA lower limit frequency value and an upper limit frequency value after the frequency is changed due to the Doppler effect; b ═ fH-fLAnd represents the bandwidth of the signal. From the undersampling theory, the new sampling rate is known
Figure GDA0002431685970000031
Should satisfy
Figure GDA0002431685970000032
Figure GDA0002431685970000033
Wherein
Figure GDA0002431685970000034
In order to take an integer number downwards,
Figure GDA0002431685970000035
is an undersampling factor.
Then, the calculation is made for different n
Figure GDA0002431685970000036
Whether the above inequality is satisfied. Calculated to satisfy the inequality
Figure GDA0002431685970000037
Including 11kHz, 8.8kHz, 7.3kHz, 6.3kHz and 5.5 kHz. In order to improve the system performance as much as possible, n is preferably 8, that is, n is
Figure GDA0002431685970000038
The signal after the undersampling process can only show information in the time domain, and different actions are difficult to distinguish in the time domain. Therefore, in order to convert the signal in the time domain to the frequency domain, N-point fast fourier transform is performed on the undersampled frame, and the phase value of each point is calculated for the transformed result, and the N phase values constitute a feature vector of the frame. The larger the value of N, the higher the resolution for different actions, but at the same time the computational burden on the handset is increased. Preferably, N is 2048, which can achieve higher resolution without increasing the computational burden on the handset.
And step 1.3, marking the action type of each feature vector obtained in the step 1.2 according to the real action information collected by the video equipment, wherein the action type of the feature vector is real, and the action type comprises heading, yawning, abnormal control of a steering wheel or other actions (such as non-fatigue driving actions of playing a mobile phone, drinking water, eating, smoking and the like) and is used as a label of the feature vector, and meanwhile, marking whether a driver is in a fatigue state at that time. Then, training a three-layer long-short time memory neural network LSTM-S by using the characteristic vectors and labels of the nod and the yawning, wherein the neural network can identify the nod and yawning actions of a driver; a four-layer long-time memory neural network LSTM-L is trained by using characteristic vectors and labels of an abnormal control steering wheel and other actions (such as non-fatigue driving actions of playing a mobile phone, drinking water, eating, smoking and the like), and the neural network can identify the abnormal control steering wheel action of a driver. And finally, taking the judgment results generated by the LSTM-S and the LSTM-L as feature vectors, taking whether a driver is in fatigue driving as a label, and training a three-layer fully-connected neural network DNN to judge whether the driver is in fatigue driving. The specific training method comprises the following steps:
the LSTM-S comprises two LSTM layers and one Softmax layer, each input is 11 timestamps, that is, each input is a feature vector of the current frame and a feature vector of the previous 10 frames. And training the network by using the feature vectors and the labels which are processed before. For the tth timeout, the LSTM layer uses the formula ht=σ(W0[ht-1,xt+b0])·tanh(Ct) Will input xtMapping to a compressed vector htWherein W is0And b0Respectively representing a weight matrix and an offset vector, CtRepresenting the status of the tth timeout. For LSTM-S, the output result contains 3 categories in total, and the category probability vector is
Figure GDA0002431685970000039
Representing the probability that the tth timestep belongs to each class respectively, through Pt=s(WTht+ b) is calculated, where s (-) is the Softmax function, WTB is a weight matrix and b is a bias vector. Then, classifying the tth timestamp into PtThe class with the highest probability. In the training process, the difference between the predicted value and the true value is minimized by the network, the error is reduced by adopting a cross entropy cost function, and finally the LSTM-S is obtained by training.
The LSTM-L comprises three LSTM layers and a Softmax layer, wherein each input is 28 timepieces, namely each input is a feature vector of a current frame and a feature vector of 27 previous frames, and the output result comprises 2 categories, namely abnormal steering wheel control and other actions (such as non-fatigue driving actions of playing a mobile phone, drinking water, eating, smoking and the like). The training process is the same as the LSTM-S training process: training by using the processed characteristic vectors and labels, and for the tth timestamp, using h for an LSTM layert=σ(W0[ht-1,xt+b0])·tanh(Ct) Will input xtMapping to a compressed vector htWherein W is0And b0Respectively representing a weight matrix and an offset vector, CtRepresenting the status of the tth timeout. For LSTM-L, the output result contains 2 categories in total, and the category probability vector is
Figure GDA0002431685970000041
Representing the probability that the tth timestep belongs to each class respectively, through Pt=s(WTht+ b) is calculated, where s (-) is the Softmax function, WTB is a weight matrix and b is a bias vector. Then, classifying the tth timestamp into PtThe class with the highest median probability; in the training process, the difference between the predicted value and the true value is minimized by the network, the error is reduced by adopting a cross entropy cost function, and finally the LSTM-L is obtained by training.
The LSTM-L and the LSTM-S are different in that the LSTM-L has more timepieces and one more LSTM layer, so that the LSTM-L can evaluate the behavior of a driver for operating the steering wheel in a longer time period.
The final neural network DNN comprises three fully connected layers. In the training process, input data are results generated by LSTM-S and LSTM-L, and real labels for judging whether a driver is tired or not are output. And reducing errors by adopting a cross entropy cost function to obtain the trained DNN.
And step two, fatigue detection is carried out on the driver in the driving state by using the trained DNN.
Firstly, a loudspeaker and a microphone of the smart phone are used for collecting high-frequency sound signals reflected back by a driver in the driving process in real time, and then the characteristic vector of the signal at the current moment is obtained by the method in the step 1.2. Selecting a fixed number of corresponding feature vectors in a period of time before the current moment, sending the feature vectors into two long-term memory neural networks LSTM-S and LSTM-L trained by the method in the step 1.3, sending results generated by the two networks into DNN together, and finally judging whether the driver has fatigue driving.
Advantageous effects
According to the invention, the identification of the typical actions of fatigue driving can be realized only by means of the fact that the loudspeaker in the smart phone emits high-frequency sound beyond the hearing range of human ears and the microphone receives sound signals, so that the detection of the fatigue driving of the driver is realized. Therefore, the invention does not depend on various sensors and wearable equipment, has low cost, strong anti-interference performance, no privacy disclosure problem and good user experience, and is suitable for the monitoring environment with only drivers at the front row in the vehicle and better road conditions.
In addition, the high-frequency sound adopted by the method exceeds the frequency of most noise in life, so that the method is not easily interfered by environmental noise, and the environmental robustness of the fatigue driving monitoring method is greatly enhanced; the three typical fatigue driving actions detected by the method are greatly different from the actions in normal driving, and accurate action information can be obtained on a smart phone with limited perception capability by combining an undersampling technology and a deep learning technology, so that the method has high accuracy.
Drawings
Fig. 1 is a schematic diagram of a fatigue driving detection method according to an embodiment of the present invention.
Fig. 2 is a plan view of an experimental site according to an embodiment of the present invention.
FIG. 3 shows the accuracy of the motion and fatigue detection for different drivers according to an embodiment of the present invention.
Fig. 4 shows the detection accuracy of the motion and fatigue detection of a mobile phone placed at different positions according to the embodiment of the present invention.
FIG. 5 shows the accuracy of motion and fatigue detection under different interferences according to an embodiment of the present invention.
Detailed Description
The method of the present invention will be described in further detail with reference to the following examples and the accompanying drawings.
As shown in fig. 1, a fatigue driving detection method using a speaker and a microphone of a smart phone includes the steps of:
step one, training two long-time memory neural networks and a full-connection neural network model.
Step 1.1, collecting state data of a driver under a normal driving condition.
5 drivers (3 males and 2 females) were recruited to drive different vehicles for data collection. A mobile phone loudspeaker arranged in the vehicle emits continuous high-frequency sound signals, Doppler frequency shift is generated after the continuous high-frequency sound signals are reflected by a moving human body, and the Doppler frequency shift is received by a microphone of the mobile phone. Meanwhile, real action information of the driver is collected through video equipment arranged in the vehicle.
The main content of the doppler effect is that the frequency at which the sound source emits sound varies due to the relative motion of the sound source and the recipient. In this embodiment, the speaker of the mobile phone emits ultrasonic waves as a sound source, and the sound signal is reflected by an object and then received by the microphone of the mobile phone. Thus, the handset is both the source and the recipient. And the object reflecting the signal can be regarded as a virtual sound source.
The mobile phone loudspeaker is used for continuously emitting ultrasonic waves, so that a driver is not disturbed and the interference of environmental noise is reduced, the frequency of the ultrasonic waves is required to be as high as possible, most of the existing smart phones support the emitting highest frequency to be 20kHz, and therefore the frequency of the selected ultrasonic waves is 20 kHz; the sampling rate of the microphone is set at 44.1kHz, which is also the default sampling rate for sounds at frequencies of 20kHz and below.
And step 1.2, cutting the high-frequency sound signals collected in the step 1.1 into frames with the same length. And for each acquired frame, filtering out signals beyond the required frequency by adopting a band-pass filter, and performing undersampling and fast Fourier transform on the filtered signals to obtain the feature vector required by a classifier for subsequent training.
In order to calculate the detection result as soon as possible and remind the driver in time, the sound signal of every 0.25 second is divided into one frame. The Doppler frequency shift generated by three typical actions of fatigue driving is concentrated between-200 Hz and 200Hz through experiments, namely, three action messages are mainly contained in [ -19.8kHz, 20.02KHz ], so that useless messages on other frequencies are firstly filtered by a band-pass filter, and only the messages in [ -19.8kHz, 20.02KHz ] are reserved.
Due to the limitation of the hardware performance of the mobile phone, the collected signals cannot be distinguished from different actions, so the embodiment adopts an undersampling technology to improve the resolution of each action. First, with fLAnd fHRespectively representing the lower and upper limit frequency values after frequency change due to Doppler effect, B ═ fH-fLRepresenting the bandwidth of the signal. From the undersampling theory, the new sampling rate is known
Figure GDA0002431685970000061
Should satisfy
Figure GDA0002431685970000062
Figure GDA0002431685970000063
Wherein
Figure GDA0002431685970000064
In order to take an integer number downwards,
Figure GDA0002431685970000065
is an undersampling factor.
Then, the calculation is made for different n
Figure GDA0002431685970000066
Whether the above inequality is satisfied. Calculated to satisfy the inequality
Figure GDA0002431685970000067
Including 11kHz, 8.8kHz, 7.3kHz, 6.3kHz and 5.5 kHz. To maximize system performance, it is finally determined that n is 8, i.e.
Figure GDA0002431685970000068
The signal after the undersampling process can only show information in the time domain, and different actions are difficult to distinguish in the time domain. Therefore, in order to convert the signal in the time domain to the frequency domain, the present embodiment performs N-point fast fourier transform on the undersampled frame, calculates a phase value of each point according to the transformed result, and the N phase values constitute a feature vector of the frame.
The larger the value of N, the higher the resolution for different actions, but at the same time the computational burden on the handset is increased. Therefore, N is 2048 through experiment selection, which not only has higher resolution, but also does not increase the calculation pressure of the mobile phone.
Step 1.3, for each feature vector obtained after the processing of the step 1.2, according to the real action information collected by the video equipment, manually marking the real action category to which the feature vector belongs, namely, nodding, yawning, abnormally controlling a steering wheel or other actions (such as non-fatigue driving actions of playing a mobile phone, drinking water, eating, smoking and the like), and taking the behavior as a label of the feature vector, and simultaneously marking whether a driver is in a fatigue state at that time. Then, training a three-layer long-short time memory neural network LSTM-S by using the characteristic vectors of the nodding and the yawning and a label, wherein the neural network can identify the nodding and the yawning actions of a driver; a four-layer long-time memory neural network LSTM-L is trained by using characteristic vectors and labels of an abnormal control steering wheel and other actions (such as non-fatigue driving actions of playing a mobile phone, drinking water, eating, smoking and the like), and the neural network can identify the abnormal control steering wheel action of a driver. And finally, taking the judgment results generated by the LSTM-S and the LSTM-L as feature vectors, taking whether a driver is in fatigue driving as a label, and training a three-layer fully-connected neural network DNN to judge whether the driver is in fatigue driving.
First, the LSTM-S contains two LSTM layers and one Softmax layer, each input is 11 timestamps, that is, each input is a feature vector of the current frame and a feature vector of the previous 10 frames.
The network is then trained using the previously processed feature vectors and labels. For the tth timeout, the LSTM layer uses the formula ht=σ(W0[ht-1,xt+b0])·tanh(Ct) Can input xtMapping to a compressed vector htWherein W is0And b0Respectively representing a weight matrix and an offset vector, CtRepresenting the status of the tth timeout. For LSTM-S, the output result contains 3 categories in total, and the category probability vector is
Figure GDA0002431685970000071
The probability that the tth timestep belongs to each class respectively can be represented by Pt=s(WTht+ b) is calculated, where s (-) is the Softmax function, WTB is a weight matrix and b is a bias vector. Then, classifying the tth timestamp into PtThe class with the highest probability. In the training process, the difference between the predicted value and the true value is minimized by the network, the error is reduced by adopting a cross entropy cost function, and finally the LSTM-S is obtained by training.
The LSTM-L comprises three LSTM layers and a Softmax layer, wherein each input is 28 timepieces, namely each input is a feature vector of a current frame and a feature vector of 27 previous frames, and the output result comprises 2 categories, namely abnormal steering wheel control and other actions (such as non-fatigue driving actions of playing a mobile phone, drinking water, eating, smoking and the like). The training process is the same as the LSTM-S training process described above. The LSTM-L and the LSTM-S are different in that the LSTM-L has more timepieces and one more LSTM layer, so that the LSTM-L can evaluate the behavior of a driver for operating the steering wheel in a longer time period.
The final neural network DNN comprises three fully connected layers. In the training process, input data are results generated by LSTM-S and LSTM-L, and real labels for judging whether a driver is tired or not are output. And reducing errors by adopting a cross entropy cost function to obtain the trained DNN.
And step two, in practical application, firstly, a mobile phone loudspeaker and a microphone are used for collecting high-frequency sound signals reflected back in the driving process of a driver in real time, and then the method in the step 1.2 is used for obtaining the feature vector of the signal at the current moment. And (3) selecting a fixed number of corresponding feature vectors in a short period of time before the current moment, sending the feature vectors into the two long-time and short-time memory neural networks trained in the step 1.3, sending results generated by the two networks into DNN together, and finally judging whether fatigue driving exists or not.
Firstly, when a driver drives the vehicle, the mobile phone is placed in the vehicle, and the speaker of the mobile phone continuously emits 20kHz high-frequency sound which is reflected by the human body of the driver and then is received by the microphone of the mobile phone. The mobile phone continuously divides the received signals into frames with the size of 0.25 second, immediately sends the current frames into a band-pass filter for filtering, then performs undersampling on the filtered signals to improve the resolution, and finally obtains the feature vector through fast Fourier transform of 2048 points.
Then the feature vector and 10 feature vectors before the current moment are sent to the LSTM-S trained in the step 1.3, and simultaneously the feature vector and 27 feature vectors before the current moment are sent to the LSTM-L trained in the step 1.3, and then the results generated by the two networks are sent to the DNN trained together, and finally whether fatigue driving exists is judged.
Examples
In order to test the performance of the method, the method is compiled into an android application program which is deployed in android mobile phones of different models. And 5 drivers (3 males and 2 females) were recruited to drive different vehicles for the experiment. After each driving, the driver needs to record the fatigue state during driving. In order to ensure safety, an open and spacious road is selected as an experimental site, and the speed of the vehicle is limited to 40 km/h. Fig. 2 is a plan view of an experimental scene.
First, the overall accuracy of the method was tested in each case. The 5 drivers drive the experimental vehicle to randomly run in the experimental site according to the driving habits of the drivers, and the mobile phones in the vehicle analyze the fatigue states of the drivers. Fig. 3 shows the experimental results of different drivers driving, and the accuracy of the detection is the ratio of the number of times that a driver actually shows a certain behavior to the number of times that the driver actually detects the behavior. As can be seen from the figure, the average detection accuracy rate for 3 typical actions of fatigue driving is 93.31%; the lowest accuracy was 89.57% among all 5 drivers and their 3 typical maneuvers. The detection of fatigue is not less than 94.31%, which fully indicates that the method has higher accuracy.
Then, the detection capability of the mobile phone placed at different positions is tested. Different drivers prefer to place the mobile phone in different positions in the vehicle, which mainly include storage spaces on the left side of the instrument desk, on the right side of the instrument desk, near the door and near the gear shift lever. Fig. 4 shows the experimental results of the mobile phone placed at different positions, and it can be seen from the figure that the method has higher accuracy in different positions. When the mobile phone is placed on the instrument desk, the mobile phone is over against a driver, so that higher accuracy can be obtained; when the mobile phone is placed in the storage space near the gear shifting lever, the accuracy is low because the mobile phone is far away from the head of a driver and has more interference objects. But at all positions the average detection accuracy for the three typical movements was not less than 89.43% and the detection accuracy for fatigue was not less than 94.84%.
And finally, testing the anti-interference capability of the method under the condition of other interference factors. 4 interference conditions are set in the experiment, including listening to music during driving, running on an uneven road, fixing the speed to be 20km/h and fixing the speed to be 40 km/h. Fig. 5 shows the experimental results under different interferences, and it can be seen from the graph that 4 interferences have little influence on the accuracy. The interference caused by the uneven road surface is the largest, because a driver needs to frequently adjust the steering wheel when driving on the uneven road surface, and the vehicle body shakes greatly, the accuracy is low, but the accuracy is higher than 92%, and the method can work well regardless of the existence of other interference in the environment.
When the driver is tired, the driver can show some characteristics in behavior, such as nodding, yawning and abnormal steering wheel control, different actions can generate different Doppler frequency shifts, the current motion state of the driver can be obtained by analyzing the collected phase information in the Doppler frequency shifts, and whether the driver is tired or not can be analyzed. Therefore, the method uses the mobile phone loudspeaker and the microphone to form a simple Doppler radar system, collects the Doppler frequency shift generated by the movement of the driver, and adopts an effective algorithm to ensure higher stability and accuracy of the method.
The above-described embodiments are further illustrative of the present invention and are not intended to limit the scope of the invention, which is to be accorded the widest scope consistent with the principles and spirit of the present invention.

Claims (8)

1. A fatigue driving detection method by using a loudspeaker and a microphone of a smart phone is characterized by comprising the following steps:
step one, training two long-time memory neural networks and a full-connection neural network model, wherein the method comprises the following steps:
step 1.1, collecting state data of a driver under a normal driving condition, placing a smart phone in a vehicle normally driven by the driver, sending a continuous high-frequency sound signal by a mobile phone loudspeaker, reflecting the continuous high-frequency sound signal by a moving human body to generate Doppler frequency shift, and receiving the Doppler frequency shift by a mobile phone microphone; meanwhile, real action information of a driver is collected through video equipment arranged in the vehicle;
step 1.2, dividing the high-frequency sound signals collected in the step 1.1 into frames with the same length, filtering out signals beyond required frequency by adopting a band-pass filter for each obtained frame, performing undersampling and N-point fast Fourier transform on the filtered signals, calculating a phase value of each point according to a transformed result, wherein the N phase values form a feature vector of the frame;
step 1.3, marking the action type to which each feature vector obtained in the step 1.2 really belongs according to the real action information collected by the video equipment, wherein the action type comprises head nodding, yawning, abnormal control of a steering wheel and non-fatigue driving action, and the action type is used as a label of the feature vector, and simultaneously marking whether a driver is in a fatigue state at the moment;
then, training a three-layer long-short time memory neural network LSTM-S by using the characteristic vectors and labels of the nod and the yawning, wherein the neural network can identify the nod and yawning actions of a driver; training a four-layer long-time memory neural network LSTM-L by using the abnormal control steering wheel and the characteristic vectors and labels of the non-fatigue driving actions, wherein the neural network can identify the abnormal control steering wheel actions of a driver;
finally, the judgment results generated by the three-layer long-short time memory neural network LSTM-S and the four-layer long-short time memory neural network LSTM-L are used as characteristic vectors, whether a driver is in fatigue driving is used as a label, a three-layer fully-connected neural network DNN is trained to judge whether the fatigue driving exists, and meanwhile, the cross entropy cost function is adopted to reduce errors;
step two, carrying out fatigue detection on the driver in the driving state by using the trained three-layer fully-connected neural network DNN:
firstly, acquiring a high-frequency sound signal reflected by a driver in a driving process in real time by using a loudspeaker and a microphone of a smart phone, and then obtaining a feature vector of the signal at the current moment by using the method in the step 1.2; selecting a fixed number of corresponding feature vectors in a period of time before the current moment, sending the feature vectors into the three-layer long-short time memory neural network LSTM-S and the four-layer long-short time memory neural network LSTM-L trained by the method in the step 1.3, sending results generated by the two networks into the three-layer fully-connected neural network DNN together, and finally judging whether the driver has fatigue driving.
2. The method for detecting fatigue driving using a speaker and a microphone of a smart phone according to claim 1, wherein in step 1.2, the collected high frequency sound signals are divided into one frame every 0.25 seconds.
3. The method for detecting fatigue driving using a smart phone speaker and microphone according to claim 1, wherein in step 1.2, the method for performing undersampling processing on the filtered signal to improve the resolution of each action comprises:
first, with fLAnd fHRespectively representing a lower limit frequency value and an upper limit frequency value after the frequency is changed due to the Doppler effect; b ═ fH-fLRepresenting the bandwidth of the signal; new sampling rate
Figure FDA0002431685960000021
Satisfy the requirement of
Figure FDA0002431685960000022
Figure FDA0002431685960000023
Wherein
Figure FDA0002431685960000024
In order to take an integer number downwards,
Figure FDA0002431685960000025
is an undersampling factor;
then, the calculation is made for different n
Figure FDA0002431685960000026
Whether the above inequality is satisfied.
4. The method of claim 3, wherein the smart phone speaker and microphone are used to detect fatigue driving,
Figure FDA0002431685960000027
values include 11kHz, 8.8kHz, 7.3kHz, 6.3 kHz.
5. A method as claimed in claim 3, wherein n-8, i.e. 8, is used for detecting fatigue driving using smart phone speaker and microphone
Figure FDA0002431685960000028
6. The method for detecting fatigue driving by using the speaker and the microphone of the smart phone according to claim 1, wherein in step 1.2, the value of N is 2048.
7. The method for detecting fatigue driving by using the speaker and the microphone of the smart phone as claimed in claim 1, wherein in step 1.3, the training method of the three-layer long-short time memory neural network LSTM-S is as follows:
the three-layer long-and-short-term memory neural network LSTM-S comprises two LSTM layers and a Softmax layer, and 11 timestamps are input each time, namely, the input each time is a feature vector of a current frame and a feature vector of 10 previous frames;
training the network by using the processed feature vectors and labels, and using h for the tth timestamp and the LSTM layert=σ(W0[ht-1,xt+b0])·tanh(Ct) Will input xtMapping to a compressed vector htWherein W is0And b0Respectively representing a weight matrix and an offset vector, CtRepresents the status of the tth timestamp;
for LSTM-S, the output result contains 3 categories in total, and the category probability vector is
Figure FDA0002431685960000029
Representing the probability that the tth timestep belongs to each class respectively, through Pt=s(WTht+ b) is calculated, where s (-) is the Softmax function, WTIs a weight matrix, b is a bias vector;
then, classifying the tth timestamp into PtThe class with the highest median probability; in the training process, the difference between the predicted value and the true value is minimized by the network, the error is reduced by adopting a cross entropy cost function, and finally the three-layer long-term memory neural network LSTM-S is obtained through training.
8. The method for detecting fatigue driving by using the speaker and the microphone of the smart phone as claimed in claim 1, wherein in step 1.3, the training method of the four-layer long-and-short-term memory neural network LSTM-L is as follows:
the four-layer long-short-time memory neural network LSTM-L comprises three LSTM layers and a Softmax layer, wherein each time of input is 28 timestamps, namely each time of input is a feature vector of a current frame and a feature vector of 27 previous frames, and an output result comprises 2 categories, namely abnormal control of a steering wheel and non-fatigue driving actions;
training by using the processed characteristic vectors and labels, and for the tth timestamp, using h for an LSTM layert=σ(W0[ht-1,xt+b0])·tanh(Ct) Will input xtMapping to a compressed vector htWherein W is0And b0Respectively representing a weight matrix and an offset vector, CtRepresents the status of the tth timestamp;
for the four-layer long-and-short-term memory neural network LSTM-L, the output result totally comprises 2 categories, and the category probability vector is
Figure FDA0002431685960000031
Representing the probability that the tth timestep belongs to each class respectively, through Pt=s(WTht+ b) is calculated, where s (-) is the Softmax function, WTIs a weight matrix, b is a bias vector;
then, classifying the tth timestamp into PtThe class with the highest median probability; in trainingIn the training process, if the difference between the predicted value and the true value is to be the minimum, the error is reduced by adopting a cross entropy cost function, and finally the four-layer long-term memory neural network LSTM-L is obtained through training;
the four-layer long-short time memory neural network LSTM-L and the three-layer long-short time memory neural network LSTM-S are different in that the four-layer long-short time memory neural network LSTM-L has more time and one more LSTM layer, and the four-layer long-short time memory neural network LSTM-L can evaluate the behavior of a driver for controlling a steering wheel in a longer time period.
CN201910256804.0A 2019-04-01 2019-04-01 Fatigue driving detection method by using smart phone loudspeaker and microphone Active CN109965889B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910256804.0A CN109965889B (en) 2019-04-01 2019-04-01 Fatigue driving detection method by using smart phone loudspeaker and microphone

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910256804.0A CN109965889B (en) 2019-04-01 2019-04-01 Fatigue driving detection method by using smart phone loudspeaker and microphone

Publications (2)

Publication Number Publication Date
CN109965889A CN109965889A (en) 2019-07-05
CN109965889B true CN109965889B (en) 2020-06-16

Family

ID=67082142

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910256804.0A Active CN109965889B (en) 2019-04-01 2019-04-01 Fatigue driving detection method by using smart phone loudspeaker and microphone

Country Status (1)

Country Link
CN (1) CN109965889B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111914707A (en) * 2020-07-22 2020-11-10 上海大学 System and method for detecting drunkenness behavior
CN113033407B (en) * 2021-03-26 2022-07-22 北京理工大学 Non-contact type fitness monitoring method using intelligent sound box
CN113069105B (en) * 2021-03-26 2022-03-04 北京理工大学 Method for detecting smoking behavior of driver by using loudspeaker and microphone of smart phone
CN114212093B (en) * 2021-12-08 2024-03-12 浙江大学 Safe driving monitoring method, system and storable medium

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN201142112Y (en) * 2008-01-08 2008-10-29 徐建荣 Vehicle safety drive reminding instrument
US10004431B2 (en) * 2014-03-13 2018-06-26 Gary Stephen Shuster Detecting medical status and cognitive impairment utilizing ambient data
KR20170036428A (en) * 2015-09-24 2017-04-03 삼성전자주식회사 Driver monitoring method and driver monitoring apparatus using wearable device
CN108670278A (en) * 2018-05-30 2018-10-19 东南大学 A kind of driver fatigue detection and alarm system and method based on smart mobile phone

Also Published As

Publication number Publication date
CN109965889A (en) 2019-07-05

Similar Documents

Publication Publication Date Title
CN109965889B (en) Fatigue driving detection method by using smart phone loudspeaker and microphone
Xie et al. D 3-guard: Acoustic-based drowsy driving detection using smartphones
CN107697069B (en) Intelligent control method for fatigue driving of automobile driver
CN109726771B (en) Abnormal driving detection model building method, device and storage medium
EP2876639B1 (en) Using external sounds to alert vehicle occupants of external events
US20170072850A1 (en) Dynamic vehicle notification system and method
US10235128B2 (en) Contextual sound filter
CN105632049A (en) Pre-warning method and device based on wearable device
Xu et al. ER: Early recognition of inattentive driving leveraging audio devices on smartphones
CN111398965A (en) Danger signal monitoring method and system based on intelligent wearable device and wearable device
WO2008084020A1 (en) Warning a vehicle operator of unsafe operation behavior based on a 3d captured image stream
US9571057B2 (en) Altering audio signals
KR20180066509A (en) An apparatus and method for providing visualization information of a rear vehicle
CN212098749U (en) Intelligent system and electronic module
De Godoy et al. Paws: A wearable acoustic system for pedestrian safety
Xie et al. Real-time detection for drowsy driving via acoustic sensing on smartphones
Jiang et al. Driversonar: Fine-grained dangerous driving detection using active sonar
Xia et al. Csafe: An intelligent audio wearable platform for improving construction worker safety in urban environments
Jin et al. CycleGuard: A Smartphone-based Assistive Tool for Cyclist Safety Using Acoustic Ranging.
WO2017189203A1 (en) System and method for identifying and responding to passenger interest in autonomous vehicle events
US10567904B2 (en) System and method for headphones for monitoring an environment outside of a user's field of view
CN113069105B (en) Method for detecting smoking behavior of driver by using loudspeaker and microphone of smart phone
Qi et al. A low-cost driver and passenger activity detection system based on deep learning and multiple sensor fusion
Wu et al. HDSpeed: Hybrid detection of vehicle speed via acoustic sensing on smartphones
CN113593183A (en) Detection method, device, equipment and medium for fatigue driving and distraction driving based on acoustics

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant