CN115359462A

CN115359462A - Bus driver fatigue parameter compensation and double-track parallel detection method

Info

Publication number: CN115359462A
Application number: CN202210943716.XA
Authority: CN
Inventors: 董红召; 方浩杰; 全程; 林少轩; 杨嘉炜
Original assignee: Zhejiang University of Technology ZJUT
Current assignee: Zhejiang University of Technology ZJUT
Priority date: 2022-08-08
Filing date: 2022-08-08
Publication date: 2022-11-18

Abstract

A bus driver fatigue parameter compensation and double-track parallel detection method comprises the following steps: s1, making labels for blink and yawnde image data of a driver; s2, calculating the number of frames for closing eyes and opening mouths of a bus driver in a face state by using a face key point algorithm, comparing the number of frames with a current data preprocessing method, and compensating according to a ratio; s3, carrying out fatigue state time series double-track division; after the image frames are detected according to the time sequence, outputting detection results of all areas, and dividing the image frames into a double-track time sequence according to complete time slice results by combining vehicle speed and vehicle conditions; s4, setting a fatigue state time sequence double-track early warning mechanism; the analyzable data comprises the number of blink frames, the blink frequency, the number of open-mouth frames, and yawningThe number of times; defining perclos _eye The proportional relation between the eye closure frame number and the total frame number in the unit is calculated, and the ratio of the eye closure duration time to the detection time and the fatigue state of the driver can be reflected.

Description

Bus driver fatigue parameter compensation and double-track parallel detection method

Technical Field

The invention relates to the field of bus driver fatigue state detection methods, in particular to a bus driver fatigue parameter compensation and double-track parallel detection method.

Background

The bus has traffic accidents caused by fatigue driving of a driver, so that serious consequences and bad influence are caused. The safety of the bus is directly related to the driving behavior of the driver, and the fatigue state of the driver directly influences the driving behavior of the driver. Therefore, the real-time detection and identification of the fatigue state of the driver can reduce the traffic accidents and improve the overall bus operation efficiency and safety.

At present, the actual application is that the winking and the yawning frequency are calculated for the face key point algorithm, the fatigue state real-time detection effect is achieved under normal driving, and certain problems still exist. 1) The current detection method only aims at normal driving states, but bus driving has frequent stops, waiting for traffic lights, giving up pedestrians and the like, and has high-frequency idling states, which still belong to the working time of bus drivers, and in subsequent bus simulated driving experiments, idling work of the stops, waiting for the traffic lights and the like accounts for 15% -20% of the total driving time, but blinking times account for 30% -35% of the total times. The attention of the public transport driver is switched in the idle state, the public transport driver is easy to be in a relaxed state, the blinking frequency, the yawning frequency and the like are increased, the fatigue state is expressed more strongly, and the fatigue real state is more exposed. 2) When the bus is driven at a stop, whether the passenger sweeps the code or not needs to be observed, and communication to a certain degree exists. In the period, a bus driver has large-amplitude head turning and multi-frequency head nodding, the bus driver needs to hold a mask for multiple times, and the like, and a human face key point algorithm cannot accurately position the areas of eyes and mouths aiming at the images and has large deviation, so that the calculation of subsequent fatigue parameters is inaccurate, and the fatigue state of the bus driver cannot be accurately judged. 3) The current set threshold value is only suitable for normal driving state, and the attention of the bus driver is switched under idle states such as stop, waiting traffic signal light and the like, so that the bus driver is easy to be in a relaxed state, and the blinking frequency, the yawning frequency and the like are increased.

Aiming at the problem that a face key point algorithm cannot detect, a plurality of researchers use a target detection algorithm to identify, but the label making standard is easy to have difference because quantification standards such as P80 (when the whole face of the eye pupil is covered by eyelids to be more than 80 percent, the eyes are considered to be closed) cannot be used in the image label making process.

Aiming at the problems, a method for compensating fatigue parameters of a bus driver and detecting double-track parallel is urgently needed, namely the method compensates the blink and yawning fatigue parameters obtained by a target detection algorithm, detects the fatigue state of the bus driver in a normal driving state and an idle state (stop, waiting for a traffic signal lamp and the like), respectively sets threshold values, carries out real-time early warning on the fatigue state of the bus in the whole driving process, and improves the bus driving safety.

Disclosure of Invention

The invention aims to overcome the defects in the prior art and provides a method for compensating fatigue parameters of a bus driver and detecting double-rail parallelism.

The method for detecting the fatigue state of the bus driver distinguishes the normal driving state and the idle state (stop, waiting for a traffic signal lamp and the like) of the bus driver by combining the speed and the condition of the bus, utilizes an image recognition algorithm to identify the eye opening and mouth closing state and the mouth opening and mouth closing state of the bus driver in real time, and enables the recognition results to belong to different driving state time dimensions according to the speed and the condition of the bus, namely the recognition results of the normal driving state and the idle state are respectively extracted and spliced, and double-track parallel compensation calculation and analysis of fatigue state parameters are carried out on the respective time dimensions.

The technical scheme of the invention is as follows:

a bus driver fatigue parameter compensation and double-track parallel detection method comprises the following steps:

s1, formulating an image data label;

the current target detection algorithm has excellent detection effects on image classification and positioning, such as SSD, YOLO, faster-RCNN and the like, and can be used for the method. Because the facial fatigue parameters are calculated by using the image data, the phenomena of blinking and yawning directly reflect the fatigue degree of the driver. The image data label making method comprises the following steps:

a) Blinking: the image data is labeled from the angle of time series, namely in the complete process from eye opening to eye closing and then from eye closing to eye opening, the image data of the completely closed eyes is labeled as the eye closing state through manual judgment, and the rest data are labeled as the eye opening state.

b) Yawning: the yawning is from closed mouth to open mouth, and then from open mouth to closed mouth, so that the critical state of open mouth is difficult to be accurately defined. According to multiple experimental observations, the yawning duration is 2-3 seconds under normal conditions, and some yawning durations are even longer and last for 4-5 seconds. And setting different labeling methods according to the yawning image data of wearing the mask and not wearing the mask. Marking the mouth region of the image from the angle of the continuous time sequence by the yawning data without wearing the mask, avoiding the confusion of the states such as communication with normal speech and the like, determining and intercepting the X% image data in the middle of the yawning complete time sequence image without wearing the mask after a plurality of experiments, marking the mouth region as open mouth, and marking the rest as closed mouth state.

In the experimental process, the emotions of crowding eyes, picking eyebrows, frowning and the like can appear when a driver wears the mask, and the areas near the eyes can be found to be obviously changed along with the wide opening of the mask. A single feature may be confused with other actions and expressions, for example, a person may feel crowded eyes and frowning when angry; the mask is greatly enlarged when laughing. Therefore, the characteristics of the two partial regions (mask region, eye region, and vicinity region) are combined, and the yawning image data on which the mask is worn indicates the mask, eye region, and vicinity region in which the large face of the image is stretched from the full time series. Because the characteristics of the mouth area almost disappear after the mask is worn, the part images of the beginning and the ending of the yawning do not widely spread the mask any more and the facial expression begins to return to calm, in order to improve the accuracy of yawning counting when the mask is worn, the Y% image data in the middle of the yawning complete time sequence image of the worn mask is marked as yawning.

S2, setting a parameter compensation method;

according to the principle of an image data preprocessing method, the number of the current image frame predictions of eye closure, mouth opening and yawning is smaller than that of a human face key point algorithm, and therefore a fatigue parameter compensation method is provided. And calculating the number of frames of eyes closed and mouth open of the bus driver in a face state by using a human face key point algorithm (Dlib library), comparing the frames with a current data preprocessing method, and compensating according to a ratio.

S21) calculating the eye aspect ratio;

E _AR the Euclidean distance ratio between the longitudinal landmark and the transverse landmark of the eye can directly reflect the degree of eye closure. According to the 6 key point positions of the eyes. E _AR Is calculated by the formula

When the eyes of the bus driver are open, E _AR The values remain dynamically balanced, i.e., at small fluctuations. But when the eyes of the bus driver are closed, E _AR The value drops rapidly and returns to homeostasis rapidly when the eyes are opened again. Thus E _AR The value may directly reflect the eye open-closed state after incorporating the P80 criterion. Capturing blinking images of multiple buses under the condition of simulating the face of a driver, and calculating the left and right eyes E of the driver by using a face key point detector in a Dlib library _AR Values and taking the mean.

S22) calculating the aspect ratio of the mouth;

referring to the eye aspect ratio definition method, the mouth aspect ratio M is defined based on 8 characteristic points of the mouth _AR The characteristic point is calculated by the formula

The mouth aspect ratio is calculated from 8 characteristic points of the inner contour of the mouth, and can directly reflect the opening degree of the mouth. Bus driver with mapThe driver will frequently communicate with the passengers on/off the bus when stopping, and M is required to be set _AR The threshold distinguishes mouth opening amplitude during yawning from normal speech.

S23) establishing E _AR And M _AR A threshold value;

to determine the face-up state of the bus driver _AR Value sum M _AR The fluctuation range of the value intercepts the front face state data in the video and respectively calculates E in the processed video stream data _AR And M _AR Maximum and minimum values of (c). Final reference to P80 index in perclos to determine E _AR And M _AR The threshold value is calculated according to the formula

E _AR ＝(E _AR,max -E _AR,min )(1-X ₁ )+E _AR,min (3)

M _AR ＝(M _AR,max -M _AR,min )(1-X ₂ )+M _AR,min (4)

S24) solving a compensation parameter;

determination of E by the above method _AR And M _AR Completely intercepting image sequence frames of blinking and yawning (without wearing a mask) of a bus driver in a frontal face state, determining the number of eye closing frames and yawning frames according to the threshold value, comparing the number of the frames with the set number of the target detection algorithm labels, calculating the ratio according to the comparison result, wherein the calculation formula of the ratio S is

Respectively counting the number of key point frames and the number of target detection label frames when the blinking times are 100 times, 500 times and 1000 times, and calculating to obtain a ratio S _eye An interval. The ratio S can be increased by increasing the number of blinking samples _eye The accuracy of the method is more convincing.

The statistical method of the number of yawning frames is the same as that of blinking, and the ratio S of the number of key point frames to the number of target detection label frames is obtained through calculation _yawn An interval.

Due to the ratio S _yawn Is not wearing the maskIn case of wearing a mask, S according to the target detection label preprocessing method _yawn-mask The value is calculated by the formula

S3, carrying out time series double-track division on the fatigue state;

after the image frames are detected according to the time sequence, the detection results of all areas are output, and the image frames are divided into a double-track time sequence, namely a normal driving sequence and an idle driving sequence according to the complete time slice result by combining the vehicle speed and the vehicle condition. The normal driving sequence is that the bus is in normal driving state, and the idle driving sequence indicates that the bus is in the driving state that drivers such as stop, wait for traffic signal lamp, give up pedestrian's gift have longer latency. The two sequences respectively splice the time slices, and the fatigue state is analyzed from the continuous time angle on the respective sequence, and the method specifically comprises the following steps:

s31: and acquiring the speed of the bus in real time by combining a speed sensor, and checking the consistency of the time dimensions of the speed acquisition equipment and the mobile terminal image acquisition equipment again.

S32: and taking the bus speed of 0km/h and the speed recovered from 0km/h as time slice division nodes. When the node division is avoided, the complete sequence of eye detection (blinking) and mouth detection (yawning) counting is interrupted, so that the number of the counting sequence frames before and after the node is compared, and the counting sequence is completely divided into a time sequence with more frames.

S33: if the bus speed is not 0km/h within 10 seconds, directly dividing the image frame detection result corresponding to the time slice of 10 seconds into a normal driving sequence; if the node appears, dividing an image frame detection result corresponding to a time slice in front of the node into a normal driving sequence, and calculating the idle speed duration of the bus from the node; if the idling time is less than 5 seconds, namely the bus recovers the speed within 5 seconds (usually at the intersection or the zebra crossing is stopped temporarily), the driver can know that the attention of the bus is not obviously transferred in the period, and the bus still concentrates on the road surface condition and is in a normal driving state after a plurality of times. Therefore, dividing the time slice and the related result into a normal driving sequence from the node for recovering the vehicle speed; if the idle speed duration is longer than 5 seconds and shorter than 10 seconds (no passenger gets on or off the bus or passengers are given a gift), dividing the time segment and the result into an idle driving sequence; if the idling time is longer than 10 seconds (common in traffic lights, traffic jams and parking stations), dividing every 10 seconds as the time slice length, and sequentially dividing the last time slices of less than or equal to 10 seconds into an idling driving sequence.

S34: the two driving sequences respectively splice the time slices divided into the two driving sequences and the corresponding image frame detection results, and the image frame results correspond to the two sequence time dimensions again, namely, the driving sequences start from the beginning respectively, and the frame number is taken as an abscissa.

S4, setting a fatigue state time series double-track early warning mechanism;

due to the fact that time series double-track division is carried out, early warning mechanisms need to be set for the two driving sequences respectively. The analyzable data comprises the number of blink frames, the blink frequency, the number of frame opening the mouth and the number of yawning. Defining perclos _eye The proportional relation between the number of closed-eye frames and the total number of frames in a unit is calculated, and the ratio of the eye closing duration to the detection time and the fatigue state of the driver can be reflected, as shown in formula 7. Number of blinks N _eye The counts are added in each sequence.

In the formula, t _eye Representing the number of eye closure frames; t is a unit of _eye Representing the total number of frames per unit time.

Perclos is defined according to the relationship between time and frame number per minute _mouth And perclos _mouth-mask The fatigue evaluation can be described as the increase of the characteristic frequency of the mouth opening per unit time of the local continuity, as shown in the formulas 8 and 9.

In the formula, t _mouth The frame number of the mouth opening is represented; t is _mouth Represents the total number of frames per unit time; t is t _mouth-mask Representing the number of yawning frames; t represents the total number of frames per unit time.

mouth-mask

S41) setting a normal driving sequence early warning mechanism:

through simulation driving platform experiment discovery bus driver is concentrating on the frequency of blinking obviously to reduce when driving, and can appear the phenomenon of blinking many times in the short time because of eyes are dry and astringent after being absorbed in for a long time. When a driver is in a fatigue state, the eyes can be dull, and the blinking frequency is lower than a normal value; furthermore, some drivers may resist fatigue by blinking frequently, resulting in a frequency of blinks that is higher than normal, so that either too low or too high indicates a higher degree of fatigue. According to the phenomenon, the driving early warning mechanism of the normal driving sequence is as follows (the unit time is 1 minute):

in the formula, perclos _eye Representing the ratio of the number of closed-eye frames to the unit time, perclos _mouth Representing the ratio of the number of open-mouth frames to the unit time, perclos _mouth-mask Represents the unit time ratio of the yawning frame number under the condition of wearing the mask, N _eye Indicating the number of blinks per unit time, states _eye Indicating the eye detection result. And if the any condition is met, judging that the bus driver is in a fatigue state under a normal driving sequence, and performing early warning and reminding.

S42) setting an idle driving sequence early warning mechanism:

through simulation driving platform experiments, it is found that the bus driver is higher in concentration degree under the normal driving condition, after the bus is in an idle state, the overall concentration degree is reduced, the conditions of yawning and substantial improvement of blink frequency exist, and the bus can still be normally driven after transient correction is carried out at the idle stage. According to the above phenomenon, the idling driving sequence driving warning mechanism is as follows (unit time is 1 minute):

in which the characters represent the same meanings as in formula 10.

Preferably, X is taken in step S23) ₁ 0.8, indicating that the eye is closed to a greater extent than 80% of the pupil covered by the eyelids; for the same reason, X ₂ A jerk of 0.2, i.e. a mouth opening greater than 80% of the maximum opening, is taken.

The working principle of the invention is as follows: the method comprises the steps of distinguishing the normal driving state and the idle state (stop, waiting for traffic signal lamps and the like) of a bus driver by combining the speed and the condition of the bus to divide a double-track time sequence, identifying the eye opening and closing state and the mouth opening and closing state of the bus driver in real time by using a target detection algorithm, performing parameter compensation on a detection result, classifying the identification result into different driving state time dimensions according to the speed and the condition of the bus, extracting and splicing, performing double-track parallel calculation and analysis on fatigue state parameters in respective time dimensions, and setting different early warning thresholds.

The invention has the advantages that: the method is characterized in that a target algorithm is utilized to detect the fatigue state of a bus driver in a normal driving state and an idle state, the fatigue state detection is more in line with the fatigue parameter quantification standard after parameter compensation is carried out, threshold values are respectively formulated, the fatigue state of the bus in the whole-course driving process is early warned in real time, and the bus driving safety is improved.

Drawings

Fig. 1 (a) to 1 (c) are schematic diagrams of image annotation of the present invention, fig. 1 (a) is a schematic diagram of eye opening and eye closing, fig. 1 (b) is a schematic diagram of mouth opening and mouth closing, and fig. 1 (c) is a schematic diagram of yawning annotation when wearing a mask;

fig. 2 (a) -2 (b) are schematic views of the eye feature points of the present invention, wherein fig. 2 (a) is an open-eye state and fig. 2 (b) is a closed-eye state;

fig. 3a to 3 (b) are schematic views of the characteristic points of the mouth of the present invention, wherein fig. 3 (a) is an open mouth state, and fig. 3 (b) is a closed mouth state;

FIG. 4 is a schematic diagram of the fatigue state time series dual rail partitioning of the present invention.

Detailed Description

The technical scheme of the invention is further explained by combining the attached drawings.

The method for compensating the fatigue parameters of the bus driver and detecting the double-track parallel comprises the following steps:

s1, formulating an image data label;

the current target detection algorithm has excellent detection effects on image classification and positioning, such as SSD, YOLO, faster-RCNN and the like, and can be used for the method. Because the image data is used for calculating the facial fatigue parameters, the phenomena of blinking and yawning directly reflect the fatigue degree of a driver. The image data label making method of the invention is shown in fig. 1, and specifically comprises the following steps:

a) Blinking: the image data is labeled from the angle of the time sequence, namely the image data of the completely closed eyes is labeled as the eye closing state by manual judgment in the complete process from eye opening to eye closing and then from eye closing to eye opening, and the rest data are labeled as the eye opening state.

b) Yawning: the yawning is from the mouth closing state to the mouth opening state, and then from the mouth opening state to the mouth closing state, so that the critical state of mouth opening is difficult to be accurately defined. According to multiple experimental observations, the yawning duration is usually 2-3 seconds, and some yawning durations are even longer and last for 4-5 seconds. And setting different labeling methods according to the yawning image data of wearing the mask and not wearing the mask. Marking the mouth area of the image by the yawning data of the unworn mask from the angle of the continuous time sequence, avoiding the confusion with the states of normal communication and the like, confirming and intercepting the image data of X percent (70 percent) in the middle of the yawning complete time sequence image of the unworn mask after a plurality of experiments, marking the mouth area as open mouth, and marking the rest as closed mouth state.

In the experimental process, people find that the emotions of crowding eyes, picking eyebrows, frowning and the like appear when the driver wears the mask, and areas near the eyes can be obviously changed and are opened along with the mask to a large extent. A single feature may be confused with other actions and expressions, for example, a person may feel crowded eyes and frowning when angry; the mask is greatly enlarged when laughing. Therefore, the characteristics of the two partial regions (mask region, eye region and vicinity region) are combined, and the yawning image data of wearing the mask marks the mask, the eye region and the vicinity region with the large face of the image stretched from the complete time series. Because the characteristics of the mouth area almost disappear after the mask is worn, the beginning and ending partial images of the yawning do not greatly expand the mask any more and the facial expression begins to restore to calm, in order to improve the accuracy of yawning counting when the mask is worn, the image data of Y percent (50 percent) in the middle of the complete time sequence images of the yawning when the mask is worn is marked as the yawning.

S2, setting a parameter compensation method;

according to the principle of an image data preprocessing method, the prediction number of the current image frames with eyes closed, mouth opened and yawning is smaller than that of a human face key point algorithm, and therefore a fatigue parameter compensation method is provided. And calculating the frame number of eyes closed and mouth opened of the bus driver in a face state by using a face key point algorithm (Dlib library), comparing the frame number with the current data preprocessing method, and compensating according to the ratio.

1) Calculating an eye aspect ratio;

E _AR the Euclidean distance ratio between the longitudinal landmark and the transverse landmark of the eye can directly reflect the degree of eye closure. From the 6 keypoint locations of the eye, as shown in fig. 2. E _AR Is calculated by the formula

When the eyes of the bus driver are open, E _AR The values are kept dynamically balanced, i.e. at a small valueThe amplitude fluctuates. But when the eyes of the bus driver are closed, E _AR The value drops rapidly and returns to dynamic equilibrium rapidly when the eye is opened again. Thus E _AR The value may directly reflect the eye open-closed state after incorporating the P80 criterion. Capturing blink images of multiple public transport simulation drivers in a face-up state, and calculating left and right eyes E of the drivers by using a face key point detector in a Dlib library _AR Values and taking the mean.

2) Calculating the mouth aspect ratio;

referring to the eye aspect ratio definition method, 8 characteristic points according to the mouth are shown in fig. 3. Mouth aspect ratio M _AR The characteristic point is calculated by the formula

The mouth aspect ratio is calculated from 8 characteristic points of the inner contour of the mouth, and can directly reflect the opening degree of the mouth. The bus driver can frequently communicate with passengers getting on or off the bus when stopping, and needs to set M _AR The threshold distinguishes mouth opening amplitude during yawning from normal speech.

3) Establishment of E _AR And M _AR A threshold value;

to determine the face-up state of the bus driver _AR Value sum M _AR The fluctuation range of the value, the front face state data in the video is intercepted, and E in the processed video stream data is respectively calculated _AR And M _AR Maximum and minimum values of (c). Final reference to P80 index in perclos to determine E _AR And M _AR The threshold value is calculated by the formula

E _AR ＝(E _AR,max -E _AR,min )(1-X ₁ )+E _AR,min (3)

M _AR ＝(M _AR,max -M _AR,min )(1-X ₂ )+M _AR,min (4)

Taking X ₁ A value of 0.8 means that the eye is closed to a greater extent than 80% of the pupil covered by the eyelids. In the same way, X ₂ 0.2 is taken, namely the mouth is opened to a large extentA yawning action is considered to be 80% of the maximum opening. The determined threshold is shown in Table 1, i.e. E _AR A value less than 0.19 is considered closed eye, M _AR And when the value is larger than 0.60, the yawning action is considered.

TABLE 1E _AR And M _AR Threshold value

4) Calculating a compensation parameter;

The statistics of the number of closed-eye frames are shown in Table 2, and it can be seen from the table that the ratio S of the number of key point frames to the number of target detection tag frames is obtained by statistics of the number of blinks of 100, 500, and 1000 times respectively _eye The interval is 1.31 to 1.34. The ratio S can be increased by increasing the number of blink samples _eye The accuracy of (S) is more convincing, this time S _eye 1.32 was taken.

TABLE 2 closed-eye frame count

The statistics of the number of yawning frames are shown in table 3, and because the number of yawning samples is small, the statistics of the number of key point frames and the number of target detection tag frames are only performed for 5 times, 10 times and 20 times, respectively, and the ratio S is _yawn The interval is 1.21-1.22, this time S _yawn Take 1.21.

TABLE 3 statistics of number of frames to yawning

Due to the ratio S _yawn If the mask is not worn, the method of preprocessing the target detection label and the method of S under the mask wearing condition are adopted _yawn-mask The value is calculated by the formula

Substituting the current calculation result into formula (6), this time S _yawn 1.21 was taken.

S3, carrying out fatigue state time series double-track division;

after the image frames are detected according to the time sequence, the detection results of all areas are output, and the image frames are divided into a double-track time sequence, namely a normal driving sequence and an idle driving sequence according to the complete time slice result by combining the vehicle speed and the vehicle condition. The normal driving sequence is that the bus is in normal driving state, and the idle driving sequence indicates that the bus is in the driving state that drivers such as stop, wait for traffic signal lamp, give a gift to pedestrians have longer latency. The two sequences respectively splice the time slices, and analyze the fatigue state from the continuous time angle on the respective sequence, the double track division of the fatigue state time sequence is shown in fig. 4, and the specific steps are as follows:

the method comprises the following steps: the bus speed is acquired in real time by combining a vehicle speed and vehicle condition sensor, data of a driving simulator (Luzhi G29) is acquired in real time by using Simulink in a bus simulation driving platform, the sampling frequency is 30 times/second, and the sampling frequency is consistent with the image frame rate. And checking the consistency of the time dimensions of the vehicle speed and vehicle condition acquisition equipment and the mobile terminal image acquisition equipment again.

Step two: the bus speed is 0km/h and the speed recovered from 0km/h is taken as a time slice dividing node. When the node division is avoided, the complete sequence of eye detection (blinking) and mouth detection (yawning) counting is interrupted, so that the number of the counting sequence frames before and after the node is compared, and the counting sequence is completely divided into a time sequence on one side with more frame numbers.

Step three: if the bus speed is not 0km/h within 10 seconds, directly dividing the image frame detection result corresponding to the 10-second time slice into a normal driving sequence; if the node appears, dividing an image frame detection result corresponding to a time slice before the node into a normal driving sequence, and calculating the idling duration of the bus from the node; if the idling time is less than 5 seconds, namely the bus recovers the speed within 5 seconds (usually at the intersection or the zebra crossing is stopped temporarily), the attention of the bus driver is not obviously transferred in the period, the bus driver is still concentrated on the road surface and is in a normal driving state after the bus driver is aware of stopping for many times. Therefore, dividing the time slice and the related result into a normal driving sequence from the node of the recovered vehicle speed; if the idle speed duration is longer than 5 seconds and shorter than 10 seconds (no passenger gets on or off the bus or passengers are given a gift), dividing the time segment and the result into an idle driving sequence; if the idle speed duration is longer than 10 seconds (common in traffic lights, traffic jams and stops), dividing every 10 seconds as the time slice length, and sequentially dividing the last time slices of less than or equal to 10 seconds into the idle speed driving sequence.

Step four: the two driving sequences respectively splice the time slices divided into the two driving sequences and the corresponding image frame detection results, and the image frame results correspond to the two sequence time dimensions again, namely, the driving sequences start from the beginning respectively, and the frame number is used as the abscissa.

After the time series double-track division is completed according to the method, the idle speed state accounts for about 20% of the total time in the current bus simulation driving experiment. However, the number of blinks accounts for 30% to 35% of the total number. The attention of the bus driver is switched in the idle state, the bus driver is easy to be in a relaxed state, the blinking frequency and the yawning frequency are increased, the fatigue state is expressed more strongly, and the real state of the fatigue is revealed.

S4, setting a fatigue state time sequence double-track early warning mechanism;

due to the fact that time series double-track division is carried out, early warning mechanisms need to be set for the two driving sequences respectively. The analyzable data comprises the number of blink frames, the blink frequency, the number of frame opening the mouth and the yawning times. Defining perclos _eye Calculating the number of closed-eye frames in a unitThe proportional relation with the total frame number can reflect the ratio of the eye closing duration to the detection time and the fatigue state of the driver, as shown in the formula 7. Number of blinks N _eye The counts are added in each sequence.

In the formula, t _eye Representing the number of eye closure frames; t is _eye Representing the total number of frames per unit time.

Perclos is defined according to the relationship between time and frame number per minute _mouth And perclos _mouth-mask The fatigue evaluation can be described as the increase of the characteristic frequency of the mouth opening per unit time of the local continuity, as shown in equations 8 and 9.

In the formula, t _mouth The frame number of the mouth opening is represented; t is a unit of _mouth Represents the total number of frames per unit time; t is t _mouth-mask Representing the number of yawning frames; t represents the total number of frames per unit time.

mouth-mask

1) Setting a normal driving sequence early warning mechanism:

through simulation driving platform experiment, it is found that the blink frequency of a bus driver is obviously reduced when the bus driver is concentrated on driving, and the phenomenon of blinking for multiple times in a short time can occur due to dry eyes after the bus driver is concentrated on driving for a long time. When a driver is in a fatigue state, the eyes can be dull, and the blinking frequency is lower than a normal value; furthermore, some drivers may resist fatigue by blinking frequently, resulting in a frequency of blinks that is higher than normal, so that either too low or too high indicates a higher degree of fatigue. According to the phenomenon, the driving early warning mechanism of the normal driving sequence is as follows (the unit time is 1 minute):

in the formula, perclos _eye Represents the ratio of the number of closed-eye frames to the unit time, perclos _mouth Representing the ratio of the number of open-mouth frames to the unit time, perclos _mouth-mask Represents the unit time ratio value of the number of yawning frames under the condition of wearing the mask, N _eye Indicates the number of blinks per unit time, states _eye Indicating the eye detection result. And if any condition is met, judging that the bus driver is in a fatigue state under a normal driving sequence, and performing early warning and reminding.

2) Setting an idle driving sequence early warning mechanism:

through simulation driving platform experiments, it is found that the bus driver is higher in concentration degree under the normal driving condition, after the bus is in an idle state, the overall concentration degree is reduced, the conditions of yawning and substantial improvement of blink frequency exist, and the bus can still be normally driven after transient correction is carried out at the idle stage. According to the phenomenon, the driving early warning mechanism of the idle driving sequence is as follows (the unit time is 1 minute):

in which the characters represent the same meanings as in formula 10.

Claims

1. A bus driver fatigue parameter compensation and double-track parallel detection method comprises the following steps:

s1, formulating an image data label;

the method comprises the following steps of making labels for blink and yawning image data of a driver, and specifically comprises the following steps:

a) Blinking: labeling image data from the angle of a time sequence, namely labeling the image data of the completely closed eyes as the eye closing state by manual judgment in the complete process from eye opening to eye closing and then from eye closing to eye opening, and labeling the rest data as the eye opening state;

b) Yawning: setting different labeling methods according to the yawning image data of wearing the mask and not wearing the mask; marking the mouth region of the image from the angle of the continuous time sequence by the yawning data without wearing the mask, avoiding the confusion with the states of normal speech communication and the like, determining and intercepting X% of image data in the middle of the yawning complete time sequence image without wearing the mask after a plurality of experiments, marking the mouth region as open mouth, and marking the rest as closed mouth state;

combining the characteristics of the mask area, the eyes and the nearby area, and marking the mask, the eyes and the nearby area with the large face of the image stretched from the angle of the complete time sequence by the yawning image data of wearing the mask; because the characteristics of the mouth area almost disappear after the mask is worn, the mask is not largely unfolded at the beginning and the end of the yawning, and the facial expression begins to return to calm, in order to improve the accuracy of yawning counting when the mask is worn, the Y% image data in the middle of the yawning complete time sequence image of the worn mask is marked as yawning;

s2, setting a parameter compensation method;

according to the principle of an image data preprocessing method, the number of image frame predictions of eye closure, mouth opening and yawning is smaller than that of a human face key point algorithm, so that a fatigue parameter compensation method is set; calculating the frame number of eyes closed and mouth opened of the bus driver in a face state by using a face key point algorithm (Dlib library), comparing the frame number with a current data preprocessing method, and compensating according to a ratio;

s21) calculating the eye aspect ratio;

E _AR the Euclidean distance ratio between the longitudinal landmark and the transverse landmark of the eye can directly reflect the degree of the eye closure; according to 6 key point positions of eyes; e _AR Is calculated by the formula

When the eyes of the bus driver are open, E _AR The values are kept in dynamic balance, i.e. at small fluctuations; but when the eyes of the bus driver are closed, E _AR The value drops rapidly and returns to dynamic equilibrium rapidly when the eye is opened again; thus E _AR The value may directly reflect the eye open-closed status after combining the P80 criterion; capturing blinking images of a plurality of buses under the condition of simulating the face of a driver, and calculating the left eye E and the right eye E of the driver by using a face key point detector in a Dlib library _AR Taking the value and taking the average value;

s22) calculating the aspect ratio of the mouth;

The aspect ratio of the mouth is calculated by 8 characteristic points of the inner contour of the mouth, and the opening degree of the mouth can be directly reflected; the bus driver can frequently communicate with passengers getting on or off the bus when stopping, and needs to set M _AR The opening amplitude of the mouth is distinguished by a threshold value when yawning and normal speaking are carried out;

s23) establishment of E _AR And M _AR A threshold value;

to determine the face-up state of the bus driver _AR Value sum M _AR The fluctuation range of the value intercepts the front face state data in the video and respectively calculates E in the processed video stream data _AR And M _AR Maximum and minimum values of (c); finally, E is determined by referring to P80 index in perclos _AR And M _AR The threshold value is calculated according to the formula

E _AR ＝(E _AR,max -E _AR,min )(1-X ₁ )+E _AR,min (3)

M _AR ＝(M _AR,max -M _AR,min )(1-X ₂ )+M _AR,min (4)

S24) solving a compensation parameter;

determination of E by the method described above _AR And M _AR Completely intercepting image sequence frames of blinking and yawning (without wearing a mask) of a bus driver in a frontal face state, determining the number of eye closing frames and yawning frames according to the threshold value, comparing the number of the frames with the set number of the target detection algorithm labels, calculating the ratio according to the comparison result, wherein the calculation formula of the ratio S is

Respectively counting the number of key point frames and the number of target detection label frames when the number of winks is 100, 500 and 1000, and calculating to obtain a ratio S _eye An interval; increasing the ratio S by increasing the number of blinking samples _eye The accuracy of (2);

the statistical method of the number of yawning frames is the same as that of blinking, and the ratio S of the number of key point frames to the number of target detection label frames is obtained through calculation _yawn An interval;

S3, carrying out time series double-track division on the fatigue state;

after the image frames are detected according to the time sequence, outputting detection results of all areas, and dividing the detection results into a double-track time sequence, namely a normal driving sequence and an idle driving sequence according to complete time slice results by combining vehicle speed and vehicle conditions; the normal driving sequence is that the bus is in a normal driving state, and the idling driving sequence is that the bus is in a driving state with long waiting time for drivers such as a stop, waiting for traffic lights, giving a gift to pedestrians and the like; the time slices are spliced by the two sequences respectively, and the fatigue state is analyzed from the continuous time angle on the respective sequence, and the method comprises the following specific steps:

s31: acquiring the speed of the bus in real time by combining a speed vehicle condition sensor, and checking the consistency of the time dimensions of the speed vehicle condition acquisition equipment and the mobile terminal image acquisition equipment again;

s32: taking the bus speed of 0km/h and the speed recovered from 0km/h as time slice division nodes; when the node is divided, the complete sequence of eye detection (blinking) and mouth detection (yawning) counting is interrupted, so that the number of the counting sequence frames before and after the node is compared, and the counting sequence is completely divided into a time sequence with more frames;

s33: if the bus speed is not 0km/h within 10 seconds, directly dividing the image frame detection result corresponding to the time slice of 10 seconds into a normal driving sequence; if the node appears, dividing an image frame detection result corresponding to a time slice before the node into a normal driving sequence, and calculating the idle speed duration of the bus from the node; if the idling time is less than 5 seconds, namely the bus recovers the speed within 5 seconds (the bus is usually stopped at the intersection or the zebra crossing temporarily), the bus driver can know that the attention of the bus driver is not obviously transferred in the period, and the bus driver still concentrates on the road surface and is in a normal driving state after multiple times; therefore, dividing the time slice and the related result into a normal driving sequence from the node for recovering the vehicle speed; if the idle speed duration is longer than 5 seconds and shorter than 10 seconds (no passenger gets on or off the bus or passengers are given a gift), dividing the time segment and the result into an idle driving sequence; if the idle speed duration is longer than 10 seconds (common in traffic lights, traffic jams and stops), dividing every 10 seconds as the length of a time slice, and sequentially dividing the last time slice of less than or equal to 10 seconds into an idle speed driving sequence;

s34: the two driving sequences respectively splice the time slices divided into the two driving sequences and the corresponding image frame detection results, and the image frame results correspond to the two sequence time dimensions again, namely, the driving sequences start from the beginning and take the frame number as an abscissa;

s4, setting a fatigue state time sequence double-track early warning mechanism;

because the time sequence double-track division is carried out, early warning mechanisms are required to be respectively set for two driving sequences; the analyzable data comprises the number of blink frames, the blink frequency, the number of frame opening the mouth and the number of yawning; defining perclos _eye Computing in-unit closed-eye framesThe proportional relation between the number and the total frame number can reflect the ratio of the eye closure duration to the detection time and the fatigue state of the driver, as shown in formula 7; number of blinks N _eye Counting and adding according to each sequence;

in the formula, t _eye Representing the number of eye closure frames; t is _eye Represents the total number of frames per unit time;

perclos is defined according to the relationship between time and frame number per minute _mouth And perclos _mouth-mask The fatigue evaluation can be described as the increase of the open mouth characteristic frequency in unit time of local continuity, as shown in formulas 8 and 9;

in the formula, t _mouth The frame number of the mouth opening is represented; t is a unit of _mouth Represents the total number of frames per unit time; t is t _mouth-mask Representing the number of yawning frames; t is a unit of _mouth-mask Represents the total number of frames per unit time;

s41) setting a normal driving sequence early warning mechanism:

through simulation of a driving platform experiment, the blinking frequency of a bus driver is obviously reduced when the bus driver concentrates on driving, and the phenomenon of multi-blinking in a short time can occur due to dry eyes after the bus driver concentrates on the driving for a long time; when a driver is in a fatigue state, the phenomenon of eye dullness can occur, and the blinking frequency is lower than a normal value; furthermore, some drivers may resist fatigue by blinking frequently, resulting in a blinking frequency higher than normal, so that either too low or too high indicates a higher degree of fatigue; according to the phenomenon, the driving early warning mechanism of the normal driving sequence is as follows (the unit time is 1 minute):

in the formula, perclos _eye Represents the ratio of the number of closed-eye frames to the unit time, perclos _mouth Representing the ratio of the number of open-mouth frames to the unit time, perclos _mouth-mask Represents the ratio of the number of yawning frames to the unit time under the condition of wearing the mask, N _eye Indicates the number of blinks per unit time, states _eye Representing an eye detection result; if the any condition is met, judging that the bus driver is in a fatigue state under a normal driving sequence, and performing early warning reminding;

s42) setting an idle driving sequence early warning mechanism:

through a simulation driving platform experiment, the fact that after a bus driver is in an idle state due to high concentration degree under a normal driving condition, the overall concentration degree is reduced, the frequency of yawning and blinking is greatly improved, and the bus driver can still normally drive the bus after transient correction is carried out in an idle stage; according to the above phenomenon, the idling driving sequence driving warning mechanism is as follows (unit time is 1 minute):

in which the characters represent the same meanings as in formula 10.

2. The bus driver fatigue parameter compensation and dual-track parallel detection method according to claim 1, characterized in that: taking X in step S23) ₁ 0.8, meaning that the eye is closed to a greater extent than 80% of the pupil covered by the eyelids; in the same way, X ₂ A jerk of 0.2, i.e. a mouth opening greater than 80% of the maximum opening, is taken.