CN115836868A - Driver fatigue state identification method based on multi-scale convolution kernel size CNN - Google Patents

Driver fatigue state identification method based on multi-scale convolution kernel size CNN Download PDF

Info

Publication number
CN115836868A
CN115836868A CN202211488681.1A CN202211488681A CN115836868A CN 115836868 A CN115836868 A CN 115836868A CN 202211488681 A CN202211488681 A CN 202211488681A CN 115836868 A CN115836868 A CN 115836868A
Authority
CN
China
Prior art keywords
data
sample
convolution kernel
layer
kernel size
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211488681.1A
Other languages
Chinese (zh)
Inventor
付荣荣
侯启恩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Yanshan University
Original Assignee
Yanshan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Yanshan University filed Critical Yanshan University
Priority to CN202211488681.1A priority Critical patent/CN115836868A/en
Publication of CN115836868A publication Critical patent/CN115836868A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Measurement And Recording Of Electrical Phenomena And Electrical Characteristics Of The Living Body (AREA)

Abstract

The invention discloses a driver fatigue state identification method based on a multi-scale convolution kernel size CNN, which comprises the following steps: s1, data preparation: preprocessing the acquired electroencephalogram signals to obtain standard format data; s2, data enhancement: performing data enhancement on the original data by adopting a frequency masking and frequency domain recombination algorithm; s3: training a model, namely training a data set formed by original data and enhanced data by adopting a multi-scale convolution kernel size mixed CNN model to obtain a classifier; s4, state identification: inputting the preprocessed electroencephalogram data into a classifier model to obtain a state label of a sample and an interpretable model classification basis; the method improves the classification performance of the model, and realizes higher accuracy in the task of identifying the fatigue state by the data set of the continuous attention driving task; and two data enhancement methods of adding frequency noise and frequency masking are designed to be integrated with the CNN model, so that the generalization capability of the model is further improved.

Description

Driver fatigue state identification method based on multi-scale convolution kernel size CNN
Technical Field
The invention relates to a driver fatigue state identification method based on a multi-scale convolution kernel size CNN, and belongs to the technical field of electroencephalogram signal processing.
Background
Along with the frequent occurrence of various malignant traffic accidents, people pay more and more attention to the driving safety. Wherein, driving fatigue can directly cause traffic accidents, which causes serious harm to the life and property safety of people. Driving fatigue refers to a phenomenon in which a driver has reduced physical functions due to insufficient rest or prolonged driving, and is usually manifested as fatigue in the mind of the driver. And the fatigue state of the driver can be accurately identified, so that the occurrence of traffic accidents can be effectively reduced.
Research shows that electroencephalogram (EEG) of a driver in a fatigue state is obviously different from a non-fatigue state. The identification of fatigue state from brain electrical signals by deep learning method has been widely studied. However, because electroencephalogram signals of different subjects and different recording periods are very different, a convolutional neural network with a single convolution scale cannot be adapted to electroencephalogram signals of different subjects at the same time, and the fatigue electroencephalogram state identification accuracy of the model is to be further improved. In the fatigue state identification of the electroencephalogram signals crossing the tested object, the time characteristics, the frequency characteristics and the spatial characteristics of the electroencephalogram signals are fully extracted, and the method is very important for accurately identifying the fatigue state.
For the deep learning algorithm, the accuracy of electroencephalogram signal classification is not only related to the performance of the designed network structure, but also to a great extent related to the quantity of data quantity available for electroencephalogram signal training. When the training data volume is limited, the neural network model is easy to generate an overfitting phenomenon in the training process, so that the classification accuracy of the model on a test set is low. In order to solve the neural network model overfitting phenomenon caused by the limited data volume of the electroencephalogram signals, the first thought solution is to collect more training data as much as possible. However, in most cases, it is difficult or even impossible to acquire more brain electrical signals. Therefore, how to fully utilize the acquired electroencephalogram data is very important to generate more data through a data enhancement method.
Disclosure of Invention
In order to solve the defects of the prior art, the invention provides a driver fatigue state identification method based on a multi-scale convolution kernel size CNN, a convolution neural network with a mixed multi-scale convolution kernel size is designed to identify the driver fatigue state, the classification performance of a fatigue state identification model is improved, and meanwhile, the classification performance and the generalization capability of the model are further improved by integrating a data enhancement method adopting frequency masking and adding frequency domain noise with the multi-scale convolution kernel size CNN model.
In order to solve the technical problems, the technical scheme adopted by the invention is as follows:
the driver fatigue state identification method based on the multi-scale convolution kernel size CNN comprises the following steps:
s1, data preparation: preprocessing the acquired electroencephalogram signals to obtain standard format data;
s2, data enhancement: performing data enhancement on the original data by adopting a frequency masking or frequency domain noise adding algorithm;
s3: training a model, namely training a data set formed by original data and enhanced data by adopting a multi-scale convolution kernel size mixed CNN model to obtain a classifier;
s4, state identification: and inputting the preprocessed electroencephalogram data into a classifier model to obtain a state label of the sample and an interpretable model classification basis.
The technical scheme of the invention is further improved as follows: the specific operation of the step S1 is as follows:
s11, reducing the sampling rate of the original data set to 128Hz, and extracting electroencephalogram samples of each track 3S before the deviation event occurs;
s12, calculating the local reaction time R of each sample t
R t =t d -t r (21),
Wherein, t d Time of drift of the vehicle, t r The time when the subject operates the automobile to return to the original lane;
s13, calculating the global reaction time GR of each sample t
Figure BDA0003963900010000021
Wherein N is the number of samples in a window of 90s before each sample automobile drift event occurs, and GR t Namely the local reaction time R in the window of 90 seconds before the occurrence of the automobile drift event t Average value of (d);
s14, defining baseline wakefulness response time R t alert
The fifth percentile of the local response time in each session was taken as the baseline wakefulness response time R t alert As the basis for marking the sample in the next step;
s15, labeling each sample:
when the local reaction time and the global reaction time of the sample are both less than 1.5 times the baseline wake time, the sample is marked as alert; the sample is labeled as fatigue when the local and global reaction times of the sample are both greater than 2.5 times the baseline wakeful reaction time.
The technical scheme of the invention is further improved as follows: after the marking in step S15 is completed, 2022 samples are obtained, each sample containing 3S of electroencephalogram data.
The technical scheme of the invention is further improved as follows: the specific operation of the frequency masking enhancement data in step S2 is:
s21, carrying out fast Fourier transform on the original signal X (t) to obtain a frequency domain signal X (jw)
X(jw)=F(x(t)) (23),
Wherein F (-) represents a fast Fourier transform;
s22, determining hyper-parameters S and t, wherein S represents the number of masking points, t represents the number of masking areas, and setting 20 frequency points randomly selected from one area to zero to obtain enhanced data.
The technical scheme of the invention is further improved as follows: the specific operation of adding the frequency domain noise algorithm enhanced data in the step S2 is as follows:
H(jw)e jωt =F[x(t)] (21),
where H (jw) = | X (jw) |, e jωt =Arg[X(jw)](i.e., phase of frequency domain signal), amplitude and phase additive noise G i (λ)~(0,σ i 2 )(i=0,1):
Figure BDA0003963900010000041
Obtaining an enhanced time domain signal after Fourier inverse transformation:
x noise (t)=F -1 (X noise ) (23)。
the technical scheme of the invention is further improved as follows: the specific operation of the step S3 is:
s31, data filtering: suppose that the original EEG signals recorded by m electrodes are X = { X = i } i=1,2,...m Performing band-pass filtering on the X to obtain three EEG signals X with different frequency bands 1 (4-7Hz),X 2 (8-13Hz),X 3 (13-32Hz);
S32, determining the size of a convolution kernel:
let convolution kernel size K = { K = } 1 ,K 2 ,K 3 In which K is i (i =1,2,3) convolution kernel size of Depthwise convolution for the ith branch;
s33, network structure design:
the first branch of the network structure, the input signal being the first frequency band X of the original data brain electrical signal 1(m,n) Wherein m is the number of electroencephalogram channels, 30, n is the number of sampling points of each sample, and 384 is taken;
as follows appear
Figure BDA0003963900010000042
A parameter representing a network layer x;
the output of the first layer Pointwise is:
Figure BDA0003963900010000043
wherein i =1,2,3 1 ,N 1 =16, number of Pointwise convolutions,
Figure BDA0003963900010000044
represents the weight of the p channel of the ith Pointwise convolution, </R>
Figure BDA0003963900010000045
A p-th channel, representing a j-th sample point of an input electroencephalogram signal sample, is->
Figure BDA0003963900010000046
A bias representing the ith Pointwise convolution; passes through a Pointwise layer to obtain an output->
Figure BDA0003963900010000047
Dimension (16, 384);
the first branch of the second layer Depthwise, the output signal dimension of the first layer is (16, 384), the signal obtained from the first layer has 16 channels, each channel adopts two Depthwise convolutions, so the number of channels output by the first branch is 32, and the number of sampling points j output by the first branch is j (2) Can be calculated from
Figure BDA0003963900010000051
Wherein the convolution kernel size K 1 =36, step size stride =1, padding =0, resulting in j (2) =349;
When i is odd, the output of the second layer Depthwise is:
Figure BDA0003963900010000052
when i is an even number, the output of the second layer Depthwise is:
Figure BDA0003963900010000053
wherein K 1 Is the convolution kernel size of Depthwise, i is the number of input channels, j is the number of sampling points,
third active layer:
Figure BDA0003963900010000054
the fourth layer is a batch normalization layer:
Figure BDA0003963900010000055
fifth global average pooling layer:
Figure BDA0003963900010000056
second branch of Depthwise layer, convolution kernel size K 2 =51 repeating equation (4) and subsequent steps, wherein j for the second branch is calculated from equation (4) (2) =334;
Third branch of Depthwise layer, convolution kernel size K 3 =80, repeating the formula (4) and the following steps, wherein j of the third branch is calculated from the formula (4) (3) =305;
The network structure of the first branch of the network structure is explained above, and the branch processes the EEG signal X of the 4-7Hz frequency band 1 For electroencephalogram signal X in 8-13Hz frequency band 2 Repeating the operation of the first branch of the network by the electroencephalogram signals of the 13-32Hz frequency band to obtain the output of the second branch and the third branch of the network, and finally carrying out full connection processing on the output of the three branches of the network and then carrying out the following processing:
sixth hidden layer:
Figure BDA0003963900010000061
c =0 or c =1 in the above formula, and represents an awake state when c = 0; when c =1, represents a fatigue state;
the seventh layer
Figure BDA0003963900010000062
And obtaining a classification result.
The technical scheme of the invention is further improved as follows: the specific operation of the step S4 is:
s41, adopting 11-fold cross validation
Selecting a first subject as a test set, taking electroencephalogram data except the first subject as a training set, and calculating the identification accuracy ACC of the model on the first subject 1 (ii) a The steps are adopted for the rest subjects, and the identification accuracy rate ACC of 11 subjects is finally obtained i (i=1,2...11);
S42, calculating average accuracy:
Figure BDA0003963900010000063
wherein n =11 represents the number of subjects;
s43, interpretability analysis
Positioning a distinguishing area of each input sample of a CNN model for solving a classification task by adopting a class activation mapping method;
assuming that a given electroencephalogram sample X (m, n) is classified with a label c, c being 0 for awake state and c being 1 for fatigue state, the input sample produces an activation h at the sixth level of the network c (6) From equation (9), we can obtain:
Figure BDA0003963900010000071
wherein
Figure BDA0003963900010000072
Further neglecting the constant (n-K + 1) to obtain
Figure BDA0003963900010000073
M i,j Can be regarded as 2N in size 1 Final activation layer of class c in mapping of x (n-K + 1)
Figure BDA0003963900010000074
The distribution of (a);
Figure BDA0003963900010000075
in the above formula, σ is a constant for determining the radius of the area of influence of each discrimination point in the input signal;
Figure BDA0003963900010000076
further normalized to the range of (-1, 1) for visualization;
according to (3) and (5), (6), when i k When it is odd
Figure BDA0003963900010000077
When i is k When it is even number
Figure BDA0003963900010000078
When i is k When it is even, ignore
Figure BDA0003963900010000079
From a point of time j k To j k One set of local area input signals of + K-1 results, i.e. convolution signals of m =30 channels ≥>
Figure BDA00039639000100000710
Is weighted, the weight of the p-th channel is->
Figure BDA00039639000100000711
(i k Odd), or +>
Figure BDA00039639000100000712
(i k Is an even number); accordingly, is present>
Figure BDA00039639000100000713
Position (i) of k ,j k ) It is possible to trace back to the center (p) of the strongest contribution set in the input signal k ,q k ) When i is k When it is odd, p k The expression is as follows:
Figure BDA0003963900010000081
when i is k When it is an even number, p k The expression is as follows:
Figure BDA0003963900010000082
and q is k =j k +(l-1)/2
In the formula (17)
Figure BDA0003963900010000083
So as to highlight the entire set of strongest contributing signals at the discriminated position of the input signal.
Due to the adoption of the technical scheme, the invention has the technical progress that:
the invention designs a multi-scale convolution kernel size mixed convolution neural network to identify the fatigue state of a driver, obtains 2% of classification performance improvement in continuous attention fatigue driving data concentration, improves the classification performance of a fatigue state identification model, and further improves the classification performance and generalization capability of the model by integrating a frequency masking and frequency domain noise adding data enhancement method and a multi-scale convolution kernel size CNN model.
Drawings
FIG. 1 is a schematic flow diagram of the present invention;
FIG. 2 is a schematic diagram of the frequency masking of the present invention;
FIG. 3 is a diagram of a CNN model architecture for multi-scale convolution kernel sizes in accordance with the present invention;
FIG. 4 is an overall schematic view of the present invention;
FIG. 5 is a t-SNE visualization graph of classification results of electroencephalogram data of a subject 1 through a multi-scale convolution kernel size CNN model in an output layer in embodiment 2 of the present invention;
FIG. 6 is a t-SNE visualization diagram of classification results of electroencephalogram data of the subject 1 through an interpretable CNN model in an output layer in the embodiment 2 of the present invention.
Detailed Description
The present invention will be described in further detail with reference to the following examples:
example 1:
a driver fatigue state identification method based on a multi-scale convolution kernel size CNN, fig. 1 is a schematic flow diagram of the method, and fig. 4 is a schematic overall diagram of the method.
The specific operation steps are as follows:
s1, data preparation: preprocessing the acquired electroencephalogram signals to obtain eleven subjects and 2022 samples, wherein the samples are shown in table 1:
Figure BDA0003963900010000091
/>
TABLE 1
The method is divided into the following 5 sub-steps:
s11, reducing the sampling rate of the original data set to 128Hz, and extracting electroencephalogram samples of each track 3S before the deviation event occurs.
S12, calculating the local reaction time R of each sample t
R t =t d -t r (41),
Wherein t is d Time of vehicle drift, t r The time when the subject operated the car to return to the original lane.
S13, calculating the global reaction time GR of each sample t
Figure BDA0003963900010000101
Wherein N is the number of samples in a window of 90s before each sample automobile drift event occurs, and GR t Namely the local reaction time R in the window of 90 seconds before the occurrence of the automobile drift event t Average value of (a).
S14, defining baseline wakefulness response time R t alert
The fifth percentile of the local response time in each session was taken as the baseline wakefulness response time R t alert And the result is used as the basis for marking the sample in the next step.
S15, labeling each sample:
when the local reaction time and the global reaction time of the sample are both less than 1.5 times the baseline wake time, the sample is marked as alert; the sample is labeled as fatigue when the local and global reaction times of the sample are both greater than 2.5 times the baseline wakeful reaction time. After labeling was completed, 2022 samples were obtained, each containing 3s of electroencephalographic data.
S2, data enhancement: the example uses frequency masking for data enhancement of raw data
Fast Fourier transform is carried out on the original signal X (t) to obtain a frequency domain signal X (jw)
X(jw)=F(x(t)) (43),
Where F (-) represents a fast Fourier transform.
A schematic diagram of frequency masking is shown in fig. 2.
Determining hyper-parameters s and t, wherein s represents the number of masking points, t represents the number of masking areas, and t =1,s =20 is taken in the example; and (3) setting 20 frequency points randomly selected from a region to zero, and performing Fourier inverse transformation to obtain a time domain signal to obtain enhanced data.
S3, model training: training a data set formed by original data and enhanced data by adopting a multi-scale convolution kernel size mixed CNN model to obtain a classifier
S31, data filtering: suppose that the original EEG signals recorded by m electrodes are X = { X = i } i=1,2,...m Performing band-pass filtering on the X to obtain three EEG signals X with different frequency bands 1 (4-7Hz),X 2 (8-13Hz),X 3 (13-32Hz)。
S32, determining the size of a convolution kernel:
let convolution kernel size K = { K = } 1 ,K 2 ,K 3 In which K is i (i =1,2,3) is the convolution kernel size of the Depthwise convolution of the ith branch. Convolution kernel size K adopted in this example 1 =36,K 2 =51,K 3 =80
S33, network structure design:
the first branch of the network structure, the input signal being the first frequency band X of the original data brain electrical signal 1(m,n) Wherein m =30 is the number of electroencephalogram channels, and n =384 is the number of sampling points of each sample. All numbers in parentheses below indicate the number of network layers, e.g.
Figure BDA0003963900010000111
Representing a parameter of a first layer of the network.
The output of the first layer Pointwise is:
Figure BDA0003963900010000112
wherein i =1,2,3 1 ,N 1 =16, number of Pointwise convolutions,
Figure BDA0003963900010000113
represents the weight of the p channel of the ith Pointwise convolution, </R>
Figure BDA0003963900010000114
Representing the j-th sample point of the input electroencephalogram signal sampleThe p-th channel>
Figure BDA0003963900010000115
Represents the bias of the ith Pointwise convolution. Passes through a Pointwise layer to obtain an output->
Figure BDA0003963900010000116
The dimension is (16,384).
The first branch of the second layer Depthwise, the output signal dimension of the first layer is (16, 384), the signal obtained from the first layer has 16 channels, each channel adopts two Depthwise convolutions, so the number of channels output by the first branch is 32, and the number of sampling points j output by the first branch is j (2) Can be calculated from
Figure BDA0003963900010000117
Wherein the convolution kernel size K 1 =36, step size stride =1, padding =0, resulting in j (2) =349
When i is odd, the output of the second layer Depthwise is:
Figure BDA0003963900010000121
when i is even, the output of the second layer Depthwise is:
Figure BDA0003963900010000122
wherein K is 1 Is the convolution kernel size of Depthwise, i is the number of input channels, j is the number of sampling points,
third active layer:
Figure BDA0003963900010000123
the fourth layer is a batch normalization layer:
Figure BDA0003963900010000124
fifth global average pooling layer:
Figure BDA0003963900010000125
second branch of Depthwise layer, convolution kernel size K 2 =51 repeating equation (4) and subsequent steps, wherein j for the second branch is calculated from equation (4) (2) =334。
Third branch of Depthwise layer, convolution kernel size K 3 =80, repeating equation (4) and subsequent steps, wherein j for the third branch is calculated from equation (4) (3) =305。
As shown in fig. 3, the three branches of Depthwise are all connected.
The network structure of the first branch of the network structure is explained above, and the branch processes the EEG signal X of the 4-7Hz frequency band 1 For electroencephalogram signal X in 8-13Hz frequency band 2 Repeating the operation of the first branch of the network by the electroencephalogram signals of the 13-32Hz frequency band to obtain the output of the second branch and the third branch of the network, and finally carrying out full connection processing on the output of the three branches of the network and then carrying out the following processing:
the sixth hidden layer
Figure BDA0003963900010000131
C =0 or c =1 in the above formula, and represents an awake state when c = 0; when c =1, the fatigue state is represented.
The seventh layer
Figure BDA0003963900010000132
And obtaining a classification result.
S4, state identification: inputting the preprocessed electroencephalogram data into a classifier model to obtain a state label of a sample and an interpretable model classification basis;
s41, adopting 11-fold cross validation
Selecting a first subject as a test set, taking electroencephalogram data except the first subject as a training set, and calculating the identification accuracy ACC of the model on the first subject 1 (ii) a The steps are adopted for the rest subjects, and the identification accuracy rate ACC of 11 subjects is finally obtained i (i=1,2...11)。
S42, calculating average accuracy:
Figure BDA0003963900010000133
where n =11, the number of subjects is indicated.
S43, interpretability analysis
And positioning the distinguishing area of each input sample of the CNN model for solving the classification task by adopting a class activation mapping method.
Assuming that a given electroencephalogram sample X (30, 384) is classified with the label c, c being 0 for awake state and c being 1 for fatigue state, the input sample produces an activation h at the sixth level of the network c (6) From equation (9), we can obtain:
Figure BDA0003963900010000141
wherein
Figure BDA0003963900010000142
Further neglecting the constant (n-K + 1) to obtain
Figure BDA0003963900010000143
M i,j Can be regarded asIs of size 2N 1 Final activation layer of class c in mapping of x (n-K + 1)
Figure BDA0003963900010000144
Distribution of (2).
Figure BDA0003963900010000145
/>
In the above equation, σ is a constant for determining the radius of the region of influence of each discrimination point in the input signal.
Figure BDA0003963900010000146
Further normalization was in the range of (-1,1) for visualization.
According to (3) and (5), (6), when i k When it is odd
Figure BDA0003963900010000147
When i is k When it is even number
Figure BDA0003963900010000148
When i is k When it is even, ignore
Figure BDA0003963900010000149
From a point of time j k To j k One set of local area input signals of + K-1 results, i.e. convolution signals of m =30 channels ≥>
Figure BDA00039639000100001410
Is weighted, the weight of the p-th channel is->
Figure BDA00039639000100001411
(i k Odd), or +>
Figure BDA00039639000100001412
(i k Is an even number). Accordingly, are combined>
Figure BDA00039639000100001413
Position (i) of k ,j k ) It is possible to trace back to the center (p) of the strongest contribution set in the input signal k ,q k ) When i is k When it is odd, p k The expression is as follows:
Figure BDA0003963900010000151
when i is k When it is an even number, p k The expression is as follows:
Figure BDA0003963900010000152
and q is k =j k +(l-1)/2
In formula (17)
Figure BDA0003963900010000153
So as to highlight the entire set of strongest contributing signals at the discriminated position of the input signal.
Example 2:
this example introduces a data enhancement method of adding frequency domain noise in a fatigue state recognition task, and the processing steps of steps S1, S3, and S4 are all the same as those of embodiment 1:
s2, data enhancement: in the embodiment, the frequency domain reorganization algorithm is adopted to perform data enhancement on the original data
Fast Fourier transform is carried out on the original signal x (t) to obtain a frequency domain signal
H(jw)e jωt =X(jw)=F[x(t)] (21),
Where H (jw) = | X (jw) |, e jωt =Arg[X(jw)](i.e., phase of frequency domain signal), amplitude and phase additive noise G i (λ)~(0,σ i 2 )(i=0,1):
Figure BDA0003963900010000154
Obtaining an enhanced time domain signal after Fourier inverse transformation:
x noise (t)=F -1 (X noise ) (23),
in order to embody the performance of the algorithm of the present invention, the experimental results of the multi-scale convolution kernel size CNN of the present invention are compared with the results of the Conv-bellownet, EEGNet, and interpretiblecnn network structures, as shown in table 2, where the bold numbers are the optimal recognition accuracy of each experimental object.
t-SNE visualization of the classification result of the electroencephalogram data of the subject 1 through the multi-scale convolution kernel size CNN and the interpretable CNN model on the output layer is shown in FIG. 5 and FIG. 6, and the remarkable advantage of the multi-scale convolution kernel size CNN model on the classification effect can be seen.
Figure BDA0003963900010000161
TABLE 2
Finally, it should be noted that: the above-mentioned embodiments are only used for illustrating the technical solution of the present invention, and not for limiting the same; although the present invention has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present invention.

Claims (7)

1. The method for identifying the fatigue state of the driver based on the multi-scale convolution kernel size CNN is characterized by comprising the following steps of:
s1, data preparation: preprocessing the acquired electroencephalogram signals to obtain standard format data;
s2, data enhancement: performing data enhancement on the original data by adopting a frequency masking or frequency domain noise adding algorithm;
s3: training a model, namely training a data set formed by original data and enhanced data by adopting a multi-scale convolution kernel size mixed CNN model to obtain a classifier;
s4, state identification: and inputting the preprocessed electroencephalogram data into a classifier model to obtain a state label of the sample and an interpretable model classification basis.
2. The method for identifying the fatigue state of the driver based on the multi-scale convolution kernel size CNN as claimed in claim 1, wherein the specific operations of step S1 are as follows:
s11, reducing the sampling rate of the original data set to 128Hz, and extracting electroencephalogram samples of each track 3S before the deviation event occurs;
s12, calculating the local reaction time R of each sample t
R t =t d -t r (1),
Wherein, t d Time of drift of the vehicle, t r The time when the subject operates the automobile to return to the original lane;
s13, calculating the global reaction time GR of each sample t
Figure FDA0003963898000000011
Wherein N is the number of samples in a window of 90s before each sample automobile drift event occurs, and GR t Namely the local reaction time R in the window of 90 seconds before the occurrence of the automobile drift event t Average value of (a);
s14, defining baseline wakefulness response time R t alert
The fifth percentile of the local response time in each session was taken as the baseline wakefulness response time R t alert As the basis for marking the sample in the next step;
s15, labeling each sample:
when the local reaction time and the global reaction time of the sample are both less than 1.5 times the baseline wake time, the sample is marked as alert; the sample is labeled as fatigue when the local and global reaction times of the sample are both greater than 2.5 times the baseline wakeful reaction time.
3. The method for identifying the fatigue status of the driver based on the multi-scale convolution kernel size CNN as claimed in claim 2, wherein 2022 samples are obtained after the marking in step S15 is completed, and each sample contains 3S electroencephalogram data.
4. The method for identifying the fatigue state of the driver based on the multi-scale convolution kernel size CNN as claimed in claim 1, wherein the specific operations of the frequency masking enhancement data in the step S2 are as follows:
s21, carrying out fast Fourier transform on the original signal X (t) to obtain a frequency domain signal X (jw)
X(jw)=F(x(t)) (3),
Wherein F (-) represents a fast Fourier transform;
s22, determining hyper-parameters S and t, wherein S represents the number of masking points, t represents the number of masking areas, and setting 20 frequency points of one area randomly selected to zero to obtain enhanced data.
5. The method for identifying the fatigue state of the driver based on the multi-scale convolution kernel size CNN as claimed in claim 1, wherein the specific operation of adding the frequency domain noise algorithm enhanced data in step S2 is:
H(jw)e jωt =F[x(t)] (21),
where H (jw) = | X (jw) |, e jωt =Arg[X(jw)](i.e., phase of frequency domain signal), amplitude and phase additive noise G i (λ)~(0,σ i 2 )(i=0,1):
Figure FDA0003963898000000021
Obtaining an enhanced time domain signal after Fourier inverse transformation:
x noise (t)=F -1 (X noise ) (23)。
6. the method for identifying the fatigue state of the driver based on the multi-scale convolution kernel size CNN as claimed in claim 1, wherein the specific operation of step S3 is:
s31, data filtering: suppose that the original EEG signals recorded by m electrodes are X = { X = i } i=1,2,...m Performing band-pass filtering on the X to obtain three EEG signals X with different frequency bands 1 (4-7Hz),X 2 (8-13Hz),X 3 (13-32Hz);
S32, determining the size of a convolution kernel:
let convolution kernel size K = { K = } 1 ,K 2 ,K 3 In which K is i (i =1,2,3) the convolution kernel size of the Depthwise convolution for the ith branch;
s33, network structure design:
the first branch of the network structure, the input signal being the first frequency band X of the original data brain electrical signal 1(m,n) Wherein m is the number of electroencephalogram channels, 30, n is the number of sampling points of each sample, and 384 is taken;
as follows appear
Figure FDA0003963898000000031
A parameter representing a network layer x;
the output of the first layer Pointwise is:
Figure FDA0003963898000000032
wherein i =1,2,3 1 ,N 1 =16, number of Pointwise convolutions,
Figure FDA0003963898000000033
weight of p channel representing ith Pointwise convolution,
Figure FDA0003963898000000034
A p-th channel representing a j-th sample point of the input brain electrical signal sample,
Figure FDA0003963898000000035
a bias representing the ith Pointwise convolution; passing through Pointwise layer to obtain output
Figure FDA0003963898000000036
Dimension (16, 384);
the first branch of the second layer Depthwise, the output signal dimension of the first layer is (16, 384), the signal obtained from the first layer has 16 channels, each channel adopts two Depthwise convolutions, so the number of channels output by the first branch is 32, and the number of sampling points j output by the first branch is j (2) Can be calculated from
Figure FDA0003963898000000037
Wherein the convolution kernel size K 1 =36, step size stride =1, padding =0, resulting in j (2) =349;
When i is odd, the output of the second layer Depthwise is:
Figure FDA0003963898000000041
when i is even, the output of the second layer Depthwise is:
Figure FDA0003963898000000042
wherein K 1 Is the convolution kernel size of Depthwise, i is the number of input channels, j is the number of sampling points,
third active layer:
Figure FDA0003963898000000043
the fourth layer is a batch normalization layer:
Figure FDA0003963898000000044
fifth global average pooling layer:
Figure FDA0003963898000000045
second branch of Depthwise layer, convolution kernel size K 2 =51 repeating equation (4) and subsequent steps, wherein j for the second branch is calculated from equation (4) (2) =334;
Third branch of Depthwise layer, convolution kernel size K 3 =80, repeating equation (4) and subsequent steps, wherein j for the third branch is calculated from equation (4) (3) =305;
The network structure of the first branch of the network structure is explained above, and the branch processes the EEG signal X of the 4-7Hz frequency band 1 For 8-13Hz frequency band EEG signal X 2 And repeating the operation of the first branch of the network by the electroencephalogram signals with the frequency band of 13-32Hz to obtain the output of the second branch and the third branch of the network, and finally carrying out full connection processing on the output of the three branches of the network and then carrying out the following processing:
sixth hidden layer:
Figure FDA0003963898000000051
c =0 or c =1 in the above formula, and represents an awake state when c = 0; when c =1, represents a fatigue state;
the seventh layer
Figure FDA0003963898000000052
And obtaining a classification result.
7. The method for identifying the fatigue state of the driver based on the multi-scale convolution kernel size CNN as claimed in claim 1, wherein the specific operations of step S4 are as follows:
s41, adopting 11-fold cross validation
Selecting a first subject as a test set, taking electroencephalogram data except the first subject as a training set, and calculating the identification accuracy ACC of the model on the first subject 1 (ii) a The steps are adopted for the rest subjects, and the identification accuracy rate ACC of 11 subjects is finally obtained i (i=1,2...11);
S42, calculating average accuracy:
Figure FDA0003963898000000053
wherein n =11 represents the number of subjects;
s43, interpretability analysis
Positioning a distinguishing area of each input sample of a CNN model for solving a classification task by adopting a class activation mapping method;
assuming that a given electroencephalogram sample X (m, n) is classified with a label c, c being 0 for awake state and c being 1 for fatigue state, the input sample produces an activation h at the sixth level of the network c (6) From equation (9), we can obtain:
Figure FDA0003963898000000054
wherein
Figure FDA0003963898000000055
Further neglecting the constant (n-K + 1) to obtain
Figure FDA0003963898000000061
M i,j Can be regarded as 2N in size 1 Final activation layer of class c in mapping of x (n-K + 1)
Figure FDA0003963898000000062
The distribution of (a);
Figure FDA0003963898000000063
in the above formula, σ is a constant for determining the radius of the region of influence of each discrimination point in the input signal;
Figure FDA0003963898000000064
further normalized to the range of (-1, 1) for visualization;
according to (3) and (5), (6), when i k When it is odd
Figure FDA0003963898000000065
When i is k When it is even number
Figure FDA0003963898000000066
When i is k When it is even, ignore
Figure FDA0003963898000000067
From a point of time j k To j k One set of local area input signals of + K-1 is generated, i.e. convolution signals of m =30 channels
Figure FDA0003963898000000068
Is weighted by the weight of the p-th channel
Figure FDA0003963898000000069
(i k Is an odd number), or
Figure FDA00039638980000000610
(i k Is an even number); therefore, the number of the first and second electrodes is increased,
Figure FDA00039638980000000611
position (i) of k ,j k ) It is possible to trace back to the center (p) of the strongest contribution set in the input signal k ,q k ) When i is k When it is odd, p k The expression is as follows:
Figure FDA00039638980000000612
when i is k When it is an even number, p k The expression is as follows:
Figure FDA0003963898000000071
and q is k =j k +(l-1)/2
In formula (17)
Figure FDA0003963898000000072
So as to highlight the entire set of strongest contributing signals at the discriminated position of the input signal.
CN202211488681.1A 2022-11-25 2022-11-25 Driver fatigue state identification method based on multi-scale convolution kernel size CNN Pending CN115836868A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211488681.1A CN115836868A (en) 2022-11-25 2022-11-25 Driver fatigue state identification method based on multi-scale convolution kernel size CNN

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211488681.1A CN115836868A (en) 2022-11-25 2022-11-25 Driver fatigue state identification method based on multi-scale convolution kernel size CNN

Publications (1)

Publication Number Publication Date
CN115836868A true CN115836868A (en) 2023-03-24

Family

ID=85576091

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211488681.1A Pending CN115836868A (en) 2022-11-25 2022-11-25 Driver fatigue state identification method based on multi-scale convolution kernel size CNN

Country Status (1)

Country Link
CN (1) CN115836868A (en)

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110309813A (en) * 2019-07-10 2019-10-08 南京行者易智能交通科技有限公司 A kind of model training method, detection method, device, mobile end equipment and the server of the human eye state detection based on deep learning
CN111460892A (en) * 2020-03-02 2020-07-28 五邑大学 Electroencephalogram mode classification model training method, classification method and system
US20200367800A1 (en) * 2019-01-23 2020-11-26 Wuyi University Method for identifying driving fatigue based on cnn-lstm deep learning model
CN113180692A (en) * 2021-02-11 2021-07-30 北京工业大学 Electroencephalogram signal classification and identification method based on feature fusion and attention mechanism
CN113673442A (en) * 2021-08-24 2021-11-19 燕山大学 Variable working condition fault detection method based on semi-supervised single classification network
US20210365741A1 (en) * 2019-05-08 2021-11-25 Tencent Technology (Shenzhen) Company Limited Image classification method, computer-readable storage medium, and computer device
CN113934302A (en) * 2021-10-21 2022-01-14 燕山大学 Myoelectric gesture recognition method based on SeNet and gating time sequence convolution network
CN114399642A (en) * 2021-12-29 2022-04-26 燕山大学 Convolutional neural network fluorescence spectrum feature extraction method
CN115357113A (en) * 2022-07-08 2022-11-18 西安电子科技大学 SSVEP brain-computer interface stimulation modulation and decoding method under dynamic background

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200367800A1 (en) * 2019-01-23 2020-11-26 Wuyi University Method for identifying driving fatigue based on cnn-lstm deep learning model
US20210365741A1 (en) * 2019-05-08 2021-11-25 Tencent Technology (Shenzhen) Company Limited Image classification method, computer-readable storage medium, and computer device
CN110309813A (en) * 2019-07-10 2019-10-08 南京行者易智能交通科技有限公司 A kind of model training method, detection method, device, mobile end equipment and the server of the human eye state detection based on deep learning
CN111460892A (en) * 2020-03-02 2020-07-28 五邑大学 Electroencephalogram mode classification model training method, classification method and system
CN113180692A (en) * 2021-02-11 2021-07-30 北京工业大学 Electroencephalogram signal classification and identification method based on feature fusion and attention mechanism
CN113673442A (en) * 2021-08-24 2021-11-19 燕山大学 Variable working condition fault detection method based on semi-supervised single classification network
CN113934302A (en) * 2021-10-21 2022-01-14 燕山大学 Myoelectric gesture recognition method based on SeNet and gating time sequence convolution network
CN114399642A (en) * 2021-12-29 2022-04-26 燕山大学 Convolutional neural network fluorescence spectrum feature extraction method
CN115357113A (en) * 2022-07-08 2022-11-18 西安电子科技大学 SSVEP brain-computer interface stimulation modulation and decoding method under dynamic background

Similar Documents

Publication Publication Date Title
Supriya et al. Automated epilepsy detection techniques from electroencephalogram signals: a review study
Lu et al. Classification of single-channel EEG signals for epileptic seizures detection based on hybrid features
Chen et al. Driving safety risk prediction using cost-sensitive with nonnegativity-constrained autoencoders based on imbalanced naturalistic driving data
Khare et al. Optimized tunable Q wavelet transform based drowsiness detection from electroencephalogram signals
WO2021017329A1 (en) Method and device for detecting when driver is distracted
Mehla et al. A novel approach for automated alcoholism detection using Fourier decomposition method
Yildiz et al. Classification and analysis of epileptic EEG recordings using convolutional neural network and class activation mapping
CN113180696A (en) Intracranial electroencephalogram detection method and device, electronic equipment and storage medium
Babaeian et al. Driver drowsiness detection algorithms using electrocardiogram data analysis
Dash et al. Hidden Markov model based epileptic seizure detection using tunable Q wavelet transform
Li et al. FuzzyEn-based features in FrFT-WPT domain for epileptic seizure detection
Wijayanto et al. Comparison of empirical mode decomposition and coarse-grained procedure for detecting pre-ictal and ictal condition in electroencephalography signal
Wang et al. Automated recognition of epilepsy from EEG signals using a combining space–time algorithm of CNN-LSTM
Lian et al. Spatial enhanced pattern through graph convolutional neural network for epileptic EEG identification
Thilagaraj et al. Identification of drivers drowsiness based on features extracted from EEG signal using SVM classifier
Gao et al. Automatic epileptic seizure classification in multichannel EEG time series with linear discriminant analysis
CN113343869A (en) Electroencephalogram signal automatic classification and identification method based on NTFT and CNN
CN115836868A (en) Driver fatigue state identification method based on multi-scale convolution kernel size CNN
Kumar et al. Classification of driver cognitive load based on physiological data: Exploring recurrent neural networks
Ding et al. EEG-Fest: Few-shot based Attention Network for Driver's Vigilance Estimation with EEG Signals
Yu et al. SQNN: a spike-wave index quantification neural network with a pre-labeling algorithm for epileptiform activity identification and quantification in children
Wang et al. Combining STFT and random forest algorithm for epileptic detection
Ding et al. EEG-fest: few-shot based attention network for driver's drowsiness estimation with EEG signals
Chandra et al. Neuromuscular disease detection employing 1D-local binary pattern of electromyography signals
Xie et al. An SVM parameter learning algorithm scalable on large data size for driver fatigue detection

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination