CN114170588B

CN114170588B - Eye feature-based bad state identification method for railway dispatcher

Info

Publication number: CN114170588B
Application number: CN202111521459.2A
Authority: CN
Inventors: 张光远; 张帆
Original assignee: Southwest Jiaotong University
Current assignee: Southwest Jiaotong University
Priority date: 2021-12-13
Filing date: 2021-12-13
Publication date: 2023-09-12
Anticipated expiration: 2041-12-13
Also published as: CN114170588A

Abstract

The invention provides a method for identifying bad states of railway dispatchers based on eye features, which comprises the steps of firstly preparing work in the early stage of an experiment, then completing a dispatching task according to the requirement of a dispatching experiment, and collecting eye data required by the experiment in real time; and (3) a clustering algorithm is adopted to find a time segment with a good classification effect, six eye movement characteristics in frame data and event data of the eye movement instrument are extracted through setting the same sliding window to form a model input end data set, labels are respectively used as output ends according to different working states, and the model input end data set is put into a built machine learning model for training comparison analysis. The invention solves the defect of evaluation index ambiguity by adopting a video emotion recognition technology, and the time domain synchronism ensures that the time interval division of the eye movement data is more accurate and controllable.

Description

Eye feature-based bad state identification method for railway dispatcher

Technical Field

The invention belongs to the technical field of machine learning, and particularly relates to a method for identifying bad states of railway dispatchers based on eye features.

Background

The railway system in China follows the principle of centralized leadership and unified command. The railway dispatching system is a nerve center of railway transportation and is responsible for organizing passenger and cargo transportation production and train safe operation. The railway driving dispatcher is used as the core of dispatching command, and the decision-making action of the railway driving dispatcher directly influences the safety and the efficiency of railway driving.

According to the characteristics of railway dispatching operation, the railway dispatching operation can be classified into two types of normal driving and abnormal driving, video supervision is taken as a main capacity in a normal driving state, attention is obviously lowered at the moment, and tired emotion is generated involuntarily; under abnormal conditions, a leader in a dispatching place generally stares at a station, and at the moment, the dispatcher has higher working pressure and stress, and if the situation of the faced accident is complex, the psychological pressure and the time pressure are both improved, so that the irritability emotion appears. Secondly, according to the working property of a dispatcher, particularly during night work, the physiological function of a person can be obviously reduced, and drowsiness becomes the most direct fatigue presentation state. And a large number of researches show that bad emotion and fatigue state can affect the attention, decision and judgment of people, and the operation safety of a train is greatly threatened by a dispatcher when the dispatcher continues to operate in the bad state. Identifying the working status of railway dispatchers is therefore of high research value.

Eye movement characteristic information plays a vital role as input information for bad state identification herein. Many studies on the attention and fatigue of drivers of trains and motor vehicles are recording of eye feature information, and most of the recording modes are realized in the forms of head-mounted instruments or computer image recognition technologies and the like. The head-mounted instrument can realize high-precision eye feature information recording, and the image recognition technology has no great interference to the tested person, but the recording precision is slightly lower than that of the head-mounted instrument. Therefore, the study of the driver records the eye movement characteristic information of the driving dispatcher through the head-mounted eye movement instrument.

The recognition, prediction and research of the fatigue state and the emotion state at home and abroad is mostly concentrated on drivers and drivers, and the effectiveness analysis is carried out by extracting an original data set through a contact or non-contact means and combining a subjective evaluation or physiological measurement method and putting the original data set into a machine learning algorithm or a regression algorithm. However, fatigue analysis of railway schedulers is mainly focused on methods requiring subjective evaluation such as workload, probability of human failure and the like, reliability is low, and emotional state identification of railway schedulers is not yet studied,

the work type of railway dispatcher is different from other industries such as drivers, the post is not stopped for 24 hours, the accident handling time is longer, and the psychological and physiological changes of people are larger. Different states can be generated under different working conditions, fatigue can be used as a result of adverse influence of judging working time on a dispatcher, but the fatigue is not unique, and the bad emotion state has the same important research value. Therefore, the recognition model of the fatigue and the bad emotion state of the railway dispatcher based on the eye features provides an important theoretical basis for the actual dispatcher work.

The first prior art is to analyze facial features of a modulator by using a BP neural network and an HMM. The dispatcher is required to start driving dispatching task operation at the head eye tracker of 12 to 15 pm, fills the card Luo Linsi sleepiness meter according to subjective feeling, and records the yawning frequency through manual counting. Seven eye movement indexes are extracted from the derived original data set of the eye movement instrument after the experiment is finished, and the seven eye movement indexes and the yawning frequency are combined into eight characteristic indexes to serve as an input end initial data set.

And then, performing cognitive function division on work tasks of railway dispatchers, calculating 432 triangular fuzzy numbers of human factor failure probability, importing k-means for cluster analysis, obtaining 3 types with maximum profile coefficients in 2, 3, 4 and 5 types of cluster numbers, determining that fatigue states of an output end are classified into light, medium and heavy states, performing weight calculation on 9-level KSS scale and DORATASK load values by adopting a hierarchical analysis method, and outputting fatigue degree intervals of the three fatigue states as an output end data set.

Through analyzing PERSON correlation coefficient of the initial data set and significance test of the input end indexes, mutual independence of the input end indexes is verified, and five strong correlation indexes of gazing time, average pupil size, blinking duration, yawning frequency and blinking frequency are screened out to serve as final input end data.

And respectively establishing a Markov model and a BP neural network model based on the time period characteristic and the time moment characteristic, comparing the accuracy of the model and the AUC area discovery of the ROC curve, wherein the BP neural network is suitable for fatigue degree judgment of the time moment characteristic data, and the HMM is suitable for fatigue degree judgment of the time period characteristic data.

The disadvantage of this technique is:

1. card Luo Linsi cards have indirect human factor effects on the sleepiness scale and on the artificial yawning scale. The distinction between class 9 classification of the KSS scale is fuzzy in the subjective cognition level of people, and a supervisor and a tested person are not easy to distinguish the classification, so that subjective influence factors are large. The existence of the yawner tends to generate certain tension and alertness on the tested person so as to influence the counting result.

2. The human failure probability and the eye movement data lack of space-time synchronism, and the eye movement data classification is defined by carrying out the clustering analysis of the human failure probability from the task of the dispatcher, so that the eye movement data classification is innovative but lacks certain persuasion.

3. The two models are well applicable to the models, but through model accuracy analysis of a pen holder, the BP neural network and the HMM can be found to have a condition of particularly low recognition degree even 0 for a certain classification, which is relatively unscientific in mathematical research.

In the second prior art, multiple linear regression analysis is carried out on emotion eye movement indexes of coal mine drivers. Firstly, carrying out 5-min good emotion induction by a mountain-water video coal mine driver, then carrying out 30-min roadway man-car driving, carrying out 5-min bad emotion induction after resting for 5min, and repeating the experimental process. And (5) completing 10 emotion state test questions and filling nine emotion scales for the driver while driving the experiment.

After the experiment is completed, the bad emotion and good emotion evaluation values of 1.00 and 8.75 of the emotion evaluation table are calculated, and the effectiveness of the experiment is proved. And then, analyzing the variability and the relativity of the scene of the work task and each index of the original data, and selecting four indexes of the number of the fixation points, the glancing speed, the reaction time and the hazard source identification number as input end data sets.

And establishing a quaternary linear regression equation to solve to obtain a regression constant, regression parameters and random errors, and verifying that the obtained average error of 8.16% has high prediction precision.

The disadvantage of this technique is:

1. the emotion induction experiment can well simulate the emotion of a real scene, but the same tested person continuously carries out different emotion induction in a very short time, so that the effect is greatly unstable. And the tested person also needs to fill in 10 test questions and emotion scales irrelevant to driving in the driving process, and has too many experimental irrelevant items and too obvious influence factors. Subjective emotion verification alone does not completely eliminate the source of influence.

2. The emotion assessment questions of the pencils are assumed scene questions and are not experimental tasks set in experiments, and are used for verifying that the true emotion states of the tested person are somewhat unreasonable and can not directly explain whether the induced emotion is successful or not.

In the third prior art, fatigue research of a driver based on eye movement and eye electricity data is performed, firstly, the difference of PERCLOS values respectively extracted by a video recognition and an eye movement instrument is compared, and finally, the eye movement instrument is selected for a driving experiment. And taking a PERCLOS value as an output end fatigue marking standard. The characteristic indexes of human eyes are collected by using the electro-oculogram equipment at the same time of experiment, a mapping relation of 'electro-oculogram characteristic indexes-PERCLOS' is established, and sequence characteristic selection and mRMR characteristic selection algorithm are compared to screen electro-oculogram data characteristics, so that the latter has higher performance, thereby determining to use mRMR to perform characteristic sequencing, extracting the characteristic with the front ranking and combining a linear kernel support vector machine and a radial basis function kernel support vector machine to perform classification training, obtaining good performance verification, having high identification degree on the electro-oculogram data in fatigue and awake states, and proving the feasibility of PERCLOS as evaluation.

The disadvantage of this technique is:

1. when PERCLOS is used as a fatigue evaluation index, the division of the threshold is set by human, but it is not described herein whether or not this portion is reasonable, that is, whether or not the state of the person is awake is an unknown quantity when the value reaches below the threshold set by human.

2. When marking and dividing the transition section from waking to fatigue, the pen user uses a learned classification model, and can theoretically determine that the model has learned the data difference in the fuzzy state, but the data difference still needs to be verified, and the data difference is not given.

3. The classification of fatigue has multiple classification cases, and the SVM used by the pencils is only applicable to the classification cases, and obviously cannot be used when the multiple classification problems are encountered.

Disclosure of Invention

The present invention aims to solve the following problems:

1. the influence of a subjective evaluation method containing a KSS scale and a nine-point emotion scale is avoided, and the defect of ambiguity of evaluation indexes is overcome by adopting a video emotion recognition technology. The time domain synchronism enables the time division of the eye movement data to be more accurate and controllable;

2. at present, the emotion state identification and prediction of railway dispatcher is not researched, but the influence of bad emotion on working safety and fatigue are not difficult to find by referring to the research of the former people on the personnel in working posts such as drivers, drivers and the like.

3. The eye height-width ratio is selected to replace PERCLOS for fatigue marking, the PERCLOS is undoubted for the accuracy of fatigue judgment, but certain technical limitations exist in data extraction, the relative position of eyes to a camera, the position of eyelashes and the processing result of eyelid images can influence the detection of pupil positions, and the influence value is difficult to distinguish one by one due to the large limitation of dynamic detection. However, the aspect ratio of the eyes is greatly different from that of PERCLOS in the quantification of fatigue, and the eye has obvious rationality of research.

4. The clustering algorithm is adopted to slice the data, so that the influence of abnormal values is reduced as much as possible, a plurality of machine learning algorithms, KNN, GRU and the like are respectively fused, the superiority and inferiority of each model are transversely compared, the richness of the model is increased, the applicability is wider, the robustness is better, and the accuracy is higher.

The specific technical scheme is as follows:

the method for identifying the bad state of the railway dispatcher based on the eye feature comprises the steps of firstly preparing the early stage of the experiment, then completing the dispatching task according to the requirement of dispatching the experiment, and collecting eye data required by the experiment in real time; and (3) a clustering algorithm is adopted to find a time segment with a good classification effect, six eye movement characteristics in frame data and event data of the eye movement instrument are extracted through setting the same sliding window to form a model input end data set, labels are respectively used as output ends according to different working states, and the model input end data set is put into a built machine learning model for training comparison analysis.

The method comprises the following specific steps:

step one: early preparation of experiment

The eye aspect ratio is used as a marking index of the fatigue state.

Building emotion recognition model

In the work of a railway dispatcher, facial expression pictures with obvious characteristics under tension, irritability and boredom emotion are collected in the field to form a data set for the emotion verification.

Performing image gray scale processing by adopting a classical weighting method:

gray[i,j]＝0.299*imageR[i,j]+0.578*imageG[i,j][1]+0.114*imageB[i,j]

initializing sampling parameters, dividing each training set picture, and storing a sequence containing a plurality of face matrixes and corresponding labels by using a labels model.

And calculating offset coordinates of the pixels on x and y axes, and calculating gray values and LBP (0, 1) coding values between adjacent pixels of each region by using a bilinear difference method, thereby obtaining an LBP coding diagram of the whole diagram.

Performing dimension reduction processing to obtain ULBP values, cutting the coded image into 8 x 8 areas, obtaining a sub-histogram of each area by using the LBP values and the ULBP values, and then obtaining a full-image LBPH image. And obtaining a classification result by comparing LBPH images among different face images.

Building a face feature detection model

(1) The video was frame image extracted and 30 face images were taken for 1 second using the for-loop function setup for alignment with the 30 frames/second sampling rate of the eye tracker.

(2) And (3) carrying out gray processing on the image, calculating the size and the direction of each pixel, dividing the image into 6*6 blocks, extracting pixels with different areas in each block to obtain an area gradient histogram, and combining all the subgraphs to form the HOG gradient histogram vector set of the whole image.

(2) Setting the SVM kernel function as linearity, wherein the HOG feature vector is a one-dimensional matrix, and downwards splicing all the image feature sets to form an n-dimensional matrix in order to meet the SVM training requirement. And putting into an SVM classifier to judge whether the number of faces in each frame is the same.

(4) And (3) calling a 68-feature classifier of a dlib library for the detected face to carry out slice positioning, circulating the detected face position, detecting the position information of the face, and traversing 36-47 (x, y) coordinate values in a dictionary.

(5) Calculating the height and width values of the double eyes, and calculating the height and width ratio:

left eye width = distance (points [36,: points [39,:);

right eye width = distance (points [42,: points [45,:);

left eye height = distance ((points [37,:) +points [38,:)/2 (points [41,:) +points [40,:)/2);

right eye height = distance ((points [43,:) +points [44 ]:)/2 (points [46,:) +points [47,:)/2;

(6) And writing excel, and ending the experiment.

Step two: experimental design and data acquisition

In the experimental process, eye movement data of a dispatcher are recorded in real time and real-time video acquisition is carried out, and the experimental steps are as follows:

(1) Performing state induction of 1, 2 and 3 groups and simultaneously performing video monitoring;

(2) Determining the emotion induction effect of the dispatcher according to video monitoring by utilizing a video recognition technology, and inviting the dispatcher with successful induction to carry out a dispatching experiment;

(3) Carrying out scheduling operation on the 4 th group, and collecting face video information in real time;

(4) Collecting eye movement indexes, and analyzing eye movement data of the gazing time, pupil size and average glance speed of a dispatcher in the working process;

(5) Analyzing the 4 th group of facial videos by using a built face feature detection model, extracting the eye height-width ratio, and carrying out cluster analysis to find the optimal fatigue state classification and section;

(6) And establishing a dispatcher state identification model based on machine learning, and judging the working state of the dispatcher.

Step three: k-means cluster analysis

Clustering the data of the eye aspect ratio by adopting a k-means clustering algorithm, removing chaotic sections, and marking cutting time points;

(1) Obtaining the optimal clustering number by adopting an elbow method.

(2) And determining the optimal clustering number K by using an elbow method, and verifying the clustering result by using a contour coefficient method.

(3) And determining cluster members of each sample through a K-means clustering algorithm after obtaining the optimal cluster number, inputting a sample matrix, randomly selecting K initial cluster centers, and finding an observation point set which enables the Euclidean distance square sum of each cluster to be minimum. Calculating the mass center of the observation sample in each cluster as a new cluster center;

repeating the steps for 10 times to obtain a final clustering center and reserving clustering members of each observation sample.

(4) Extracting a cluster member sequence for cluster bar graph analysis, finding a more stable classification segment, and intercepting the corresponding fatigue group eye movement data in the time period.

Step four: eye movement instrument original data extraction and processing

The following six eye movement indexes are extracted:

pupil size: two sequences of left and right pupil diameters with the sampling rate of 10hz are selected from the original data with the sampling rate of 30hz, are respectively squared, added for averaging, and multiplied to obtain an average pupil size time sequence with units and corresponding time stamps.

Glance speed, glance times: selecting a time sequence with a label of Saccade from the original data with a sampling rate of 30hz, extracting a Saccade angular velocity and a time stamp thereof, taking the time stamp with the speed of 10hz in pupil size calculation as an object (namely, the sliding step length is 0.01 s), taking 5s as a window, and respectively counting the average value of the Saccade angular velocity and the Saccade times of the data in the time window in the Saccade sequence as the Saccade velocity and the Saccade velocity at the moment.

Gaze time: selecting a time sequence with a label of fix from the original data with a sampling rate of 30hz, extracting a starting time and an ending time of the time sequence, calculating a Fixation time length, taking a time stamp with the length of 10hz in pupil size calculation as an object (namely, a sliding step length is 0.01 s), taking 5s as a window, and respectively counting Fixation time average values of data in the window in the fix sequence as Fixation time at the moment.

Number of blinks, blink time: selecting a time sequence with a label of Blink from original data with a sampling rate of 30hz, extracting a starting time and an ending time of the time sequence, calculating a blinking time length, taking a time stamp with the length of 10hz in pupil size calculation as an object (namely, a sliding step length is 0.01 s), taking 5s as a window, and respectively counting the blinking time average value and the blinking times of data in the window in the Blink sequence as the blinking time and the blinking times at the moment.

Step five: construction of machine learning model

The classified data set is a seven-dimensional matrix with six eye movement characteristic indexes at the input end and four bad working states of fatigue, tension, boredom and irritability at the output end, and the seven-dimensional matrix is marked with 1, 2, 3 and 4 in sequence. The time sequence data set is an eight-dimensional matrix with the input end being in a fatigue state and containing time characteristics and six eye movement characteristic indexes, and the output end being a K-means cluster member.

KNN: firstly, obtaining an optimal K value by adopting a cross verification method, dividing a classification data set into K disjoint subsets, arbitrarily selecting one subset as a verification set and the rest K-1 subsets as training sets, verifying the verification set after training is finished to obtain classification accuracy, sequentially obtaining K times of classification accuracy, and carrying out mean value processing. And screening out K values corresponding to the maximum classification accuracy of 1-20 to obtain the obtained K values.

Initializing an optimal K value, defining the dependency relationship of the characteristic vector x and the target vector y of the input classified data set, and dividing a training set and a testing set. And calculating Euclidean distance between the test set and the training set, sorting according to the size relation, and taking the classification label with the largest occurrence number corresponding to the first k sample data as the classification of the test set.

ID3 decision tree: : discretizing the continuous index by adopting a dichotomy method:

sorting the numerical value sets A of n continuous indexes from small to large to obtain a new set B of the attribute, and sorting two adjacent elements B _i 、b _i+1 Sequentially performing mean value calculation to obtain a new set C with (n-1) elements, and dividing the set B into two new sets B according to the values of B on two sides of each middle locus C ⁺ And B ^- . Calculating information entropy of the set B: U _B Representation B ⁺ And B ^- The ratio of the number of elements in the two sets to the number of elements in the B set.

For each median point c, an information gain value is calculated:and selecting the point with the maximum information gain value as a dividing node, and constructing a decision tree.

Hidden markov model:

inputting the observation state sequence 0 and determining the initial parameter lambda ⁰ And taking eye characteristic information of a railway driving dispatcher recorded by an eye tracker as an observation state sequence O of a hidden Markov model. Initial parameters lambda of model ⁰ ＝(π ⁰ ，A ⁰ ，B ⁰ ) N represents the K-means clustering result in the foregoing.

Calculating training values of state transition matrix and observation probability matrixThe initial observation probability matrix B represents the probability of the observation value j when the state is i, and the input observation state sequence O and the initial parameter A are subjected to Baum-Welch algorithm ⁰ ，B ⁰ Training is carried out, and finally, a training value +.> And respectively representing an implicit state transition matrix and an observation state transition probability matrix of the hidden Markov model for evaluating the safety behavior of the railway traffic dispatcher.

And training the hidden Markov model of the safety behavior of the railway traffic dispatcher under three classification modes. The iteration number is set to 10000 and the precision is 0.0001.

According to the state transition matrix and the observation probability matrix obtained by training Observed state sequence o= (O) ₁ ，o ₂ ，...，o _T ) And solving a state sequence like states of the maximum possibility of the hidden state I corresponding to the observed state sequence O.

Initializing:

δ ₁ (i)＝π _i b _i (o ₁ )

ψ _t (i)＝0

and (5) recursion:

δ _t (i)＝max _1≤j≤N [δ _t-1 (j)a _ji ]b _i (o _t )

ψ _t (i)＝arg max _1≤j≤N [δ _t-1 (j)a _ji ]

and (3) terminating:

P ^* ＝max _1≤i≤N δ _T (i)

for t=t-1, T-2,..1, the solution is 1

Solving to obtain

GRU model:

the GRU model is built by running two input layers: eye movement index X at current moment _t Hidden state S containing related information at last moment _t-1 Output ofThe method comprises the following steps: degree of fatigue Y at the present moment _t And a hidden state S containing information related to the current time _t 。

The interior of the GRU is controlled by a reset gate r and an update gate z. The updating gate is used for controlling the reserved amount of the fatigue degree information at the previous moment, and the weight matrix is respectively multiplied by the fatigue degree at the previous moment and the eye movement index at the current moment, added and then put into sigmoid for activation: z _t ＝sigmoid(W _z [X _t ，S _t-1 ]). The reset gate is used for keeping the history information before the current moment to the hidden state S _t In the data S' before being output, the calculation method is similar to the update gate: r is (r) _t ＝sigmoid(W _r [X _t ，S _t-1 ])。

After the gating process, the reset gate data r is activated by the tanh function _t ：S′＝tanh(W _s′ [r _t *s _t-1 ，X _t ]At this point S' has preserved the input and partially hidden state of eye movement data at the current moment.

Then get data z for update gate _t Screening and retaining operation is carried out to obtain a hidden state S of the current state _t ：

S _t ＝(1-z _t )*S _t-1 +z _t * S', selective forgetting of the first half of the plus sign S _t-1 The latter half retains the eye movement input information at the present time. Finally S is arranged _t The weight matrix is multiplied by the left and put into a sigmoid function to be activated, so that the current fatigue degree state Y is obtained _t ：Y _t ＝sigmoid(W _t [S _t ])。

According to the technical scheme provided by the invention, the HOG feature extraction and SVM algorithm are fused on the extraction of the eye aspect ratio, the recognition precision and efficiency are improved, and the influence factors with smaller calculation amount of the eye aspect ratio are fewer, so that the fatigue degree quantization of PERCLOS can be replaced to a certain extent. And (3) calculating the K value of the eye height-width ratio from different angles, finding the optimal clustering number through an SSE elbow method, and then verifying the clustering effect through a contour coefficient method, wherein the feasibility of the data under the K-means clustering method is laterally illustrated. The clusters are sequentially arranged to form a clustered column diagram for analysis, a time period with good clustering effect from the 3301 th time point to the 9864 th time point is positioned for subsequent data processing, eye data interference caused by external factors such as inadaptation equipment and environment in a short time period when a dispatcher starts and ends working is avoided, and a region with disordered clustering is eliminated.

The key step of emotion recognition is added in emotion induction, the feasibility of an emotion induction method is objectively and quantitatively explained, the effectiveness of subsequent experiments under the condition of emotion is ensured, and the influence on the accuracy of subsequent experimental data due to the occurrence of emotion induction failure is avoided.

The working state identification overall precision of the KNN and the decision tree to the railway dispatcher reaches more than 90%, the feasibility of the technology is proved, and the blank of the railway dispatcher in the emotion identification research direction is filled. Although the overall difference between the model and the model is small, KNN is 2.2% higher than the decision tree in terms of evaluating fatigue accuracy; in terms of boredom, irritability and tension evaluation accuracy, KNN is more than 6% lower than that of the decision tree, the overall effect of the improved ID3 decision tree can be found to be optimal, the problem that the ID3 decision tree is limited by discrete values is solved, and the applicability is wider. Therefore, the improved ID3 decision tree model is used as a railway driving dispatcher working state identification model.

The effect accuracy of the data subjected to K-means algorithm clustering elimination and the original data on a time sequence model is different by more than 14.3%, the effectiveness of fusion of a clustering algorithm and the time sequence model is proved, but the accuracy of a GRU model is 92.33% and is far higher than 79.46% of that of an HMM model, so that the K-means clustering and GRU algorithm fusion model is used as a dispatcher fatigue degree prediction model.

Drawings

FIG. 1 is a schematic flow chart of the present invention;

FIG. 2 is a schematic diagram showing a specific flow of the first step;

FIG. 3 is a schematic flow chart of a face feature detection model;

FIG. 4 is an example railroad car dispatcher work flow;

fig. 5 is a schematic flow chart of the third step.

Detailed Description

The specific technical scheme of the invention is described by combining the embodiments.

According to the invention, by analyzing the operation tasks and the working characteristics of the railway dispatcher, the bad working state of the dispatcher is mainly derived from psychological and physiological influences, and the most obvious expression form, namely, the emotion state and the fatigue state, is found. The emotional states are further divided into three emotions of boredom, tension and irritability according to the work tasks. The invention thus establishes a scope of design: the state identification model is built by acquiring eye data of railway dispatcher in four working states of fatigue, tension, boredom and irritability.

The design proposal mainly comprises the following six steps,

as in the flow of fig. 1, first, the preparation work in the early stage of the experiment is performed, then the scheduling task is completed according to the requirement of the scheduling experiment, and the eye data required by the experiment is collected in real time.

And (3) a clustering algorithm is adopted to find a time segment with a good classification effect, six eye movement characteristics in frame data and event data of the eye movement instrument are extracted through setting the same sliding window to form a model input end data set, and labels 1, 2, 3 and 4 are respectively used as output ends according to different working states and are put into a built machine learning model to carry out training comparison analysis.

Step one: early preparation of experiment

Expression is an important means and objective index for researching and knowing true emotion of people, so the invention marks the emotion state in a video recognition mode, considers that the working environment of a dispatcher is relatively airtight, and advances the face characteristics by adopting an LBPH algorithm to avoid the influence caused by illumination. The PERCLOS and the eye height-width ratio can be used as important means for fatigue detection, and the detection of the pupil position is influenced by the relative position of eyes to a camera, the position of eyelashes, the processing result of eyelid images and the like, so the invention selects the latter as a marking index of the fatigue state.

As in fig. 2, comprising:

building emotion recognition model

And using a python3.6 tool kit based on a pycharm platform, screening CK+ data sets and other adaptive expressions, and acquiring 500 images of facial expression images with obvious characteristics under tension, irritability and boredom emotion in the work of railway dispatchers in advance to form a data set for the emotion verification.

The invention adopts a classical weighting method to carry out image gray scale processing:

gray[i,j]＝0.299*imageR[i,j]+0.578*imageG[i,j][1]+0.114*imageB[i,j]

Building a face feature detection model

The flow is as follows in FIG. 3:

left eye width = distance (points [36,: points [39,:);

right eye width = distance (points [42,: points [45,:);

(6) And writing excel, and ending the experiment.

Step two: experimental design and data acquisition

The railway traffic dispatcher operation flow is shown in fig. 4.

The instrument for recording eye feature data in the experimental process is an SMI ETG eye movement instrument, the experiment is carried out in a comprehensive dispatching command simulation laboratory of southwest traffic university, a railway driving dispatching desk is provided with four displays, and the displayable contents comprise: section diagrams, station diagrams, train operation diagrams and related scheduling command interfaces.

In the experiment, 16 trainees of a dispatcher training class are invited to serve as experimental objects, all the trainees participating in the experiment are required to have healthy emotion and stable emotion, the binocular vision is more than 5.0, and no frame glasses are worn; the mean age was 33 years, with standard deviation of 5 years; the skill of railway dispatch operation is mastered.

And when a student carries out a dispatching experiment, wearing an eye movement instrument and correcting parameters, and recording eye movement data of the driver in real time. The experiment time is 60min altogether, wherein tired emotion experiments only need to monitor work, other state group experiments alternately issue construction maintenance commands needing to be issued temporarily once every 10 minutes and the conditions that the train needs to run at a speed reduction and temporarily stop at a nearest station due to emergency faults, eye movement data of a dispatcher are recorded in real time and real-time video acquisition is carried out in the experiment process, and the experiment steps are as follows:

(3) The 4 th group performs scheduling operation and acquires face video information in real time;

(4) Eye movement indexes are collected, and eye movement data such as fixation time, pupil size, average glance speed and the like in the working process of a dispatcher are analyzed;

Step three: k-means cluster analysis

The k-means clustering algorithm is adopted to cluster the data of the eye aspect ratio, and due to the fact that sleep deprivation experiments are carried out in advance, the fourth group of data is defaulted to be fatigue degree data acquired, but in order to avoid interference caused by external factors such as inadaptation equipment and environment in a short period of time when a dispatcher starts and ends working, chaotic sections are removed, and cutting time points are marked for researching the follow-up eye movement data.

(1) As shown in fig. 5, the main idea of the elbow method is to calculate the sum of squares of errors of samples, and as the number of clusters K increases from 1, the sum of squares of clusters from the sample point N of each cluster Z to the sample mean Y of the cluster becomes smaller, so that the sum of squares of errors SSE becomes smaller, and the degree of aggregation becomes better. As the K value approaches the optimum, the gradient of the decrease in SSE will be large, and when K reaches the optimum, the clustering degree will be small when K value is increased again, and the gradient of the decrease in SSE will be suddenly small and approach to stability. The whole scatter diagram is just like the human elbow, and the elbow is the optimal cluster number.

(2) After the optimal clustering number K is determined by using an elbow method, the clustering result is verified by using a contour coefficient method. Calculating the contour coefficient S (N) of the sample N according to the average distance A (N) of the sample N in the other samples in the cluster Z and the minimum average distance B (N) of the sample N and the other clustered samples:

S (n) is close to 1, the sample clustering is reasonable; s (n) is close to-1, then it is stated that the sample should be further categorized; if S (n) is approximately 0, it is stated that the sample is on the boundary of two adjacent classes. And therefore the larger the profile factor the better.

(3) And determining cluster members of each sample through a K-means clustering algorithm after obtaining the optimal cluster number, inputting a sample matrix, randomly selecting K initial cluster centers, and finding an observation point set which enables the Euclidean distance square sum of each cluster to be minimum. Calculating the mass center of the observation sample in each cluster as a new cluster center, repeating the steps for 10 times to obtain a final cluster center and reserving cluster members of each observation sample.

Step four: eye movement instrument original data extraction and processing

The following six eye movement indicators were extracted using Matlab 2017a software

Step five: construction of machine learning model

The data set of the invention is ready up to this, the classified data set is seven-dimensional matrix with six eye movement characteristic indexes at the input end and four bad working states (namely labels 1, 2, 3 and 4 in turn) of fatigue, tension, boredom and irritability at the output end. The time sequence data set is an eight-dimensional matrix with the input end being in a fatigue state and containing time characteristics and six eye movement characteristic indexes, and the output end being a K-means cluster member.

ID3 decision tree: : the conventional ID3 decision tree algorithm can only process discrete numerical indexes, and the numerical indexes such as pupil size, glance speed and the like in the data of the present invention are almost different from each other although being discrete points, so the present invention should be regarded as continuous value processing approximately. The invention adopts a dichotomy to discretize the continuous index:

sorting the numerical value sets A of n continuous indexes from small to large to obtain a new set B of the attribute, and sorting two adjacent elements B _i 、b _i+1 Sequentially performing mean value calculation to obtain a new set C with (n-1) elements, and dividing the set B into two new sets B according to the values of B on two sides of each middle locus C ⁺ And B ^- . Calculating information entropy of the set B:U _B representation B ⁺ And B ^- The ratio of the number of elements in the two sets to the number of elements in the B set.

Hidden markov model:

inputting the observation state sequence O and determining the initial parameter lambda ⁰ And taking eye characteristic information of a railway driving dispatcher recorded by an eye tracker as an observation state sequence O of a hidden Markov model. Initial parameters lambda of model ⁰ ＝(π ⁰ ，A ⁰ ，B ⁰ ) N represents the K-means clustering result in the foregoing.

The Matlab 2017a software was used to train a hidden markov model of the railway traffic dispatcher's safety behavior in three classification modes. The iteration number is set to 10000 and the precision is 0.0001.

According to the state transition matrix and the observation probability matrix obtained by trainingObserved state sequence o= (o) ₁ ，o ₂ ，...，o _T ) And solving a state sequence like states of the maximum possibility of the hidden state I corresponding to the observed state sequence O.

Initializing:

δ ₁ (i)＝π _i b _i (o ₁ )

ψ _t (i)＝0

and (5) recursion:

δ _t (i)＝max _1≤j≤N [δ _t-1 (j)a _ji ]b _i (o _t )

ψ _t (i)＝argmax _1≤j≤N [δ _t-1 (j)a _ji ]

and (3) terminating:

P ^* ＝max _1≤i≤N δ _T (i)

for t=t-1, T-2,..1, the solution is 1

Solving to obtain

GRU model:

the GRU model is built by running two input layers: eye movement index X at current moment _t Hidden state S containing related information at last moment _t-1 And (3) outputting to obtain: degree of fatigue Y at the present moment _t And a hidden state S containing information related to the current time _t 。

The invention then obtains data z for the update gate _t Screening and retaining operation is carried out to obtain a hidden state S of the current state _t ：S _t ＝(1-z _t )*S _t-1 +z _t * S', selective forgetting of the first half of the plus sign S _t-1 The latter half retains the eye movement input information at the present time. Finally S is arranged _t The weight matrix is multiplied by the left and put into a sigmoid function to be activated, so that the current fatigue degree state Y is obtained _t ：Y _t ＝sigmoid(W _t [S _t ])。

Claims

1. The method for identifying the bad state of the railway dispatcher based on the eye features is characterized by comprising the following steps of:

firstly, preparing in the early stage of an experiment, then completing a dispatching task according to the requirement of dispatching the experiment, and collecting eye data required by the experiment in real time;

A clustering algorithm is adopted to find out time slices with good classification effect, six eye movement characteristics in frame data and event data of an eye movement instrument are extracted through setting the same sliding window to form a model input end data set, labels are respectively used as output ends according to different working states, and a built machine learning model is put into the model input end data set for training comparison analysis;

the method specifically comprises the following steps:

step one: early preparation of experiment

Building emotion recognition model

In the work of a railway dispatcher, facial expression pictures with obvious characteristics under tension, irritability and boredom emotion are collected in the field to form a data set for the emotion verification;

initializing sampling parameters, dividing each training set picture, and storing a sequence containing a plurality of face matrixes and corresponding labels by using a labels model;

calculating offset coordinates of pixels on x and y axes, and calculating gray values and LBP (0, 1) coding values between adjacent pixels of each region by using a bilinear interpolation method, so as to obtain an LBP coding diagram of the full diagram;

performing dimension reduction processing to obtain ULBP values, cutting the coded image into 8 x 8 areas, obtaining a sub-histogram of each area by using the LBP values and the ULBP values, and then obtaining a full-image LBPH map; obtaining a classification result by comparing LBPH images among different face images;

Building a face feature detection model

(1) Extracting frame images of the video, and setting for 1 second to obtain 30 face images for alignment with a sampling rate of 30 frames/second of the eye tracker by using a for-loop function;

(2) Carrying out gray processing on the image, calculating the size and the direction of each pixel, dividing the image into 6*6 blocks, extracting pixels with different areas of each block to obtain an area gradient histogram, and combining all subgraphs to form an HOG gradient histogram vector set of the whole image;

(2) Setting an SVM kernel function as linearity, wherein the HOG feature vector is a one-dimensional matrix, and downwards splicing all image feature sets to form an n-dimensional matrix in order to meet the SVM training requirement; putting an SVM classifier into the system to judge whether the number of faces in each frame is the same;

(4) The 68 feature classifier of dlib library is called to the detected face for slicing and positioning, the detected face position is circulated, the face position information is detected, and 37-48 bits (x, y) coordinate values in the dictionary are traversed;

(5) Calculating the height and width values of the double eyes, and calculating the height and width ratio;

(6) Writing excel, and ending the experiment;

step two: experimental design and data acquisition

(6) Establishing a dispatcher state identification model based on machine learning, and judging the working state of a dispatcher;

step three: k-means cluster analysis

(1) Obtaining an optimal clustering number by adopting an elbow method;

(2) Determining an optimal clustering number K by using an elbow method, and verifying a clustering result by using a contour coefficient method;

(3) Determining cluster members of each sample through a K-means clustering algorithm after obtaining the optimal cluster number, inputting a sample matrix, randomly selecting K initial cluster centers, and finding an observation point set which enables the Euclidean distance square sum of each cluster to be minimum; calculating the mass center of the observation sample in each cluster as a new cluster center;

Repeating the steps for 10 times to obtain a final clustering center and reserving clustering members of each observation sample;

(4) Extracting a cluster member sequence for cluster column diagram analysis, finding a more stable classification segment, and intercepting the corresponding fatigue group eye movement data in the time period;

step four: eye movement instrument original data extraction and processing

The following six eye movement indexes are extracted:

pupil size: selecting two rows of time sequences of left pupil diameter and right pupil diameter with the sampling rate of 10hz from the original data with the sampling rate of 30hz, respectively squaring, adding and averaging to obtain an average pupil size time sequence taking 0.1 second as a unit and a corresponding time stamp;

glance speed, glance times: selecting a time sequence with a label of Saccade from the original data with a sampling rate of 30hz, extracting a Saccade angular velocity and a time stamp thereof, and taking the time stamp of 10hz in pupil size calculation as an object and taking 5s as a window, respectively counting a Saccade angular velocity average value and a Saccade frequency of the data in the time window in the Saccade sequence, and taking the Saccade angular velocity average value and the Saccade frequency as the Saccade velocity and the Saccade frequency at the moment;

gaze time: selecting a time sequence with a label of fix from original data with a sampling rate of 30hz, extracting a starting time and an ending time of the time sequence, calculating a Fixation time length, taking a time stamp of 10hz in pupil size calculation as an object, taking 5s as a window, and respectively counting Fixation time average values of data in the window in the fix sequence as Fixation time at the moment;

Number of blinks, blink time: selecting a time sequence with a label of Blink from original data with a sampling rate of 30hz, extracting a starting time and an ending time of the time sequence, calculating a blinking time length, taking a time stamp of 10hz in pupil size calculation as an object, taking 5s as a window, and respectively counting the blinking time average value and the blinking times of data in the window in the Blink sequence as the blinking time and the blinking times of the moment;

step five: construction of machine learning model

The classified data set is a seven-dimensional matrix with six eye movement characteristic indexes at the input end and four bad working states of fatigue, tension, boredom and irritability at the output end, and the seven-dimensional matrix is labeled 1, 2, 3 and 4 in sequence; the time sequence data set is an eight-dimensional matrix with an input end being in a fatigue state and containing time characteristics and six eye movement characteristic indexes, and an output end being a K-means cluster member;

KNN: firstly, obtaining an optimal K value by adopting a cross verification method, dividing a classification data set into K disjoint subsets, arbitrarily selecting one subset as a verification set and the rest K-1 subsets as training sets, verifying the verification set after training is finished to obtain classification accuracy, sequentially obtaining K times of classification accuracy and carrying out mean value processing; screening K, namely obtaining a K value corresponding to the maximum classification accuracy of 1-20;

Initializing an optimal K value, defining the dependency relationship of the characteristic vector x and the target vector y of the input classified data set, and dividing a training set and a testing set; calculating Euclidean distance between the test set and the training set, sorting according to the size relation, and taking the classification label with the largest occurrence number corresponding to the first k sample data as the classification of the test set;

GRU model:

the GRU model is built by running two input layers: eye movement index X at current moment _t Hidden state S containing related information at last moment _t-1 And (3) outputting to obtain: degree of fatigue Y at the present moment _t And a hidden state S containing information related to the current time _t ；

The interior of the GRU is controlled by a reset gate r and an update gate z; the updating gate is used for controlling the reserved amount of the fatigue degree information at the previous moment, and the weight matrix is respectively multiplied by the fatigue degree at the previous moment and the eye movement index at the current moment, added and then put into sigmoid for activation: z _t ＝sigmoid(W _z [X _t ，S _t-1 ]) The method comprises the steps of carrying out a first treatment on the surface of the The reset gate is used for keeping the history information before the current moment to the hidden state S _t In the data S' before being output, the calculation method is similar to the update gate: r is (r) _t ＝sigmoid(W _r [X _t ，S _t-1 ])；

After the gating process, the reset gate data r is activated by the tanh function _t ：S′＝tanh(W _s′ [r _t *S _t-1 ，X _t ]S' at this time already keeps the input and partial hidden state of the eye movement data at the current moment;

Then get data z for update gate _t Screening and retaining operation is carried out to obtain a hidden state S of the current state _t ：S _t ＝(1-z _t )*S _t-1 +z _t * S', selective forgetting of the first half of the plus sign S _t-1 The latter half part reserves the eye movement input information at the current moment; finally S is arranged _t The weight matrix is multiplied by the left and put into a sigmoid function to be activated, so that the current fatigue degree state Y is obtained _t ：Y _t ＝sigmoid(W _t [S _t ])。