CN111933275A - Depression evaluation system based on eye movement and facial expression - Google Patents

Depression evaluation system based on eye movement and facial expression Download PDF

Info

Publication number
CN111933275A
CN111933275A CN202010692613.1A CN202010692613A CN111933275A CN 111933275 A CN111933275 A CN 111933275A CN 202010692613 A CN202010692613 A CN 202010692613A CN 111933275 A CN111933275 A CN 111933275A
Authority
CN
China
Prior art keywords
eye movement
expression
module
testee
eye
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010692613.1A
Other languages
Chinese (zh)
Other versions
CN111933275B (en
Inventor
胡斌
杨民强
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Lanzhou University
Original Assignee
Lanzhou University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Lanzhou University filed Critical Lanzhou University
Priority to CN202010692613.1A priority Critical patent/CN111933275B/en
Publication of CN111933275A publication Critical patent/CN111933275A/en
Application granted granted Critical
Publication of CN111933275B publication Critical patent/CN111933275B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/20ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/213Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
    • G06F18/2135Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods based on approximation criteria, e.g. principal component analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/24323Tree-organised classifiers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/48Extraction of image or video features by mapping characteristic values of the pattern into a parameter space, e.g. Hough transformation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/174Facial expression recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/18Eye characteristics, e.g. of the iris
    • G06V40/193Preprocessing; Feature extraction

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Multimedia (AREA)
  • Biomedical Technology (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Human Computer Interaction (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Medical Informatics (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • Public Health (AREA)
  • Databases & Information Systems (AREA)
  • Ophthalmology & Optometry (AREA)
  • Epidemiology (AREA)
  • Pathology (AREA)
  • Primary Health Care (AREA)
  • Measurement Of The Respiration, Hearing Ability, Form, And Blood Characteristics Of Living Organisms (AREA)
  • Image Analysis (AREA)

Abstract

The invention provides a system for evaluating depression based on eye movement and facial expression, which realizes an objective and quantitative non-invasive depression evaluation means by extracting and analyzing effective characteristics of the eye movement and the facial expression. The emotion recognition system comprises an emotion stimulation module, an expression acquisition module, an eye movement feature extraction module, an expression feature extraction module, a machine learning classification module and an automatic evaluation module; the expression acquisition module acquires expression information of a testee watching different emotional stimulation pictures output by the emotional stimulation module; the eye movement acquisition module acquires eye movement information of a testee watching different emotional stimulation pictures output by the emotional stimulation module; the eye movement characteristic extraction module extracts eye movement characteristics from the obtained eye movement image information, and the expression characteristic extraction module extracts expression characteristics from the obtained expression image information; the machine learning classification module performs feature fusion and machine learning classification; and the automatic evaluation module evaluates the depression degree of the testee according to the machine learning classification result.

Description

Depression evaluation system based on eye movement and facial expression
Technical Field
The invention relates to the technical field of computer-aided early depression detection, in particular to a depression assessment system based on eye movement and facial expression.
Background
Depression is a common psychological disorder affecting about 3.5 million people worldwide, and the World Health Organization (WHO) predicts that depression will become the second leading global disorder, second only to heart disease, by the year 2020. However, the diagnosis and efficacy evaluation of depression mainly depend on subjective evaluation methods such as family history, patient self-description and clinical scale, and lack of objective measurement method and tool, thereby causing difficulty in identification of early affective disorder, and patients often miss the optimal treatment opportunity.
In the biomedical field, it has been found that the spontaneous behavior of eye movements and expressions is closely linked to its psychological state, in particular the symptoms of depression. It has been shown that patients with depressive disorders respond differently to stimuli of different emotions than non-depressive patients, and that these responses are often subconscious, such as eye movements, expressions, and the like. Compared with the traditional assessment means, the physiological indexes are more objective, and meanwhile, the non-invasive equipment can conveniently acquire data and is convenient to operate.
Disclosure of Invention
According to the problems in the prior art, the invention provides a depression assessment system based on eye movement and facial expression, and the association relation between the eye movement, the expression and depression disorder is established by extracting the effective characteristics of the eye movement and the expression and carrying out fusion analysis on the eye movement and the expression, so that an objective and quantitative non-invasive depression assessment means is realized.
The technical scheme of the invention is as follows:
1. a depression assessment system based on eye movement and facial expression is characterized by comprising an emotion stimulation module, an expression acquisition module, an eye movement feature extraction module, an expression feature extraction module, a machine learning classification module and an automatic assessment module; the expression acquisition module is used for acquiring expression information of a testee when watching different emotional stimulation pictures output by the emotional stimulation module; the eye movement acquisition module is used for acquiring eye movement information of a testee when watching different emotional stimulation pictures output by the emotional stimulation module; the eye movement characteristic extraction module extracts eye movement characteristics from the obtained eye movement image information, and the expression characteristic extraction module extracts expression characteristics from the obtained expression image information; the machine learning classification module performs feature fusion and machine learning classification; and the automatic evaluation module evaluates the depression degree of the testee according to the machine learning classification result.
2. The eye movement acquisition module comprises a foreground camera and an eye movement camera, the foreground camera is arranged in the middle area of the forehead of the tested person and used for shooting the visual field area of the tested person, the resolution ratio is 1080p, and the sampling rate reaches 30 fps; the eye-moving cameras are arranged in the left and right areas of the cheek of the testee and used for shooting pupil images of the left and right eyes of the testee, the requirement is high frame number, the resolution ratio is 120x120, and the sampling rate reaches 200 fps.
3. The eye movement acquisition module comprises a frame for accommodating a foreground camera and an eye movement camera; the spectacle frame is made of polyurethane materials through 3D printing and comprises a foreground camera support, a lengthened nose support and a pupil camera support; the foreground camera support is positioned above the eyebrows, the foreground camera is fixed at the central part, and the foreground camera support is supported on the nose part through the lengthened nose support; the eye-moving camera support is connected to the left side and the right side of the foreground camera support, and the joint of the eye-moving camera support and the foreground camera support is provided with an arc-shaped structure, so that the glasses legs of the traditional glasses can penetrate through the arc-shaped structure; the fixed eye of tip of eye-moving camera support moves the camera, eye-moving camera support outwards rotates certain angle respectively, makes eye-moving camera can not shelter from cheek part and shoot the pupil upwards to the support can stretch out and draw back, rotatory operation, adapts to the person under test of different types of face.
4. The expression acquisition module comprises an expression acquisition camera which is arranged at a proper position in front of the testee so as to shoot the complete face area of the testee, and the Rouz c1000e, the resolution ratio of 4096x 2160 and the sampling rate of 60fps are adopted.
5. The emotion stimulation module comprises picture materials and audio materials, wherein the picture materials can respectively give positive, neutral and negative emotion stimulation to a human subject, and the audio materials have the same emotion as the picture stimulation.
6. The eye movement feature extraction module is used for extracting eye movement features of the obtained pupil image, and comprises the steps of extracting pupil radius and pupil center coordinate information in the image by adopting a Canny edge detection operator and a Hough circle detection algorithm, and further calculating an eye movement track and pupil size change; for the eye movement data, firstly, Gaussian filtering is used for denoising, then the center and the size of the pupil are obtained by using a Canny edge detection operator and a Hough circle for detection, and meanwhile, the characteristics of the watching region and the watching time of the testee are calculated.
7. The expression feature extraction module extracts expression features, namely extracting TOFS features from the obtained expression images, cutting the video stream of the whole sequence into a plurality of video segments through MDMO features based on an optical flow field and adding a sliding window, and extracting the expression features to obtain 41-dimensional feature vectors; for the expression data, firstly, a face region is calculated by using a CNN convolutional neural network, 66 feature points of the face and 36 ROI (region of interest) are calculated, and finally, the TOFS feature is calculated.
8. The machine learning classification module comprises the steps of feature fusion: and performing multi-mode parallel feature fusion on the two groups of feature vectors of the extracted eye movement features and expression features, combining the two groups of feature vectors into a complex vector space through the complex vector, and performing dimensionality reduction on the vector space by using principal component analysis.
9. The machine learning classification module includes the steps of training a classifier: marking the collected eye movement data and expression data according to whether the testee is a depressed patient, then using the eye movement data and the expression data and the label of whether the testee is the depressed patient as training data, adopting a decision tree to perform classification calculation, and establishing and training a classifier model.
10. The automatic evaluation module comprises the steps of automatically evaluating: the method comprises the following steps of collecting eye movement data and expression data of a testee with unknown depression condition, extracting features, fusing the features, inputting the data into a trained classifier model, automatically evaluating whether depression tendency exists or not by the classifier model according to the input features, outputting depression degree classification, and obtaining a classification result: normal or depressed.
The invention has the technical effects that:
according to the system for evaluating the depression based on the eye movement and the facial expression, the effective characteristics of the eye movement and the expression are extracted and are subjected to fusion analysis, and the incidence relation between the eye movement, the expression and the depression disorder is established, so that an objective and quantitative non-invasive depression evaluation means is realized.
The invention improves the hardware, avoids shielding important areas when acquiring eye movement and expression data, and improves the structure of the spectacle frame in consideration that most people need to wear spectacles. In addition, the classification algorithm combines the processing of eye movement and expression data, the feature calculation and extraction and multi-modal analysis, and a classification model is obtained through a machine learning classification algorithm. The trained classification model is applied to an automatic depression evaluation system, the detection process is visualized, the operation is convenient, and meanwhile, the depression degree of a testee is evaluated by a wearable non-invasive method, so that the early evaluation effect is achieved.
Drawings
FIG. 1 is a general framework schematic of the system of the present invention.
Fig. 2a is a structural perspective view of a 3D printing frame according to an embodiment of the present invention.
Fig. 2b is a front view of a 3D printing frame of an embodiment of the present invention.
Fig. 2c is a side view of a 3D printing frame of an embodiment of the present invention.
Fig. 2D is a top view of a 3D printing frame of an embodiment of the present invention.
Fig. 3a is a front view of a live view illustration of a 3D printed frame according to an embodiment of the invention.
Fig. 3b is a side view of a live view schematic of a 3D printed frame of an embodiment of the invention.
Fig. 4 is a schematic system flow diagram of the present invention.
Fig. 5 is a schematic flow chart of extracting the TOFS feature.
1-foreground camera support, 2-lengthened nose support, 3-foreground camera, 4-eye movement camera, 5-arc structure, 6-eye movement camera support
Detailed Description
Embodiments of the present invention will be described in further detail below with reference to the accompanying drawings.
FIG. 1 is a general framework schematic of the system of the present invention. A depression assessment system based on eye movement and facial expression comprises an emotion stimulation module, an expression acquisition module, an eye movement feature extraction module, an expression feature extraction module, a machine learning classification module and an automatic assessment module; the expression acquisition module is used for acquiring expression information of a testee when watching different emotional stimulation pictures output by the emotional stimulation module; the eye movement acquisition module is used for acquiring eye movement information of a testee when watching different emotional stimulation pictures output by the emotional stimulation module; the eye movement characteristic extraction module extracts eye movement characteristics from the obtained eye movement image information, and the expression characteristic extraction module extracts expression characteristics from the obtained expression image information; the machine learning classification module performs feature fusion and machine learning classification; and the automatic evaluation module evaluates the depression degree of the testee according to the machine learning classification result.
The eye movement acquisition module comprises a foreground camera and an eye movement camera, the foreground camera is arranged in the forehead middle area of the tested person and used for shooting the visual field area of the tested person, the resolution ratio is 1080p, and the sampling rate reaches 30 fps; the eye-moving cameras are arranged in the left and right areas of the cheek of the testee and used for shooting pupil images of the left and right eyes of the testee, the requirement is high frame number, the resolution ratio is 120x120, and the sampling rate reaches 200 fps. The expression acquisition module comprises an expression acquisition camera which is arranged at a proper position in front of the testee so as to shoot the complete face area of the testee, and the Rouz c1000e, the resolution ratio of 4096x 2160 and the sampling rate of 60fps are adopted.
The embodiment of the invention adopts the 3D printing mirror bracket to arrange the foreground camera and the eye movement camera. As shown in fig. 2a, 2b, 2c, and 2D, there are respectively a perspective view, a front view, a side view, and a top view of a 3D printing frame according to an embodiment of the present invention. The spectacle frame is made of polyurethane materials through 3D printing and comprises a foreground camera support 1, a lengthened nose support 2 and an eye movement camera support 6; the foreground camera support 1 is positioned above eyebrows, a foreground camera 3 is fixed at the center, and the foreground camera support 1 is supported on a nose part through a lengthened nose support 2; the eye-moving camera support 6 is connected to the left side and the right side of the foreground camera support 1, and an arc-shaped structure 5 is arranged at the joint of the eye-moving camera support 6 and the foreground camera support 1 and can enable the glasses legs of the traditional glasses to penetrate through the arc-shaped structure; the end part of the eye-moving camera support is fixed with the eye-moving camera 4, the eye-moving camera support 6 respectively rotates outwards for a certain angle, so that the eye-moving camera cannot shield the cheek part and shoot pupils obliquely upwards, and the support can be telescopic and rotated to adapt to testers with different facial shapes.
Fig. 3a and 3b are a front view and a side view of a 3D printed glasses frame with a realistic view according to an embodiment of the present invention. The design structure of the 3D printing spectacle frame is mainly improved in the aspects of not blocking facial expression interest areas and adapting to myopia crowds. The spectacle frame main body is made of polyurethane materials, so that the thickness of the spectacle frame is not too large, and the spectacle frame has strong toughness. The spectacle frame is innovated aiming at data acquisition of an experiment, firstly, acquired data comprise eye movement and expressions, the spectacle frame needs to shield the facial expressions to the minimum, the experiment proves that the shielding of the forehead and the nose has small influence on the expression acquisition, and key parts such as eyebrows, glasses, a mouth and the like are prevented from being shielded as far as possible, so that the spectacle frame is designed to be raised, the heights of the eyes and the spectacle frame are raised integrally, the frontal spectacle frame is positioned above the eyebrows, the nose supports of the glasses are extended and concentrated in front of the nose, and materials with good toughness and small size are adopted, so that the shielding of other positions is reduced; in addition, to the crowd who wears spectacles for near sighted now gradually increasing, this mirror holder has also carried out the adaptation to it, has suitably improved the height of mirror leg. Meanwhile, an arc notch is additionally arranged at the position where the camera supports on the two sides are close to the temple of a person, so that the glasses legs of the myopia glasses can pass through the arc notch, and a test wearing the glasses can be conveniently used; in addition, the eye-moving camera support is rotated outwards by 15 degrees by the embodiment, the camera for shooting the pupil cannot shield the cheek part, and the orientation of the camera is selected at the same time, so that the pupil can be shot by the camera in the oblique upward direction, the face part is not shielded, and the collection of pupil images is guaranteed.
Fig. 4 is a schematic system flow diagram of the present invention. Before the experiment is started, the experimental environment is firstly ensured to be comfortable and relatively quiet, the interference of the external environment to a testee is reduced, and meanwhile, a noise source is eliminated. After the experimental scene meets the requirements, synchronous eye movement and expression collection are started, pictures with different emotions are respectively given to the testee in the collection process for stimulation, the pictures are supplemented with audio stimulation with the same emotion as the emotions of the pictures, and the stimulation pictures are divided into: positive, neutral and negative. After the original data of the eye movement and the expression are obtained, the data of the eye movement and the expression are preprocessed firstly, then the eye movement characteristics and the expression characteristics are extracted, the characteristics are fused by using a characteristic layer fusion mode to obtain effective characteristics, a decision tree of a machine learning classification algorithm is used for classification, a classifier model is built and trained, so that a predicted value obtained through the model is fitted with a real value, a good accuracy rate is guaranteed in an actual evaluation task, and the classifier is obtained. The automatic evaluation system based on the trained classifier is used for automatically evaluating the depression degree of a testee, carrying out the same data acquisition and the same characteristic extraction process on the testee with unknown depression condition, inputting the characteristics into the automatic evaluation system after calculating the characteristics, outputting the depression degree classification by the classifier through the input characteristics, and obtaining the classification result: normal or depressed.
Wherein the emotional stimulation module is designed to: 1) nine-point positioning: sequentially displaying yellow marks at nine points of the middle, upper, lower, left, right, upper left, lower right, lower left and upper right of the screen, and allowing the tested person to look at the yellow marks as pupil positioning calibration 2) continuous picture stimulation: the picture material is from international emotion picture system (IAPS), and comprises picture material capable of respectively giving positive, neutral and negative emotional stimuli to the subject, and audio material with the same emotion as the picture stimuli. The experimental paradigm shows neutral, positive, neutral, and negative pictures in sequence, repeated twice, each group of pictures shows 5 pictures of the same attribute, each picture shows 5s, and there are 5s intervals between different groups of pictures.
After data acquisition is ready to begin, a screen begins to play a video stimulation paradigm, and meanwhile, an eye movement acquisition module and an expression acquisition module begin to work to acquire eye movement picture information and expression picture information of a testee when the testee watches different emotional stimulation pictures; the expression acquisition module and the eye movement acquisition module both adopt a mode of recording image data, record the tested eye movement data and expression data, and after the video stimulation paradigm is finished, the data are stored into a video format. The method also comprises the step of marking the eye movement and expression data according to whether the testee is a depressed patient or not, and using the eye movement and expression data and a label of whether the testee is a depressed patient or not as training data. Whether the patient is the depression or not is judged by a hospital doctor by using a diagnosis mode such as traditional inquiry.
The extraction of the eye movement characteristics by the eye movement characteristic extraction module refers to the extraction of the eye movement characteristics of the obtained pupil image, and comprises the steps of extracting pupil radius and pupil center coordinate information in the image by adopting a Canny edge detection operator and a Hough circle detection algorithm, and further calculating the eye movement track and the pupil size change; for the eye movement data, firstly, Gaussian filtering is used for denoising, then the center and the size of the pupil are obtained by using a Canny edge detection operator and a Hough circle for detection, and meanwhile, the characteristics of the watching region and the watching time of the testee are calculated. The expression feature extraction module extracts expression features, namely extracting TOFS features from the obtained expression images, cutting the video stream of the whole sequence into a plurality of video segments by adding a sliding window based on optical flow field features, and extracting the expression features to obtain 41-dimensional feature vectors; for the expression data, firstly, a face region is calculated by using a CNN convolutional neural network, 66 feature points of the face and 36 ROI (region of interest) are calculated, and finally, the TOFS feature is calculated.
The module comprises a Canny edge detection operator and Hough circle detection extraction image pupil radius and pupil center coordinate information, and further calculates the eye movement track and pupil size change, and the detailed steps are as follows:
the first step is as follows: canny edge detection operator
(1) Gauss filtering
The Canny edge detection algorithm is sensitive to noise, so smoothing filtering is performed on an image to reduce the influence of the noise on an edge detection result, and in order to smooth the image, a Gaussian filter is used for carrying out convolution on the image so as to smooth the image to reduce the influence of the noise on the edge detection result. The generation equation for a gaussian filter kernel of size (2k +1) x (2k +1) is given by:
Figure BDA0002589835760000061
(2) calculating gradient strength and direction
The most important feature of the edge is that the grey value changes strongly, and the change in grey value is then described by a gradient. A pixel point has 8 neighborhoods, so that gradients in four directions of upper, lower, left and right diagonal angles exist, and therefore the Canny algorithm uses four operators to detect horizontal, vertical and diagonal edges in an image. The operator calculates gradient in the form of image convolution, convolves the following two templates with the original image to obtain a difference value graph of x and y axes, and finally calculates gradient G and direction theta of the point, wherein the formula is as follows:
Figure BDA0002589835760000062
θ=arctan(Gy/Gx)
(3) non-maximum suppression
After the gradient calculation of the image is performed in the previous step, the edge extracted based on the gradient value is still blurred. There is therefore a need for an edge thinning algorithm with non-maximum suppression, the effect of which is to compare the gradient strength of the current pixel with two pixels in the positive and negative gradient directions. If the gradient intensity of the current pixel is maximum compared with the other two pixels, the pixel point is reserved as an edge point, otherwise, the pixel point is restrained. This allows for more accurate identification of the actual edges of the image.
(4) Dual threshold detection
Although the non-maximum suppression algorithm can detect the actual edge more accurately, the presence of noise and color variations has an effect on the detection result. In order to solve these spurious responses, it is necessary to filter the edge pixels with weak gradient values and keep the edge pixels with high gradient values, i.e. when the gradient values are higher than a set threshold, the pixel point can be regarded as a strong edge pixel; and conversely, if the value is lower than the threshold value point, the point is regarded as a weak edge pixel point and is restrained in subsequent detection.
(5) Suppression of isolated low threshold points
The detected strong edge pixel point in the last step is determined as an edge, however, the weak edge pixel may be an actual edge or an error caused by noise or color change. Therefore, in order to filter out noise while preserving actual edges, by looking at weak edge pixels and their 8 neighborhood pixels, as long as one of them is a strong edge pixel, the weak edge point can be preserved as a true edge.
The second step is that: hough circle detection
The Hough transform is a method of detecting a curve by using duality between points on the curve and parameters of the curve. This work is widely used for the detection of certain analytical curves in grayscale images, particularly straight lines, circles and parabolas.
When a circle is present in an image, then its edges must belong to the edges of the image, in the x-y coordinate system, the general equation for a circle is as follows:
(x-a)2+(y-b)2=r2
and converting the x-y coordinate system into an a-b coordinate system. Written in the form (a-x)2+(b-y)2=r2. A point on the circular boundary in the x-y coordinate system corresponds to a circle in the a-b coordinate system. The circle boundary of the x-y coordinate system contains countless points, an infinite number of circles exist in the corresponding a-b coordinates, and the circles meet the condition that the distances from the centers (a, b) of the circles are equal, so that the circles on the coordinate axes of a-b intersect at one point, and the intersection point is the center (a, b) of the circle. And counting the number of circles at the local intersection points, taking each local maximum value to obtain the coordinates of the circle center, and determining the radius of the intersected circles to obtain the radius r value.
Calculating a fixation point, a first target interest point and fixation time as characteristic values of eye movement according to the change of the pupil center point; the radius of the pupil serves as a direct indicator of pupillary zoom. The above eye movement trajectory and pupil change are taken as eye movement characteristics.
And extracting the whole video sequence by using a sliding window, and comparing the pupil radius of the former frame of picture and the pupil center coordinate change of the latter frame of picture.
dxi=|xi-xi+1|(i=0,1,2......)
dyi=|yi-yi+1|(i=0,1,2......)
dri=|ri-ri+1|(i=0,1,2......)
If driIf the eye movement type is smaller than the set threshold value, the eye movement type of the two frames is judged to be watching, and if the eye movement type of the two frames is smaller than the set threshold value, the following dri+1If the value is still less than the threshold value, the gaze state is considered to be still in the gaze state until drnAnd if the value is larger than the threshold value, counting the fixation time. The last calculation of the first fixation time (t)f) I.e. mean fixation time
Figure BDA0002589835760000081
Averaging dx, dy and dr in sequence
Figure BDA0002589835760000082
To obtain finally
Figure BDA0002589835760000083
A five-dimensional feature vector.
The expression feature extraction is to extract TOFS features of the obtained expression images, extract expression features in the whole sequence based on optical flow field features and a sliding window to obtain 41-dimensional feature vectors, and comprises the following detailed steps:
fig. 5 is a schematic flow chart of extracting the TOFS feature. The expression feature extraction module comprises CNN facial area recognition, 68 facial key points and 36 interesting areas calculation, TOFS feature calculation. The TOFS feature calculation is based on an MDMO (Main direct Mean-flow) feature and is fused with a feature selection algorithm of a sliding window, the expression recognition of the Optical flow feature in a short sequence video has a good accuracy, but the video sequence of the experiment is long, and in order to guarantee the accuracy and robustness of the recognition algorithm, a sliding window is added on the basis of the feature extraction to extract feature vectors in unit time, so that the 41-dimensional TOFS feature is solved.
The MDMO feature is based on optical flow and the data set can be a video segment or a picture. For example, for a sequence of images (f)1,f2,...,fm) This feature is based on a facial motion coding system, using 68 facial key points to divide the facial area of each frame into 36 regions of interest (ROIs), calculating the optical flow between frames. For each frame fiI > 1, for each ROI
Figure BDA0002589835760000084
The optical flow vectors in k 1, 2.., 36 are divided into 8 directional lattices, and the lattice with the largest number of optical flow vectors is selected, and the main direction of the optical flow is the average value of all the optical flow vectors in the lattice. The optical flow vector is expressed in polar coordinates (ρ)ii),ρiAnd thetaiIs the magnitude and direction of the optical flow. In order to eliminate the influence of different frame numbers of different video segments, the frame numbers are normalized to obtain the final characteristics:
Figure BDA0002589835760000091
wherein:
Figure BDA0002589835760000092
we denote the 72-dimensional features as:
Figure BDA0002589835760000093
where α is an adjustable parameter, and according to the experimental results of the thesis of the optical flow features, we set the value of the parameter to 0.9.
The time-frequency domain statistical characteristics of the sliding window show that the time length of each video segment collected in the experiment is about 2 minutes, and the video segment records the emotion change of a testee in the test process. We found through experiments that if the optical flow characteristics are evaluated for the whole video, the degree of significance of the emotion change is eliminated to some extent. Moreover, the facial expression change of the tested person in a part of time period is not very obvious, so that the paper provides a sliding window algorithm to search a key video segment, thereby better extracting the optical flow characteristics. For each picture sequence (f)1,f2,...,fm) The frame number of the sliding window is n, the image sequence subset obtained by using the sliding window is y, and the frame number of y is frameγThe sliding window can be described as:
Figure BDA0002589835760000098
wherein, the value relation of i and n is as follows:
Figure BDA0002589835760000094
the optical flow change characteristics of the testee in each small time period are obtained by using a sliding window algorithm, and in order to better utilize the information, the statistical characteristics of a sliding window in a time-frequency domain are extracted: mean μ, variance s, standard deviation σ, skewness γ1Degree of harmony Kr. Sequence of optical flow features for a sliding window for each subject
Figure BDA0002589835760000095
The statistical characteristics of the time-frequency domain are as follows:
Figure BDA0002589835760000096
Figure BDA0002589835760000097
Figure BDA0002589835760000101
skewness is a measure of the direction and degree of skew of statistical data distribution, and we measure the symmetry of optical flow characteristics in the whole process by skewness:
Figure BDA0002589835760000102
wherein κ23Representing the second and third central moments, respectively, and E is the averaging operation.
Kurtosis is a normalized fourth-order central moment, and we extract kurtosis features of sliding window data to measure the distribution of optical flow:
Figure BDA0002589835760000103
we combine these statistical and optical flow features to form the final 41-dimensional feature TOFS:
Figure BDA0002589835760000104
the machine learning classification module comprises the steps of feature fusion and machine learning building and training a classifier. The feature fusion is to obtain two groups of feature vectors related to eye movement and expression features from the above steps, combine the two groups of feature vectors into a complex vector space through a complex vector by using a parallel feature fusion mode, and then perform dimensionality reduction on the vector space by using Principal Component Analysis (PCA), and comprises the following steps:
(1) forming n (feature vector) rows and m (sample number) column matrixes X from the original data according to columns;
(2) zero-averaging each row of X, i.e. subtracting the average of this row;
(3) solving a covariance matrix;
(4) solving an eigenvalue of the covariance matrix and a corresponding eigenvector r;
(5) and arranging the eigenvectors into a matrix from top to bottom according to the size of the corresponding eigenvalue, and taking the first k rows to form a matrix P, namely the data from dimensionality reduction to dimensionality k.
Training a classifier: and marking the collected eye movement data and expression data according to whether the testee is a depressed patient, taking the eye movement data and the expression data and the label of whether the testee is the depressed patient as training data, carrying out classification calculation on the extracted effective characteristic matrix by a decision tree, and establishing and training a classifier model to obtain a classification model with higher accuracy.
The automatic evaluation module is applied to a classifier model obtained by the machine learning classification module, a tested person without a manual diagnosis result is subjected to eye movement data and expression data acquisition and feature extraction and feature fusion according to the data acquisition and feature extraction modes, effective features are calculated, the effective features are input into a trained classifier, the classifier is used for evaluating the depression degree, the depression degree of the tested person is output and classified, and the classification result is as follows: normal or depressed.

Claims (10)

1. A depression assessment system based on eye movement and facial expression is characterized by comprising an emotion stimulation module, an expression acquisition module, an eye movement feature extraction module, an expression feature extraction module, a machine learning classification module and an automatic assessment module; the expression acquisition module is used for acquiring expression information of a testee when watching different emotional stimulation pictures output by the emotional stimulation module; the eye movement acquisition module is used for acquiring eye movement information of a testee when watching different emotional stimulation pictures output by the emotional stimulation module; the eye movement characteristic extraction module extracts eye movement characteristics from the obtained eye movement image information, and the expression characteristic extraction module extracts expression characteristics from the obtained expression image information; the machine learning classification module performs feature fusion and machine learning classification; and the automatic evaluation module evaluates the depression degree of the testee according to the machine learning classification result.
2. The system for evaluating depression according to claim 1, wherein the eye movement acquisition module comprises a foreground camera and an eye movement camera, the foreground camera is arranged in the middle area of the forehead of the subject and is used for shooting the visual field area of the subject, the resolution is 1080p, and the sampling rate is up to 30 fps; the eye-moving cameras are arranged in the left and right areas of the cheek of the testee and used for shooting pupil images of the left and right eyes of the testee, the requirement is high frame number, the resolution ratio is 120x120, and the sampling rate reaches 200 fps.
3. The system for assessing depression based on eye movement and facial expression according to claim 2, wherein the eye movement acquisition module comprises a frame for housing a foreground camera and an eye movement camera; the spectacle frame is made of polyurethane materials through 3D printing and comprises a foreground camera support, a lengthened nose support and a pupil camera support; the foreground camera support is positioned above the eyebrows, the foreground camera is fixed at the central part, and the foreground camera support is supported on the nose part through the lengthened nose support; the eye-moving camera support is connected to the left side and the right side of the foreground camera support, and the joint of the eye-moving camera support and the foreground camera support is provided with an arc-shaped structure, so that the glasses legs of the traditional glasses can penetrate through the arc-shaped structure; the fixed eye of tip of eye-movement camera support moves the camera, eye-movement camera support outwards rotates certain angle respectively, makes eye-movement camera can not shelter from cheek part and shoot the pupil upwards to one side, and the support can stretch out and draw back, rotatory operation, adapts to the person under test of different types of face.
4. The system of claim 1, wherein the expression collection module comprises an expression collection camera disposed in front of the subject at a position to capture the complete facial area of the subject, with Robotic c1000e, resolution 4096x 2160, and sampling rate of 60 fps.
5. The system for assessing depression according to claim 1, wherein the emotional stimulation module includes picture material capable of giving positive, neutral and negative different emotional stimulations to the subject and audio material of the same emotion as the picture stimulations, respectively.
6. The system for evaluating depression according to one of claims 1 to 5, wherein the extraction of the eye movement features by the eye movement feature extraction module is to extract the eye movement features of the obtained pupil image, and comprises the steps of extracting the pupil radius and the pupil center coordinate information in the image by using a Canny edge detection operator and a Hough circle detection algorithm, and further calculating the eye movement track and the pupil size change; for the eye movement data, firstly, Gaussian filtering is used for denoising, then the center and the size of the pupil are obtained by using a Canny edge detection operator and a Hough circle for detection, and meanwhile, the characteristics of the watching region and the watching time of the testee are calculated.
7. The system according to claim 6, wherein the expression feature extraction module extracts expression features by extracting TOFS features from the obtained expression image, cutting the entire sequence of video stream into a plurality of video segments by using MDMO features based on an optical flow field and adding a sliding window, and extracting expression features to obtain 41-dimensional feature vectors; for the expression data, firstly, a face region is calculated by using a CNN convolutional neural network, 66 feature points of the face and 36 ROI (region of interest) are calculated, and finally, the TOFS feature is calculated.
8. The system of claim 7, wherein the machine learning classification module comprises a feature fusion step of: and performing multi-mode parallel feature fusion on the two groups of feature vectors of the extracted eye movement features and expression features, combining the two groups of feature vectors into a complex vector space through the complex vector, and then performing dimensionality reduction on the vector space by using Principal Component Analysis (PCA).
9. The system of claim 8, wherein the machine learning classification module comprises a step of training a classifier: marking the collected eye movement data and expression data according to whether the testee is a depressed patient, then using the eye movement data and the expression data and the label of whether the testee is the depressed patient as training data, adopting a decision tree to perform classification calculation, and establishing and training a classifier model.
10. The system for estimating depression based on eye movement and facial expression according to claim 9, wherein the automatic estimation module comprises the steps of automatically estimating: the method comprises the following steps of collecting eye movement data and expression data of a testee with unknown depression condition, extracting features, fusing the features, inputting the data into a trained classifier model, automatically evaluating whether depression tendency exists or not by the classifier model according to the input features, outputting depression degree classification, and obtaining a classification result: normal or depressed.
CN202010692613.1A 2020-07-17 2020-07-17 Depression evaluation system based on eye movement and facial expression Active CN111933275B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010692613.1A CN111933275B (en) 2020-07-17 2020-07-17 Depression evaluation system based on eye movement and facial expression

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010692613.1A CN111933275B (en) 2020-07-17 2020-07-17 Depression evaluation system based on eye movement and facial expression

Publications (2)

Publication Number Publication Date
CN111933275A true CN111933275A (en) 2020-11-13
CN111933275B CN111933275B (en) 2023-07-28

Family

ID=73313299

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010692613.1A Active CN111933275B (en) 2020-07-17 2020-07-17 Depression evaluation system based on eye movement and facial expression

Country Status (1)

Country Link
CN (1) CN111933275B (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112614583A (en) * 2020-11-25 2021-04-06 平安医疗健康管理股份有限公司 Depression grade testing system
CN112603320A (en) * 2021-01-07 2021-04-06 岭南师范学院 Optical nondestructive special children detector based on facial expression analysis and detection method
CN113052064A (en) * 2021-03-23 2021-06-29 北京思图场景数据科技服务有限公司 Attention detection method based on face orientation, facial expression and pupil tracking
CN113413154A (en) * 2021-05-14 2021-09-21 兰州大学 Wearable eye movement and facial expression synchronous acquisition system
CN113658697A (en) * 2021-07-29 2021-11-16 北京科技大学 Psychological assessment system based on video fixation difference
CN113946217A (en) * 2021-10-20 2022-01-18 北京科技大学 Intelligent auxiliary evaluation system for enteroscope operation skills
CN114743680A (en) * 2022-06-09 2022-07-12 云天智能信息(深圳)有限公司 Method, device and storage medium for evaluating non-fault
CN115607159A (en) * 2022-12-14 2023-01-17 北京科技大学 Depression state identification method and device based on eye movement sequence space-time characteristic analysis

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107212851A (en) * 2017-07-28 2017-09-29 温州市人民医院 A kind of wireless eye tracker
CN107609516A (en) * 2017-09-13 2018-01-19 重庆爱威视科技有限公司 Adaptive eye moves method for tracing
CN107773248A (en) * 2017-09-30 2018-03-09 优视眼动科技(北京)有限公司 Eye tracker and image processing method
CN109157231A (en) * 2018-10-24 2019-01-08 阿呆科技(北京)有限公司 Portable multi-channel Depression trend assessment system based on emotional distress task

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107212851A (en) * 2017-07-28 2017-09-29 温州市人民医院 A kind of wireless eye tracker
CN107609516A (en) * 2017-09-13 2018-01-19 重庆爱威视科技有限公司 Adaptive eye moves method for tracing
CN107773248A (en) * 2017-09-30 2018-03-09 优视眼动科技(北京)有限公司 Eye tracker and image processing method
CN109157231A (en) * 2018-10-24 2019-01-08 阿呆科技(北京)有限公司 Portable multi-channel Depression trend assessment system based on emotional distress task

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
YONG-JIN LIU等: "A Main Directional Mean Optical Flow Feature for Spontaneous Micro-Expression Recognition", 《IEEE TRANSACTIONS ON AFFECTIVE COMPUTING》, vol. 7, no. 4, 1 October 2015 (2015-10-01), pages 3 *
YONG-JIN LIU等: "A Main Directional Mean Optical Flow Feature for Spontaneous Micro-Expression Recognition", 《IEEE TRANSACTIONS ON AFFECTIVE COMPUTING》, vol. 7, no. 4, pages 299 - 310, XP011635112, DOI: 10.1109/TAFFC.2015.2485205 *
周晓荣: "基于头戴式眼动仪的视线跟踪算法研究", 《中国优秀博硕士学位论文全文数据库(硕士)信息科技辑》, no. 2, 15 February 2020 (2020-02-15), pages 2 *
赵盛杰: "基于脑电及卷积神经网络的抑郁症实时监测方法研究", 《中国优秀硕士学位论文全文数据库信息科技辑》, no. 12, pages 140 - 178 *
黄程韦等: "基于语音信号与心电信号的多模态情感识别", 《东南大学学报(自然科学版)》, vol. 40, no. 5, 20 September 2010 (2010-09-20), pages 3 - 4 *

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112614583A (en) * 2020-11-25 2021-04-06 平安医疗健康管理股份有限公司 Depression grade testing system
CN112603320A (en) * 2021-01-07 2021-04-06 岭南师范学院 Optical nondestructive special children detector based on facial expression analysis and detection method
CN113052064A (en) * 2021-03-23 2021-06-29 北京思图场景数据科技服务有限公司 Attention detection method based on face orientation, facial expression and pupil tracking
CN113052064B (en) * 2021-03-23 2024-04-02 北京思图场景数据科技服务有限公司 Attention detection method based on face orientation, facial expression and pupil tracking
CN113413154A (en) * 2021-05-14 2021-09-21 兰州大学 Wearable eye movement and facial expression synchronous acquisition system
CN113658697A (en) * 2021-07-29 2021-11-16 北京科技大学 Psychological assessment system based on video fixation difference
CN113658697B (en) * 2021-07-29 2023-01-31 北京科技大学 Psychological assessment system based on video fixation difference
CN113946217A (en) * 2021-10-20 2022-01-18 北京科技大学 Intelligent auxiliary evaluation system for enteroscope operation skills
CN114743680A (en) * 2022-06-09 2022-07-12 云天智能信息(深圳)有限公司 Method, device and storage medium for evaluating non-fault
CN115607159A (en) * 2022-12-14 2023-01-17 北京科技大学 Depression state identification method and device based on eye movement sequence space-time characteristic analysis

Also Published As

Publication number Publication date
CN111933275B (en) 2023-07-28

Similar Documents

Publication Publication Date Title
CN111933275B (en) Depression evaluation system based on eye movement and facial expression
Yiu et al. DeepVOG: Open-source pupil segmentation and gaze estimation in neuroscience using deep learning
Chen et al. Deepphys: Video-based physiological measurement using convolutional attention networks
CN108427503B (en) Human eye tracking method and human eye tracking device
US20210056360A1 (en) System and method using machine learning for iris tracking, measurement, and simulation
Li et al. Learning to predict gaze in egocentric video
CN109684915B (en) Pupil tracking image processing method
TWI694809B (en) Method for detecting eyeball movement, program thereof, storage media for the program and device for detecting eyeball movement
CN109712710B (en) Intelligent infant development disorder assessment method based on three-dimensional eye movement characteristics
CN112069986A (en) Machine vision tracking method and device for eye movements of old people
Chaudhary et al. Motion tracking of iris features to detect small eye movements
Leli et al. Near-infrared-to-visible vein imaging via convolutional neural networks and reinforcement learning
Zhao et al. Remote estimation of heart rate based on multi-scale facial rois
Arar et al. Towards convenient calibration for cross-ratio based gaze estimation
CN110096978A (en) The method of eccentricity cycles image procossing based on machine vision
Jayawardena et al. Automated filtering of eye gaze metrics from dynamic areas of interest
CN115512410A (en) Abnormal refraction state identification method and device based on abnormal eye posture
CN117918021A (en) Extracting signals from camera observations
EP4075385A1 (en) Method and system for anonymizing facial images
Parte et al. A survey on eye tracking and detection
Hong et al. Lightweight, low-cost, side-mounted mobile eye tracking system
Ni et al. A remote free-head pupillometry based on deep learning and binocular system
Pangestu et al. Electric Wheelchair Control Mechanism Using Eye-mark Key Point Detection.
CN113011286B (en) Squint discrimination method and system based on deep neural network regression model of video
Karaaslan et al. A new method based on deep learning and image processing for detection of strabismus with the Hirschberg test

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant