CN115607159B - Depression state identification method and device based on eye movement sequence space-time characteristic analysis - Google Patents

Depression state identification method and device based on eye movement sequence space-time characteristic analysis Download PDF

Info

Publication number
CN115607159B
CN115607159B CN202211598004.5A CN202211598004A CN115607159B CN 115607159 B CN115607159 B CN 115607159B CN 202211598004 A CN202211598004 A CN 202211598004A CN 115607159 B CN115607159 B CN 115607159B
Authority
CN
China
Prior art keywords
eye movement
sequence
feature
tester
positive
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202211598004.5A
Other languages
Chinese (zh)
Other versions
CN115607159A (en
Inventor
马惠敏
林宇昕
邹博超
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Science and Technology Beijing USTB
Original Assignee
University of Science and Technology Beijing USTB
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Science and Technology Beijing USTB filed Critical University of Science and Technology Beijing USTB
Priority to CN202211598004.5A priority Critical patent/CN115607159B/en
Publication of CN115607159A publication Critical patent/CN115607159A/en
Application granted granted Critical
Publication of CN115607159B publication Critical patent/CN115607159B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • A61B5/16Devices for psychotechnics; Testing reaction times ; Devices for evaluating the psychological state
    • A61B5/165Evaluating the state of mind, e.g. depression, anxiety
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B3/00Apparatus for testing the eyes; Instruments for examining the eyes
    • A61B3/10Objective types, i.e. instruments for examining the eyes independent of the patients' perceptions or reactions
    • A61B3/113Objective types, i.e. instruments for examining the eyes independent of the patients' perceptions or reactions for determining or recording eye movement
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • A61B5/72Signal processing specially adapted for physiological signals or for diagnostic purposes
    • A61B5/7235Details of waveform analysis
    • A61B5/7264Classification of physiological signals or data, e.g. using neural networks, statistical classifiers, expert systems or fuzzy systems
    • A61B5/7267Classification of physiological signals or data, e.g. using neural networks, statistical classifiers, expert systems or fuzzy systems involving training the classification device

Landscapes

  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Surgery (AREA)
  • Psychiatry (AREA)
  • Veterinary Medicine (AREA)
  • Public Health (AREA)
  • General Health & Medical Sciences (AREA)
  • Animal Behavior & Ethology (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Biomedical Technology (AREA)
  • Heart & Thoracic Surgery (AREA)
  • Medical Informatics (AREA)
  • Pathology (AREA)
  • Hospice & Palliative Care (AREA)
  • Evolutionary Computation (AREA)
  • Developmental Disabilities (AREA)
  • Social Psychology (AREA)
  • Psychology (AREA)
  • Human Computer Interaction (AREA)
  • Child & Adolescent Psychology (AREA)
  • Educational Technology (AREA)
  • Ophthalmology & Optometry (AREA)
  • Fuzzy Systems (AREA)
  • Mathematical Physics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physiology (AREA)
  • Signal Processing (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a method and a device for identifying depression states based on eye movement sequence space-time characteristic analysis, and relates to the technical field of visual analysis. The method comprises the following steps: acquiring eye movement data of a tester to be identified when watching a positive emotion image and a negative emotion image; inputting eye movement data of a tester to be identified into a constructed depression state identification network based on eye movement sequence space-time characteristics; and obtaining a depression state identification result of the tester to be identified according to the eye movement data of the tester to be identified and the depression state identification network based on the eye movement sequence space-time characteristics. The invention designs a depression state identification network based on eye movement sequence space-time characteristics, extracts time and space characteristics in a visual saccade path and represents the psychological state of a tester by combining with the semantics of an emotional image as experimental stimulation, thereby realizing the classification of people with different psychological states.

Description

Depression state identification method and device based on eye movement sequence space-time characteristic analysis
Technical Field
The invention relates to the technical field of visual analysis, in particular to a method and a device for identifying depression states based on eye movement sequence space-time characteristic analysis.
Background
The depression is different from the common mood fluctuation in life or the persistent transient emotional state in a period of time, and patients mainly show long-term depressed mood, feel boring and uninteresting in life, have low self-value feeling and cause certain eating or sleeping disorders. Long-term maintenance of a high degree of depression has already constituted a serious illness, with patients often feeling guilt, despair, restlessness, presence of delusions, hallucinations, and even suicidal ideation. According to the epidemiological research report of the Chinese mental disease prevalence published in 2019, about 6.9% of depression lifetime prevalence rate exists in China at present, and the statistical result shows that the depression prevalence rate of each age group above 18 years old is quite high. According to the american society of mental disorders, 7% of the adults in the united states suffer from depression, however, only nearly 40% of the adults have received effective treatment, with 49% of the caucasian population receiving effective treatment and as low as 25% of the asian population. Factors affecting effective treatment of depression patients are various, including lack of adequate medical resources, lack of trained psychologists, common social discrimination against mental diseases, and the like, but the most important causes are the inability to accurately and effectively recognize depression states. Compared with physiological diseases, the mental disease detection load is heavier, the detection period is longer, and the detection precision is lower. Therefore, how to accurately and effectively identify the depression state is a problem which needs to be solved urgently.
The conventional depression state recognition methods include the following two methods: one is a clinical diagnostic method. The clinician analyzes the extent of the subject's depressive state by analyzing the subject's verbal depression indicators, including monotonous pitch, decreased speech rate, decreased volume, etc., and non-verbal depression indicators, including decreased gestures, increased downward line of sight, etc. If the symptoms last more than two weeks, the depressed state of the test person is considered to be more severe. However, the method has certain limitations: firstly, clinical diagnosis requires a large amount of medical resources, and the diagnosis period is too long; secondly, there is subjective judgment of the doctor in the clinical diagnosis, and it needs a trained psychologist to complete the diagnosis. Thus, many testers with severe depression states cannot be diagnosed and treated effectively in a timely manner.
The second is a structured metrology method. And analyzing the degree of the depression state of the tester by the structured quantitative table from a plurality of dimensions, and synthesizing the score condition of each dimension to obtain a final depression state evaluation result. The structured quantity table is divided into a Self-Rating table and a Self-Rating table, wherein the Self-Rating table means that a tester evaluates himself according to the guidance words of the table, is simple and easy to implement, can be independently completed in about 5 to 10 minutes, and is famous as SDS (Self-Rating Depression Scale), BDI (Beck Depression Self-Rating table) and the like; the rating scale is used for interviewing testers by professional psychologists, scoring item by item according to the rating standard of each item of the scale and obtaining the evaluation result of the testers, and the rating scale is higher in relative reliability and is known as HAMD (Hamilton depression scale) and the like. Firstly, the problem mode of the scale is fixed, and a tester can hide the real psychological state of the tester and obtain an expected score; second, the scale does not allow psychological assessment of subjects with reading disabilities.
Disclosure of Invention
The invention aims at the problem of how to accurately and effectively identify the depression state.
In order to solve the technical problems, the invention provides the following technical scheme:
in one aspect, the present invention provides a method for identifying a depressive state based on eye movement sequence spatiotemporal feature analysis, the method being implemented by an electronic device, the method comprising:
s1, eye movement data of a tester to be identified when the tester watches the combination of the positive emotion image and the negative emotion image are obtained.
S2, inputting the eye movement data of the testee to be identified into the constructed depression state identification network based on the eye movement sequence space-time characteristics.
And S3, obtaining a depression state identification result of the tester to be identified according to the eye movement data of the tester to be identified and the depression state identification network based on the eye movement sequence space-time characteristics.
Optionally, the construction process of the depression state identification network based on the eye movement sequence space-time characteristics in S2 includes:
s21, obtaining a characteristic diagram of each positive and negative emotion image combination in the multiple positive and negative emotion image combinations.
S22, acquiring eye movement data of a sample tester when the sample tester watches each group of positive and negative emotion image combination; wherein the sample testers include normal population and depressed population.
And S23, obtaining an eye movement sequence according to the eye movement data.
S24, constructing a characteristic sequence of the fixation point according to the eye movement sequence and the characteristic diagram; the characteristic sequence comprises a spatial characteristic sequence and a time characteristic sequence.
And S25, training the constructed transform model by using the feature sequence and the classification label of the psychological state corresponding to the feature sequence, and coding and decoding the input feature sequence by using the trained transform model to obtain an interacted feature sequence.
And S26, inputting the interacted feature sequence into a full connection layer to obtain a classification label of the psychological state of the interacted feature sequence.
And S27, obtaining a depression state identification result of the sample tester according to the classification label of the psychological state of the interacted feature sequence and a preset threshold value.
Optionally, the obtaining a feature map of each positive and negative emotion picture combination in the multiple sets of positive and negative emotion picture combinations in S21 includes:
based on an image encoder, feature extraction is carried out on each group of positive and negative emotion picture combination in the multiple groups of positive and negative emotion picture combinations, and feature graphs obtained by feature extraction are up-sampled to obtain feature graphs of each group of positive and negative emotion picture combinations.
Optionally, the obtaining an eye movement sequence according to the eye movement data in S23 includes:
s231, mapping the eye movement data to a coordinate system of the positive and negative emotion picture combination to obtain mapped coordinate point data.
S232, performing fixation point extraction on the mapped coordinate point data based on a speed threshold algorithm to obtain a horizontal and vertical coordinate of the fixation point and a time stamp formed by the fixation point, and determining the horizontal and vertical coordinate and the time stamp as an eye movement sequence.
Optionally, the step of performing fixation point extraction on the mapped coordinate point data based on a speed threshold algorithm in S232 to obtain a horizontal and vertical coordinate of the fixation point and a timestamp formed by the fixation point, and determining the horizontal and vertical coordinate and the timestamp as an eye movement sequence includes:
sequentially acquiring coordinate points in a set A consisting of mapped coordinate point data, calculating the eye movement speed from the acquired coordinate point to the next coordinate point of the acquired coordinate points, if the eye movement speed is less than a preset speed threshold, adding the acquired coordinate points into the set B, and continuously acquiring the next coordinate point until the speed of the acquired coordinate points is greater than or equal to the speed threshold, judging whether the number of the coordinate points in the set B is greater than or equal to the preset number threshold, if so, forming primary fixation by the set B, and obtaining a fixation point; if not, the set B does not form primary fixation, and the next coordinate point is continuously acquired until the coordinate points in the set A are completely acquired.
Optionally, the number threshold is: and forming the minimum recording point number obtained by converting the shortest time of one fixation.
Optionally, the eye movement sequence comprises the abscissa and ordinate of the fixation point and the time stamp.
In S24, constructing a feature sequence of the gaze point according to the eye movement sequence and the feature map includes:
and S241, obtaining eye movement sequences of the fixation points with the preset number.
And S242, acquiring image characteristics of corresponding positions on the characteristic diagram according to the horizontal and vertical coordinates in the acquired eye movement sequence to obtain a spatial characteristic sequence.
And S243, converting the time stamp in the acquired eye movement sequence into relative time from the tester to start acquiring the eye movement data, and taking the relative time as position coding to obtain a time characteristic sequence.
In another aspect, the present invention provides a device for identifying a depressive state based on eye movement sequence spatiotemporal feature analysis, the device being applied to a method for identifying a depressive state based on eye movement sequence spatiotemporal feature analysis, the device comprising:
and the acquisition module is used for acquiring the eye movement data when the tester to be identified watches the positive and negative emotion image combination.
And the input module is used for inputting the eye movement data of the tester to be identified into the constructed depression state identification network based on the eye movement sequence space-time characteristics.
And the output module is used for obtaining a depression state identification result of the tester to be identified according to the eye movement data of the tester to be identified and the depression state identification network based on the eye movement sequence space-time characteristics.
Optionally, the input module is further configured to:
s21, obtaining a characteristic diagram of each positive and negative emotion image combination in the multiple positive and negative emotion image combinations.
S22, acquiring eye movement data of a sample tester when the sample tester watches each group of positive and negative emotion image combinations; wherein the sample testers include normal population and depressed population.
And S23, obtaining an eye movement sequence according to the eye movement data.
S24, constructing a characteristic sequence of the fixation point according to the eye movement sequence and the characteristic diagram; the characteristic sequence comprises a spatial characteristic sequence and a time characteristic sequence.
And S25, training the constructed transformer model by using the characteristic sequence and the classification label of the psychological state corresponding to the characteristic sequence, and coding and decoding the input characteristic sequence by using the trained transformer model to obtain an interacted characteristic sequence.
And S26, inputting the interacted feature sequences into a full connection layer to obtain the classification labels of the psychological states of the interacted feature sequences.
And S27, obtaining a depression state identification result of the sample tester according to the classification label of the psychological state of the interacted feature sequence and a preset threshold value.
Optionally, the input module is further configured to:
based on an image encoder, feature extraction is carried out on each group of positive and negative emotion picture combination in the multiple groups of positive and negative emotion picture combinations, and feature graphs obtained by feature extraction are up-sampled to obtain feature graphs of each group of positive and negative emotion picture combinations.
Optionally, the input module is further configured to:
s231, mapping the eye movement data to a coordinate system of the positive and negative emotion picture combination to obtain mapped coordinate point data.
S232, performing fixation point extraction on the mapped coordinate point data based on a speed threshold algorithm to obtain a horizontal and vertical coordinate of the fixation point and a time stamp formed by the fixation point, and determining the horizontal and vertical coordinate and the time stamp as an eye movement sequence.
Optionally, the input module is further configured to:
sequentially acquiring coordinate points in a set A consisting of mapped coordinate point data, calculating the eye movement speed from the acquired coordinate point to the next coordinate point of the acquired coordinate points, if the eye movement speed is less than a preset speed threshold, adding the acquired coordinate points into the set B, and continuously acquiring the next coordinate point until the speed of the acquired coordinate points is greater than or equal to the speed threshold, judging whether the number of the coordinate points in the set B is greater than or equal to the preset number threshold, if so, forming primary fixation by the set B, and obtaining a fixation point; if not, the set B does not form primary fixation, and the next coordinate point is continuously acquired until the coordinate points in the set A are completely acquired.
Optionally, the number threshold is: and forming the minimum recording point number obtained by converting the shortest time of one fixation.
Optionally, the eye movement sequence comprises the abscissa and ordinate of the fixation point and the time stamp.
An input module further to:
and S241, obtaining eye movement sequences of the fixation points with the preset number.
And S242, acquiring image characteristics of corresponding positions on the characteristic diagram according to the horizontal and vertical coordinates in the acquired eye movement sequence to obtain a spatial characteristic sequence.
And S243, converting the time stamp in the acquired eye movement sequence into relative time from the tester to start acquiring the eye movement data, and taking the relative time as position coding to obtain a time characteristic sequence.
In one aspect, an electronic device is provided and includes a processor and a memory, where the memory stores at least one instruction that is loaded and executed by the processor to implement the method for identifying a depressive state based on eye movement sequence spatiotemporal feature analysis.
In one aspect, a computer-readable storage medium is provided, in which at least one instruction is stored, and the at least one instruction is loaded and executed by a processor to implement the above-mentioned method for discriminating a depression state based on eye movement sequence spatiotemporal feature analysis.
The technical scheme provided by the embodiment of the invention has the beneficial effects that at least:
in the scheme, an objective and convenient depression state detection method is realized, and positive and negative emotion images are used as stimuli to present and record eyeball movement data of a subject. The attention patterns of depressed and non-depressed individuals on positive and negative mood images differ and eye movements directly reflect the process of attention allocation. The scan path may be viewed as a representation of intrinsic features of the observer's brain. Scan path similarity may be considered as a sign of perceptual or attention bias. Deep learning is utilized to establish a connection between image processing and psychological state analysis of a subject, and a depression state identification method which is more efficient, objective and easy to obtain is realized. By using the method for identifying the depressive state based on the eye movement sequence space-time characteristic analysis, disclosed by the invention, the accurate and objective identification of the depressive state can be realized, the subjectivity of the traditional method for identifying the depressive state is overcome, the manpower and material resources are saved, and the large-scale and quick screening and quantitative evaluation of the depressive state can be realized in schools, hospitals and enterprises.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings required to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the description below are only some embodiments of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained according to the drawings without creative efforts.
FIG. 1 is a schematic flow chart of a method for identifying a depression state based on eye movement sequence spatiotemporal feature analysis, provided by an embodiment of the invention;
fig. 2 is a diagram of a network structure for discriminating a depressed state based on eye movement sequence spatiotemporal characteristics according to an embodiment of the present invention;
FIG. 3 is a flow chart of an algorithm based on a speed threshold provided by an embodiment of the invention;
FIG. 4 is a block diagram of a device for identifying a depression state based on eye movement sequence spatiotemporal feature analysis, provided by an embodiment of the invention;
fig. 5 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
In order to make the technical problems, technical solutions and advantages of the present invention more apparent, the following detailed description is given with reference to the accompanying drawings and specific embodiments.
As shown in fig. 1, an embodiment of the present invention provides a method for identifying a depression state based on eye movement sequence spatiotemporal feature analysis, which may be implemented by an electronic device. As shown in fig. 1, the flow chart of the method for identifying the depression state based on the eye movement sequence spatiotemporal feature analysis, the processing flow of the method may include the following steps:
s1, eye movement data of a tester to be identified when the tester watches the combination of the positive emotion image and the negative emotion image are obtained.
In one possible embodiment, the eye movement data of the test person to be identified can be recorded on the basis of an eye tracker.
S2, inputting the eye movement data of the testee to be identified into the constructed depression state identification network based on the eye movement sequence space-time characteristics.
In one possible embodiment, the present invention contemplates a scan path similarity calculation method consisting of five separate measurement methods that capture the similarity between the different features of the scan path, i.e., vector, direction, length, position, and duration. The scan path similarity calculation needs to be simplified before it can be done, since the original eye movement signals cannot directly reflect the tester's gaze pattern, iteratively merging into a continuous gaze point if the original eye movement coordinate recording points are within a given distance or within a given direction threshold of each other. This simplified process helps to reduce the complexity of the scan path while maintaining its spatial and temporal structure. After the simplification, the scan path is aligned according to its shape using a dynamic programming method. Alignment is calculated by optimizing the vector difference between the scan paths, which reduces sensitivity in comparison to small or small spatial scan path variations and allows the algorithm to find the best possible match between the two scan paths, all subsequent similarity measures being calculated on these simplified aligned scan paths.
Where the vector similarity is calculated as the vector difference between aligned glance pairs, normalized by the screen diagonal and averaged over the scan path. The method does not rely on predefined quantification and is sensitive to spatial differences in gaze locations. It is an index that measures the overall similarity in shape of two fixation-saccade sequences. The length similarity is calculated as the absolute difference of the amplitude of the registered glance vectors, normalized by the screen diagonal and averaged over the scan path. This measurement is sensitive only to eye jump amplitude and not to direction, position or gaze duration. The directional similarity is calculated as the angular difference between the registered glances, normalized by pi and averaged over the scan path. The measurement is sensitive only to the saccade direction and not to the amplitude and absolute gaze position. The position similarity is calculated as the euclidean distance between the aligned fixed points, normalized by the screen diagonal, and averaged over the scan path. This measurement is sensitive to both pan amplitude and direction. The duration similarity is calculated as the absolute difference in gaze times of aligned fixations, normalized by the maximum duration, and averaged over the scan path. This measurement is not affected by the gaze location or eye jump amplitude.
Further, the present invention measures scan path similarity by comparing different scan paths belonging to the same set of different members (intra-population similarity). Intra-group scan path similarity helps to measure the commonality of group members in terms of apparent behavior, i.e., features common to group members. Higher group similarity means that the eye movement pattern is more limited by the members of the group, or simply, the person looks very similar to the other members of the group. The present invention finds that the similarity of the scan paths of depressed subjects is significantly higher than that of another group in the form of vector, length, position, duration measurements for images of subjects like humans, animals, objects, and the present invention recognizes that in this case the similarity of the scan paths can be interpreted as a reflection of the attention bias, limiting the eye movement pattern. Since the eye is free to move without any bias, exploring each region of the image with equal probability, should result in completely different scan paths through repeated presentation of the image. But because viewers have many biases that limit their eye movement patterns, such as favoring salient or meaningful features of the image, an increase in scan path similarity can be interpreted as a sign of perceptual or attention bias for a certain class of stimuli.
Further, scan path similarity is used as a new attention bias or preference metric that limits eye movement patterns by directing attention to specific visual or semantic features of the image. Therefore, the method is very important for utilizing the time-space characteristics of the eye movement sequence, and the method presents a plurality of groups of positive and negative emotion picture combinations to a tested object as experimental stimuli according to a pre-established experimental platform, records the eyeball movement of the tested object, and designs a depression state identification network based on the time-space characteristics of the eye movement sequence, wherein the structure diagram of the network is shown in fig. 2.
Alternatively, the construction process of the depression state discrimination network based on eye movement sequence spatiotemporal features in S2 may include S21-S27:
and S21, based on an image encoder, performing feature extraction on each positive and negative emotion picture combination in the multiple positive and negative emotion picture combinations, and performing up-sampling on a feature map obtained by the feature extraction to obtain a feature map of each positive and negative emotion picture combination.
In a possible embodiment, the invention splices a group of emotion pictures (positive + negative) presented at the same time into one picture, the dimension of the picture is 512 × 1340, the number of channels is 3, and Resnet-34 can be selected as an image encoder to perform feature extraction on the picture. The obtained feature map is 32 times of dimension reduction, and the number of channels is 512, so that repeated extraction of semantic features of the picture is avoided, and time and space redundancy is reduced. In order to reduce the size difference between the feature map and the original map, the feature map is up-sampled by 2 times in size.
S22, eye movement data of the normal population and the depressed population when each group of positive and negative emotion image combination is watched is obtained.
And S23, obtaining an eye movement sequence according to the eye movement data.
Optionally, the step S23 may include the following steps S231 to S232:
s231, mapping the eye movement data to a coordinate system of the positive and negative emotion picture combination to obtain mapped coordinate point data.
In a possible embodiment, the data recorded by the eye tracker is in the frame of reference of the screen coordinate system, so that the recorded raw data needs to be mapped onto the displayed two-dimensional image coordinate system. The resolution of the display screen is marked as a × b; the image resolution is denoted as a '. Multidot.b'. The coordinate data (x, y) recorded by the eye tracker are mapped to the two-dimensional image display coordinate system (x ', y'), and the calculation is shown as the following formula (1):
Figure 817730DEST_PATH_IMAGE001
s232, based on a speed threshold algorithm, the mapped coordinate point data are subjected to fixation point extraction, the horizontal and vertical coordinates of the fixation point and a time stamp formed by the fixation point are obtained, and the horizontal and vertical coordinates and the time stamp are determined to be an eye movement sequence. Optionally, the step S232 may specifically be: sequentially acquiring coordinate points in a set A consisting of mapped coordinate point data, calculating the eye movement speed from the acquired coordinate point to the next coordinate point of the acquired coordinate points, if the eye movement speed is less than a preset speed threshold, adding the acquired coordinate points into the set B, and continuously acquiring the next coordinate point until the speed of the acquired coordinate points is greater than or equal to the speed threshold, judging whether the number of the coordinate points in the set B is greater than or equal to the preset number threshold, if so, forming primary fixation by the set B, and obtaining a fixation point; if not, the set B does not form a fixation, and the next coordinate point is continuously acquired until the coordinate points in the set A are completely acquired.
Optionally, the number threshold is: and forming the minimum recording point number obtained by converting the shortest time of one fixation.
In a feasible implementation mode, an algorithm for clustering coordinate points based on a speed threshold is designed to obtain horizontal and vertical coordinates and time stamps of a fixation point as an eye movement sequence. The algorithm flow based on the speed threshold is shown in fig. 3.
Further, traversing backward from the first coordinate point of the set a of eye movement data to be tested to be recorded in a single test, calculating an eye movement velocity v from the current coordinate point to the next coordinate point, the velocity v being calculated by the following formula (2):
Figure 266029DEST_PATH_IMAGE002
further, the calculated speed value v is smaller than the speed threshold value
Figure 576925DEST_PATH_IMAGE003
Join set B and continue traversing backwards until it exceeds a speed threshold &>
Figure 503292DEST_PATH_IMAGE003
. If the number of trace points in the set B exceeds the number threshold->
Figure 599555DEST_PATH_IMAGE004
Set B is considered to constitute a fixation. Wherein the number threshold is->
Figure 953176DEST_PATH_IMAGE004
Is that the shortest time to make a fixation->
Figure 16947DEST_PATH_IMAGE005
The converted minimum number of recording points is calculated by the following formula (3), wherein f is the working frequency of the eye tracker:
Figure 747006DEST_PATH_IMAGE006
by calculating all eye movement track data of the tester in a single experiment process, the information of a plurality of fixation points of the tester can be obtained and used for subsequent feature extraction and analysis work.
And S24, constructing a characteristic sequence of the fixation point according to the eye movement sequence and the characteristic diagram.
The characteristic sequence comprises a spatial characteristic sequence and a time characteristic sequence.
Alternatively, the step S24 may include the following steps S241 to S243:
and S241, obtaining eye movement sequences of the fixation points with the preset number.
And S242, acquiring image characteristics of corresponding positions on the characteristic diagram according to the horizontal and vertical coordinates in the acquired eye movement sequence to obtain a spatial characteristic sequence.
And S243, converting the time stamp in the acquired eye movement sequence into relative time from the tester to start acquiring the eye movement data, and taking the relative time as position coding to obtain a time characteristic sequence.
In one possible embodiment, the image features of the spatial positions (x, y) corresponding to the eye movement sequences of the first 6 fixation points formed on each picture are extracted to obtain a set of spatial feature sequences. The time stamp information in the eye movement sequence is converted into the relative time from the beginning of each group of experiments, the codes are embedded as position information, the relative time when each fixation point is formed and the time difference between different fixation points can be represented, the core idea is to provide effective distance information during the calculation of the attention mechanism, and the time characteristics in the eye movement sequence are fully utilized.
And S25, training the constructed transformer model by using the characteristic sequence and the classification label of the psychological state corresponding to the characteristic sequence, and coding and decoding the input characteristic sequence by using the trained transformer model to obtain an interacted characteristic sequence.
And S26, inputting the interacted feature sequence into a full connection layer to obtain a classification label of the psychological state of the interacted feature sequence.
And S27, obtaining a depression state identification result of the sample tester according to the classification label of the psychological state of the interacted feature sequence and a preset threshold value.
In a possible implementation, the constructed transformer network structure is trained by using the spatial features of the eye movement sequence obtained after the processing, the structure of the attention mechanism in the transformer is similar to the attention mechanism of human, and in the visual system of human, there is a selective attention mechanism. The visual attention mechanism is a brain signal processing mechanism unique to human vision. Human vision obtains a target area needing important attention, namely a focus of attention in general, by rapidly scanning a global image, and then puts more attention resources into the area to obtain more detailed information of the target needing attention, and suppresses other useless information. The method is a means for rapidly screening high-value information from a large amount of information by using limited attention resources, is a survival mechanism formed in long-term evolution of human beings, and greatly improves the efficiency and accuracy of visual information processing by using the human visual attention mechanism. And meanwhile, the obtained position coding information is used as the time characteristic of the eye movement sequence and is also sent to a transformer to train the eye movement sequence, and the position coding aims to embed the position information into an input vector, so that the position information is not lost in the calculation. The characteristic sequence and the position code of each section of input have corresponding label data, and are divided into a normal group and a depressed group. Training the constructed transformer model by utilizing the spatial characteristic sequence, the temporal characteristic sequence and the psychological state classification label corresponding to the characteristic sequence constructed by the eye movement sequence of the tester, and coding and decoding the input characteristic sequence by the transformer model to obtain an interacted characteristic sequence; inputting the interacted feature sequence into a full connection layer to obtain a classification label of the psychological state of the interacted feature sequence; and obtaining a depression state identification result of the tester according to a classification label of the psychological state and a preset threshold value obtained after the coding and decoding operation interaction of all the eye movement sequences of the tester through a transducer model.
Furthermore, the method utilizes a plurality of measurement methods to calculate the similarity of the saccade path, realizes the multi-dimensional analysis of the visual saccade path, fully excavates and utilizes the space-time characteristics of the eye movement signals, and quantifies the difference of the attention mechanisms of the two groups of people.
In other methods, only basic characteristics of eye movement signals, such as the number of eye movement points, the length of a track and the like, are utilized, the space-time characteristics of the eye movement signals are fully mined and utilized for the first time, the space characteristics of the eye movement are combined with the semantics of emotion pictures serving as experimental stimuli, the dimension of time is embedded into an input vector as relative position codes, and the space-time characteristic information of an eye movement sequence is fully utilized.
The invention realizes reasonable classification algorithm design under limited sample data, fully utilizes each group of experimental labels of each tested sample as a sample, can enlarge the scale of a data set to 80 times of the original scale, and 80 is the number of groups of pictures seen when each tested sample participates in an experiment, so that the training of a network model can be trained based on a larger data set, and the generalization capability of the model is improved.
In the embodiment of the invention, an objective and convenient depression state detection method is realized, positive and negative mood images are taken as stimuli to present and record the eyeball motion data of a subject. Depressed and non-depressed individuals differ in their attention patterns to positive and negative mood images and eye movements directly reflect the process of attention allocation. The scan path may be viewed as a representation of intrinsic features of the observer's brain. Scan path similarity may be considered as a sign of perceptual or attention bias. Deep learning is utilized to establish a connection between image processing and psychological state analysis of a subject, and a depression state identification method which is more efficient, objective and easy to obtain is realized. By using the method for identifying the depression state based on the eye movement sequence space-time characteristic analysis, disclosed by the invention, the accurate and objective identification of the depression state can be realized, the subjectivity of the traditional depression state identification method is overcome, the saving of manpower and material resources is realized, and the large-scale and quick screening and quantitative evaluation of the depression state can be realized in schools, hospitals and enterprises.
As shown in fig. 4, an embodiment of the present invention provides a device 400 for identifying a depression state based on eye movement sequence spatiotemporal feature analysis, where the device 400 is applied to implement a method for identifying a depression state based on eye movement sequence spatiotemporal feature analysis, and the device 400 includes:
the obtaining module 410 is configured to obtain eye movement data of the tester to be authenticated when the tester views the combination of the positive emotion image and the negative emotion image.
And the input module 420 is used for inputting the eye movement data of the tester to be identified into the constructed depression state identification network based on the eye movement sequence space-time characteristics.
And the output module 430 is used for obtaining a depression state identification result of the tester to be identified according to the eye movement data of the tester to be identified and the depression state identification network based on the eye movement sequence space-time characteristics.
Optionally, the input module 420 is further configured to:
s21, obtaining a characteristic diagram of each positive and negative emotion image combination in the multiple positive and negative emotion image combinations.
S22, acquiring eye movement data of a sample tester when the sample tester watches each group of positive and negative emotion image combinations; wherein the sample testers include normal population and depressed population.
And S23, obtaining an eye movement sequence according to the eye movement data.
S24, constructing a characteristic sequence of the fixation point according to the eye movement sequence and the characteristic diagram; the characteristic sequence comprises a spatial characteristic sequence and a time characteristic sequence.
And S25, training the constructed transformer model by using the characteristic sequence and the classification label of the psychological state corresponding to the characteristic sequence, and coding and decoding the input characteristic sequence by using the trained transformer model to obtain an interacted characteristic sequence.
And S26, inputting the interacted feature sequence into a full connection layer to obtain a classification label of the psychological state of the interacted feature sequence.
And S27, obtaining a depression state identification result of the sample tester according to the classification label of the psychological state of the interacted feature sequence and a preset threshold value.
Optionally, the input module 420 is further configured to:
based on an image encoder, feature extraction is carried out on each group of positive and negative emotion picture combination in the multiple groups of positive and negative emotion picture combinations, and feature graphs obtained by feature extraction are up-sampled to obtain feature graphs of each group of positive and negative emotion picture combinations.
Optionally, the input module 420 is further configured to:
s231, mapping the eye movement data to a coordinate system of the positive and negative emotion picture combination to obtain mapped coordinate point data.
S232, performing fixation point extraction on the mapped coordinate point data based on a speed threshold algorithm to obtain a horizontal and vertical coordinate of the fixation point and a time stamp formed by the fixation point, and determining the horizontal and vertical coordinate and the time stamp as an eye movement sequence.
Optionally, the input module 420 is further configured to:
sequentially acquiring coordinate points in a set A consisting of mapped coordinate point data, calculating the eye movement speed from the acquired coordinate point to the next coordinate point of the acquired coordinate points, if the eye movement speed is less than a preset speed threshold, adding the acquired coordinate points into the set B, and continuously acquiring the next coordinate point until the speed of the acquired coordinate points is greater than or equal to the speed threshold, judging whether the number of the coordinate points in the set B is greater than or equal to the preset number threshold, if so, forming primary fixation by the set B, and obtaining a fixation point; if not, the set B does not form primary fixation, and the next coordinate point is continuously acquired until the coordinate points in the set A are completely acquired.
Optionally, the number threshold is: and forming the minimum recording point number obtained by converting the shortest time of one fixation.
Optionally, the eye movement sequence comprises the abscissa and ordinate of the fixation point and the time stamp.
An input module 420, further configured to:
and S241, obtaining eye movement sequences of the fixation points with the preset number.
And S242, acquiring image characteristics of corresponding positions on the characteristic diagram according to the horizontal and vertical coordinates in the acquired eye movement sequence to obtain a spatial characteristic sequence.
And S243, converting the time stamp in the acquired eye movement sequence into relative time from the tester to start acquiring the eye movement data, and taking the relative time as position coding to obtain a time characteristic sequence.
In the embodiment of the invention, an objective and convenient depression state detection method is realized, and positive and negative emotion images are used as stimuli to present and record eyeball motion data of a subject. The attention patterns of depressed and non-depressed individuals on positive and negative mood images differ and eye movements directly reflect the process of attention allocation. The scan path may be viewed as a representation of intrinsic characteristics of the observer's brain. Scan path similarity may be considered as a sign of perceptual or attention bias. Deep learning is utilized to establish a connection between image processing and psychological state analysis of a subject, and a depression state identification method which is more efficient, objective and easy to obtain is realized. By using the method for identifying the depressive state based on the eye movement sequence space-time characteristic analysis, disclosed by the invention, the accurate and objective identification of the depressive state can be realized, the subjectivity of the traditional method for identifying the depressive state is overcome, the manpower and material resources are saved, and the large-scale and quick screening and quantitative evaluation of the depressive state can be realized in schools, hospitals and enterprises.
Fig. 5 is a schematic structural diagram of an electronic device 500 according to an embodiment of the present invention, where the electronic device 500 may generate relatively large differences due to different configurations or performances, and may include one or more processors (CPUs) 501 and one or more memories 502, where the memory 502 stores at least one instruction, and the at least one instruction is loaded and executed by the processor 501 to implement the following depression state identification method based on eye movement sequence spatiotemporal feature analysis:
s1, eye movement data of a tester to be identified when the tester watches the combination of the positive emotion image and the negative emotion image are obtained.
S2, inputting the eye movement data of the testee to be identified into the constructed depression state identification network based on the eye movement sequence space-time characteristics.
And S3, obtaining a depression state identification result of the tester to be identified according to the eye movement data of the tester to be identified and the depression state identification network based on the eye movement sequence space-time characteristics.
In an exemplary embodiment, there is also provided a computer-readable storage medium, such as a memory including instructions executable by a processor in a terminal, to perform the above-described method for depression state discrimination based on eye movement sequence spatiotemporal feature analysis. For example, the computer readable storage medium may be a ROM, a Random Access Memory (RAM), a CD-ROM, a magnetic tape, a floppy disk, an optical data storage device, and the like.
It will be understood by those skilled in the art that all or part of the steps for implementing the above embodiments may be implemented by hardware, or may be implemented by a program instructing relevant hardware, where the program may be stored in a computer-readable storage medium, and the storage medium may be a read-only memory, a magnetic disk or an optical disk.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.

Claims (6)

1. A method for identifying a depressive state based on temporal-spatial feature analysis of eye movement sequences, the method comprising:
s1, obtaining eye movement data of a tester to be identified when watching a positive emotion image and a negative emotion image;
s2, inputting the eye movement data of the tester to be identified into a constructed depression state identification network based on eye movement sequence space-time characteristics;
s3, obtaining a depression state identification result of the tester to be identified according to the eye movement data of the tester to be identified and a depression state identification network based on eye movement sequence space-time characteristics;
the construction process of the depression state identification network based on the eye movement sequence space-time characteristics in the S2 comprises the following steps:
s21, obtaining a characteristic diagram of each positive and negative emotion image combination in the multiple positive and negative emotion image combinations;
s22, acquiring eye movement data of a sample tester when the sample tester watches each group of positive and negative emotion image combinations; wherein the sample testers comprise a normal population and a depressed population;
s23, obtaining an eye movement sequence according to the eye movement data;
s24, constructing a characteristic sequence of the fixation point according to the eye movement sequence and the characteristic diagram; wherein the feature sequence comprises a spatial feature sequence and a temporal feature sequence;
s25, training the constructed transform model by using the feature sequence and the classification label of the psychological state corresponding to the feature sequence, and coding and decoding the input feature sequence by using the trained transform model to obtain an interacted feature sequence;
s26, inputting the interacted feature sequences into a full connection layer to obtain a classification label of the psychological state of the interacted feature sequences;
s27, obtaining a depression state identification result of the sample tester according to the classification label of the psychological state of the interacted feature sequence and a preset threshold value;
the eye movement sequence comprises a horizontal coordinate and a vertical coordinate of the fixation point and a time stamp;
in S24, constructing a feature sequence of the fixation point according to the eye movement sequence and the feature map, including:
s241, obtaining eye movement sequences of the fixation points with the preset number;
s242, acquiring image features of corresponding positions on the feature map according to horizontal and vertical coordinates in the acquired eye movement sequence to obtain a spatial feature sequence;
and S243, converting the time stamp in the acquired eye movement sequence into relative time from the tester to start acquiring the eye movement data, and using the relative time as a position code to obtain a time characteristic sequence.
2. The method according to claim 1, wherein the obtaining of the feature map of each positive and negative emotion picture combination in the multiple positive and negative emotion picture combinations in S21 includes:
based on an image encoder, feature extraction is carried out on each group of positive and negative emotion picture combination in a plurality of groups of positive and negative emotion picture combinations, and a feature graph obtained by feature extraction is up-sampled to obtain the feature graph of each group of positive and negative emotion picture combination.
3. The method according to claim 1, wherein the obtaining an eye movement sequence according to the eye movement data in S23 includes:
s231, mapping the eye movement data to a coordinate system of the positive and negative emotion picture combination to obtain mapped coordinate point data;
s232, performing fixation point extraction on the mapped coordinate point data based on a speed threshold algorithm to obtain a horizontal and vertical coordinate of a fixation point and a time stamp formed by the fixation point, and determining the horizontal and vertical coordinate and the time stamp as an eye movement sequence.
4. The method according to claim 3, wherein the S232 performs gaze point extraction on the mapped coordinate point data based on a speed threshold algorithm to obtain an abscissa and an ordinate of a gaze point and a timestamp formed by the gaze point, and determines the abscissa and the timestamp as an eye movement sequence, including:
sequentially acquiring coordinate points in a set A consisting of the mapped coordinate point data, calculating the eye movement speed from the acquired coordinate point to the next coordinate point of the acquired coordinate points, if the eye movement speed is less than a preset speed threshold, adding the acquired coordinate points into a set B, and continuously acquiring the next coordinate point until the speed of the acquired coordinate points is greater than or equal to the speed threshold, judging whether the number of the coordinate points in the set B is greater than or equal to a preset number threshold, if so, forming one-time fixation by the set B, and obtaining a fixation point; if not, the set B does not form primary fixation, and the next coordinate point is continuously acquired until the coordinate points in the set A are completely acquired.
5. The method of claim 4, wherein the number threshold is: and forming the minimum recording point number obtained by converting the minimum time of one fixation.
6. A depression state discrimination apparatus based on eye movement sequence spatiotemporal feature analysis, the apparatus comprising:
the acquisition module is used for acquiring eye movement data when the tester to be identified watches the combination of the positive emotion and the negative emotion;
the input module is used for inputting the eye movement data of the tester to be identified into the constructed depression state identification network based on the eye movement sequence space-time characteristics;
the output module is used for obtaining a depression state identification result of the tester to be identified according to the eye movement data of the tester to be identified and the depression state identification network based on the eye movement sequence space-time characteristics;
the construction process of the depression state identification network based on the eye movement sequence space-time characteristics comprises the following steps:
s21, obtaining a characteristic diagram of each positive and negative emotion picture combination in a plurality of groups of positive and negative emotion picture combinations;
s22, acquiring eye movement data of a sample tester when the sample tester watches each group of positive and negative emotion image combinations; wherein the sample testers comprise a normal population and a depressed population;
s23, obtaining an eye movement sequence according to the eye movement data;
s24, constructing a characteristic sequence of the fixation point according to the eye movement sequence and the characteristic diagram; wherein the feature sequence comprises a spatial feature sequence and a temporal feature sequence;
s25, training the constructed transformer model by using the characteristic sequence and the classification label of the psychological state corresponding to the characteristic sequence, and coding and decoding the input characteristic sequence by using the trained transformer model to obtain an interacted characteristic sequence;
s26, inputting the interacted feature sequences into a full connection layer to obtain a classification label of the psychological state of the interacted feature sequences;
s27, obtaining a depression state identification result of the sample tester according to the classification label of the psychological state of the interacted feature sequence and a preset threshold value;
the eye movement sequence comprises a horizontal coordinate and a vertical coordinate of a fixation point and a time stamp;
in S24, constructing a feature sequence of the gaze point according to the eye movement sequence and the feature map includes:
s241, obtaining eye movement sequences of a preset number of fixation points;
s242, acquiring image characteristics of corresponding positions on the characteristic diagram according to the horizontal and vertical coordinates in the acquired eye movement sequence to obtain a spatial characteristic sequence;
and S243, converting the time stamp in the acquired eye movement sequence into relative time from the tester to start acquiring the eye movement data, and using the relative time as a position code to obtain a time characteristic sequence.
CN202211598004.5A 2022-12-14 2022-12-14 Depression state identification method and device based on eye movement sequence space-time characteristic analysis Active CN115607159B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211598004.5A CN115607159B (en) 2022-12-14 2022-12-14 Depression state identification method and device based on eye movement sequence space-time characteristic analysis

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211598004.5A CN115607159B (en) 2022-12-14 2022-12-14 Depression state identification method and device based on eye movement sequence space-time characteristic analysis

Publications (2)

Publication Number Publication Date
CN115607159A CN115607159A (en) 2023-01-17
CN115607159B true CN115607159B (en) 2023-04-07

Family

ID=84880628

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211598004.5A Active CN115607159B (en) 2022-12-14 2022-12-14 Depression state identification method and device based on eye movement sequence space-time characteristic analysis

Country Status (1)

Country Link
CN (1) CN115607159B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2016112690A1 (en) * 2015-01-14 2016-07-21 北京工业大学 Eye movement data based online user state recognition method and device
CN112674771A (en) * 2020-12-22 2021-04-20 北京科技大学 Depression crowd identification method and device based on image fixation difference
CN114209324A (en) * 2022-02-21 2022-03-22 北京科技大学 Psychological assessment data acquisition method based on image visual cognition and VR system

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111933275B (en) * 2020-07-17 2023-07-28 兰州大学 Depression evaluation system based on eye movement and facial expression
US20220211310A1 (en) * 2020-12-18 2022-07-07 Senseye, Inc. Ocular system for diagnosing and monitoring mental health
CN112674770B (en) * 2020-12-22 2021-09-21 北京科技大学 Depression crowd eye movement identification method based on image significance difference and emotion analysis

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2016112690A1 (en) * 2015-01-14 2016-07-21 北京工业大学 Eye movement data based online user state recognition method and device
CN112674771A (en) * 2020-12-22 2021-04-20 北京科技大学 Depression crowd identification method and device based on image fixation difference
CN114209324A (en) * 2022-02-21 2022-03-22 北京科技大学 Psychological assessment data acquisition method based on image visual cognition and VR system

Also Published As

Publication number Publication date
CN115607159A (en) 2023-01-17

Similar Documents

Publication Publication Date Title
CN111528859B (en) Child ADHD screening and evaluating system based on multi-modal deep learning technology
Ulrich et al. Perceptual and memorial contributions to developmental prosopagnosia
Tang et al. Automatic identification of high-risk autism spectrum disorder: a feasibility study using video and audio data under the still-face paradigm
CN112890815A (en) Autism auxiliary evaluation system and method based on deep learning
Drimalla et al. Detecting autism by analyzing a simulated social interaction
US20240050006A1 (en) System and method for prediction and control of attention deficit hyperactivity (adhd) disorders
Kacur et al. An analysis of eye-tracking features and modelling methods for free-viewed standard stimulus: application for schizophrenia detection
Sun et al. A novel deep learning approach for diagnosing Alzheimer's disease based on eye-tracking data
Cook et al. Towards automatic screening of typical and atypical behaviors in children with autism
Bhatia et al. A video-based facial behaviour analysis approach to melancholia
CN113658697B (en) Psychological assessment system based on video fixation difference
Ishikawa et al. Handwriting features of multiple drawing tests for early detection of Alzheimer’s disease: a preliminary result
Lin et al. Looking at the body: Automatic analysis of body gestures and self-adaptors in psychological distress
Li et al. Automatic diagnosis of depression based on facial expression information and deep convolutional neural network
CN108962397B (en) Pen and voice-based cooperative task nervous system disease auxiliary diagnosis system
Rahman et al. Video minor stroke extraction using learning vector quantization
Zuo et al. Deep Learning-based Eye-Tracking Analysis for Diagnosis of Alzheimer's Disease Using 3D Comprehensive Visual Stimuli
Friedman et al. Factors affecting inter-rater agreement in human classification of eye movements: a comparison of three datasets
Hu et al. Detection of paroxysmal atrial fibrillation from dynamic ECG recordings based on a deep learning model
CN116452592B (en) Method, device and system for constructing brain vascular disease AI cognitive function evaluation model
Zhou et al. Gaze Patterns in Children With Autism Spectrum Disorder to Emotional Faces: Scanpath and Similarity
Eisenhauer et al. Context-based facilitation in visual word recognition: Evidence for visual and lexical but not pre-lexical contributions
CN115607159B (en) Depression state identification method and device based on eye movement sequence space-time characteristic analysis
Dupré et al. Emotion recognition in humans and machine using posed and spontaneous facial expression
Bhatia et al. A multimodal system to characterise melancholia: cascaded bag of words approach

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant