CN117373075A - Emotion recognition data set based on eye feature points and eye region segmentation results - Google Patents

Emotion recognition data set based on eye feature points and eye region segmentation results Download PDF

Info

Publication number
CN117373075A
CN117373075A CN202311135038.5A CN202311135038A CN117373075A CN 117373075 A CN117373075 A CN 117373075A CN 202311135038 A CN202311135038 A CN 202311135038A CN 117373075 A CN117373075 A CN 117373075A
Authority
CN
China
Prior art keywords
eye
feature points
data set
area
emotion
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311135038.5A
Other languages
Chinese (zh)
Inventor
张俊杰
黄荣怀
刘德建
李卓然
费程
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Normal University
Original Assignee
Beijing Normal University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Normal University filed Critical Beijing Normal University
Priority to CN202311135038.5A priority Critical patent/CN117373075A/en
Publication of CN117373075A publication Critical patent/CN117373075A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/174Facial expression recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • G06V10/267Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/18Eye characteristics, e.g. of the iris
    • G06V40/193Preprocessing; Feature extraction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/18Eye characteristics, e.g. of the iris
    • G06V40/197Matching; Classification
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Multimedia (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Human Computer Interaction (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Ophthalmology & Optometry (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Databases & Information Systems (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Image Processing (AREA)

Abstract

The invention provides an emotion recognition data set based on eye feature points and eye region segmentation results. The emotion recognition data set based on the eye feature points and the eye region segmentation result comprises the following steps: step one, collecting facial images when a user browses network information, and recording emotion changes of the user in real time; step two, face detection is carried out on the acquired image, and a face area is segmented; step three, after obtaining the facial area, detecting the characteristic points in the facial area, and dividing the left eye area and the right eye area according to the relative position relation between the characteristic points; fourth, marking the eye feature points of the segmented eye region images by using open source data marking software; fifthly, emotion recognition is achieved according to the relative position relation among eye feature points, the mutual duty ratio among eye segmentation images and the difference of the sizes of all areas.

Description

Emotion recognition data set based on eye feature points and eye region segmentation results
Technical Field
The invention relates to the technical fields of image segmentation, feature point positioning and emotion recognition in the field of computer vision, in particular to an emotion recognition data set based on eye feature points and eye region segmentation results.
Background
The eyes are windows of mental vigilance, and changes in pupil size may reflect changes in the user's state of health, mental state, mood swings, and cognitive levels. In the medical field, pupil status is of great importance for the prevention and diagnosis of diseases, such as determining the degree of strabismus by measuring the pupil position of a patient, performing cosmetic correction by positioning eye feature points, performing preliminary diagnosis on a patient suffering from a toxic phenomenon or suffering from certain diseases by observing the pupil size, and the like. The current judgment of pupil state requires a doctor to carry out autonomous diagnosis through experience, and has strong subjectivity. In the education field, the interest points of students can be known through analysis of eye state changes in the learning process of the students. In particular, with the development of online education, traditional classrooms gradually develop towards online education. The on-line education has the characteristic of one-to-many, and when students are too many, teachers cannot observe the learning states of all students in real time. And the video is analyzed by collecting the video of the students in class, so that the class state of the students and the knowledge mastering degree are judged. At present, teachers judge states of students through expressions of the students, but the expressions are high in camouflage, and even if the students cannot understand the learned contents, the students can make the expressions with confusion, so that the judgment of the teachers is affected. Compared with the expression, the eye state has weaker camouflage, so that the state of students in class can be judged more accurately through the eye state. However, the eye area is small, and only occupies a small part of the face, so that wearing glasses, shielding, reflecting and the like in real life increases the difficulty of eye state analysis. In addition, in the image acquired by the common camera, the pupil is close to the iris in color, and the difficulty of eye state analysis is increased.
Although studies of ocular conditions have been carried out for a considerable period of time, the relevant applications have been difficult to achieve. The existing pupil data set has a large difference from images acquired in actual life, most of data are acquired under ideal conditions, such as CASIA.V1 and CASIA.V2, the images in the data sets remove noise factors influencing the image quality, the irises are uniformly distributed in the images, the eye structure is clear, and the image quality is high. However, in a real environment, there are many factors affecting the image quality, and this type of data cannot be practically applied. To solve the above problems, the CASIA.V3 increases the diversity of data by changing the illumination intensity, and the CASIA.V4 considers more factors affecting the image quality, such as capturing images during the movement of the user, capturing images of the user at different distances from the camera, and low resolution images. ICE2005 data set, which takes into account the condition of eyelash occlusion and strabismus, increases the brightness of the eye periphery during image acquisition by using an external light source, thereby improving the definition of the image. ICE2006 is an extension of ICE2005 data set, also assisted by the addition of an external light source during data acquisition. MMU1 dataset requires that user and camera distance be between 7~25cm, in order to gather clear image, has restricted user and camera distance. Compared to MMU1, MMU2 increases the distance between the user and the camera to between 47 and 53cm while taking into account the effect of eye lashes and eye position changes on the data quality. The WVU dataset takes into account more factors that affect the quality of the image, including the effects of eye strabismus, camera poor focus, ambient light reflection, rotation of the image, and occlusions of the lashes, etc. The data sets are all provided with manual guidance in the image acquisition process, so that factors affecting the image quality are fewer, and the data sets are acquired by using an infrared camera. Unlike the acquisition of images with an infrared camera, the LPW dataset is acquired with a head-mounted camera, the acquisition scene includes indoor and outdoor, and the video is acquired when the eyes are active without assistance from a light source, and the images take into account the effects of wearing glasses, wearing pupils, making up, and the like. Whether using a head mounted camera or an infrared camera, the use of professional equipment limits practical applications. In order to be as close to a practical application scene as possible, the UPOL data set uses visible light as an auxiliary light source, and an eye image is acquired under a close range condition. The uriis.v1 dataset contains motion blur, focus blur, occlusion of the eyelid and eyelashes, eye closure, etc., requiring that the distance between the eye and the camera be less than 50cm during image acquisition. The UBIRIS.V2 is the data set closest to the actual application situation, and an ordinary camera is used for acquiring an eye image under the auxiliary condition of a light source, wherein the image acquisition distance is 4-8 m. Under the condition that the user is not limited at all, the image diversity is increased by changing the distance between the user and the camera and collecting the images in the moving process of the user. In 2021, researchers at the university of german carbene created the largest global human eye image public dataset TEyeD that used 7 eye tracker types of different resolutions to capture eye images, which were clear but were captured by a head-mounted camera, thus limiting the scope of application.
Through analysis of the existing data set, it can be found that, due to the characteristics of the eye structure, the pupil is small and the color is close to that of the iris, so that in order to collect clear eye images, analysis of the eye state is performed, the eye data needs to be collected in a short-distance range by using professional equipment, and in particular, an infrared light source or a common light source needs to assist. In actual life, when a user browses information by using portable equipment such as a computer, a mobile phone and the like, the equipment is only provided with a common camera, and the user experience is influenced by adding an external light source. Furthermore, there is a lack of related datasets that enable analysis of user emotional states based solely on ocular states. Therefore, aiming at the defects of the existing data set, how to analyze the emotion and the state of the user by using the common camera without any light source assistance only by analyzing the change of the eye state becomes a problem to be solved urgently. In addition, the existing data set does not simultaneously provide an eye feature point calibration result and an eye structure segmentation result. In the process of training the model based on deep learning, the higher the matching degree of training data and actual application data is, the better the actual application effect of the model is.
Therefore, there is a need to provide an emotion recognition dataset based on eye feature points and eye region segmentation results to solve the above-mentioned technical problems.
Disclosure of Invention
Aiming at the problems existing in the existing dataset, namely (1) image acquisition is needed by using professional equipment such as an infrared camera, a head-mounted camera and the like; (2) The definition of the acquired image is improved by utilizing an infrared light source or a common light source; (3) Only the eye structure segmentation result is provided in the same dataset; (4) Lacking a data set for emotion analysis based only on eye states, the present invention proposes an emotion recognition data set based on eye feature points and eye region segmentation results.
The invention provides an emotion recognition data set based on eye feature points and eye region segmentation results, and the acquisition of the data set comprises the following steps:
step one, collecting facial images when a user browses network information, and recording emotion changes of the user in real time;
step two, face detection is carried out on the acquired image, and a face area is segmented;
step three, after obtaining the facial area, detecting the characteristic points in the facial area, and dividing the left eye area and the right eye area according to the relative position relation between the characteristic points;
marking the eye feature points of the segmented eye region images by using open source data marking software, wherein the eye feature point marking areas comprise pupils, scleras and irises;
fifthly, identifying emotion according to the relative position relation among eye feature points, the mutual occupation ratio among eye segmentation images and the difference of the sizes of all areas;
step six: meanwhile, the eye structure segmentation result, the feature point labeling result and the emotion type are stored to form a data set based on the eye feature point and the eye structure segmentation result.
Preferably, in the first step, a common camera may be used when acquiring the facial image when the user browses the network information.
Preferably, in the second step, in the raw data of the collected face image, a face area is detected in the collected face image based on a face detection algorithm, and the face detection algorithm may include MTCNN, dlib, haar.
Compared with the related art, the emotion recognition data set based on the eye feature points and the eye region segmentation result has the following beneficial effects:
according to the emotion recognition data set based on the eye feature points and the eye region segmentation result, the eye structure image is collected by using the low-cost common camera under the condition of no external light source assistance, and the image not only comprises the eye image of a healthy user, but also comprises the eye image of a patient suffering from eye diseases; in addition, the data set simultaneously provides segmentation data and characteristic point position data for the same eye image, and can simultaneously complete two tasks of eye structure segmentation and characteristic point positioning; finally, unlike facial expressions, the eye state is less manually controllable, so that true emotion recognition under low cost is realized based on the change of eye feature point positions and the change of eye structure sizes.
Drawings
FIG. 1 is a schematic diagram of the collection of emotion recognition data sets based on eye feature points and eye region segmentation results;
FIG. 2 is an original acquired image;
FIG. 3 is a face positioning result;
fig. 4 is a face feature point detection result;
FIG. 5 is a left eye structure segmentation result;
FIG. 6 is a graph of right eye structure segmentation results;
FIG. 7 is an exemplary diagram of feature point labeling;
FIG. 8 is a left eye segmentation result;
fig. 9 is a right eye segmentation result.
Description of the embodiments
The invention will be further described with reference to the drawings and embodiments.
Based on the emotion recognition data set of eye feature points and eye region segmentation results, a data acquisition schematic diagram is shown in fig. 1, a user sits in front of a computer, facial images of the user when browsing webpage information are acquired through a camera, emotion state changes of the user are recorded simultaneously in the image acquisition process, the emotion states comprise 6 states of happiness, gas generation, confusion, surprise, boring and fatigue, no human intervention is added in the data acquisition process, and the acquired images are shown in fig. 2;
after the image is acquired, face detection is carried out, the current face detection algorithm is mature, available algorithms comprise MTCNN, dlib, haar and the like, and the detected face image is shown in fig. 3;
next, facial feature points are detected in the detected face area, the facial feature point detection result is shown in fig. 4, and since the detected facial feature points have fixed numbers, feature points representing the eye area can be selected according to the relative position relation among the feature points, so that the segmentation of the eye area is realized, and the eye area segmentation results are respectively shown in fig. 5 and 6;
after the eye region is obtained, carrying out fine calibration on the feature points of the eye region by using open source data labeling software, and labeling 41 eye feature points in the eye region, wherein 16 feature points around the sclera, 12 feature points around the iris, 13 feature points in the pupil region, wherein the 13 feature points comprise 12 feature points around the pupil and 1 pupil center feature point, and an example result of labeling the eye feature points is shown in fig. 7;
after the eye feature points are marked, storing a marked result file, generating an eye region segmentation result based on the marked file, and representing pupil, iris and sclera regions with different colors, wherein the eye region segmentation example result is shown in fig. 8 and 9;
finally, the eye region segmentation result, the characteristic point labeling file and the emotion change labeling file are saved to form a data set, wherein the data set comprises 4828 original data, wherein the number of pictures is 2414, the number of json files is 2414, the number of emotion change labeling files is 35, and the eye segmentation result is 2414.
It should be noted that, because the eye area is smaller, the accuracy of emotion recognition can be improved by accurately dividing the change of the eye area and the feature point position, so that the image contains various noise factors influencing the feature point positioning and the eye structure dividing result, including wearing glasses, wearing pupils and reflecting light.
Compared with the related art, the emotion recognition data set based on the eye feature points and the eye region segmentation result has the following effective effects:
under the condition of no light source assistance, an eye image of a user when browsing contents by using mobile equipment such as a computer, a mobile phone, a tablet and the like at any time, any place and any mode is acquired based on a low-cost common camera, the emotion state of the user in the image acquisition process is recorded, the image not only comprises the eye image of a healthy user, but also comprises the eye image of a patient suffering from eye diseases, an emotion recognition data set based on eye feature point movement and eye region segmentation is finely calibrated, the data set not only comprises eye feature points and eye structure segmentation results of the user, but also provides emotion states of the user under the condition of corresponding eye structure segmentation results and feature point change, and the depth model is trained by utilizing the data set, so that emotion state detection can be realized based on the eye region under the condition of low cost, and auxiliary support can be provided for an orthopedic surgeon and eye disease diagnosis.
The foregoing description is only illustrative of the present invention and is not intended to limit the scope of the invention, and all equivalent structures or equivalent processes or direct or indirect application in other related technical fields are included in the scope of the present invention.

Claims (3)

1. An emotion recognition dataset based on eye feature points and eye region segmentation results, wherein the collection of the dataset comprises the following steps:
step one, collecting facial images when a user browses network information, and recording emotion changes of the user in real time;
step two, face detection is carried out on the acquired image, and a face area is segmented;
step three, after obtaining the facial area, detecting the characteristic points in the facial area, and dividing the left eye area and the right eye area according to the relative position relation between the characteristic points;
marking the eye feature points of the segmented eye region images by using open source data marking software, wherein the eye feature point marking areas comprise pupils, scleras and irises;
fifthly, identifying emotion according to the relative position relation among eye feature points, the mutual occupation ratio among eye segmentation images and the difference of the sizes of all areas;
step six: meanwhile, the eye structure segmentation result, the feature point labeling result and the emotion type are stored to form a data set based on the eye feature point and the eye structure segmentation result.
2. The emotion recognition data set based on the segmentation result of the eye feature points and the eye region according to claim 1, wherein in the first step, a general camera is used when collecting the face image when the user browses the network information.
3. The emotion recognition data set based on the segmentation result of the eye feature points and the eye region according to claim 1, wherein in the second step, the face region is detected in the collected face image based on a face detection algorithm in the raw data of the collected face image, and the available face detection algorithm includes MTCNN, dlib, haar.
CN202311135038.5A 2023-09-05 2023-09-05 Emotion recognition data set based on eye feature points and eye region segmentation results Pending CN117373075A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311135038.5A CN117373075A (en) 2023-09-05 2023-09-05 Emotion recognition data set based on eye feature points and eye region segmentation results

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311135038.5A CN117373075A (en) 2023-09-05 2023-09-05 Emotion recognition data set based on eye feature points and eye region segmentation results

Publications (1)

Publication Number Publication Date
CN117373075A true CN117373075A (en) 2024-01-09

Family

ID=89401178

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311135038.5A Pending CN117373075A (en) 2023-09-05 2023-09-05 Emotion recognition data set based on eye feature points and eye region segmentation results

Country Status (1)

Country Link
CN (1) CN117373075A (en)

Similar Documents

Publication Publication Date Title
Garbin et al. Openeds: Open eye dataset
Hammoud Passive eye monitoring: Algorithms, applications and experiments
Rothkopf et al. Task and context determine where you look
JP2022501718A (en) Human / computer interface with fast and accurate tracking of user interactions
CN111933275B (en) Depression evaluation system based on eye movement and facial expression
CN110335266B (en) Intelligent traditional Chinese medicine visual inspection image processing method and device
CN109712710B (en) Intelligent infant development disorder assessment method based on three-dimensional eye movement characteristics
WO2021135557A1 (en) Artificial intelligence multi-mode imaging analysis apparatus
Sharma et al. Eye gaze techniques for human computer interaction: A research survey
CN105095840B (en) Multi-direction upper nystagmus method for extracting signal based on nystagmus image
CN102567734A (en) Specific value based retina thin blood vessel segmentation method
CN110472546B (en) Infant non-contact eye movement feature extraction device and method
Jongerius et al. Eye-tracking glasses in face-to-face interactions: Manual versus automated assessment of areas-of-interest
Edughele et al. Eye-tracking assistive technologies for individuals with amyotrophic lateral sclerosis
Okada et al. Advertisement effectiveness estimation based on crowdsourced multimodal affective responses
Sangeetha A survey on deep learning based eye gaze estimation methods
Chen Cognitive load measurement from eye activity: acquisition, efficacy, and real-time system design
EP4325517A1 (en) Methods and devices in performing a vision testing procedure on a person
Albright et al. Visual neuroscience for architecture: Seeking a new evidence‐based approach to design
CN108495584B (en) Apparatus and method for determining eye movement through a haptic interface
CN117373075A (en) Emotion recognition data set based on eye feature points and eye region segmentation results
Chaudhary et al. : From real infrared eye-images to synthetic sequences of gaze behavior
Oyekoya Eye tracking: A perceptual interface for content based image retrieval
Barbieri et al. Realter: An Immersive Simulator to Support Low-Vision Rehabilitation
Kim et al. An Affective Situation Labeling System from Psychological Behaviors in Emotion Recognition

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination