WO2021248814A1 - 一种鲁棒的家庭儿童学习状态视觉监督方法及装置 - Google Patents

一种鲁棒的家庭儿童学习状态视觉监督方法及装置 Download PDF

Info

Publication number
WO2021248814A1
WO2021248814A1 PCT/CN2020/128882 CN2020128882W WO2021248814A1 WO 2021248814 A1 WO2021248814 A1 WO 2021248814A1 CN 2020128882 W CN2020128882 W CN 2020128882W WO 2021248814 A1 WO2021248814 A1 WO 2021248814A1
Authority
WO
WIPO (PCT)
Prior art keywords
facial
data
feature points
detection module
key
Prior art date
Application number
PCT/CN2020/128882
Other languages
English (en)
French (fr)
Inventor
李龙
宋恒
赵丹
崔修涛
Original Assignee
德派(嘉兴)医疗器械有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 德派(嘉兴)医疗器械有限公司 filed Critical 德派(嘉兴)医疗器械有限公司
Publication of WO2021248814A1 publication Critical patent/WO2021248814A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/58Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/583Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0639Performance analysis of employees; Performance analysis of enterprise or organisation operations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/20Education
    • G06Q50/205Education administration or guidance
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/25Determination of region of interest [ROI] or a volume of interest [VOI]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • G06V10/267Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/30Noise filtering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • G06V40/171Local features and components; Facial parts ; Occluding parts, e.g. glasses; Geometrical relationships
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/172Classification, e.g. identification

Definitions

  • the present invention relates to the technical field of computer vision processing, and in particular to a robust method and device for visually supervising the learning status of family children.
  • the Chinese patent application publication number CN110867105A discloses a home learning supervision method and system based on edge computing, and proposes a method involving computer vision, but does not describe the specific implementation method of facial state behavior analysis; there is also an application publication number
  • the Chinese patent for CN110197169A discloses a non-contact learning state monitoring system and a learning state detection method, and proposes a face state analysis method based on the Dlib computer vision toolkit, which uses a cascaded decision tree algorithm to detect facial feature points.
  • the facial posture and attention direction are calculated by geometric methods. The accuracy is completely dependent on the accuracy of feature point detection, and it is easily affected by conditions such as illumination and posture changes. It has poor robustness and limited practicality.
  • the purpose of the present invention is to provide a robust family child learning state monitoring method and device that has fast processing speed and performs stable and reliable facial behavior analysis under complex lighting and large posture conditions.
  • a robust visual supervision method of children's learning status at home including the following steps:
  • the feature detection module determines whether it is a monitoring object according to the key feature points
  • step S3 If yes, go to step S3, and perform deep learning thermal detection on the data of the ROI area to obtain facial thermal information;
  • step S1 the face data is collected by edge AI extraction, and the key feature points correspond to the eyes, nose tip, mouth, and facial contours of the face.
  • step S3 cutting, scaling, filtering, denoising, histogram equalization, and gray level balancing are performed on the video frame containing the key feature points, and converted into a normalized standard image;
  • the standard image is segmented according to facial organ regions to obtain the facial key point data.
  • step S4 the ROI area in the t+1 frame is obtained according to the position coordinates of the facial key point data in the t frame.
  • step S6 the attention mechanism is used to repeatedly compare the details of the recognized object to improve the accuracy of the comparison.
  • the facial key point data and the facial heat information can be compared before the comparison.
  • the image of the facial thermal information is output after being reconstructed into a high-resolution image according to the principle of end to end.
  • the LSTM classification method is used to classify the detection data of different parts of the face.
  • a robust visual monitoring device for the learning status of family children including a data acquisition module, a feature detection module, a feature detection module of interest, a thermal image detection module, an algorithm module, a quantitative analysis module, and a standard feature database;
  • the data collection module collects face data, extracts multiple key feature points, and sequentially submits the key feature points to the feature detection module according to a time sequence;
  • the feature detection module judges whether it is a monitoring object according to the key feature points, and sends data that meets the requirements to the interest feature detection module and the thermal image detection module;
  • the feature of interest detection module performs separate detection according to different key feature points to obtain key point data of the monitored subject's face, and the algorithm module calculates the difference between the next frame and the key feature points of the single item after separation.
  • the algorithm module performs self-inspection on the ROI area to determine whether it is the face of the monitored object, if so, sends the ROI area to the feature of interest to continue detection, if not, interrupts the feature of interest detection module Separation test;
  • the thermal image detection module performs thermal detection according to the data of the ROI area, and obtains facial thermal information
  • the quantitative analysis module obtains the facial key point data and the facial thermal information in real time, and compares with corresponding data in the standard feature database after integration and classification to obtain a quantified learning state evaluation result.
  • the present invention includes at least one of the following beneficial technical effects:
  • the amount of input data can be reduced, on the other hand, the problem can be simplified, the processing efficiency of the process can be improved, and the processing speed can be increased, combined with the heat map of deep learning Detection, combined with ROI area tracking and related filtering and noise removal, can improve the system's anti-disturbance ability under light changes and posture changes, and improve the accuracy of facial recognition.
  • Fig. 1 is a block diagram of a method according to an embodiment of the present invention
  • FIG. 2 is a specific process flow diagram of an embodiment of the present invention.
  • Fig. 1 it is a robust visual supervision method of the learning state of the family and children disclosed in the present invention, which includes the following steps:
  • S1 Collect face data according to the preset frequency, extract multiple key feature points, and submit the key feature points to the feature detection module in sequence according to the time sequence;
  • the feature detection module judges whether it is a monitoring object according to the key feature points
  • step S3 If yes, go to step S3, and perform deep learning thermal detection on the data of the ROI area to obtain facial thermal information;
  • a confrontation network based on sample data, which specifically includes four steps: obtaining sample data, preprocessing training samples, generating lighting confrontation training for the confrontation network, and generating pose confrontation training for the confrontation network.
  • step of obtaining sample data it is required to obtain face images of various illuminations and angles as sample data.
  • 13 poses in CMU Multi-PIE and face images under 20 illumination conditions are used as the training data set. Since it is convenient to train the model later, first normalize each sample image.
  • this embodiment uses the MTCNN method to detect the key points of the face image, and then selects the left eye, the right eye, the nose, the left of the mouth, and the right of the mouth as five key points, and the coordinates of the key points are followed by the image path. And the labels are saved together in a text file, which is used for training to obtain the heatmap of the corresponding key points for training and testing.
  • an image and the target lighting label are selected from the sample data as the input of the lighting generator, the generator outputs the target lighting image, and then the target lighting image and the original lighting label are sent to the lighting generation again
  • the device gets the fake original lighting image.
  • the discriminator feeds back the errors of the real image and the false original illumination image to the illumination generator, and the identity classifier and the illumination classifier respectively feed back the errors of the target face image and the identity information and illumination information of the generated image to the illumination generator; illumination generation Trainers, discriminators, and classifiers are continuously iterative training.
  • step S1 the face data is collected by edge AI extraction, and the key feature points correspond to the eyes, nose tip, mouth, and facial contours of the face.
  • step S3 crop, zoom, filter, denoise, histogram equalization, and gray balance are performed on the video frame containing the key feature points, and convert it into a normalized standard image;
  • the standard image is segmented according to the facial organ area to obtain the key point data of the face.
  • step S4 the ROI area in the t+1 frame is obtained according to the position coordinates of the face key point data in the t frame.
  • step S6 the attention mechanism is used to repeatedly compare the details of the recognized object to improve the accuracy of the comparison.
  • the facial key point data and facial thermal information image can be compared according to the end to The principle of end is reconstructed into a high-resolution image and then output.
  • the LSTM classification method is used to classify the detection data of different parts of the face.
  • a robust visual monitoring device for the learning status of family children including a data acquisition module, a feature detection module, a feature detection module of interest, a thermal image detection module, an algorithm module, a quantitative analysis module, and a standard feature database;
  • the data collection module collects face data, extracts multiple key feature points, and submits the key feature points to the feature detection module in sequence according to the time sequence;
  • the feature detection module judges whether it is a monitoring object according to the key feature points, and sends the data that meets the requirements to the interest feature detection module and the thermal image detection module;
  • the feature of interest detection module performs separation and detection according to different key feature points to obtain the key point data of the monitored subject's face, and the algorithm module calculates the ROI area associated with the single key feature point in the next frame based on the separated single key feature point ;
  • the algorithm module performs self-inspection on the ROI area to determine whether it is the face of the monitored object, if it is, it sends the ROI area to the feature of interest to continue the detection, if not, it interrupts the separation and detection of the feature of interest detection module;
  • the thermal image detection module performs thermal detection based on the data in the ROI area to obtain facial thermal information
  • the quantitative analysis module obtains the facial key point data and facial thermal information in real time, and compares it with the corresponding data in the standard feature database after integration and classification, and obtains the result of the quantified learning state evaluation. When comparing, you can increase the precise tuning of specific individuals, such as the detection of eyeball status.
  • the eyeball model in the standard feature database can be reconstructed according to the eyeball structure of the current monitored object, thereby improving the accuracy of eyeball state detection.
  • the present invention includes at least one of the following beneficial technical effects:
  • the amount of input data can be reduced, on the other hand, the problem can be simplified, the processing efficiency of the process can be improved, and the processing speed can be increased, combined with the heat map of deep learning Detection, combined with ROI area tracking and related filtering and noise removal, can improve the system's anti-disturbance ability under light changes and posture changes, and improve the accuracy of facial recognition.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Human Resources & Organizations (AREA)
  • Multimedia (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Strategic Management (AREA)
  • Educational Administration (AREA)
  • Economics (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Tourism & Hospitality (AREA)
  • Software Systems (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • Human Computer Interaction (AREA)
  • Evolutionary Computation (AREA)
  • Computational Linguistics (AREA)
  • Educational Technology (AREA)
  • Development Economics (AREA)
  • Biomedical Technology (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Marketing (AREA)
  • Entrepreneurship & Innovation (AREA)
  • General Business, Economics & Management (AREA)
  • Library & Information Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Databases & Information Systems (AREA)
  • Primary Health Care (AREA)
  • Game Theory and Decision Science (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Image Analysis (AREA)

Abstract

一种鲁棒的家庭儿童学习状态视觉监督方法,对采集到的图像划分出ROI区域进行精细检测和识别,可以降低输入的数据量并且提高处理效率和提升处理速度;然后再配合深度学习、几何融合检测、ROI区域跟踪以及滤波除噪操作,可以在光线变化以及姿态变化时提升抗扰动能力,进而提升面部识别的精度。

Description

一种鲁棒的家庭儿童学习状态视觉监督方法及装置 技术领域
本发明涉及本发明涉及计算机视觉处理技术领域,具体涉及一种鲁棒的家庭儿童学习状态视觉监督方法及装置。
背景技术
随着我国的不断发展,儿童教育日益受到社会重视,学习方式也由单一的在校学习拓展到居家在线学习、居家线下学习等多种方式。但是儿童的自律性一般欠佳,家长精力亦有限,居家学习时老师难以监督,因此常常学习效率较低。
现有两种对学生在线上教学进行监督的方式,一种是接触式的,检测效果比较准确但是需要传感器与儿童直接接触,会对其学习产生一定的影响;另一种是非接触式的,通过摄像头对儿童的外在行为表现和内在生理变化。
如申请公布号为CN110867105A的中国专利公开了一种基于边缘计算的家庭学习监督方法和系统,提出了涉及计算机视觉的方法,但并未描述面部状态行为分析的具体实现方法;还有申请公布号为CN110197169A的中国专利公开了一种非接触式的学习状态监测系统及学习状态检测方法,提出了基于Dlib计算机视觉工具包的面部状态分析方法,该方法采用级联决策树算法检测面部特征点,通过几何方法计算出面部姿态和注意力方向,精度完全依赖特征点检测精度,容易受到光照、姿态变化等条件影响,鲁棒性差,实用受限。
发明内容
针对现有技术存在的不足,本发明的目的是提供一种处理速度快且在光照复杂和大姿态条件下进行稳定可靠的面部行为分析的鲁棒的家庭儿童学习状态监督方法及装置。
本发明的上述发明目的是通过以下技术方案得以实现的:
一种鲁棒的家庭儿童学习状态视觉监督方法,包括如下步骤:
S1、按照预设频率对人脸数据进行收集,提取出多个关键特征点,并按照时序依次将所述关键特征点提交给特征检测模块;
S2、由所述特征检测模块根据所述关键特征点判断是否为监测对象;
若是,进入步骤S3;
若不是,返回步骤S1;
S3、按照面部识别的需求将不同面部区域的所述关键特征点进行分离,获得多组面部关键点数据;
S4、根据当前帧数对应的所述面部关键点数据推算出下一帧中对应的所述关键特征点 所在的区域,并将该区域定义为ROI区域;
S5、对所述ROI区域进行自检,判断是否为监测对象的人脸;
若是,进入步骤S3,并对所述ROI区域的数据进行深度学习热力检测,获取面部热力信息;
若不是,返回步骤S1;
S6、通过量化分析模块实时获取所述面部关键点数据和所述面部热力信息,整合分类后与所述标准特征数据库中的对应数据进行比对,得出量化后的学习状态评价的结果。
在步骤S1中,采用边缘AI提取的方式对人脸数据进行收集,所述关键特征点对应人脸面部的眼部、鼻尖、嘴部以及面部轮廓。
在步骤S3中,对含有所述关键特征点的视频帧进行裁剪、缩放、滤波、去噪、直方图均衡和灰度平衡,转换成归一化的标准图像;
再将所述标准图像按照面部器官区域进行分割,得到所述面部关键点数据。
在步骤S4中,根据t帧中所述面部关键点数据的位置坐标获取t+1帧中的所述ROI区域。
在步骤S6中,采用attention机制,反复比对识别对象细节,提高对比的精准度。
当所述面部关键点数据和所述面部热力信息的分辨率的分辨率无法满足于所述标准特征数据库中的对应数据进行有效比对时,可在比对之前对所述面部关键点数据和所述面部热力信息的图像按照end to end的原则重建为高分辨率图像后输出。
采用LSTM分类法对面部不同部位的检测数据进行分类。
一种鲁棒的家庭儿童学习状态视觉监督装置,包括数据采集模块、特征检测模块、感兴趣特征检测模块、热力图像检测模块、算法模块和量化分析模块以及标准特征数据库;
所述数据采集模块对人脸数据进行收集,提取出多个关键特征点,并按照时序依次将所述关键特征点提交给所述特征检测模块;
所述特征检测模块根据所述关键特征点判断是否为监测对象,并将符合要求的数据发送至所述感兴趣特征检测模块和所述热力图像检测模块;
所述感兴趣特征检测模块根据不同所述关键特征点进行分离检测,获得监测对象面部关键点数据,并由所述算法模块根据分离后的单项所述关键特征点推算出下一帧的与该单项所述关键特征点关联的ROI区域;
所述算法模块对所述ROI区域进行自检,判断是否为监测对象的人脸,若是则发送所述ROI区域至所述感兴趣特征继续检测,若不是则中断所述感兴趣特征检测模块的分离检测;
所述热力图像检测模块根据所述ROI区域的数据进行热力检测,获取面部热力信息;
所述量化分析模块实时获取所述面部关键点数据和所述面部热力信息,整合分类后与所述标准特征数据库中的对应数据进行比对,得出量化后的学习状态评价的结果。
综上所述,本发明包括以下至少一种有益技术效果:
通过预先对采集到的图像划分出ROI区域进行精细的检测和识别,一方面可以降低输入的数据量,另一方可以简化问题,提高流程的处理效率,提升处理速度,再配合深度学习的heat map检测,配合ROI区域跟踪以及相关的滤波除噪,可以提升系统在光线变化以及姿态变化下的抗扰动能力,提升面部识别的精度。
附图说明
图1是本发明实施例的方法框图;
图2是本发明实施例的具体流程处理图。
具体实施方式
以下结合附图对本发明作进一步详细说明。
参照图1,为本发明公开的一种鲁棒的家庭儿童学习状态视觉监督方法,包括如下步骤:
S1、按照预设频率对人脸数据进行收集,提取出多个关键特征点,并按照时序依次将关键特征点提交给特征检测模块;
S2、由特征检测模块根据关键特征点判断是否为监测对象;
若是,进入步骤S3;
若不是,返回步骤S1;
S3、按照面部识别的需求将不同面部区域的关键特征点进行分离,获得多组面部关键点数据;
S4、根据当前帧数对应的面部关键点数据推算出下一帧中对应的关键特征点所在的区域,并将该区域定义为ROI区域;了
S5、对ROI区域进行自检,判断是否为监测对象的人脸;
若是,进入步骤S3,并对ROI区域的数据进行深度学习热力检测,获取面部热力信息;
本实施例中,首先需要根据样本数据训练生成对抗网络,具体包括获取样本数据、训练样本预处理、生成对抗网络的光照对抗训练、生成对抗网络的姿态对抗训练四个步骤。
在获取样本数据步骤,要求获取各种光照以及角度的人脸图像作为样本数据,本实施例采用CMU Multi-PIE中13种姿态,以及20种光照条件下的人脸图像作为训练数据集。由 于便于后面模型训练,首先对各个样本图像进行归一化。
在训练样本预处理步骤,本实施例通过MTCNN方法对人脸图像进行关键点检测,然后选取左眼、右眼、鼻子、嘴左、嘴右作为五个关键点,将关键点坐标跟图像路径以及标签共同保存到文本文件中,用于训练时得到对应关键点的heatmap图用于训练跟测试。
在生成对抗网络的光照对抗训练步骤,从样本数据中选择一张图像以及目标光照标签作为光照生成器的输入,生成器输出目标光照图像,然后将目标光照图像跟原始光照标签再次送入光照生成器得到假原始光照图像。判别器将真实图像和假原始光照图像的误差反馈给光照生成器,身份分类器和光照分类器分别将目标人脸图像和生成图像的身份信息和光照信息的误差反馈给光照生成器;光照生成器、判别器、分类器不断迭代训练。
若不是,返回步骤S1;
S6、通过量化分析模块实时获取面部关键点数据和面部热力信息,整合分类后与标准特征数据库中的对应数据进行比对,得出量化后的学习状态评价的结果。
在步骤S1中,采用边缘AI提取的方式对人脸数据进行收集,关键特征点对应人脸面部的眼部、鼻尖、嘴部以及面部轮廓。
在步骤S3中,对含有关键特征点的视频帧进行裁剪、缩放、滤波、去噪、直方图均衡和灰度平衡,转换成归一化的标准图像;
再将标准图像按照面部器官区域进行分割,得到面部关键点数据。
在步骤S4中,根据t帧中面部关键点数据的位置坐标获取t+1帧中的ROI区域。
在步骤S6中,采用attention机制,反复比对识别对象细节,提高对比的精准度。
当面部关键点数据和面部热力信息的分辨率的分辨率无法满足于标准特征数据库中的对应数据进行有效比对时,可在比对之前对面部关键点数据和面部热力信息的图像按照end to end的原则重建为高分辨率图像后输出。
采用LSTM分类法对面部不同部位的检测数据进行分类。
一种鲁棒的家庭儿童学习状态视觉监督装置,包括数据采集模块、特征检测模块、感兴趣特征检测模块、热力图像检测模块、算法模块和量化分析模块以及标准特征数据库;
数据采集模块对人脸数据进行收集,提取出多个关键特征点,并按照时序依次将关键特征点提交给特征检测模块;
特征检测模块根据关键特征点判断是否为监测对象,并将符合要求的数据发送至感兴趣特征检测模块和热力图像检测模块;
感兴趣特征检测模块根据不同关键特征点进行分离检测,获得监测对象面部关键点数据,并由算法模块根据分离后的单项关键特征点推算出下一帧的与该单项关键特征点关联的 ROI区域;
算法模块对ROI区域进行自检,判断是否为监测对象的人脸,若是则发送ROI区域至感兴趣特征继续检测,若不是则中断感兴趣特征检测模块的分离检测;
热力图像检测模块根据ROI区域的数据进行热力检测,获取面部热力信息;
量化分析模块实时获取面部关键点数据和面部热力信息,整合分类后与标准特征数据库中的对应数据进行比对,得出量化后的学习状态评价的结果。在对比时,可以增加特定个体的精确调优,例如对眼球状态的检测。可以根据当前监测对象的眼球结构重构标准特征数据库内的眼球模型,从而提升眼球状态检测的精准度。
综上,本发明包括以下至少一种有益技术效果:
通过预先对采集到的图像划分出ROI区域进行精细的检测和识别,一方面可以降低输入的数据量,另一方可以简化问题,提高流程的处理效率,提升处理速度,再配合深度学习的heat map检测,配合ROI区域跟踪以及相关的滤波除噪,可以提升系统在光线变化以及姿态变化下的抗扰动能力,提升面部识别的精度。
本具体实施方式的实施例均为本发明的较佳实施例,并非依此限制本发明的保护范围,故:凡依本发明的结构、形状、原理所做的等效变化,均应涵盖于本发明的保护范围之内。

Claims (8)

  1. 一种鲁棒的家庭儿童学习状态视觉监督方法,其特征在于:包括如下步骤:
    S1、按照预设频率对人脸数据进行收集,提取出多个关键特征点,并按照时序依次将所述关键特征点提交给特征检测模块;
    S2、由所述特征检测模块根据所述关键特征点判断是否为监测对象;
    若是,进入步骤S3;
    若不是,返回步骤S1;
    S3、按照面部识别的需求将不同面部区域的所述关键特征点进行分离,获得多组面部关键点数据;
    S4、根据当前帧数对应的所述面部关键点数据推算出下一帧中对应的所述关键特征点所在的区域,并将该区域定义为ROI区域;
    S5、对所述ROI区域进行自检,判断是否为监测对象的人脸;
    若是,进入步骤S3,并对所述ROI区域的数据进行深度学习热力检测,获取面部热力信息;
    若不是,返回步骤S1;
    S6、通过量化分析模块实时获取所述面部关键点数据和所述面部热力信息,整合分类后与所述标准特征数据库中的对应数据进行比对,得出量化后的学习状态评价的结果。
  2. 根据权利要求1所述的鲁棒的家庭儿童学习状态视觉监督方法,其特征在于:在步骤S1中,采用边缘AI提取的方式对人脸数据进行收集,所述关键特征点对应人脸面部的眼部、鼻尖、嘴部以及面部轮廓。
  3. 根据权利要求1所述的鲁棒的家庭儿童学习状态视觉监督方法,其特征在于:在步骤S3中,对含有所述关键特征点的视频帧进行裁剪、缩放、滤波、去噪、直方图均衡和灰度平衡,转换成归一化的标准图像;
    再将所述标准图像按照面部器官区域进行分割,得到所述面部关键点数据。
  4. 根据权利要求3所述的鲁棒的家庭儿童学习状态视觉监督方法,其特征在于:在步骤S4中,根据t帧中所述面部关键点数据的位置坐标获取t+1帧中的所述ROI区域。
  5. 根据权利要求1所述的鲁棒的家庭儿童学习状态视觉监督方法,其特征在于:在步骤S6中,采用attention机制,反复比对识别对象细节,提高对比的精准度。
  6. 根据权利要求5所述的鲁棒的家庭儿童学习状态视觉监督方法,其特征在于:当所述面部关键点数据和所述面部热力信息的分辨率的分辨率无法满足于所述标准特 征数据库中的对应数据进行有效比对时,可在比对之前对所述面部关键点数据和所述面部热力信息的图像按照end to end的原则重建为高分辨率图像后输出。
  7. 根据权利要求6所述的鲁棒的家庭儿童学习状态视觉监督方法,其特征在于:采用LSTM分类法对面部不同部位的检测数据进行分类。
  8. 一种鲁棒的家庭儿童学习状态视觉监督装置,其特征在于:包括数据采集模块、特征检测模块、感兴趣特征检测模块、热力图像检测模块、算法模块和量化分析模块以及标准特征数据库;
    所述数据采集模块对人脸数据进行收集,提取出多个关键特征点,并按照时序依次将所述关键特征点提交给所述特征检测模块;
    所述特征检测模块根据所述关键特征点判断是否为监测对象,并将符合要求的数据发送至所述感兴趣特征检测模块和所述热力图像检测模块;
    所述感兴趣特征检测模块根据不同所述关键特征点进行分离检测,获得监测对象面部关键点数据,并由所述算法模块根据分离后的单项所述关键特征点推算出下一帧的与该单项所述关键特征点关联的ROI区域;
    所述算法模块对所述ROI区域进行自检,判断是否为监测对象的人脸,若是则发送所述ROI区域至所述感兴趣特征继续检测,若不是则中断所述感兴趣特征检测模块的分离检测;
    所述热力图像检测模块根据所述ROI区域的数据进行热力检测,获取面部热力信息;
    所述量化分析模块实时获取所述面部关键点数据和所述面部热力信息,整合分类后与所述标准特征数据库中的对应数据进行比对,得出量化后的学习状态评价的结果。
PCT/CN2020/128882 2020-06-13 2020-11-15 一种鲁棒的家庭儿童学习状态视觉监督方法及装置 WO2021248814A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010538607.0A CN111694980A (zh) 2020-06-13 2020-06-13 一种鲁棒的家庭儿童学习状态视觉监督方法及装置
CN202010538607.0 2020-06-13

Publications (1)

Publication Number Publication Date
WO2021248814A1 true WO2021248814A1 (zh) 2021-12-16

Family

ID=72480855

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/128882 WO2021248814A1 (zh) 2020-06-13 2020-11-15 一种鲁棒的家庭儿童学习状态视觉监督方法及装置

Country Status (2)

Country Link
CN (1) CN111694980A (zh)
WO (1) WO2021248814A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114373535A (zh) * 2022-01-13 2022-04-19 刘威 基于互联网的新型医患机制系统

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111694980A (zh) * 2020-06-13 2020-09-22 德沃康科技集团有限公司 一种鲁棒的家庭儿童学习状态视觉监督方法及装置

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1794265A (zh) * 2005-12-31 2006-06-28 北京中星微电子有限公司 基于视频的面部表情识别方法及装置
CN109299685A (zh) * 2018-09-14 2019-02-01 北京航空航天大学青岛研究院 用于人体关节3d坐标估计的推断网络及其方法
CN109472198A (zh) * 2018-09-28 2019-03-15 武汉工程大学 一种姿态鲁棒的视频笑脸识别方法
CN111046825A (zh) * 2019-12-19 2020-04-21 杭州晨鹰军泰科技有限公司 人体姿态识别方法、装置、系统及计算机可读存储介质
CN111160085A (zh) * 2019-11-19 2020-05-15 天津中科智能识别产业技术研究院有限公司 一种人体图像关键点姿态估计方法
CN111694980A (zh) * 2020-06-13 2020-09-22 德沃康科技集团有限公司 一种鲁棒的家庭儿童学习状态视觉监督方法及装置

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106355138A (zh) * 2016-08-18 2017-01-25 电子科技大学 基于深度学习和关键点特征提取的人脸识别方法
CN107944415A (zh) * 2017-12-06 2018-04-20 董伟 一种基于深度学习算法的人眼注意力检测方法
CN109271848B (zh) * 2018-08-01 2022-04-15 深圳市天阿智能科技有限责任公司 一种人脸检测方法及人脸检测装置、存储介质
CN110287895B (zh) * 2019-04-17 2021-08-06 北京阳光易德科技股份有限公司 一种基于卷积神经网络进行情绪测量的方法
CN110197169B (zh) * 2019-06-05 2022-08-26 南京邮电大学 一种非接触式的学习状态监测系统及学习状态检测方法
CN110459304A (zh) * 2019-07-19 2019-11-15 汕头大学 一种基于面部图像的健康状态诊断系统

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1794265A (zh) * 2005-12-31 2006-06-28 北京中星微电子有限公司 基于视频的面部表情识别方法及装置
CN109299685A (zh) * 2018-09-14 2019-02-01 北京航空航天大学青岛研究院 用于人体关节3d坐标估计的推断网络及其方法
CN109472198A (zh) * 2018-09-28 2019-03-15 武汉工程大学 一种姿态鲁棒的视频笑脸识别方法
CN111160085A (zh) * 2019-11-19 2020-05-15 天津中科智能识别产业技术研究院有限公司 一种人体图像关键点姿态估计方法
CN111046825A (zh) * 2019-12-19 2020-04-21 杭州晨鹰军泰科技有限公司 人体姿态识别方法、装置、系统及计算机可读存储介质
CN111694980A (zh) * 2020-06-13 2020-09-22 德沃康科技集团有限公司 一种鲁棒的家庭儿童学习状态视觉监督方法及装置

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114373535A (zh) * 2022-01-13 2022-04-19 刘威 基于互联网的新型医患机制系统

Also Published As

Publication number Publication date
CN111694980A (zh) 2020-09-22

Similar Documents

Publication Publication Date Title
Konstantinidis et al. Sign language recognition based on hand and body skeletal data
Agarwal et al. Learning to detect objects in images via a sparse, part-based representation
CN105138954B (zh) 一种图像自动筛选查询识别系统
WO2021248815A1 (zh) 一种高精度的儿童坐姿检测与矫正方法及装置
CN111563452B (zh) 一种基于实例分割的多人体姿态检测及状态判别方法
US8977010B2 (en) Method for discriminating between a real face and a two-dimensional image of the face in a biometric detection process
CN109101865A (zh) 一种基于深度学习的行人重识别方法
CN103279768B (zh) 一种基于增量学习人脸分块视觉表征的视频人脸识别方法
US20100316263A1 (en) Iris and ocular recognition system using trace transforms
Rouhi et al. A review on feature extraction techniques in face recognition
WO2021248814A1 (zh) 一种鲁棒的家庭儿童学习状态视觉监督方法及装置
CN109544523A (zh) 基于多属性人脸比对的人脸图像质量评价方法及装置
US20230237694A1 (en) Method and system for detecting children's sitting posture based on face recognition of children
Phuong et al. An eye blink detection technique in video surveillance based on eye aspect ratio
Tang et al. Automatic facial expression analysis of students in teaching environments
Faria et al. Interface framework to drive an intelligent wheelchair using facial expressions
CN115830635A (zh) 一种基于关键点检测和目标识别的pvc手套识别方法
Mishra Persuasive boundary point based face detection using normalized edge detection in regular expression face morphing
Bora et al. ISL gesture recognition using multiple feature fusion
Huang et al. Research on learning state based on students’ attitude and emotion in class learning
Chen et al. Intelligent Recognition of Physical Education Teachers' Behaviors Using Kinect Sensors and Machine Learning.
Shamil et al. Detection of Iris localization in facial images using haar cascade circular hough transform
Abd et al. Automatic deception detection system based on hybrid feature extraction techniques
CN116894978B (zh) 一种融合面部情绪与行为多特征的线上考试防作弊系统
Li et al. A method of depth image based human action recognition

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20940076

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20940076

Country of ref document: EP

Kind code of ref document: A1

122 Ep: pct application non-entry in european phase

Ref document number: 20940076

Country of ref document: EP

Kind code of ref document: A1

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 060723)

122 Ep: pct application non-entry in european phase

Ref document number: 20940076

Country of ref document: EP

Kind code of ref document: A1