CN113331839A - Network learning attention monitoring method and system based on multi-source information fusion - Google Patents
Network learning attention monitoring method and system based on multi-source information fusion Download PDFInfo
- Publication number
- CN113331839A CN113331839A CN202110592337.6A CN202110592337A CN113331839A CN 113331839 A CN113331839 A CN 113331839A CN 202110592337 A CN202110592337 A CN 202110592337A CN 113331839 A CN113331839 A CN 113331839A
- Authority
- CN
- China
- Prior art keywords
- learning
- human eye
- electroencephalogram
- video data
- attention
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000012544 monitoring process Methods 0.000 title claims abstract description 44
- 230000004927 fusion Effects 0.000 title claims abstract description 30
- 238000000034 method Methods 0.000 title claims abstract description 30
- 230000004438 eyesight Effects 0.000 claims abstract description 44
- 210000003128 head Anatomy 0.000 claims abstract description 44
- 210000004556 brain Anatomy 0.000 claims abstract description 26
- 238000007781 pre-processing Methods 0.000 claims abstract description 14
- 238000005070 sampling Methods 0.000 claims description 25
- 238000012937 correction Methods 0.000 claims description 6
- 230000000737 periodic effect Effects 0.000 claims description 5
- 238000012545 processing Methods 0.000 claims description 4
- 241000287181 Sturnus vulgaris Species 0.000 abstract description 3
- 239000012141 concentrate Substances 0.000 abstract description 2
- 230000036544 posture Effects 0.000 description 25
- 230000008569 process Effects 0.000 description 8
- 238000010586 diagram Methods 0.000 description 5
- 230000000694 effects Effects 0.000 description 3
- 238000000605 extraction Methods 0.000 description 3
- 230000007246 mechanism Effects 0.000 description 3
- 230000007177 brain activity Effects 0.000 description 2
- 230000004913 activation Effects 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 238000002955 isolation Methods 0.000 description 1
- 230000007786 learning performance Effects 0.000 description 1
- 230000004807 localization Effects 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000011176 pooling Methods 0.000 description 1
- 230000003595 spectral effect Effects 0.000 description 1
Images
Classifications
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61B—DIAGNOSIS; SURGERY; IDENTIFICATION
- A61B5/00—Measuring for diagnostic purposes; Identification of persons
- A61B5/16—Devices for psychotechnics; Testing reaction times ; Devices for evaluating the psychological state
- A61B5/168—Evaluating attention deficit, hyperactivity
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61B—DIAGNOSIS; SURGERY; IDENTIFICATION
- A61B3/00—Apparatus for testing the eyes; Instruments for examining the eyes
- A61B3/0016—Operational features thereof
- A61B3/0025—Operational features thereof characterised by electronic signal processing, e.g. eye models
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61B—DIAGNOSIS; SURGERY; IDENTIFICATION
- A61B3/00—Apparatus for testing the eyes; Instruments for examining the eyes
- A61B3/10—Objective types, i.e. instruments for examining the eyes independent of the patients' perceptions or reactions
- A61B3/113—Objective types, i.e. instruments for examining the eyes independent of the patients' perceptions or reactions for determining or recording eye movement
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61B—DIAGNOSIS; SURGERY; IDENTIFICATION
- A61B3/00—Apparatus for testing the eyes; Instruments for examining the eyes
- A61B3/10—Objective types, i.e. instruments for examining the eyes independent of the patients' perceptions or reactions
- A61B3/14—Arrangements specially adapted for eye photography
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61B—DIAGNOSIS; SURGERY; IDENTIFICATION
- A61B3/00—Apparatus for testing the eyes; Instruments for examining the eyes
- A61B3/10—Objective types, i.e. instruments for examining the eyes independent of the patients' perceptions or reactions
- A61B3/14—Arrangements specially adapted for eye photography
- A61B3/145—Arrangements specially adapted for eye photography by video means
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61B—DIAGNOSIS; SURGERY; IDENTIFICATION
- A61B5/00—Measuring for diagnostic purposes; Identification of persons
- A61B5/103—Detecting, measuring or recording devices for testing the shape, pattern, colour, size or movement of the body or parts thereof, for diagnostic purposes
- A61B5/11—Measuring movement of the entire body or parts thereof, e.g. head or hand tremor, mobility of a limb
- A61B5/1116—Determining posture transitions
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61B—DIAGNOSIS; SURGERY; IDENTIFICATION
- A61B5/00—Measuring for diagnostic purposes; Identification of persons
- A61B5/103—Detecting, measuring or recording devices for testing the shape, pattern, colour, size or movement of the body or parts thereof, for diagnostic purposes
- A61B5/11—Measuring movement of the entire body or parts thereof, e.g. head or hand tremor, mobility of a limb
- A61B5/1118—Determining activity level
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61B—DIAGNOSIS; SURGERY; IDENTIFICATION
- A61B5/00—Measuring for diagnostic purposes; Identification of persons
- A61B5/103—Detecting, measuring or recording devices for testing the shape, pattern, colour, size or movement of the body or parts thereof, for diagnostic purposes
- A61B5/11—Measuring movement of the entire body or parts thereof, e.g. head or hand tremor, mobility of a limb
- A61B5/1121—Determining geometric values, e.g. centre of rotation or angular range of movement
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61B—DIAGNOSIS; SURGERY; IDENTIFICATION
- A61B5/00—Measuring for diagnostic purposes; Identification of persons
- A61B5/103—Detecting, measuring or recording devices for testing the shape, pattern, colour, size or movement of the body or parts thereof, for diagnostic purposes
- A61B5/11—Measuring movement of the entire body or parts thereof, e.g. head or hand tremor, mobility of a limb
- A61B5/1126—Measuring movement of the entire body or parts thereof, e.g. head or hand tremor, mobility of a limb using a particular sensing technique
- A61B5/1128—Measuring movement of the entire body or parts thereof, e.g. head or hand tremor, mobility of a limb using a particular sensing technique using image analysis
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61B—DIAGNOSIS; SURGERY; IDENTIFICATION
- A61B5/00—Measuring for diagnostic purposes; Identification of persons
- A61B5/16—Devices for psychotechnics; Testing reaction times ; Devices for evaluating the psychological state
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61B—DIAGNOSIS; SURGERY; IDENTIFICATION
- A61B5/00—Measuring for diagnostic purposes; Identification of persons
- A61B5/16—Devices for psychotechnics; Testing reaction times ; Devices for evaluating the psychological state
- A61B5/163—Devices for psychotechnics; Testing reaction times ; Devices for evaluating the psychological state by tracking eye movement, gaze, or pupil change
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61B—DIAGNOSIS; SURGERY; IDENTIFICATION
- A61B5/00—Measuring for diagnostic purposes; Identification of persons
- A61B5/24—Detecting, measuring or recording bioelectric or biomagnetic signals of the body or parts thereof
- A61B5/316—Modalities, i.e. specific diagnostic methods
- A61B5/369—Electroencephalography [EEG]
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61B—DIAGNOSIS; SURGERY; IDENTIFICATION
- A61B5/00—Measuring for diagnostic purposes; Identification of persons
- A61B5/24—Detecting, measuring or recording bioelectric or biomagnetic signals of the body or parts thereof
- A61B5/316—Modalities, i.e. specific diagnostic methods
- A61B5/369—Electroencephalography [EEG]
- A61B5/372—Analysis of electroencephalograms
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61B—DIAGNOSIS; SURGERY; IDENTIFICATION
- A61B5/00—Measuring for diagnostic purposes; Identification of persons
- A61B5/72—Signal processing specially adapted for physiological signals or for diagnostic purposes
- A61B5/7235—Details of waveform analysis
- A61B5/7264—Classification of physiological signals or data, e.g. using neural networks, statistical classifiers, expert systems or fuzzy systems
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61B—DIAGNOSIS; SURGERY; IDENTIFICATION
- A61B5/00—Measuring for diagnostic purposes; Identification of persons
- A61B5/72—Signal processing specially adapted for physiological signals or for diagnostic purposes
- A61B5/7235—Details of waveform analysis
- A61B5/7264—Classification of physiological signals or data, e.g. using neural networks, statistical classifiers, expert systems or fuzzy systems
- A61B5/7267—Classification of physiological signals or data, e.g. using neural networks, statistical classifiers, expert systems or fuzzy systems involving training the classification device
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61B—DIAGNOSIS; SURGERY; IDENTIFICATION
- A61B5/00—Measuring for diagnostic purposes; Identification of persons
- A61B5/72—Signal processing specially adapted for physiological signals or for diagnostic purposes
- A61B5/7271—Specific aspects of physiological measurement analysis
- A61B5/7275—Determining trends in physiological measurement data; Predicting development of a medical condition based on physiological measurements, e.g. determining a risk factor
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61B—DIAGNOSIS; SURGERY; IDENTIFICATION
- A61B5/00—Measuring for diagnostic purposes; Identification of persons
- A61B5/72—Signal processing specially adapted for physiological signals or for diagnostic purposes
- A61B5/7271—Specific aspects of physiological measurement analysis
- A61B5/7285—Specific aspects of physiological measurement analysis for synchronising or triggering a physiological measurement or image acquisition with a physiological event or waveform, e.g. an ECG signal
Landscapes
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Public Health (AREA)
- Veterinary Medicine (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Heart & Thoracic Surgery (AREA)
- Medical Informatics (AREA)
- Molecular Biology (AREA)
- Surgery (AREA)
- Animal Behavior & Ethology (AREA)
- General Health & Medical Sciences (AREA)
- Pathology (AREA)
- Psychiatry (AREA)
- Physiology (AREA)
- Artificial Intelligence (AREA)
- Psychology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Signal Processing (AREA)
- Developmental Disabilities (AREA)
- Ophthalmology & Optometry (AREA)
- Dentistry (AREA)
- Oral & Maxillofacial Surgery (AREA)
- Social Psychology (AREA)
- Child & Adolescent Psychology (AREA)
- Educational Technology (AREA)
- Hospice & Palliative Care (AREA)
- Mathematical Physics (AREA)
- Fuzzy Systems (AREA)
- Evolutionary Computation (AREA)
- Geometry (AREA)
- Radiology & Medical Imaging (AREA)
- Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
- Human Computer Interaction (AREA)
- Multimedia (AREA)
- Measurement And Recording Of Electrical Phenomena And Electrical Characteristics Of The Living Body (AREA)
- Image Analysis (AREA)
Abstract
The invention relates to a network learning attention monitoring method and system based on multi-source information fusion, wherein the method comprises the following steps: s1: collecting learning video data and brain wave data; s2: preprocessing the learning video data and the brain wave data; estimating the sight of human eyes according to the preprocessed learning video data to obtain positioning information of the sight of human eyes; extracting electroencephalogram characteristics according to the preprocessed electroencephalogram data; s3: and inputting the human eye sight line positioning information and the electroencephalogram characteristics into a time sequence prediction model for learning attention monitoring. The method and the device solve the problem that whether the learner stares at the screen or not and then deduces whether the learner concentrates the attention or not only according to the head posture information in the prior art, and can accurately judge the internet learning attention of the learner.
Description
Technical Field
The invention belongs to the technical field of monitoring, and particularly relates to a network learning attention monitoring method and system based on multi-source information fusion.
Background
With the deep progress of education informatization work, online learning gradually becomes one of mainstream learning paradigms because of not being limited by physical space-time, but in the online learning process, learners are easily interfered by external factors to cause difficulty in focusing attention, and meanwhile, because of factors such as online teaching teacher-to-student ratio and space isolation, teachers are difficult to find the problem of the defocusing attention of learners in time, the online learning efficiency is low, and the learning effect is poor.
The existing network teaching attention recognition mainly analyzes the head posture and the like of a learner based on video data of the learner, judges whether the learner gazes at a learning screen, and further judges whether the learner concentrates on network learning content, but the methods are difficult to accurately recognize whether the learner gazes at the screen but actually focuses on the network learning content.
Disclosure of Invention
The invention mainly aims to overcome the defects of the prior art and provide a network learning attention monitoring method and system based on multi-source information fusion, firstly, the human eye characteristics and the head posture information of a learner at each sampling moment are obtained based on learning image data, further, the human eye sight line positioning information of the learner is obtained, and meanwhile, the brain activity characteristics at the sampling moment are obtained based on electroencephalogram data; and then, the learning attention time sequence prediction model comprehensively utilizes the electroencephalogram characteristics and the positioning information of the sight lines of human eyes to accurately judge the learning attention of the learner on the internet, and feedback prompt is given to a learning terminal according to the learning attention time sequence prediction result in a certain time period so as to solve the problem that the attention of the learner cannot be accurately identified in the conventional online teaching scene.
According to one aspect of the invention, the invention provides a network learning attention monitoring method based on multi-source information fusion, which comprises the following steps:
s1: collecting learning video data and brain wave data;
s2: preprocessing the learning video data and the brain wave data; estimating the sight of human eyes according to the preprocessed learning video data to obtain positioning information of the sight of human eyes; extracting electroencephalogram features according to the preprocessed electroencephalogram data;
s3: and inputting the human eye sight line positioning information and the electroencephalogram characteristics into a time sequence prediction model for learning attention monitoring.
Preferably, the preprocessing the learning video data and the brain wave data includes:
sampling the learning video data according to a sampling period T to obtain a learning sequence picture; and segmenting the waveform data of the electroencephalogram data according to the sampling period T.
Preferably, the estimating of the human eye sight line according to the preprocessed learning video data to obtain the human eye sight line positioning information includes:
face recognition and correction are carried out on the learning sequence pictures, and the corrected pictures are subjected to deep CNN network H1Obtaining the human eye characteristics of the learner through multiple convolution operations; subjecting the corrected image to a deep CNN network H2Obtaining the positioning characteristics of the head posture of the learner through multiple convolution operations, and sending the positioning characteristics of the head posture to a classification regression network to obtain head posture information;
and fusing the human eye features and the head posture information by using a feature splicing mode to obtain human eye sight line positioning information.
Preferably, the time sequence prediction model is an LSTM prediction model based on line-of-sight-electroencephalogram mode gating; the performing learning attention monitoring includes:
performing modal gate operation on the electroencephalogram characteristics and the human eye sight positioning information input at each moment, giving different dynamic weights to the characteristics of different modes, realizing dynamic fusion of the electroencephalogram characteristics and the human eye sight positioning information, and inputting a dynamic fusion result into an LSTM prediction model;
the LSTM prediction model outputs corresponding prediction results according to application scenes of network learning attention monitoring, the application scenes comprise regression and classification, and the corresponding prediction results comprise learning attention values and learning attention categories.
Preferably, whether learning attention periodic feedback is performed through a learning terminal is determined according to a prediction result of the time sequence prediction model and a preset threshold value.
According to another aspect of the present invention, the present invention further provides a network learning attention monitoring system based on multi-source information fusion, the system comprising:
the acquisition module is used for acquiring learning video data and brain wave data;
the processing module is used for preprocessing the learning video data and the brain wave data; carrying out human eye sight estimation according to the preprocessed learning video data to obtain human eye sight positioning information; extracting electroencephalogram characteristics according to the preprocessed electroencephalogram data;
and the monitoring module is used for inputting the human eye sight line positioning information and the electroencephalogram characteristics into a time sequence prediction model to carry out learning attention monitoring.
Preferably, the preprocessing the learning video data and the brain wave data includes:
sampling the learning video data according to a sampling period T to obtain a learning sequence picture; and segmenting the waveform data of the electroencephalogram data according to the sampling period T.
Preferably, the estimating of the human eye sight line according to the preprocessed learning video data to obtain the human eye sight line positioning information includes:
face recognition and correction are carried out on the learning sequence pictures, and the corrected pictures are subjected to deep CNN network H1Obtaining the human eye characteristics of the learner through multiple convolution operations; subjecting the corrected image to a deep CNN network H2Obtaining the positioning characteristics of the head posture of the learner through multiple convolution operations, and sending the positioning characteristics of the head posture to a classification regression network to obtain head posture information;
and fusing the human eye features and the head posture information by using a feature splicing mode to obtain human eye sight line positioning information.
Preferably, the time sequence prediction model is an LSTM prediction model based on line-of-sight-electroencephalogram mode gating; the performing learning attention monitoring includes:
performing modal gate operation on the electroencephalogram characteristics and the human eye sight positioning information input at each moment, giving different dynamic weights to the characteristics of different modes, realizing dynamic fusion of the electroencephalogram characteristics and the human eye sight positioning information, and inputting a dynamic fusion result into an LSTM prediction model;
the LSTM prediction model outputs corresponding prediction results according to application scenes of network learning attention monitoring, the application scenes comprise regression and classification, and the corresponding prediction results comprise learning attention values and learning attention categories.
Preferably, the system further includes a feedback module, configured to determine whether to perform learning attention periodic feedback through a learning terminal according to a prediction result of the time sequence prediction model and a preset threshold.
Has the advantages that: the invention can solve the problems that whether the sight of the learner is on the screen or not can not be accurately judged only according to the head posture information and the condition that the learner stares at the screen but is distracted in the prior learning attention monitoring technology, and the like, thereby improving the accuracy of learning attention monitoring.
The features and advantages of the present invention will become apparent by reference to the following drawings and detailed description of specific embodiments of the invention.
Drawings
FIG. 1 is a flow chart of a network learning attention monitoring method based on multi-source information fusion;
FIG. 2 is a flow chart of another network learning attention monitoring method based on multi-source information fusion;
FIG. 3 is a schematic diagram of a human eye feature extraction method based on an attention mechanism;
FIG. 4 is a schematic diagram of a head pose location method based on a classification regression network;
FIG. 5 is a schematic diagram of a learning attention prediction model based on EEG-VIA mode gating;
FIG. 6 is a schematic diagram of a network learning attention monitoring system based on multi-source information fusion.
Detailed Description
The technical solutions in the embodiments of the present invention are clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Example 1
FIG. 1 is a flow chart of a network learning attention monitoring method based on multi-source information fusion. As shown in fig. 1, the present invention provides a network learning attention monitoring method based on multi-source information fusion, which includes the following steps:
s1: learning video data and brain wave data are collected.
Specifically, a student wears portable electroencephalogram equipment and carries out network learning facing a computer screen, and the system comprises a camera which is positioned above the screen or on the same plane with the screen; and the network learning video data and the brain wave data collected by the camera are sent to a student data warehouse. The student data warehouse stores video data, electroencephalogram data and historical learning attention states of learners in the learning process. The learning video data are collected by a camera on the learning terminal, and the electroencephalogram data are collected by a portable electroencephalograph; the initial value of the historical learning attention state is obtained by the subjective judgment of a teacher according to the previous learning performance of the learner, and the prediction result of the network learning attention recognition engine in the learning process is continuously stored.
S2: preprocessing the learning video data and the brain wave data; estimating the sight of human eyes according to the preprocessed learning video data to obtain positioning information of the sight of human eyes; and extracting electroencephalogram characteristics according to the preprocessed electroencephalogram data.
Preferably, the preprocessing the learning video data and the brain wave data includes:
sampling the learning video data according to a sampling period T to obtain a learning sequence picture; and segmenting the waveform data of the electroencephalogram data according to the sampling period T.
Specifically, the video data of the learner and the electroencephalogram data have time sequence consistency in the network learning process, so that the embodiment proposes that the video data are preprocessed according to a fixed sampling period T, and the learning sequence picture at the time T and the time 2T … nT can be represented as (I)1,I2,...In). In view of the fact that the brain activity data changes faster than the image data, the present embodiment proposes to segment the brain wave data into waveform data according to the sampling period T, so that the brain wave data in each T time period can be represented as (E)1,E2,...En)。
Preferably, the estimating of the human eye sight line according to the preprocessed learning video data to obtain the human eye sight line positioning information includes:
face recognition and correction are carried out on the learning sequence pictures, and the corrected pictures are subjected to deep CNN network H1Obtaining the human eye characteristics of the learner through multiple convolution operations; subjecting the corrected image to a deep CNN network H2Obtaining the positioning characteristics of the head posture of the learner through multiple convolution operations, and sending the positioning characteristics of the head posture to a classification regression network to obtain head posture information;
and fusing the human eye features and the head posture information by using a feature splicing mode to obtain human eye sight line positioning information.
Specifically, referring to fig. 2, after the learning video data is sampled, a series of pictures are obtained, and then the face monitoring is firstly performed on the learning image I to determine the position frame of the face of the learner in the learning image. In this embodiment, an open MTCNN algorithm is used, the original learning image is continuously resized to obtain a picture pyramid, and then coarse positioning to fine positioning of the face position are continuously performed based on the CNN algorithm to obtain a face image of a corrected learner, so that subsequent extraction of eye features and head pose positioning can be supported.
As shown in FIG. 3, the corrected learner image passes through a deep CNN network H1The characteristics of the human eyes of the learner are obtained through multiple convolution operations and are marked as U. The invention provides a method for further processing the human eye characteristics U of the learner based on an attention mechanism. Assuming that U contains L channels can be denoted as { c1,c2,…ck,…cLAnd performing global average pooling operation on each channel of the U to obtain a vector V, and performing full-connection mapping on the vector V to obtain the attention weight of each channel. If the channel ckWeight under attention mechanism is cwkThen the weights of all channels can be characterized as cw, thus attention is paid to the human eye feature U in forceaThe characteristics are as follows:
as shown in FIG. 4, the rectified learning image passes through a deep CNN network H2The positioning characteristics of the head posture of the learner are obtained through multiple convolution operations, and then the positioning characteristics of the head posture are sent to 3 classification regression networks to position the head posture parameters. That is, each classification regression network will predict the head pose belonging to a classification result under a certain pose angle (for example: pitch) according to the head pose positioning feature, and the classification result indicates the rough range of the learner under the head pose angle; and finding a rough range under a certain head attitude angle based on the classification result, and then performing regression to realize accurate positioning of the head attitude angle. Therefore, based on the results of the three classification regression networks, the 3 parameters of the head pose pitch, yaw and roll of the learner corresponding to a certain learning image can be obtained and recorded as h.
In the network learning process, the head pose and the eyes of the learner can reflect whether the learner pays attention to the screen, so that the embodiment proposes to integrate the head pose information and the eye features to realize the estimation of the eye sight. Firstly, fusing U by using a characteristic splicing modeaAnd h, further based on human eye gaze estimationPrediction algorithm H3Obtaining more accurate human eye sight line positioning information FsComprises the following steps:
Fs=H3(Ua,h)
preferably, extracting electroencephalogram features according to the preprocessed electroencephalogram data, comprising:
based on the electroencephalogram data preprocessing mode, electroencephalogram characteristics of electroencephalogram data in each T time period can be extracted from the aspects of time domain, frequency domain and wavelet. The time domain features of the electroencephalogram data include: signal amplitude ratio, peak signal difference, peak time window, inter-peak slope, signal power, signal mean, kurtosis, mobility, and complexity; the frequency domain features of the electroencephalogram data include: power spectral density and band strength; the wavelet characteristics of the brain electrical data include: entropy and energy, therefore, the electroencephalogram data characteristics can be recorded as F based on the characteristic extraction methode。
S3: and inputting the human eye sight line positioning information and the electroencephalogram characteristics into a time sequence prediction model for learning attention monitoring.
Preferably, the time sequence prediction model is an LSTM prediction model based on line-of-sight-electroencephalogram mode gating; the performing learning attention monitoring includes:
performing modal gate operation on the electroencephalogram characteristics and the human eye sight positioning information input at each moment, giving different dynamic weights to the characteristics of different modes, realizing dynamic fusion of the electroencephalogram characteristics and the human eye sight positioning information, and inputting a dynamic fusion result into an LSTM prediction model;
the LSTM prediction model outputs corresponding prediction results according to application scenes of network learning attention monitoring, the application scenes comprise regression and classification, and the corresponding prediction results comprise learning attention values and learning attention categories.
Specifically, as shown in fig. 5, the characteristics of learner i at (T, 2T.. jT.. nT.) timing sequence may be characterized asWhereinAndrespectively showing the electroencephalogram characteristics of the learner i at the time T, the time 2T, the time jT and the time nT,andrespectively showing the human eye sight line positioning characteristics of the learner i at the time T, the time 2T, the time jT and the time nT. In the LSTM prediction model based on view-electroencephalogram mode gating according to the present embodiment, the model first performs mode gating on electroencephalogram characteristics and human eye view localization input at each time, that is, different dynamic weights are given to characteristics of different modes, so as to realize dynamic fusion of electroencephalogram characteristics and human eye views, and further serve as input of an LSTM network. Assume that at the jth sampling instant, the original input is (F)ej,Fsj) Obtaining integrated input x after EEG-Sight line mode gatingjThe formula is shown below.
xj=concate(f(WFFej+VFFsj+QFhj-1)[0]·Fej,f(WFFej+VFFsj+QFhj-1)[1]·Fsj)
Wherein, WF,VFAnd QFNetwork parameters corresponding to brain electrical characteristics, human eye sight and a hidden layer in a fusion gate control network layer, f is a sigmoid activation function, and hj-1The hidden layer state output at the sampling moment of j-1 in the LSTM network. The output y of the LSTM network at the jth sampling instantjComprises the following steps:
yj=f(WYxj+QYhj-1)
wherein, WYAnd QYNetwork parameters corresponding to the integrated input and hidden states in the output layer are respectively. Can be selected according to the specific application scene of the network learning attention monitoringSelecting attention monitoring as regression or classification problem, if regression, predicting output yjTo learn attention values, and to predict output y if it is a classificationjTo learn the category of attention.
Preferably, whether learning attention periodic feedback is performed through a learning terminal is determined according to a prediction result of the time sequence prediction model and a preset threshold value.
Specifically, since a lesson teaching on the internet is composed of a plurality of teaching points related to a plurality of teaching activities, each teaching activity or teaching knowledge point is only a few minutes. The present embodiment therefore proposes an attention result (y) predicted from a web-learning attention recognition engine1,y2,...yn..), intercepting the prediction results according to a segment of every N minutes, and checking the sequence prediction results in the time segment, wherein if m% of the results are attention-focused and m is more than or equal to threshold (wherein the threshold is a threshold for judging attention-focused or non-focused under an actual application network teaching scene), feedback to the learning terminal is not needed; otherwise, if the learner is not attentive in the time period, the learner needs to be reminded and fed back by the learning terminal (such as a pop-up box or voice reminding), so that the learner is prompted to enter the learning state as soon as possible.
Compared with the existing attention monitoring technology, the method has the advantages that multi-source data information in the network learning process is fully utilized, the learning image data is utilized to learn the positioning information of the human eye features and the head postures, and then the human eye features and the head postures are fused to judge the human eye sight line positioning; simultaneously, extracting time domain, frequency domain and wavelet characteristics of the electroencephalogram data; and then, on the basis of a learning attention time sequence prediction model of sight-brain electrical mode gating, comprehensively utilizing brain electrical characteristics and human eye sight information to comprehensively judge the network learning attention of the learner, and analyzing a time sequence prediction result according to a certain period to judge whether to give a learning terminal with reminding and feedback at a proper time. The embodiment can more accurately identify the learning attention condition of the online learner, solves the problem that the prior art only roughly judges whether the learner stares at the screen according to the head posture information so as to deduce whether the learner focuses on the attention, and has important teaching application value.
Example 2
FIG. 6 is a schematic diagram of a network learning attention monitoring system based on multi-source information fusion. As shown in fig. 6, the present invention further provides a network learning attention monitoring system based on multi-source information fusion, wherein the system includes:
the acquisition module is used for acquiring learning video data and brain wave data;
the processing module is used for preprocessing the learning video data and the brain wave data; carrying out human eye sight estimation according to the preprocessed learning video data to obtain human eye sight positioning information; extracting electroencephalogram characteristics according to the preprocessed electroencephalogram data;
and the monitoring module is used for inputting the human eye sight line positioning information and the electroencephalogram characteristics into a time sequence prediction model to carry out learning attention monitoring.
Preferably, the preprocessing the learning video data and the brain wave data includes:
sampling the learning video data according to a sampling period T to obtain a learning sequence picture; and segmenting the waveform data of the electroencephalogram data according to the sampling period T.
Preferably, the estimating of the human eye sight line according to the preprocessed learning video data to obtain the human eye sight line positioning information includes:
face recognition and correction are carried out on the learning sequence pictures, and the corrected pictures are subjected to deep CNN network H1Obtaining the human eye characteristics of the learner through multiple convolution operations; subjecting the corrected image to a deep CNN network H2Obtaining the positioning characteristics of the head posture of the learner through multiple convolution operations, and sending the positioning characteristics of the head posture to a classification regression network to obtain head posture information;
and fusing the human eye features and the head posture information by using a feature splicing mode to obtain human eye sight line positioning information.
Preferably, the time sequence prediction model is an LSTM prediction model based on line-of-sight-electroencephalogram mode gating; the performing learning attention monitoring includes:
performing modal gate operation on the electroencephalogram characteristics and the human eye sight positioning information input at each moment, giving different dynamic weights to the characteristics of different modes, realizing dynamic fusion of the electroencephalogram characteristics and the human eye sight positioning information, and inputting a dynamic fusion result into an LSTM prediction model;
the LSTM prediction model outputs corresponding prediction results according to application scenes of network learning attention monitoring, the application scenes comprise regression and classification, and the corresponding prediction results comprise learning attention values and learning attention categories.
Preferably, the system further includes a feedback module, configured to determine whether to perform learning attention periodic feedback through a learning terminal according to a prediction result of the time sequence prediction model and a preset threshold.
The specific implementation process of the method steps executed by each module in embodiment 2 of the present invention is the same as the implementation process of each step in embodiment 1, and is not described herein again.
The above description is only a preferred embodiment of the present invention, and is not intended to limit the scope of the present invention, and all modifications and equivalents of the present invention, which are made by the contents of the present specification and the accompanying drawings, or directly/indirectly applied to other related technical fields, are included in the scope of the present invention.
Claims (10)
1. A network learning attention monitoring method based on multi-source information fusion is characterized by comprising the following steps:
s1: collecting learning video data and brain wave data;
s2: preprocessing the learning video data and the brain wave data; estimating the sight of human eyes according to the preprocessed learning video data to obtain positioning information of the sight of human eyes; extracting electroencephalogram characteristics according to the preprocessed electroencephalogram data;
s3: and inputting the human eye sight line positioning information and the electroencephalogram characteristics into a time sequence prediction model for learning attention monitoring.
2. The method according to claim 1, wherein the preprocessing the learning video data and the brain wave data comprises:
sampling the learning video data according to a sampling period T to obtain a learning sequence picture; and segmenting the waveform data of the electroencephalogram data according to the sampling period T.
3. The method of claim 2, wherein the performing human eye gaze estimation based on the preprocessed learning video data to obtain human eye gaze location information comprises:
face recognition and correction are carried out on the learning sequence pictures, and the corrected pictures are subjected to deep CNN network H1Obtaining the human eye characteristics of the learner through multiple convolution operations; subjecting the corrected image to a deep CNN network H2Obtaining the positioning characteristics of the head posture of the learner through multiple convolution operations, and sending the positioning characteristics of the head posture to a classification regression network to obtain head posture information;
and fusing the human eye features and the head posture information by using a feature splicing mode to obtain human eye sight positioning information.
4. The method of claim 1, wherein the timing prediction model is an LSTM prediction model based on line-of-sight-electroencephalogram modal gating; the performing learning attention monitoring includes:
performing modal gate operation on the electroencephalogram characteristics and the human eye sight positioning information input at each moment, giving different dynamic weights to the characteristics of different modes, realizing dynamic fusion of the electroencephalogram characteristics and the human eye sight positioning information, and inputting a dynamic fusion result into an LSTM prediction model;
the LSTM prediction model outputs corresponding prediction results according to application scenes of network learning attention monitoring, the application scenes comprise regression and classification, and the corresponding prediction results comprise learning attention values and learning attention categories.
5. The method according to claim 1, wherein whether learning attention staged feedback is performed through a learning terminal is determined according to a prediction result of the time sequence prediction model and a preset threshold value.
6. A network learning attention monitoring system based on multi-source information fusion, characterized in that the system comprises:
the acquisition module is used for acquiring learning video data and brain wave data;
the processing module is used for preprocessing the learning video data and the brain wave data; estimating the sight of human eyes according to the preprocessed learning video data to obtain positioning information of the sight of human eyes; extracting electroencephalogram characteristics according to the preprocessed electroencephalogram data;
and the monitoring module is used for inputting the human eye sight line positioning information and the electroencephalogram characteristics into a time sequence prediction model to carry out learning attention monitoring.
7. The system according to claim 6, wherein the preprocessing of the learning video data and the brain wave data comprises:
sampling the learning video data according to a sampling period T to obtain a learning sequence picture; and segmenting the waveform data of the electroencephalogram data according to the sampling period T.
8. The system according to claim 7, wherein the estimating the human eye gaze according to the preprocessed learning video data to obtain the human eye gaze positioning information comprises:
face recognition and correction are carried out on the learning sequence pictures, and the corrected pictures are subjected to deep CNN network H1Obtaining the human eye characteristics of the learner through multiple convolution operations; subjecting the corrected image to a deep CNN network H2Obtaining the positioning characteristics of the head posture of the learner through multiple convolution operations, and sending the positioning characteristics of the head posture to a classification regression networkObtaining head posture information;
and fusing the human eye features and the head posture information by using a feature splicing mode to obtain human eye sight positioning information.
9. The system of claim 6, wherein the timing prediction model is an LSTM prediction model based on line-of-sight-electroencephalogram modality gating; the performing learning attention monitoring includes:
performing modal gate operation on the electroencephalogram characteristics and the human eye sight positioning information input at each moment, giving different dynamic weights to the characteristics of different modes, realizing dynamic fusion of the electroencephalogram characteristics and the human eye sight positioning information, and inputting a dynamic fusion result into an LSTM prediction model;
the LSTM prediction model outputs corresponding prediction results according to application scenes of network learning attention monitoring, the application scenes comprise regression and classification, and the corresponding prediction results comprise learning attention values and learning attention categories.
10. The system of claim 6, further comprising a feedback module configured to determine whether to perform learning attention periodic feedback through a learning terminal according to a prediction result of the time-series prediction model and a preset threshold.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110592337.6A CN113331839A (en) | 2021-05-28 | 2021-05-28 | Network learning attention monitoring method and system based on multi-source information fusion |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110592337.6A CN113331839A (en) | 2021-05-28 | 2021-05-28 | Network learning attention monitoring method and system based on multi-source information fusion |
Publications (1)
Publication Number | Publication Date |
---|---|
CN113331839A true CN113331839A (en) | 2021-09-03 |
Family
ID=77471974
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110592337.6A Pending CN113331839A (en) | 2021-05-28 | 2021-05-28 | Network learning attention monitoring method and system based on multi-source information fusion |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113331839A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116226712A (en) * | 2023-03-03 | 2023-06-06 | 湖北商贸学院 | Online learner concentration monitoring method, system and readable storage medium |
Citations (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103340637A (en) * | 2013-06-06 | 2013-10-09 | 同济大学 | System and method for driver alertness intelligent monitoring based on fusion of eye movement and brain waves |
CN109044363A (en) * | 2018-09-04 | 2018-12-21 | 华南师范大学 | Driver Fatigue Detection based on head pose and eye movement |
CN109793528A (en) * | 2019-01-28 | 2019-05-24 | 华南理工大学 | A kind of mood classification method based on dynamic brain function network |
CN109875568A (en) * | 2019-03-08 | 2019-06-14 | 北京联合大学 | A kind of head pose detection method for fatigue driving detection |
WO2019144542A1 (en) * | 2018-01-26 | 2019-08-01 | Institute Of Software Chinese Academy Of Sciences | Affective interaction systems, devices, and methods based on affective computing user interface |
CN110101397A (en) * | 2019-03-29 | 2019-08-09 | 中国地质大学(武汉) | Focus detector based on TGAM |
CN110610168A (en) * | 2019-09-20 | 2019-12-24 | 合肥工业大学 | Electroencephalogram emotion recognition method based on attention mechanism |
CN111046734A (en) * | 2019-11-12 | 2020-04-21 | 重庆邮电大学 | Multi-modal fusion sight line estimation method based on expansion convolution |
CN111160239A (en) * | 2019-12-27 | 2020-05-15 | 中国联合网络通信集团有限公司 | Concentration degree evaluation method and device |
WO2020204810A1 (en) * | 2019-03-29 | 2020-10-08 | Agency For Science, Technology And Research | Identifying and extracting electroencephalogram signals |
CN112132058A (en) * | 2020-09-25 | 2020-12-25 | 山东大学 | Head posture estimation method based on multi-level image feature refining learning, implementation system and storage medium thereof |
CN112603336A (en) * | 2020-12-30 | 2021-04-06 | 国科易讯(北京)科技有限公司 | Attention analysis method and system based on brain waves |
-
2021
- 2021-05-28 CN CN202110592337.6A patent/CN113331839A/en active Pending
Patent Citations (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103340637A (en) * | 2013-06-06 | 2013-10-09 | 同济大学 | System and method for driver alertness intelligent monitoring based on fusion of eye movement and brain waves |
WO2019144542A1 (en) * | 2018-01-26 | 2019-08-01 | Institute Of Software Chinese Academy Of Sciences | Affective interaction systems, devices, and methods based on affective computing user interface |
CN109044363A (en) * | 2018-09-04 | 2018-12-21 | 华南师范大学 | Driver Fatigue Detection based on head pose and eye movement |
CN109793528A (en) * | 2019-01-28 | 2019-05-24 | 华南理工大学 | A kind of mood classification method based on dynamic brain function network |
CN109875568A (en) * | 2019-03-08 | 2019-06-14 | 北京联合大学 | A kind of head pose detection method for fatigue driving detection |
CN110101397A (en) * | 2019-03-29 | 2019-08-09 | 中国地质大学(武汉) | Focus detector based on TGAM |
WO2020204810A1 (en) * | 2019-03-29 | 2020-10-08 | Agency For Science, Technology And Research | Identifying and extracting electroencephalogram signals |
CN110610168A (en) * | 2019-09-20 | 2019-12-24 | 合肥工业大学 | Electroencephalogram emotion recognition method based on attention mechanism |
CN111046734A (en) * | 2019-11-12 | 2020-04-21 | 重庆邮电大学 | Multi-modal fusion sight line estimation method based on expansion convolution |
CN111160239A (en) * | 2019-12-27 | 2020-05-15 | 中国联合网络通信集团有限公司 | Concentration degree evaluation method and device |
CN112132058A (en) * | 2020-09-25 | 2020-12-25 | 山东大学 | Head posture estimation method based on multi-level image feature refining learning, implementation system and storage medium thereof |
CN112603336A (en) * | 2020-12-30 | 2021-04-06 | 国科易讯(北京)科技有限公司 | Attention analysis method and system based on brain waves |
Non-Patent Citations (1)
Title |
---|
刘嘉敏,等: "基于长短记忆与信息注意的视频-脑电交互协同情感识别" * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116226712A (en) * | 2023-03-03 | 2023-06-06 | 湖北商贸学院 | Online learner concentration monitoring method, system and readable storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109117731B (en) | Classroom teaching cognitive load measurement system | |
CN108399376B (en) | Intelligent analysis method and system for classroom learning interest of students | |
CN105516280B (en) | A kind of Multimodal Learning process state information packed record method | |
CN111046823A (en) | Student classroom participation degree analysis system based on classroom video | |
CN107392120B (en) | Attention intelligent supervision method based on sight line estimation | |
CN106599881A (en) | Student state determination method, device and system | |
Hu et al. | Research on abnormal behavior detection of online examination based on image information | |
CN106055894A (en) | Behavior analysis method and system based on artificial intelligence | |
CN110807585A (en) | Student classroom learning state online evaluation method and system | |
CN107909037B (en) | Information output method and device | |
CN113282840B (en) | Comprehensive training acquisition management platform | |
CN111695442A (en) | Online learning intelligent auxiliary system based on multi-mode fusion | |
CN113705349A (en) | Attention power analysis method and system based on sight estimation neural network | |
CN114120432A (en) | Online learning attention tracking method based on sight estimation and application thereof | |
CN113344479B (en) | Online classroom-oriented learning participation intelligent assessment method and device | |
CN114663734A (en) | Online classroom student concentration degree evaluation method and system based on multi-feature fusion | |
Bhamare et al. | Deep neural networks for lie detection with attention on bio-signals | |
CN116434341A (en) | Student classroom abnormal behavior identification method and system | |
Ashwin et al. | Unobtrusive students' engagement analysis in computer science laboratory using deep learning techniques | |
CN113331839A (en) | Network learning attention monitoring method and system based on multi-source information fusion | |
Duraisamy et al. | Classroom engagement evaluation using computer vision techniques | |
CN113591678A (en) | Classroom attention determination method, device, equipment, storage medium and program product | |
CN112818741A (en) | Behavior etiquette dimension evaluation method and device for intelligent interview | |
CN110852284A (en) | System for predicting user concentration degree based on virtual reality environment and implementation method | |
CN116052264A (en) | Sight estimation method and device based on nonlinear deviation calibration |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20210903 |