WO2022169424A1 - A device and a process for managing alertness and quality interaction in two-way audio-video communication platforms - Google Patents

A device and a process for managing alertness and quality interaction in two-way audio-video communication platforms Download PDF

Info

Publication number
WO2022169424A1
WO2022169424A1 PCT/SI2022/050004 SI2022050004W WO2022169424A1 WO 2022169424 A1 WO2022169424 A1 WO 2022169424A1 SI 2022050004 W SI2022050004 W SI 2022050004W WO 2022169424 A1 WO2022169424 A1 WO 2022169424A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
users
alertness
agent
sensors
Prior art date
Application number
PCT/SI2022/050004
Other languages
French (fr)
Inventor
Borut Likar
Denis TRCEK
Original Assignee
Univerza Na Primorskem
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Univerza Na Primorskem filed Critical Univerza Na Primorskem
Priority to EP22712098.7A priority Critical patent/EP4289115A1/en
Publication of WO2022169424A1 publication Critical patent/WO2022169424A1/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L12/00Data switching networks
    • H04L12/02Details
    • H04L12/16Arrangements for providing special services to substations
    • H04L12/18Arrangements for providing special services to substations for broadcast or conference, e.g. multicast
    • H04L12/1813Arrangements for providing special services to substations for broadcast or conference, e.g. multicast for computer conferences, e.g. chat rooms
    • H04L12/1831Tracking arrangements for later retrieval, e.g. recording contents, participants activities or behavior, network status
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0639Performance analysis of employees; Performance analysis of enterprise or organisation operations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/10Office automation; Time management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
    • G06Q50/01Social networking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/20Education
    • GPHYSICS
    • G09EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
    • G09BEDUCATIONAL OR DEMONSTRATION APPLIANCES; APPLIANCES FOR TEACHING, OR COMMUNICATING WITH, THE BLIND, DEAF OR MUTE; MODELS; PLANETARIA; GLOBES; MAPS; DIAGRAMS
    • G09B5/00Electrically-operated educational appliances
    • G09B5/08Electrically-operated educational appliances providing for individual presentation of information to a plurality of student stations
    • G09B5/10Electrically-operated educational appliances providing for individual presentation of information to a plurality of student stations all student stations being capable of presenting the same information simultaneously
    • GPHYSICS
    • G09EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
    • G09BEDUCATIONAL OR DEMONSTRATION APPLIANCES; APPLIANCES FOR TEACHING, OR COMMUNICATING WITH, THE BLIND, DEAF OR MUTE; MODELS; PLANETARIA; GLOBES; MAPS; DIAGRAMS
    • G09B7/00Electrically-operated teaching apparatus or devices working with questions and answers
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L51/00User-to-user messaging in packet-switching networks, transmitted according to store-and-forward or real-time protocols, e.g. e-mail
    • H04L51/02User-to-user messaging in packet-switching networks, transmitted according to store-and-forward or real-time protocols, e.g. e-mail using automatic reactions or user delegation, e.g. automatic replies or chatbot-generated messages

Definitions

  • the present invention belongs to the field of audio-video communication platforms for distance education (i.e., lectures, teaching and communication) and on-line meetings for various types of listeners (pupils, students, university students, adults etc.).
  • the invention relates to a process for managing two-way interaction during use of audio video communication platforms with digital channels.
  • Audio-video communication platforms needed for distance learning, lecturing and online meetings offer the two-way transmission of audio and video, as well as additional communication channels, such as those for sending text, expression of the listeners’ reactions (raised finger, clapping, notices that the pacing is too quick, etc.), production and implementation of quizzes, additional video effects (e.g. background selection, virtual and augmented reality), screen sharing, use of breakout rooms, recording, etc.
  • additional communication channels such as those for sending text, expression of the listeners’ reactions (raised finger, clapping, notices that the pacing is too quick, etc.), production and implementation of quizzes, additional video effects (e.g. background selection, virtual and augmented reality), screen sharing, use of breakout rooms, recording, etc.
  • One-way communication This scenario discusses the problem of one-way communication. Due to frequent “invisible presence” or cases where the listeners have no video feeds, there is no feedback regarding problems, e.g., when a subject is not understood or (dis)agrees with a topic or a statement.
  • the listeners’ video feeds are also a problem, because - although the lecturers can see them - there are too many due to the large number of listeners for the lecturer to be able to observe them to a useful extent.
  • Some platforms partially solve this problem by allowing the listeners to give a positive or negative feedback, e.g., in the form of a thumbs-up or a thumbs- down icon.
  • the solution is partially effective, because the lecturer can only see a small part of the listeners (or their profiles with video feeds) showing the (dis)agreement icon.
  • this solution requires being ready for the listeners actively responding in the indicated ways. It is also problematic where the listeners are tasked with reading through a given study text, as the lecturer does not have any information about whether they have finished or not.
  • the technical problem solved by the present invention is how to design a suitable device and perform a process for providing feedback based on different parameters, which will allow a lecturer, a teacher or other participants an assessment of individual and group feedback.
  • US 2020/0085312 A1 discloses systems and methods for detecting a physiological response using different types of photoplethysmography sensors. Examples of physiological responses that may be detected include a stroke, a migraine, certain emotional responses.
  • a camera captures images of a part of the skin on the subject’s head and correlates them with the photoplethysmogram signal on the neighboring skin segment. The solution remains limited to the mentioned detection and no further applications are covered.
  • US 2020/0210781 A1 discloses a system and a method that enable context-dependent detection of passive events (e.g. the locomotor system of the user) on a local device using commercially-available low-power sensors, e.g. on a mobile phone (to reduce the need for additional computing, e.g. cloud-based processing).
  • passive events e.g. the locomotor system of the user
  • commercially-available low-power sensors e.g. on a mobile phone
  • the solution remains limited to the mentioned detection and no further applications are covered.
  • Patent application WO 2020/061358 A1 relates to systems, devices, and methods for use in the implementation of a human-computer interface using high-speed tracking of user interactions.
  • the embodiments also include neural, motor, and electromyography signals to mediate user manipulation of machines and devices.
  • This solution aims at improvements of human-computer interfaces for manipulating machines and devices.
  • US 2020/0192478 A1 relates to systems, devices, and methods for use in the implementation of a brain-computer interface that integrates real-time eye-movement tracking with brain activity tracking to update a user interface at high speed.
  • the embodiments also relate to the implementation of a brain-computer interface that uses real-time eye tracking and real-time analysis of neural signals to mediate user manipulation of a device. This solution cannot be used meaningfully in the (two-way) communication platforms such as the ones, addressed in this application.
  • WO 2005/076241 A1 describes a method and apparatus for recording operations, which are transferred to a server so that the given operation can be performed at a remote location using guidance on the basis of the recording in question. This solution aims at training the user for performing particular operations and no other application is considered.
  • the patent US 2020/10692606 B2 presents methods, computer programs, and systems that can obtain biometric data of a first user and transmit them to their computer or themselves, or to the user of a second computer or to a second user.
  • Biometric feedback is used for the current stress level classification and includes haptic feedback.
  • the patent aims at measuring stress level and not alertness, nor does it aim at improving interaction in terms of better education and acquiring knowledge and managing group of pupils and students.
  • Patent application US 20200358627 describes a solution inside a computing system, which provides a meeting evaluation based on the parameters from meeting quality monitoring devices.
  • the quality parameters (each quantify meeting conditions during one or more previous meetings) are used to determine an overall quality score for each of the meetings.
  • the meeting insight computing system relies on quality parameters received from a plurality of meeting quality monitoring devices like the Internet of Things. These parameters enable an understanding of the real-world context in which the meetings take place and can be used by users or organizations to improve overall meeting quality by appropriate planning of these meetings. So, the patent aims at improving the quality of the meetings via the context determination like appropriate structure of attendees, appropriate room and time of the day, etc. These parameters are controlled by an organization.
  • the presented invention aims to solve disadvantages of prior art and inherently functions differently from the above-described known solutions. It is based on the deployment of an agent that gathers data and controls the alertness and quality of interactions. It does not necessarily need external devices but uses meeting providing devices that are available at the spots where the on-line session is taking place. It further uses data from the devices that attendees are wearing (like smart watches), and finally, it uses the data from the communication equipment of attendees of on-line meetings I lectures. These data come via at least one of four channels described in further detail below and are used by the agent to make an estimation of quality of interactions, based on information from at least one channel, but preferably all four channels.
  • the device for providing feedback in two-way audio-video communication platforms comprises an agent arranged to receive, track and/or analyze data from listeners of online lectures and/or participants of online meetings and/or lecturer(s) and/or meeting manager(s) (leaders) based on data provided by the sensors embedded in deployed digital devices, including the devices performing the two-way communication.
  • the said digital devices and sensors form one or preferably more channels selected in the group consisting of:
  • - System -process channel for following, e.g., speech with a microphone on or off, for following a mouse and/or tracking pad movement, for tracking the use of other applications, for keyboard use, etc., which can be formed exclusively in the used digital device on which the two-way communication platform is running, like a personal computer,
  • Physiological channel for following physiological responses of the participants, like blood pressure, oxygenation, body temperature, obtained via, for example, smart watches;
  • - Neurological channel for following neuro-related data, focused on central or peripheral nervous system data, e.g., obtained via portable EEG devices or its lightweight variants or any other device arranged to capture neurological signals.
  • the agent performs at least the following operations:
  • the agent may be designed as purely software-based, hardware-based or as a combination of software and hardware. It consists of data capturing and aggregating part, data analysis part (which can include statistical methods or machine learning methods), and data presentation part. Further, the agent is arranged to provide general feedback for all participants, which can include statistical methods or machine learning methods. It can also allow formation of bonus points, wherein the lecturer can award involvement of particular participants on the basis of their activity during the lecture.
  • - low alertness is characterized by at least one of the following options: frequent looking away from the screen or camera, frequent yawning, longer lasting speaking without microphone use, also after automatic notification that the microphone is off, use of other programs or applications, eye movements in various directions, frequent whole-body movement, changes in physiological and/or neurological parameters, longer absence from the camera; and
  • - high alertness is a state of active attention, when for example users mostly look at the screen or the camera, promptly react to questions, messages and/or tasks, rarely or never yawn, never or rarely disappears form the camera and similar.
  • Average alertness is somewhere in between both extreme states, wherein the state may vary during a single meeting or lecture and can be followed by the abovedescribed device comprising the agent.
  • the device also allows optional implementation of calibration questions at the beginning of the meeting or lecture and during the meeting or the lecture, wherein the question can be formulated in a manner that allows determination of a baseline (using, for example, heart rate, etc.). Further, the question during the lecture or meeting can be formulated specifically in order to allow the user to evaluate his own attention. In addition, other means can be used by the device like specific short videos, etc.
  • the device for providing feedback according to the invention comprises at least the following components:
  • An audio-video communication platform arranged for interaction among one or more users
  • sensors may be selected in the group consisting of: o Camera for detecting body/part of the body movements, face changes, o Camera and/or similar sensor for tracking eye movements, o Microphone for speech detection (including microphone on and/or off state), o Keypad and/or mouse and/or tracking pad sensors, o Heart rate sensor, o Blood oxygenation level sensor, o EEG, o Any other suitable sensor, where the function of a sensor can be also considered any device that provides biometry specific data like a keyboard,
  • An agent arranged to receive, track and analyze the data from the digital devices and/or sensors, in order to provide feedback on the state of alertness of the participants,
  • a memory device for storing the information from the sensors and for the analysis performed by the agent
  • a module for providing feed-back information to the lecturer wherein the lecturer is shown either an individual and/or a group feedback for participants in any suitable representation, most preferably as numerical values or color-coded scale. For example, a high numerical value is connected with high attention, or a green color means high attention, while red color means low attention.
  • the process for providing feedback about the state of alertness of users of two-way communication platforms comprises at least the following steps: a) Detection of the type of digital devices used by users (listeners, participants) and connection with at least one detected device, wherein it is preferred that the user of the device gives consent to the agent in order to collect data from the detected device, b) Data collection from the communication device used by users or sensors on the listeners or in their surroundings transmitted via communication protocols like WiFi, Bluetooth and so on, wherein the data is about listeners and at least one can be selected in the group of following:
  • None of the existing solutions have a system for monitoring responses of listeners/participants (e.g., their concentration or alertness) or explicit digital conceptualizations using at least one of the four key channels: system-processes, motor, physiological, and neurological. Furthermore, none of them provide an effective display method (a color display line with an appropriate scale, or a display pointer that moves along an appropriate scale) for relevant feedback according to each channel, a combination of these channels, or across all channels together in real time or post- festum.
  • none of these methods ensure privacy of the listeners while also appropriately informing the lecturer about the state of the listeners (auditorium), which is possible with the present solution, because the agent privacy-protects the data with methods like anonymization, differential privacy and so forth and discards the original private data after the required functionality is provided.
  • figure 1 shows a schematic view of the process and all the devices and participants involved in the process for providing feedback in two-way audio-video communication means.
  • the present invention uses digital devices 9, such as computers, smartphones, electronic tablets, and digital communications to establish sensory channels 1 to 4 from the listeners to the lecturer, to which system -processes data about the basic processes on the used digital devices flow, including hardware elements, such as camera, microphone, keyboard and mouse.
  • Established channels 1 to 4 (motor, system, physiological, neurological) provide the data to the agent 5 for analysis, while the agent 5 gives then the lecturer feedback about the motor/physiological/neurological state of the listeners and their activities and responses.
  • this refers, for example, to the state of the listeners or the auditorium with regards to concentration and attention, agreement or disagreement, lecture-unrelated activities (making calls, playing games, etc.), and other important parameters that serve as vital information for the lecturer in providing a quality lecture.
  • threshold values for motor/neurological/physiological states cannot be determined in a general manner, some tangible values could be experimentally defined, however, variability of users has to be taken into account.
  • a possible method for experimental definition of threshold values is the following: i. Define the tracked variables like eye movement, hand or palm position, heart rate, mouse movements, etc., ii. Perform an experimental phase with several lectures with different lecturers and listeners, with tracking of the defined variables, wherein camera and other necessary sensors are continuously tracked and analyzed, iii. Each lecture is evaluated based on the opinion of the listeners, the lecturer and independent evaluators also attending the lectures, iv. Based on the analysis of data and the opinion of all participants the key values for particular variable is determined in order to achieve different levels of alertness and/or lecture quality, v. Statistical analysis and averaging is used to define the final threshold values or ranges of values for particular variable.
  • Telemetric (remote) data capture via appropriate sensors This is done through four basic sensory channels 1 to 4, which are of a logical nature and realized as: system -processes 1 , motor 2, physiological 3, and neurological 4. They provide the lecturer 11 with real-time feedback on whether the listeners 10 are in a state of attention and actively following the education process or they are in an uncooperative state (e.g., too tired or in an insufficiently active state). Accordingly, the lecturer 11 , who can be a teacher, a professor, an expert or any other person leading the lecture or the meeting, can adjust the course of teaching or lecturing. These channels, which combine the data according to their generic origin and which are captured via sensors 14 are already included on the used digital devices 9, for example computers, or they are enabled via additional equipment (e.g., virtual reality headsets) and are connected to the audio-video platform 8.
  • additional equipment e.g., virtual reality headsets
  • the agent 5 which functions as a proxy and analyzes the data collected from the listeners 10 and/or the lecturer 11 and which ensures data privacy, for example by masking the personal identification parts, or ensures the privacy of the participants in some other way, for example by using differential privacy techniques.
  • the agent 5 manages the module for storing information 15 to store basic and analytically derived data and performs real-time and post-festum analytics.
  • the analytics includes both the processing of the data obtained from the individual listeners 10 and the lecturer 11 , as well as privacy ensuring processing of the data and the production of their aggregates.
  • the data analysis is based on methods, such as pattern detection and statistical methods, where the received data can be analyzed in relation to established data set for alertness, and machine learning, where the data from an alert person are used for training a neural network, and the received data is then submitted to this network for evaluation.
  • the lecturer is also shown a general result of the processing of individual parameters based on the metrics used for the entire virtual meeting, a ranking compared to other virtual meetings and the possibility of additional analytics (e.g., pivot tables).
  • the agent 5 is also associated with an optional two-way subchannel realized via the audio-video platform 8 that is called the articulation subchannel 7, which the agent 5 uses to help the listeners 10 articulate their questions or capture the natural responses, such as confirmations by giving a thumbs-up 12 or nodding 13, which can be forwarded to the lecturer after they are processed.
  • This subchannel which is running through the platform central communication channel, can deploy various techniques to help the listeners (e.g., chat-bot techniques, which may be basic like spelling and syntax checking ones, or advanced like natural language processing technology).
  • the main element is generally a dynamic color-coded bar (e.g., going from green, which represents full attention, and through yellow to red, which represents absence of attention) or the Likert scale (e.g., from 0 to 10, where 0 means absence of attention and 10 full attention) or an analogous method that shows other reactions or indicators, such as agreement, completion of page reading, etc.
  • listeners 10 can, for example, choose not to transmit their camera feeds to the audio-video communication platform 8 where the other participants would see them, because the other sensors 14 (in connection with the indicated four channels 1 to 4) make it possible to collect the data necessary for analytics.
  • the camera can also be active solely for the purposes of analytics or for the purposes of analytics, but without showing the feed on the audio-video platform 8.
  • the present invention enables the detection of attention in a way that the privacy of the participants (listeners 10 and lecturer(s) 11 ) is ensured, whereby the data that is available to the lecturer (or a potential attacker) is insufficient to carry out an attack on the user’s identity (i.e. , they protect their privacy). It is desirable that the communication takes place over an encrypted channel, which is supported by most audio-video platforms 8.
  • Example 1 Talking to a wall
  • Telemetry is used to collect and analyze data about listeners 10 from their digital devices (e.g., a computer, mobile phone, etc.). In this way, lecturers 11 obtain feedback about whether the concentration level of the listener 10 has decreased. Feedback about the listeners’ concentration levels is presented to the lecturer via the module for providing information 6, e.g., a color scale or a bar, measuring line or other suitable representation of results that dynamically changes color from green (“attention present”) to red (“attention absent”).
  • This type of telemetry is based on the data captured via sensors associated with the indicated four channels.
  • the system detects the type of a digital device the listener 10 is using (e.g., a computer with one or two monitors, smartphone, electronic tablet, etc.). Accordingly, the process then branches out as is indicated below.
  • a digital device e.g., a computer with one or two monitors, smartphone, electronic tablet, etc.
  • the following information may indicate lower concentration:
  • agent 5 a proxy, which may also be in the form of an avatar
  • agent 5 processes it e.g., by means of processing the video feed and other signals using mathematical (statistical) procedures, and artificial intelligence procedures, in particular machine learning.
  • mathematical procedures reference values are used and the received values are compared with these reference values to determine if attention thresholds are exceeded.
  • machine learning a software like a neural network is trained with data from a subject that is in an alert state, and the received data is then feed into the network which decides if alertness is present or not. For the new kinds of data, correlations can be used with known and already deployed kind of data.
  • Aggregates and/or privacy-preserving processed information of the received data are sent by the agent 5 to the lecturer via the module for providing information 6, generally in the form of a dynamic graphic color display of attention. This is visible for each listener 10 separately - similar to lectures in person where due to being physically present, the lecturer 11 can observe each of the listeners 10.
  • the system can also display aggregate data for the entire auditorium of all listeners 10, which represent the average or weighted values of individual parameters. d) Storage of information
  • the agent 5 discards other data in real time or processes them appropriately, e.g., by anonymizing them or by deploying differential privacy techniques, for subsequent analysis, whereby the agent 5 can also store all other data pursuant to the Ell GDPR requirements within the module for storing information 15.
  • the lecturer 10 gives to the listeners 11 an assignment or case study to examine, but then does not know who has finished reading (e.g., a case study or part of a page).
  • the agent 5 uses a camera or other sensors 14 to detect typical head movements 13 when reading, eye tracking (via the so-called heat maps) and sends to the lecturer 11 comprehensive information in real time about how many listeners 10 have finished reading.
  • the number of listeners may also be given as a share of users that finished the task. This means that gradual and/or repetitive (in the case of longer texts) downward movements of the head or eyes 13 are detected, which indicates the tracking of text or image.
  • this solution also uses the following process structure: Data collection, Data analysis, Display of information.
  • This scenario refers to the lack of motivation of the listeners 10 to participate compared to classic methods of teaching or lecturing.
  • a possible solution is to use additional methods of motivation through a motivational scoring scale or collecting bonus points.
  • Listeners 10 who participate gain bonus points that are shown on the motivational scoring scale of each listener 10 and collected in an appropriate file or the module for storing information 15. They are obtained on the basis of different elements of participation, e.g., oral participation, written questions raised in a dialog box, participation in quizzes within the audio-video communication platform 8 and monitoring of reactions and active participation in other activities offered by the platform 8.
  • Bonus points are assigned via the agent 5 to the individual listeners 10 and sent to the lecturer 11 who can take them into account in assessment. The substantive quality of the participation can be weighted accordingly by the lecturer 11 .
  • the agent 5 presents the likelihood of the lecturer/group having a quality meeting by analyzing the parameters of the current and previous lectures.
  • the lecturer enters certain parameters via the module for entering the parameters of the current lecture 16, which affect the quality of a meeting.
  • These include the term of the lecture (date and time), listeners’ characteristics (e.g., age, motivation), structure of the lecture (e.g., an ex-cathedra lecture, discussion, case study, group work, text assignments and similar), and a subjective assessment of the attractiveness of the lecture topic (complexity or dryness of the topic, quality of the lecture).
  • the process includes determining for each of the indicated parameters whether it will have a positive or a negative impact (e.g., lower ratings are given when only an ex-cathedra lecture is held in the evening).
  • the estimation/final value can be calculated either using the standard statistical methods, e.g., weighted sum of influencing parameters, multiple regression etc. or using machine learning tools.
  • the quality of the rating can be further improved if the agent 5 when making the rating takes into account also the data from the previous lectures (e.g., by extrapolation) for which comparable data are available about the term of the lecture, listeners’ characteristics, structure of the lecture and subjective assessment of the attractiveness of the lecture topic.
  • a comparison is made according to individual parameters for the current and the previous lecture(s) and then used to additionally correct the already provided rating (for the current lecture).
  • a rating of previous lectures can be obtained in two ways - by directly rating the listeners’ experience and/or by monitoring data from the four key channels: system -process 1 , motor 2, physiological 3, and neurological 4.
  • Examples include ex cathedra lecturing without any interactive parts, with a complex and dry topic in the evening, to full-time listeners 10 who are generally less motivated than part-time listeners. In this case, the listeners’ estimated rating will be lower. A further negative impact on the rating is the fact that lectures that are even more interesting and are held during the day receive average ratings. The estimated rating can be appropriately taken into account by the lecturer 11 to correct certain lecture parameters.
  • Stages of the agent’s process entry and analysis of the current lecture’s parameters via the module for entering the parameters of the lecture, extrapolation of the rating on the basis of previous lectures (optional), display of information.
  • the solution relates to discussions in a virtual space (or lecture room), which are very important for a successful educational process.
  • Practice shows that the number of questions raised in distance teaching (or lectures) is relatively lower than in the case of direct contact.
  • listeners 10 often, especially when there is a large number of listeners 10, instead of using a microphone 14, prefer to use a dialog box to type messages 19, but this is a medium that requires the ability to formulate a question in a concise manner. Therefore, questions are often insufficiently articulated (including because typing is a more complex process than verbal expression).
  • a possible solution for the listener is to first have an anonymous conversation with the artificial intelligent agent 5 through the articulation channel 7. It provides guidance on articulating the question properly and once the goal has been achieved, the lecturer 11 receives notification about the question, after which the question is submitted to them (in writing or orally).
  • the agent 5 can thus further function as an anonymizing proxy or a so-called chatbot, thereby ensuring privacy, while at the same time the obtained data offer insight into the question articulation process, providing additional feedback for the lecturer about the listeners’ knowledge and abilities.
  • Stages of the process include: entry of a question, its articulation with the agent’s support, display of the text (or a question, a remark, or other). This may be done by deployment of chat-bot technologies. For example, as long as a question is not well formulated for a chat-bot, it is not forwarded to the leader of the session.
  • a possible scheme of the agent acting as the chat-bot for articulating questions is the following: a) Entry of a text (question/remark/other), wherein the participant enters the text, which may not be optimal or well formulated, respectively, due to the parallel work during the lecture. b) Articulation with the agent for formal checking phase by the chat-bot function of the agent:
  • Al artificial intelligence checks the content using two or more different Al algorithm similar as used in the matching translating process. If one algorithm/ program provides a translation, there is a possibility of wrong understanding. If 2 or more independent translation algorithm/programs come to the same translation, the possibility of a correct translation is significantly higher. Al can use also natural language processing.

Abstract

The device for providing feedback about alertness of users in two-way audio-video communication platforms comprising an agent arranged to receive, track and/or analyze data from the users based on data provided by sensors and deployed digital devices forming one or more channels: − system-processes channel for following speech, mouse and keyboard use, use of other programs, − motor channel for following locomotion data of users, − physiological channel for following physiological responses of users like blood pressure, oxygenation, body temperature, − neurological channel for following neuro-related data obtained via portable EEG devices or similar. Low alertness indicators include: frequent looking away from the screen or camera, frequent yawning, longer lasting speaking without microphone use, use of other programs or applications, eye movements in different directions, frequent whole-body movements, changes in physiological and/or neurological parameters, longer absence from the camera; while high alertness is a state of active attention, when for example users mostly look at the screen or camera, promptly react to questions, messages and/or tasks, rarely or never yawn, never or rarely disappears from the camera.

Description

A device and a process for managing alertness and quality interaction in two-way audio-video communication platforms
Field of the invention
The present invention belongs to the field of audio-video communication platforms for distance education (i.e., lectures, teaching and communication) and on-line meetings for various types of listeners (pupils, students, university students, adults etc.). The invention relates to a process for managing two-way interaction during use of audio video communication platforms with digital channels.
Background of the invention and the technical problem
The use of digital media and digital devices is becoming the basis for a multitude of activities that are vital for societies, including education or distance learning. As such, there is a growing need for solutions that primarily enable distance education, while also aiming to make the conditions as equivalent as possible to teaching, lecturing and communication under normal, face to face conditions. Then there exists the need to use the new technologies to master the aspects, which represent a problem under normal conditions (or indeed cannot be included at all), but which can be very useful in teaching and are made possible by digital technology. Therefore, it is necessary to also include these novel features, where one of the key goals is the personalization of the learning and study process. Audio-video communication platforms needed for distance learning, lecturing and online meetings offer the two-way transmission of audio and video, as well as additional communication channels, such as those for sending text, expression of the listeners’ reactions (raised finger, clapping, notices that the pacing is too quick, etc.), production and implementation of quizzes, additional video effects (e.g. background selection, virtual and augmented reality), screen sharing, use of breakout rooms, recording, etc. At the same time, we are presented with the challenge of how to ensure that the conditions in the digital environment of the audio-video communication platforms are as close as possible to ordinary meetings in person in order to have a quality virtual meeting.
The whole problem will be presented through the following scenarios, which are the basis for the solutions in this patent application:
1 . “Talking to a wall”. This scenario relates to the so-called effect of talking to a wall. In distance education environments, lecturers are experiencing the feeling of talking into a void, because they are deprived of direct personal contact - a feedback from listeners, which has lately been compounded by the GDPR (General Data Protection Regulation) requirements. But this contact is important, because it allows the lecturer to receive the appropriate reactions. Practical experience shows that it is harder for listeners in these environments to maintain attention, as they engage with other communication channels (e.g., read text messages on their phones, surf the internet or watch streaming content, use social networks, etc.). The problem is often made worse by the multiplicity of attendees, because it is much more difficult for a lecturer in a distance education environment to keep track of tens (or even hundreds) of listeners over the network in order to obtain the relevant feedback. Furthermore, much of the feedback (such as body language or undetectable elements of communication in virtual environments) is simply not there. Even the face itself is shown in a limited manner, as these applications lower the resolution so that they can transmit a solid enough video stream to a crowd of participants.
2. One-way communication. This scenario discusses the problem of one-way communication. Due to frequent “invisible presence” or cases where the listeners have no video feeds, there is no feedback regarding problems, e.g., when a subject is not understood or (dis)agrees with a topic or a statement. The listeners’ video feeds are also a problem, because - although the lecturers can see them - there are too many due to the large number of listeners for the lecturer to be able to observe them to a useful extent. Some platforms partially solve this problem by allowing the listeners to give a positive or negative feedback, e.g., in the form of a thumbs-up or a thumbs- down icon. In practice, the solution is partially effective, because the lecturer can only see a small part of the listeners (or their profiles with video feeds) showing the (dis)agreement icon. At the same time, this solution requires being ready for the listeners actively responding in the indicated ways. It is also problematic where the listeners are tasked with reading through a given study text, as the lecturer does not have any information about whether they have finished or not.
3. Motivation of the listeners. This scenario refers to the listeners’ lack of motivation for participation compared to classic methods of teaching or lecturing or on-line meetings. There are several reasons for this, one of the key ones being the impairment of the group dynamic, which is in normal environments achieved by being directly present in the same room.
4. “Bad days”. This scenario relates to the potential problem of the lecture being received poorly by the listeners in a given group of listeners and at a given time, duration, day, type of lecture relating to the structure of the lecture elements used (e.g., an ex-cathedra lecture, discussion, case study, group work, text assignments, etc.). Examples include ex cathedra lectures, which are not the most effective when held in the evening.
5. Written question - please help! This scenario relates to discussions in the classroom or lecturing room, which are very important for a successful educational process. Practice shows that the number of questions raised in distance teaching (or lecturing) is lower than in the case of a direct contact. Moreover, listeners prefer to use a dialog box to type messages instead of using the microphone. However, this is a medium that requires the ability to formulate a question in a concise manner. Therefore, the questions are often negligently written or insufficiently articulated, as writing is a more complex process than verbal expression.
The technical problem solved by the present invention is how to design a suitable device and perform a process for providing feedback based on different parameters, which will allow a lecturer, a teacher or other participants an assessment of individual and group feedback. State of the art
The topic of the present invention is partially addressed by the patent applications and patents listed below:
US 2020/0085312 A1 discloses systems and methods for detecting a physiological response using different types of photoplethysmography sensors. Examples of physiological responses that may be detected include a stroke, a migraine, certain emotional responses. In one embodiment, a camera captures images of a part of the skin on the subject’s head and correlates them with the photoplethysmogram signal on the neighboring skin segment. The solution remains limited to the mentioned detection and no further applications are covered.
US 2020/0210781 A1 discloses a system and a method that enable context-dependent detection of passive events (e.g. the locomotor system of the user) on a local device using commercially-available low-power sensors, e.g. on a mobile phone (to reduce the need for additional computing, e.g. cloud-based processing). The solution remains limited to the mentioned detection and no further applications are covered.
Patent application WO 2020/061358 A1 relates to systems, devices, and methods for use in the implementation of a human-computer interface using high-speed tracking of user interactions. The embodiments also include neural, motor, and electromyography signals to mediate user manipulation of machines and devices. This solution aims at improvements of human-computer interfaces for manipulating machines and devices.
US 2020/0192478 A1 relates to systems, devices, and methods for use in the implementation of a brain-computer interface that integrates real-time eye-movement tracking with brain activity tracking to update a user interface at high speed. The embodiments also relate to the implementation of a brain-computer interface that uses real-time eye tracking and real-time analysis of neural signals to mediate user manipulation of a device. This solution cannot be used meaningfully in the (two-way) communication platforms such as the ones, addressed in this application.
WO 2005/076241 A1 describes a method and apparatus for recording operations, which are transferred to a server so that the given operation can be performed at a remote location using guidance on the basis of the recording in question. This solution aims at training the user for performing particular operations and no other application is considered.
The patent US 2020/10692606 B2 presents methods, computer programs, and systems that can obtain biometric data of a first user and transmit them to their computer or themselves, or to the user of a second computer or to a second user. Biometric feedback is used for the current stress level classification and includes haptic feedback. The patent aims at measuring stress level and not alertness, nor does it aim at improving interaction in terms of better education and acquiring knowledge and managing group of pupils and students.
Patent application US 20200358627 describes a solution inside a computing system, which provides a meeting evaluation based on the parameters from meeting quality monitoring devices. The quality parameters (each quantify meeting conditions during one or more previous meetings) are used to determine an overall quality score for each of the meetings. The meeting insight computing system relies on quality parameters received from a plurality of meeting quality monitoring devices like the Internet of Things. These parameters enable an understanding of the real-world context in which the meetings take place and can be used by users or organizations to improve overall meeting quality by appropriate planning of these meetings. So, the patent aims at improving the quality of the meetings via the context determination like appropriate structure of attendees, appropriate room and time of the day, etc. These parameters are controlled by an organization. There are several disadvantages of the mentioned patent application, wherein the most notable is absence of feedback for each individual person attending the meeting. In addition, the solution presented in this application focuses on current sessions and contexts, which cannot be changed or controlled. For example, students’ population for a certain course that is conducted on-line is given as is. Further, the students’ private environments cannot be controlled or changed by a university staff either. Further, the solution proposed in this application is focused on real-time event management, not its future planning. Further, the solution proposed in this document addresses privacy issues by design, while the mentioned patent application is not concerned with privacy at all.
Description of the invention
The presented invention aims to solve disadvantages of prior art and inherently functions differently from the above-described known solutions. It is based on the deployment of an agent that gathers data and controls the alertness and quality of interactions. It does not necessarily need external devices but uses meeting providing devices that are available at the spots where the on-line session is taking place. It further uses data from the devices that attendees are wearing (like smart watches), and finally, it uses the data from the communication equipment of attendees of on-line meetings I lectures. These data come via at least one of four channels described in further detail below and are used by the agent to make an estimation of quality of interactions, based on information from at least one channel, but preferably all four channels.
The essence of the invention is in that the device for providing feedback in two-way audio-video communication platforms comprises an agent arranged to receive, track and/or analyze data from listeners of online lectures and/or participants of online meetings and/or lecturer(s) and/or meeting manager(s) (leaders) based on data provided by the sensors embedded in deployed digital devices, including the devices performing the two-way communication. The said digital devices and sensors form one or preferably more channels selected in the group consisting of:
- System -process channel for following, e.g., speech with a microphone on or off, for following a mouse and/or tracking pad movement, for tracking the use of other applications, for keyboard use, etc., which can be formed exclusively in the used digital device on which the two-way communication platform is running, like a personal computer,
- Motor channel for following locomotion data of the participants, which are obtained from sensors that are placed on the participants, or are present in their environments, like cameras, smart glasses, acceleration sensors;
- Physiological channel for following physiological responses of the participants, like blood pressure, oxygenation, body temperature, obtained via, for example, smart watches;
- Neurological channel for following neuro-related data, focused on central or peripheral nervous system data, e.g., obtained via portable EEG devices or its lightweight variants or any other device arranged to capture neurological signals.
Although the preferred embodiment uses all four channels, any other combination of 1 , 2 or 3 channels is possible. Feedback can be generated with only one channel, any of the above-mentioned channel types, wherein feedback generated from any combination of two or more channels is more reliable and more complete. The agent performs at least the following operations:
- Detection of the type of the digital devices used by users (listeners, participants) and connection with at least one detected device, wherein it is preferred that the user of the device gives consent to the agent in order to collect the data from the detected device.
- Data collection from the communication device used by users or sensors placed on the listeners or in their surroundings transmitted via communication protocols like WiFi, Bluetooth and so on, wherein the data is about listeners and at least one can be selected in the group of following: o Body/body parts movements, o Eye movements, o Face changes, o Speech detection (including microphone on and/or off state), o Heart rate, o Blood oxygenation level, o EEG patterns, o Use of other applications on the used communication device, o Mouse movements or tracking pad use, o Keyboard typing (like time-based patterns), o Other data from digital devices like cameras, smart watches, etc.,
- Analysis of the data collected in the previous step by the agent, wherein the analysis is based on the patterns identified as representing a specific motor, and/or physiological and/or neurological state of the listeners that express the state of alertness of the listeners, including the following ones:
• frequent and prolonged averting of the eyes away from the camera/monitor;
• frequent glances in multiple directions (up, down, left, right);
• frequent yawning, having a bored face;
• looking downward for a longer period of time could indicate phone use - in this case, no typing is detected on the computer, which may mean no taking notes;
• nodding as a sign of agreement nodding as a sign of disagreement;
• rising a hand and/or finger as sign of agreement and opposite - turn finger down as a sign of disagreement;
• rising hand as a sign of positive voting;
• detection of speech when the microphone is off (on a video platform) - this could indicate a conversation outside the lectures;
• frequent use of the browser and other programs, which are generally not directly related to the lectures;
• other signs transmitted from sensors via communication protocols like WiFi or Bluetooth, which may be attached to the listener, or placed in his surroundings, such as, for example, changes in skin conductivity, decreased blood oxygenation level, increased heart rate, typical patterns in the electroencephalogram, dilated iris and similar, indicating certain motor- physiological-neurological states. These are for example boredom, decreased concentration, increased physical activity and other.
- Display of the information about the state of alertness of the listeners, wherein the results may be presented as color graphics, numbers or in any other suitable form of presentation, and wherein the results may be presented for each of the users independently and/or for the whole group, - Forwarding data analysis from the previous step to the module for providing information to the meeting host (leader) or lecturer, and
- optionally storage of data, deletion and/or further processing of collected data.
The agent may be designed as purely software-based, hardware-based or as a combination of software and hardware. It consists of data capturing and aggregating part, data analysis part (which can include statistical methods or machine learning methods), and data presentation part. Further, the agent is arranged to provide general feedback for all participants, which can include statistical methods or machine learning methods. It can also allow formation of bonus points, wherein the lecturer can award involvement of particular participants on the basis of their activity during the lecture.
State of alertness can be low, average or high or anything in between, wherein:
- low alertness is characterized by at least one of the following options: frequent looking away from the screen or camera, frequent yawning, longer lasting speaking without microphone use, also after automatic notification that the microphone is off, use of other programs or applications, eye movements in various directions, frequent whole-body movement, changes in physiological and/or neurological parameters, longer absence from the camera; and
- high alertness is a state of active attention, when for example users mostly look at the screen or the camera, promptly react to questions, messages and/or tasks, rarely or never yawn, never or rarely disappears form the camera and similar.
Average alertness is somewhere in between both extreme states, wherein the state may vary during a single meeting or lecture and can be followed by the abovedescribed device comprising the agent.
In order to facilitate estimation of alertness, the device also allows optional implementation of calibration questions at the beginning of the meeting or lecture and during the meeting or the lecture, wherein the question can be formulated in a manner that allows determination of a baseline (using, for example, heart rate, etc.). Further, the question during the lecture or meeting can be formulated specifically in order to allow the user to evaluate his own attention. In addition, other means can be used by the device like specific short videos, etc.
The device for providing feedback according to the invention comprises at least the following components:
- Communication means for connection with the digital devices of users (listeners, participants) and the lecturer(s),
- An audio-video communication platform arranged for interaction among one or more users,
- One or more transmitters to send the data from the sensors embedded in said digital devices, wherein sensors may be selected in the group consisting of: o Camera for detecting body/part of the body movements, face changes, o Camera and/or similar sensor for tracking eye movements, o Microphone for speech detection (including microphone on and/or off state), o Keypad and/or mouse and/or tracking pad sensors, o Heart rate sensor, o Blood oxygenation level sensor, o EEG, o Any other suitable sensor, where the function of a sensor can be also considered any device that provides biometry specific data like a keyboard,
- An agent arranged to receive, track and analyze the data from the digital devices and/or sensors, in order to provide feedback on the state of alertness of the participants,
- A memory device for storing the information from the sensors and for the analysis performed by the agent,
- A module for providing feed-back information to the lecturer, wherein the lecturer is shown either an individual and/or a group feedback for participants in any suitable representation, most preferably as numerical values or color-coded scale. For example, a high numerical value is connected with high attention, or a green color means high attention, while red color means low attention. The process for providing feedback about the state of alertness of users of two-way communication platforms comprises at least the following steps: a) Detection of the type of digital devices used by users (listeners, participants) and connection with at least one detected device, wherein it is preferred that the user of the device gives consent to the agent in order to collect data from the detected device, b) Data collection from the communication device used by users or sensors on the listeners or in their surroundings transmitted via communication protocols like WiFi, Bluetooth and so on, wherein the data is about listeners and at least one can be selected in the group of following:
- Body movements, such as nodding as a sign of agreement or as a sign of disagreement, rising hand and/or finger as a sign of agreement and opposite, turning finger down as a sign of disagreement, rising hand as a sign of positive or negative voting,
- Eye movements,
- Face changes,
- Speech detection (including microphone on and/or off state),
- Heart rate,
- Blood oxygenation level,
- EEG patterns,
- Use of other applications and/or sensors on the used communication device,
- Mouse movements or tracking pad use,
- Keyboard typing (like time-based patterns),
- Other data from digital devices like cameras, smart watches, etc, c) Analysis of the data collected in the previous step by the agent, wherein the analysis is based on the patterns identified as representing a specific motor, and/or physiological and/or neurological state of the listeners that express the state of alertness of the listeners, including the following ones:
- frequent and prolonged averting of the eyes away from the camera/monitor;
- frequent glances in multiple directions (up, down, left, right);
- frequent yawning, having a bored face; - looking downward for a longer period of time could indicate phone use - in this case, no typing is detected on the computer, which may mean no taking notes;
- nodding as a sign of agreement nodding as a sign of disagreement);
- rising a hand and/or a finger as sign of agreement and opposite - turn finger down as a sign of disagreement;
- rising a hand as a sign of positive voting;
- detection of speech when the microphone is off (on a video platform) - this could indicate a conversation outside the lectures;
- frequent use of the browser and other programs, which are generally not directly related to the lectures;
- other signs transmitted from sensors via communication protocols like WiFi or Bluetooth, which may be attached to the listener, or placed in his surroundings, such as, for example, changes in skin conductivity, decreased blood oxygenation level, increased heart rate, typical patterns in the electroencephalogram (EEG), dilated iris and similar, indicating certain motor- physiological-neurological states; d) Display of the information about the state of alertness of the listeners, wherein the results may be presented as color graphics, numbers or in any other suitable form of presentation, and wherein the results may be presented for each of the users independently and/or for the whole group; e) Forwarding data analysis from the previous step to module for providing information to the meeting host (leader) or lecturer, and f) optionally storage of data, deletion and/or further processing of collected data.
None of the existing solutions have a system for monitoring responses of listeners/participants (e.g., their concentration or alertness) or explicit digital conceptualizations using at least one of the four key channels: system-processes, motor, physiological, and neurological. Furthermore, none of them provide an effective display method (a color display line with an appropriate scale, or a display pointer that moves along an appropriate scale) for relevant feedback according to each channel, a combination of these channels, or across all channels together in real time or post- festum. Furthermore, none of these methods ensure privacy of the listeners while also appropriately informing the lecturer about the state of the listeners (auditorium), which is possible with the present solution, because the agent privacy-protects the data with methods like anonymization, differential privacy and so forth and discards the original private data after the required functionality is provided.
The invention will be further described on the basis of exemplary embodiments and figure 1 , which shows a schematic view of the process and all the devices and participants involved in the process for providing feedback in two-way audio-video communication means.
The present invention uses digital devices 9, such as computers, smartphones, electronic tablets, and digital communications to establish sensory channels 1 to 4 from the listeners to the lecturer, to which system -processes data about the basic processes on the used digital devices flow, including hardware elements, such as camera, microphone, keyboard and mouse. Established channels 1 to 4 (motor, system, physiological, neurological) provide the data to the agent 5 for analysis, while the agent 5 gives then the lecturer feedback about the motor/physiological/neurological state of the listeners and their activities and responses. In practice, this refers, for example, to the state of the listeners or the auditorium with regards to concentration and attention, agreement or disagreement, lecture-unrelated activities (making calls, playing games, etc.), and other important parameters that serve as vital information for the lecturer in providing a quality lecture.
The threshold values for motor/neurological/physiological states cannot be determined in a general manner, some tangible values could be experimentally defined, however, variability of users has to be taken into account. A possible method for experimental definition of threshold values is the following: i. Define the tracked variables like eye movement, hand or palm position, heart rate, mouse movements, etc., ii. Perform an experimental phase with several lectures with different lecturers and listeners, with tracking of the defined variables, wherein camera and other necessary sensors are continuously tracked and analyzed, iii. Each lecture is evaluated based on the opinion of the listeners, the lecturer and independent evaluators also attending the lectures, iv. Based on the analysis of data and the opinion of all participants the key values for particular variable is determined in order to achieve different levels of alertness and/or lecture quality, v. Statistical analysis and averaging is used to define the final threshold values or ranges of values for particular variable.
The preferred embodiment includes the following essential elements with the associated functionalities:
• Telemetric (remote) data capture via appropriate sensors. This is done through four basic sensory channels 1 to 4, which are of a logical nature and realized as: system -processes 1 , motor 2, physiological 3, and neurological 4. They provide the lecturer 11 with real-time feedback on whether the listeners 10 are in a state of attention and actively following the education process or they are in an uncooperative state (e.g., too tired or in an insufficiently active state). Accordingly, the lecturer 11 , who can be a teacher, a professor, an expert or any other person leading the lecture or the meeting, can adjust the course of teaching or lecturing. These channels, which combine the data according to their generic origin and which are captured via sensors 14 are already included on the used digital devices 9, for example computers, or they are enabled via additional equipment (e.g., virtual reality headsets) and are connected to the audio-video platform 8.
• The agent 5, which functions as a proxy and analyzes the data collected from the listeners 10 and/or the lecturer 11 and which ensures data privacy, for example by masking the personal identification parts, or ensures the privacy of the participants in some other way, for example by using differential privacy techniques. The agent 5 manages the module for storing information 15 to store basic and analytically derived data and performs real-time and post-festum analytics. The analytics includes both the processing of the data obtained from the individual listeners 10 and the lecturer 11 , as well as privacy ensuring processing of the data and the production of their aggregates. In this regard, detecting characteristic patterns in the video feeds or signals from the other sensors 14, e.g., the physiological and motor responses of non-concentration (yawning, having a bored face, having a conversation not intended for communication with the lecturer 11 ) or some other response of the listener 10 (e.g., nodding as a sign of agreement, nodding as a sign of disagreement, rising a hand and/or finger as sign of agreement and opposite - turn finger down as a sign of disagreement, and/or rising hand as a sign of positive voting) is very important. As a rule, these are shown only to the lecturer 11 , not to the participants. The data analysis is based on methods, such as pattern detection and statistical methods, where the received data can be analyzed in relation to established data set for alertness, and machine learning, where the data from an alert person are used for training a neural network, and the received data is then submitted to this network for evaluation. After the session, the lecturer is also shown a general result of the processing of individual parameters based on the metrics used for the entire virtual meeting, a ranking compared to other virtual meetings and the possibility of additional analytics (e.g., pivot tables).
• The agent 5 is also associated with an optional two-way subchannel realized via the audio-video platform 8 that is called the articulation subchannel 7, which the agent 5 uses to help the listeners 10 articulate their questions or capture the natural responses, such as confirmations by giving a thumbs-up 12 or nodding 13, which can be forwarded to the lecturer after they are processed. This subchannel, which is running through the platform central communication channel, can deploy various techniques to help the listeners (e.g., chat-bot techniques, which may be basic like spelling and syntax checking ones, or advanced like natural language processing technology).
• The module for providing information 6 that is sent and shown, typically in graphic form, to the lecturer 11 and/or the listeners 10 in an appropriate aggregate and/or anonymized form. The main element is generally a dynamic color-coded bar (e.g., going from green, which represents full attention, and through yellow to red, which represents absence of attention) or the Likert scale (e.g., from 0 to 10, where 0 means absence of attention and 10 full attention) or an analogous method that shows other reactions or indicators, such as agreement, completion of page reading, etc.
• The module for storing information 15 with data anonymized for subsequent analysis, whereby the agent can also store all other data pursuant to the Ell GDPR requirements.
With the present solution, listeners 10 can, for example, choose not to transmit their camera feeds to the audio-video communication platform 8 where the other participants would see them, because the other sensors 14 (in connection with the indicated four channels 1 to 4) make it possible to collect the data necessary for analytics. The camera can also be active solely for the purposes of analytics or for the purposes of analytics, but without showing the feed on the audio-video platform 8.
The present invention enables the detection of attention in a way that the privacy of the participants (listeners 10 and lecturer(s) 11 ) is ensured, whereby the data that is available to the lecturer (or a potential attacker) is insufficient to carry out an attack on the user’s identity (i.e. , they protect their privacy). It is desirable that the communication takes place over an encrypted channel, which is supported by most audio-video platforms 8.
Examples
Example 1 : Talking to a wall
Telemetry is used to collect and analyze data about listeners 10 from their digital devices (e.g., a computer, mobile phone, etc.). In this way, lecturers 11 obtain feedback about whether the concentration level of the listener 10 has decreased. Feedback about the listeners’ concentration levels is presented to the lecturer via the module for providing information 6, e.g., a color scale or a bar, measuring line or other suitable representation of results that dynamically changes color from green (“attention present”) to red (“attention absent”). This type of telemetry is based on the data captured via sensors associated with the indicated four channels. These are the stages of the process that takes place in the agent: a) Data collection and identification of characteristic patterns of concentration
At the first stage, the system detects the type of a digital device the listener 10 is using (e.g., a computer with one or two monitors, smartphone, electronic tablet, etc.). Accordingly, the process then branches out as is indicated below.
When using a computer with one monitor (or an electronic tablet), the following data may indicate lower concentration:
- frequent and prolonged averting of the eyes 13 away from the camera/monitor
- frequent glances 13 in multiple directions (up, down, left, right)
- frequent yawning, having a bored face 13
- looking downward 13 for a longer period of time could indicate phone use - in this case, no typing is detected on the computer, which may mean no taking notes
- detection of a speech when the microphone 14 is off (on a video platform) - this could indicate a conversation outside the lectures
- frequent use of a browser and other programs, which are generally not directly related to the lectures
- other signs transmitted from sensors 14 via communication protocols like WiFi or Bluetooth, which may be attached to the listener, or placed in his surroundings, such as, for example, changes in skin conductivity, decreased blood oxygenation level, increased heart rate, typical patterns in the electroencephalogram (EEG), dilated iris and similar, indicating certain motor- physiological-neurological states. These are for example boredom, decreased concentration, increased physical activity and other.
When using a computer with two monitors, the following information may indicate lower concentration:
- frequent glances 13 in multiple directions (up, down, left, right)
- frequent yawning, having a bored face 13 - looking downward 13 for a longer period of time could indicate phone use - in this case, no typing is detected on the computer, which may mean no taking notes
- detection of a speech when the microphone 14 is off (on a video platform) - this could indicate a conversation outside the lectures
- frequent use of the browser and other programs, which are generally not directly connected to the lectures other signs transmitted from sensors 14 via communication protocols like WiFi or Bluetooth, which may be attached to the listener, or placed in his surroundings, such as, for example, changes in the skin conductivity, decreased blood oxygenation level, increased heart rate, typical patterns in the electroencephalogram (EEG), dilated iris and similar, indicating certain motor-physiological-neurological states. These are for example boredom, decreased concentration, increased physical activity and other
When using a smartphone, the following information may indicate lower concentration:
- frequent and prolonged averting of the eyes 13 away from the camera/screen
- frequent glances 13 in multiple directions (up, down, left, right)
- frequent yawning, having a bored face 13
- detection of a speech when the microphone 14 is off (on a video platform) - this could indicate a conversation outside the lectures
- frequent use of a browser and other programs, which are generally not directly related to the lectures other signs transmitted from sensors 14 via communication protocols like WiFi or Bluetooth, which may be attached to the listener, or placed in his surroundings, such as, for example, changes in skin conductivity, decreased blood oxygenation level, increased heart rate, typical patterns in the electroencephalogram, dilated iris and similar, indicating certain motor-physiological-neurological states. These are for example boredom, decreased concentration, increased physical activity and other. b) Data analysis
All this data and/or information is provided to the agent 5 (a proxy, which may also be in the form of an avatar), which processes it e.g., by means of processing the video feed and other signals using mathematical (statistical) procedures, and artificial intelligence procedures, in particular machine learning. With mathematical procedures, reference values are used and the received values are compared with these reference values to determine if attention thresholds are exceeded. With machine learning, a software like a neural network is trained with data from a subject that is in an alert state, and the received data is then feed into the network which decides if alertness is present or not. For the new kinds of data, correlations can be used with known and already deployed kind of data. c) Display of information
Aggregates and/or privacy-preserving processed information of the received data are sent by the agent 5 to the lecturer via the module for providing information 6, generally in the form of a dynamic graphic color display of attention. This is visible for each listener 10 separately - similar to lectures in person where due to being physically present, the lecturer 11 can observe each of the listeners 10. The system can also display aggregate data for the entire auditorium of all listeners 10, which represent the average or weighted values of individual parameters. d) Storage of information
The agent 5 discards other data in real time or processes them appropriately, e.g., by anonymizing them or by deploying differential privacy techniques, for subsequent analysis, whereby the agent 5 can also store all other data pursuant to the Ell GDPR requirements within the module for storing information 15.
Example 2: One-way communication
Here we discuss the problem of one-way communication, where there is no feedback about (dis)agreement with a topic or statement due to the absence of the video feeds of the listeners 10, or due to their overly large number (tens or even hundreds). A possible solution is that the listeners 10 nod or shake their heads 13, they can also lift a finger/thumb and/or raise the arm/palm 12, which the camera 14 detects, and the system then sends to the lecturer 11 aggregate information about this or other methods of non-verbal communication. This solution is more intuitive than those that require a reaction from the listeners 10, by means of giving their opinion via an appropriate icon (e.g., I agree, I disagree).
Another example of one-way communication is where the lecturer 10 gives to the listeners 11 an assignment or case study to examine, but then does not know who has finished reading (e.g., a case study or part of a page). In this solution, the agent 5 uses a camera or other sensors 14 to detect typical head movements 13 when reading, eye tracking (via the so-called heat maps) and sends to the lecturer 11 comprehensive information in real time about how many listeners 10 have finished reading. The number of listeners may also be given as a share of users that finished the task. This means that gradual and/or repetitive (in the case of longer texts) downward movements of the head or eyes 13 are detected, which indicates the tracking of text or image. Once it is recognized in the movement patterns of the head or eyes that the text/image is no longer being tracked, we can conclude that the reading is finished.
Similar to the previously presented solution “Talking to a wall”, this solution also uses the following process structure: Data collection, Data analysis, Display of information.
Example 3: Motivation of the listeners
This scenario refers to the lack of motivation of the listeners 10 to participate compared to classic methods of teaching or lecturing. A possible solution is to use additional methods of motivation through a motivational scoring scale or collecting bonus points. Listeners 10 who participate gain bonus points that are shown on the motivational scoring scale of each listener 10 and collected in an appropriate file or the module for storing information 15. They are obtained on the basis of different elements of participation, e.g., oral participation, written questions raised in a dialog box, participation in quizzes within the audio-video communication platform 8 and monitoring of reactions and active participation in other activities offered by the platform 8. Bonus points are assigned via the agent 5 to the individual listeners 10 and sent to the lecturer 11 who can take them into account in assessment. The substantive quality of the participation can be weighted accordingly by the lecturer 11 .
Similar to the previously presented solution “Talking to a wall”, this also uses the following process structure: Data collection, Data analysis, Display of information, Storage of information.
Example 4: “Bad days”
In this solution, the agent 5 presents the likelihood of the lecturer/group having a quality meeting by analyzing the parameters of the current and previous lectures. To rate the estimated quality, the lecturer enters certain parameters via the module for entering the parameters of the current lecture 16, which affect the quality of a meeting. These include the term of the lecture (date and time), listeners’ characteristics (e.g., age, motivation), structure of the lecture (e.g., an ex-cathedra lecture, discussion, case study, group work, text assignments and similar), and a subjective assessment of the attractiveness of the lecture topic (complexity or dryness of the topic, quality of the lecture). The process includes determining for each of the indicated parameters whether it will have a positive or a negative impact (e.g., lower ratings are given when only an ex-cathedra lecture is held in the evening).
Based on the presented influencing parameters, the estimation/final value can be calculated either using the standard statistical methods, e.g., weighted sum of influencing parameters, multiple regression etc. or using machine learning tools.
The quality of the rating can be further improved if the agent 5 when making the rating takes into account also the data from the previous lectures (e.g., by extrapolation) for which comparable data are available about the term of the lecture, listeners’ characteristics, structure of the lecture and subjective assessment of the attractiveness of the lecture topic. A comparison is made according to individual parameters for the current and the previous lecture(s) and then used to additionally correct the already provided rating (for the current lecture). A rating of previous lectures can be obtained in two ways - by directly rating the listeners’ experience and/or by monitoring data from the four key channels: system -process 1 , motor 2, physiological 3, and neurological 4.
Examples include ex cathedra lecturing without any interactive parts, with a complex and dry topic in the evening, to full-time listeners 10 who are generally less motivated than part-time listeners. In this case, the listeners’ estimated rating will be lower. A further negative impact on the rating is the fact that lectures that are even more interesting and are held during the day receive average ratings. The estimated rating can be appropriately taken into account by the lecturer 11 to correct certain lecture parameters.
Stages of the agent’s process: entry and analysis of the current lecture’s parameters via the module for entering the parameters of the lecture, extrapolation of the rating on the basis of previous lectures (optional), display of information.
Example 5: Written question - please help!
The solution relates to discussions in a virtual space (or lecture room), which are very important for a successful educational process. Practice shows that the number of questions raised in distance teaching (or lectures) is relatively lower than in the case of direct contact. Moreover, listeners 10 often, especially when there is a large number of listeners 10, instead of using a microphone 14, prefer to use a dialog box to type messages 19, but this is a medium that requires the ability to formulate a question in a concise manner. Therefore, questions are often insufficiently articulated (including because typing is a more complex process than verbal expression).
A possible solution for the listener is to first have an anonymous conversation with the artificial intelligent agent 5 through the articulation channel 7. It provides guidance on articulating the question properly and once the goal has been achieved, the lecturer 11 receives notification about the question, after which the question is submitted to them (in writing or orally). The agent 5 can thus further function as an anonymizing proxy or a so-called chatbot, thereby ensuring privacy, while at the same time the obtained data offer insight into the question articulation process, providing additional feedback for the lecturer about the listeners’ knowledge and abilities.
Stages of the process include: entry of a question, its articulation with the agent’s support, display of the text (or a question, a remark, or other). This may be done by deployment of chat-bot technologies. For example, as long as a question is not well formulated for a chat-bot, it is not forwarded to the leader of the session.
A possible scheme of the agent acting as the chat-bot for articulating questions is the following: a) Entry of a text (question/remark/other), wherein the participant enters the text, which may not be optimal or well formulated, respectively, due to the parallel work during the lecture. b) Articulation with the agent for formal checking phase by the chat-bot function of the agent:
- syntax check,
- grammar check,
- natural language processing check, wherein this phase is similar to verification of text during writing, where the text editor verifies the syntax and grammar and suggests improvements. But our process is upgraded with basic solutions/concrete suggestions for improvements mainly regarding the syntax which the agent offers the participant to speed up the writing process. This phase is important to avoid basic mistakes. In addition, it represents a quality preparation of text for the next phase. c) Content checking phase comprising the following sub-phases:
- Al (artificial intelligence) checks the content using two or more different Al algorithm similar as used in the matching translating process. If one algorithm/ program provides a translation, there is a possibility of wrong understanding. If 2 or more independent translation algorithm/programs come to the same translation, the possibility of a correct translation is significantly higher. Al can use also natural language processing.
- Machine preparation of suggestions for improvement, wherein two possible scenarios may occur: o In certain cases, Al will be able to “understand” the text meaning and propose improvements, or o In other cases, Al will not be able to propose suggestions.
- Participant’s action: o In case of a) participants will be able to choose and accept the suggestion o In case b) participant will be asked to improve the text and repeat the procedure starting with Formal checking phase d) Display of the text, wherein the question (remark/other) is presented on the platform to the lecturer and/or all/individual participants.

Claims

25 Patent claims
1 . A device for managing alertness and quality of interaction in two-way audio-video communication platforms, characterized in that the device comprises an agent arranged to receive, track and/or analyze data from at least one user based on data provided by sensors embedded in digital devices detected by the agent and used by at least one user in order to form feedback, wherein the said digital devices and sensors form at least one or preferably more channels selected in the group consisting of:
- a system -processes channel arranged for following speech with microphone on or off, for following mouse and/or tracking pad movement, for tracking use of other programs, for keyboard use, and/or other digital devices, which can be formed exclusively in the used digital device on which the two-way communication platform is running,
- a motor channel arranged for following locomotion data of the users, which are obtained from sensors that are placed on the users, or are present in their environments;
- a physiological channel arranged for following physiological responses of the users, like blood pressure, oxygenation, body temperature, obtained via, for example, smart watches; and
- a neurological channel arranged for following neuro-related data obtained via portable EEG devices, its lightweight variants or any other device arranged to capture neurological signals.
2. The device according to claim 1 , wherein feedback is generated from one, or any combination of two or more channels, preferably all four channels.
3. The device according to claim 1 or claim 2, wherein the agent is arranged to perform at least one of the following operations:
- detection of the type of digital devices used by users and connection with at least one detected device, wherein it is preferred that the user of the device gives consent to the agent in order to collect data from the detected device, - data collection from the communication device used by users or sensors on the listeners or in their surroundings transmitted via communication protocols like WiFi, Bluetooth and similar, wherein use of applications and/or functional components of digital devices is also detected,
- analysis of the data collected in the previous step by the agent, wherein the analysis is based on the patterns identified as representing a specific motor, and/or physiological and/or neurological state of the users that express the state of alertness of the users,
- display of the information about the state of alertness of the users, wherein the results may be presented as color graphics, numbers or in any other suitable form of presentation, and wherein the results may be presented for each of the users independently and/or for the whole group,
- forwarding data analysis from the previous step to the module for providing information to the meeting host or lecturer, and
- optionally storage of the data, deletion and/or further processing of collected data. The device according to the preceding claim, wherein the data is about listeners and can be selected in the group of following:
- body movements, such as nodding as a sign of agreement or as a sign of disagreement, rising hand and/or finger as a sign of agreement and opposite, turning finger down as a sign of disagreement, rising hand as a sign of positive or negative voting,
- eye movements,
- face changes,
- speech detection, including microphone on and/or off state,
- heart rate,
- blood oxygenation level,
- EEG patterns,
- use of other applications and/or sensors on the used communication device,
- mouse movements or tracking pad use,
- keyboard typing, - any other suitable sensor, where the function of the sensor can be also considered any device that provides biometry specific data, like a keyboard,
- other data from digital devices like cameras, smart watches, and similar. The device according to claim 3 or claim 4, wherein patterns identified as representing a specific activity via the system -processes channel, and/or motor, and/or physiological and/or neurological state of the users that express the state of alertness of the users, comprise one or more of the following in any possible combination:
- frequent and prolonged averting of the eyes away from the camera/monitor;
- frequent glances in multiple directions, i.e. , up, down, left, right;
- frequent yawning, having a bored face;
- looking downward for a longer period of time could indicate phone use - in this case, no typing is detected on the computer, which may mean no taking notes;
- detection of a speech when the microphone is off indicating a conversation outside the lectures;
- frequent use of a browser or other programs, which are generally not directly related to the lectures;
- other signs transmitted from sensors, such as, changes in skin conductivity, decreased blood oxygenation level, increased heart rate, typical patterns in the electroencephalogram (EEG), dilated iris and similar, indicating certain motor- physiological-neurological states. The device according to any of the preceding claims, wherein the state of alertness can be low, average or high or anything in between, wherein:
- low alertness is characterized by at least one of the following options: frequent looking away from the screen or camera, frequent yawning, longer lasting speaking without microphone use, also after automatic notification that the microphone is off, use of other programs or applications eye movement in different directions, frequent whole-body movement, changes in physiological and/or neurological parameters, longer absence from the camera; and 28
- high alertness is a state of active attention, when for example users mostly look at the screen or camera, promptly react to questions, messages and/or tasks, rarely or never yawn, never or rarely disappears form the camera and similar. The device according to any of the preceding claims, wherein the device also allows optional implementation of calibration questions in the beginning of the meeting or lecture and during the meeting or lecture, wherein the question can be formulated in a manner that allows the determination of a baseline alertness level. The device according to any of the preceding claims, wherein the agent further acts as a chat-bot for providing guidance on articulating a question formed by a user. The device according to any of the preceding claims, wherein the agent further acts as an analysis module for sensing if a particular task, such as reading a text, has been carried out by at least one user, wherein the agent is arranged to use the camera and/or other sensors to detect typical head movements when reading, eye tracking, wherein gradual and/or repetitive downward movements of the head and/or eyes are detected, reading is performed, and absence of previous downward movements of the head and/or eyes indicates finished reading, and wherein the agent is arranged to send the lecturer real time information about the number or share of users who finished the task. The device according to any of the preceding claims, wherein the agent is arranged to provide general feedback for all participants, which can include statistical methods or machine learning methods. The device according to any of the preceding claims, wherein the agent may be designed as purely software-based, hardware-based or as a combination of software and hardware. The device according to the preceding claim, wherein the agent comprises a data capturing and aggregating part, anonymization of all obtained data, a data analysis 29 part optionally comprising statistical methods or machine learning methods, and data presentation part. The device according to any of the preceding claims, wherein the device comprises at least the following components:
- Communication means for connection with the digital devices of users (listeners, participants) and the lecturer(s),
- An audio-video communication platform arranged for interaction among one or more users,
- One or more transmitters to send the data from the sensors embedded in said digital devices, wherein sensors may be selected in the group consisting of: o Camera for detecting body movements, face changes, o Camera and/or similar sensor for tracking eye movements, o Microphone for speech detection (including microphone on and/or off state), o Keypad and/or mouse and/or tracking pad sensors, o Heart rate sensor, o Blood oxygenation level sensor, o EEG, o Any other suitable sensor,
- the agent arranged to receive, track and analyze the data from the digital devices and/or sensors, in order to provide feedback on the state of alertness of the participants,
- A memory device for storing the information from the sensors and analysis performed by the agent,
- A module for providing feed-back information to the lecturer, wherein the lecturer is shown either individual and/or group feedback for users in any suitable representation, most preferably as numerical values or color-coded scale. 30 A process for managing alertness and quality interaction in two-way audio-video communication platforms performed by the device according to any of the preceding claims, characterized in that the process comprises the following steps: a) Detection of the type of digital devices used by users and connection with at least one detected device, wherein it is preferred that the user of the device gives consent to the agent in order to collect data from the detected device, b) Data collection from the communication device detected in step a) and used by users or sensors on the users or in their surroundings transmitted via communication protocols like WiFi, Bluetooth and similar, c) Analysis of the data collected in the previous step by the agent, wherein the analysis is based on the patterns identified as representing a specific motor, and/or physiological and/or neurological state of the users that express the state of alertness of the listeners, d) Display of the information about the state of alertness of the users, wherein the results may be presented as color graphics, numbers or in any other suitable form of presentation, and wherein the results may be presented for each of the users independently and/or for the whole group, e) Forwarding data analysis from the previous step to the module for providing information to the meeting host or lecturer, and f) optionally storage of the data, deletion and/or further processing of collected data. The process according to the preceding claim, wherein the data about users in step b) is selected in the group of following:
- Body movements, such as nodding as a sign of agreement or as a sign of disagreement, rising hand and/or finger as a sign of agreement and opposite, turning finger down as a sign of disagreement, rising hand as sign of positive or negative voting,
- Eye movements,
- Face changes,
- Speech detection (including microphone on and/or off state),
- Heart rate, 31
- Blood oxygenation level,
- EEG patterns,
- Use of other applications and I or sensors on the used communication device,
- Mouse movements or tracking pad use,
- Keyboard typing (like time-based patterns),
- Any other suitable sensor, where the function of the sensor can be also considered any device that provides biometry specific data, like a keyboard,
- Other data from digital devices like cameras, smart watches, etc., The process according to claim 14 or 15, wherein patterns identified in step c) as representing a specific motor, and/or physiological and/or neurological state of the users that express the state of alertness of the users, comprise one or more of the following in any possible combination:
- frequent and prolonged averting of the eyes away from the camera/monitor;
- frequent glances in multiple directions, i.e. , up, down, left, right;
- frequent yawning, having a bored face;
- looking downward for a longer period of time could indicate phone use - in this case, no typing is detected on the computer, which may mean the taking of notes;
- detection of speech when the microphone is off indicating a conversation outside the lectures;
- frequent use of a browser or other programs, which are generally not directly related to the lectures;
- other signs transmitted from sensors, such as, changes in skin conductivity, decreased blood oxygenation level, increased heart rate, typical patterns in the electroencephalogram (EEG), dilated iris and similar, indicating certain motor- physiological-neurological states.
PCT/SI2022/050004 2021-02-04 2022-02-04 A device and a process for managing alertness and quality interaction in two-way audio-video communication platforms WO2022169424A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
EP22712098.7A EP4289115A1 (en) 2021-02-04 2022-02-04 A device and a process for managing alertness and quality interaction in two-way audio-video communication platforms

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
SI202100017A SI26152A (en) 2021-02-04 2021-02-04 Procedure for managing two-way interaction when using audio-video communication platforms using digital channels
SIP-202100017 2021-02-04

Publications (1)

Publication Number Publication Date
WO2022169424A1 true WO2022169424A1 (en) 2022-08-11

Family

ID=80930090

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/SI2022/050004 WO2022169424A1 (en) 2021-02-04 2022-02-04 A device and a process for managing alertness and quality interaction in two-way audio-video communication platforms

Country Status (3)

Country Link
EP (1) EP4289115A1 (en)
SI (1) SI26152A (en)
WO (1) WO2022169424A1 (en)

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2005076241A1 (en) 2004-02-05 2005-08-18 Medcom Limited Internet based distance learning of medical procedures
US20160073054A1 (en) * 2014-09-05 2016-03-10 Avaya Inc. System and method for determining conference participation
US20190349212A1 (en) * 2018-05-09 2019-11-14 International Business Machines Corporation Real-time meeting effectiveness measurement based on audience analysis
US20200085312A1 (en) 2015-06-14 2020-03-19 Facense Ltd. Utilizing correlations between PPG signals and iPPG signals to improve detection of physiological responses
WO2020061358A1 (en) 2018-09-21 2020-03-26 Neurable Inc. Human-computer interface using high-speed and accurate tracking of user interactions
US20200192478A1 (en) 2017-08-23 2020-06-18 Neurable Inc. Brain-computer interface with high-speed eye tracking features
US10692606B2 (en) 2018-10-23 2020-06-23 International Business Machines Corporation Stress level reduction using haptic feedback
US20200210781A1 (en) 2017-09-15 2020-07-02 Tandemlaunch Inc. System and method for classifying passive human-device interactions through ongoing device context awareness
US20200228359A1 (en) * 2010-06-07 2020-07-16 Affectiva, Inc. Live streaming analytics within a shared digital environment
US20200358627A1 (en) 2018-05-04 2020-11-12 Microsoft Technology Licensing, Llc Meeting insight computing system

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2005076241A1 (en) 2004-02-05 2005-08-18 Medcom Limited Internet based distance learning of medical procedures
US20200228359A1 (en) * 2010-06-07 2020-07-16 Affectiva, Inc. Live streaming analytics within a shared digital environment
US20160073054A1 (en) * 2014-09-05 2016-03-10 Avaya Inc. System and method for determining conference participation
US20200085312A1 (en) 2015-06-14 2020-03-19 Facense Ltd. Utilizing correlations between PPG signals and iPPG signals to improve detection of physiological responses
US20200192478A1 (en) 2017-08-23 2020-06-18 Neurable Inc. Brain-computer interface with high-speed eye tracking features
US20200210781A1 (en) 2017-09-15 2020-07-02 Tandemlaunch Inc. System and method for classifying passive human-device interactions through ongoing device context awareness
US20200358627A1 (en) 2018-05-04 2020-11-12 Microsoft Technology Licensing, Llc Meeting insight computing system
US20190349212A1 (en) * 2018-05-09 2019-11-14 International Business Machines Corporation Real-time meeting effectiveness measurement based on audience analysis
WO2020061358A1 (en) 2018-09-21 2020-03-26 Neurable Inc. Human-computer interface using high-speed and accurate tracking of user interactions
US10692606B2 (en) 2018-10-23 2020-06-23 International Business Machines Corporation Stress level reduction using haptic feedback

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
DEWAN M ALI AKBER ET AL: "A Deep Learning Approach to Detecting Engagement of Online Learners", 2018 IEEE SMARTWORLD, UBIQUITOUS INTELLIGENCE & COMPUTING, ADVANCED & TRUSTED COMPUTING, SCALABLE COMPUTING & COMMUNICATIONS, CLOUD & BIG DATA COMPUTING, INTERNET OF PEOPLE AND SMART CITY INNOVATION (SMARTWORLD/SCALCOM/UIC/ATC/CBDCOM/IOP/SCI), IEEE, 8 October 2018 (2018-10-08), pages 1895 - 1902, XP033462815, DOI: 10.1109/SMARTWORLD.2018.00318 *

Also Published As

Publication number Publication date
SI26152A (en) 2022-08-31
EP4289115A1 (en) 2023-12-13

Similar Documents

Publication Publication Date Title
Allison et al. 30+ years of P300 brain–computer interfaces
Oberman et al. Face to face: Blocking facial mimicry can selectively impair recognition of emotional expressions
US20160042648A1 (en) Emotion feedback based training and personalization system for aiding user performance in interactive presentations
Ahmed et al. A systematic survey on multimodal emotion recognition using learning algorithms
Carneiro et al. Multimodal behavioral analysis for non-invasive stress detection
US20220392625A1 (en) Method and system for an interface to provide activity recommendations
Burgoon et al. Patterns of nonverbal behavior associated with truth and deception: Illustrations from three experiments
Harrison The Emotiv mind: Investigating the accuracy of the Emotiv EPOC in identifying emotions and its use in an Intelligent Tutoring System
Pham et al. Predicting learners’ emotions in mobile MOOC learning via a multimodal intelligent tutor
Sharma et al. Sensing technologies and child–computer interaction: Opportunities, challenges and ethical considerations
Abate et al. Attention monitoring for synchronous distance learning
Xu et al. Accelerating Reinforcement Learning using EEG-based implicit human feedback
Tikadar et al. Detection of affective states of the students in a blended learning environment comprising of smartphones
Hung et al. Augmenting teacher-student interaction in digital learning through affective computing
Kutt et al. BIRAFFE2, a multimodal dataset for emotion-based personalization in rich affective game environments
KR102552220B1 (en) Contents providing method, system and computer program for performing adaptable diagnosis and treatment for mental health
Wiggins et al. Affect-based early prediction of player mental demand and engagement for educational games
Kokini et al. Quantification of trainee affective and cognitive state in real-time
WO2022169424A1 (en) A device and a process for managing alertness and quality interaction in two-way audio-video communication platforms
WO2023059620A1 (en) Mental health intervention using a virtual environment
WO2022168180A1 (en) Video session evaluation terminal, video session evaluation system, and video session evaluation program
Patel A Machine Learning based Eye Tracking Framework to detect Zoom Fatigue
El Kaliouby et al. iSET: interactive social-emotional toolkit for autism spectrum disorder
Banire et al. One size does not fit all: detecting attention in children with autism using machine learning
Hardjasa et al. An examination of gaze during conversation for designing culture-based robot behavior

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22712098

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 2022712098

Country of ref document: EP

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2022712098

Country of ref document: EP

Effective date: 20230904