WO2023012818A1 - A non-invasive multimodal screening and assessment system for human health monitoring and a method thereof - Google Patents

A non-invasive multimodal screening and assessment system for human health monitoring and a method thereof Download PDF

Info

Publication number
WO2023012818A1
WO2023012818A1 PCT/IN2022/050687 IN2022050687W WO2023012818A1 WO 2023012818 A1 WO2023012818 A1 WO 2023012818A1 IN 2022050687 W IN2022050687 W IN 2022050687W WO 2023012818 A1 WO2023012818 A1 WO 2023012818A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
assessment
health monitoring
human health
speech
Prior art date
Application number
PCT/IN2022/050687
Other languages
French (fr)
Inventor
Gopinath VARADHARAJAN
Pooja HEMMIGE SHWETHADRI
Vijaygopal Rengarajan
Original Assignee
Sparcolife Digital Healthcare Technologies Private Limited
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sparcolife Digital Healthcare Technologies Private Limited filed Critical Sparcolife Digital Healthcare Technologies Private Limited
Publication of WO2023012818A1 publication Critical patent/WO2023012818A1/en

Links

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H40/00ICT specially adapted for the management or administration of healthcare resources or facilities; ICT specially adapted for the management or operation of medical equipment or devices
    • G16H40/60ICT specially adapted for the management or administration of healthcare resources or facilities; ICT specially adapted for the management or operation of medical equipment or devices for the operation of medical equipment or devices
    • G16H40/67ICT specially adapted for the management or administration of healthcare resources or facilities; ICT specially adapted for the management or operation of medical equipment or devices for the operation of medical equipment or devices for remote operation
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • A61B5/0059Measuring for diagnostic purposes; Identification of persons using light, e.g. diagnosis by transillumination, diascopy, fluorescence
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • A61B5/16Devices for psychotechnics; Testing reaction times ; Devices for evaluating the psychological state
    • A61B5/165Evaluating the state of mind, e.g. depression, anxiety
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H10/00ICT specially adapted for the handling or processing of patient-related medical or healthcare data
    • G16H10/20ICT specially adapted for the handling or processing of patient-related medical or healthcare data for electronic clinical trials or questionnaires
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H15/00ICT specially adapted for medical reports, e.g. generation or transmission thereof
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H30/00ICT specially adapted for the handling or processing of medical images
    • G16H30/20ICT specially adapted for the handling or processing of medical images for handling medical images, e.g. DICOM, HL7 or PACS
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H30/00ICT specially adapted for the handling or processing of medical images
    • G16H30/40ICT specially adapted for the handling or processing of medical images for processing medical images, e.g. editing
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/20ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H80/00ICT specially adapted for facilitating communication between medical practitioners or patients, e.g. for collaborative diagnosis, therapy or health monitoring

Definitions

  • the present methods of monitoring the health condition such as mental health is predominantly based on analyzing the responses provided by the patient to a specific type of multi choice questionnaire which curates the responses in a physical mode, however, there is a high chance that the data may not get rightly captured since the patient’s mental at the time of attempting the questionnaire is not being taken into consideration hence some of the disorders and stress levels while answering the questionnaire may also go unnoticed and the treatment protocols designed as a result of the analysis of responses provided by patient are resulting in accentuated deterioration of actual quality of care.
  • Another objective of the present invention is to provide a method and system for facial expression-emotion recognition.
  • Yet another objective of the present invention is to provide a video analytics module for screening and assessment for early diagnosis of multifarious health conditions by analyzing Facial Emotion Recognition, Micro-expression Analysis, Speech Emotion Recognition, Body Language Analysis, Pupillometry / Pupillary Responses as a Biomarker.
  • Still another objective of the present invention is to provide a computing devicebased system and method to provide standardized and personalized assessments that can help determine one or more health conditions of a given user.
  • Another objective of the present invention is to provide hardware interfaces such as a workstation or a kiosk coupled with a computing device-based system and method to provide advanced diagnostics following the results generated based on the computing device-based system.
  • Yet another objective of the present invention is to provide a system and method for Retinal Imaging to detect multifarious health conditions using wired/ wireless non-mydriatic fundus cameras.
  • Still another objective of the present invention is to provide a cloud-based data analytics and data management system.
  • the present invention provides a non-invasive multimodal screening and assessment system for human health monitoring and a method thereof for human health monitoring which is used as a diagnostic tool utilizing video analytics as primary modality for screening and assessment, retinal imaging-based biomarker identification as a secondary modality and using synchronized inputs from biofeedback and neurofeedback as an auxiliary modality.
  • the said video analytics mainly comprises Facial Emotion Recognition, Micro -expression Analysis, Speech Emotion Recognition, Body Language Analysis and Pupillometry / Pupillary Responses as a Biomarker for determining the anomaly in the mental and physical health condition.
  • the above mentioned modalities can be achieved by analyzing the data from the passive sensors together with artificial intelligence to identify emotions and cognitive states of a person.
  • Facial expression-emotion recognition serve as indicators of health disorders including depression, anxiety and trauma and emotion recognition is used to quantify emotions and design better treatment programs for patients.
  • the non-invasive multimodal screening and assessment system for human health monitoring comprises of:
  • a connectivity module
  • a touch screen panel A touch screen panel
  • Video acquisition module characterized to Recording for Facial Expression/Speech Emotion Recognition with Zoom, Pan and Tilt functionality
  • Biofeedback and Neurofeedback Sensors selected from Brain Activity (EEG), Muscle Tension (EMG), Heart Rate (ECG), Respiration Rate, Pulse (BVP) and Pulse Oximetry, Skin Conductance (SC/GSR), Peripheral and Body Temperature, Eye Movement (EOG) and other biosensors.
  • the method for non-invasive multimodal screening and assessment comprises of: a. Capturing the video while the person is answering the questionnaire. The system will extract the image frames from video and save them in a testing image database. b. Detecting human emotions such as anger, fear, disgust, happiness, sadness, surprise, contempt. c. Recognizing object and scene detection, facial analysis with sentiment tracking, image moderation detecting explicit content, face comparison, face recognition and celebrity recognition. d. Speech recognition to identify the emotions by using frequency characteristics (such as accent shape, average pitch, pitch range etc), Time related features such as speech rate and speech and frequency and voice quality parameters and energy descriptors such as breathiness, brilliance, loudness, pause and pitch discontinuity. e. Classifying speech to text characterized to identify emotions such as anger, fear, disgust, happiness, sadness, surprise, contempt.
  • the present invention discloses retinal image diagnostic using Wireless Camera selected from Non-mydriatic Fundus camera, Thermal camera, High resolution RGB camera wherein the cloud based retinal image processing engine explores the retinal neurovascular architecture and the retinal ganglion pathways linking to the Central Nervous System (CNS).
  • the present invention also discloses an interface between ophthalmology, neurology and image processing which with the help of retinal phenotyping is able to detect and assess multiple of candidate biomarkers including history of disease and disease progression.
  • the hardware interface such as a workstation or a kiosk transmits the data to the Cloud Server using in-built GPRS Module/Ethemet Port/Wi-Fi.
  • the present inventions provides a method and system for enabling improved adherence of drug intake during clinical trial and also serve as a reliable real-time pharmacovigilance tool.
  • FIG. 1 represents system components for non-invasive multimodal screening and assessment for enabling human health monitoring
  • FIG. 2 represents a block diagram for the method of the non-invasive multimodal screening and assessment for human health monitoring
  • FIG 3 represents a flowchart for a method of predicting health condition depending on emotional signs relies on the observation of the face, gestures or body posture
  • FIG 4 represents a graph of face expressions analysis which are captured during the questionnaire session of the human being

Landscapes

  • Health & Medical Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Medical Informatics (AREA)
  • Public Health (AREA)
  • General Health & Medical Sciences (AREA)
  • Primary Health Care (AREA)
  • Epidemiology (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Pathology (AREA)
  • Veterinary Medicine (AREA)
  • Biophysics (AREA)
  • Physics & Mathematics (AREA)
  • Radiology & Medical Imaging (AREA)
  • Heart & Thoracic Surgery (AREA)
  • Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
  • Molecular Biology (AREA)
  • Surgery (AREA)
  • Animal Behavior & Ethology (AREA)
  • Psychiatry (AREA)
  • Developmental Disabilities (AREA)
  • Educational Technology (AREA)
  • Child & Adolescent Psychology (AREA)
  • Hospice & Palliative Care (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Business, Economics & Management (AREA)
  • General Business, Economics & Management (AREA)
  • Psychology (AREA)
  • Social Psychology (AREA)
  • Measurement Of The Respiration, Hearing Ability, Form, And Blood Characteristics Of Living Organisms (AREA)

Abstract

The present invention provides a non-invasive multimodal screening and assessment system (100) and a method thereof for human health monitoring and utilizes Facial expression-emotion recognition system and method which serves as indicators of health disorders including depression, anxiety and trauma and emotion recognition is used to quantify emotions and design better treatment programs for patients. The present invention employs a wired or wireless camera (101) consisting of a wired/ wireless HD camera, a hardware interface such as a workstation or a kiosk (102) and other parts (107, 108, 109, 110, 111). The method for assessment and screening of human health monitoring involves capturing the video during questionnaire session, detecting human emotions, conducting facial analysis, speech analysis and classifying speech to text characterized to identify emotions. The present system is reliable source for health monitoring with high speed, accuracy and at affordable cost.

Description

Title
A NON-INVASIVE MULTIMODAL SCREENING AND ASSESSMENT SYSTEM FOR HUMAN HEALTH MONITORING AND A METHOD THEREOF
FIELD OF THE INVENTION
The present invention generally relates to telehealth-based diagnosis and treatment. More particularly, the present invention relates to a non-invasive multimodal screening and assessment system for human health monitoring. Even more particularly, this invention relates to a non-invasive multimodal screening and assessment system for mental health and physical health based on multiple factors such as stress and anxiety management.
BACKGROUND OF THE INVENTION AND DESCRIPTION OF THE PRIOR ART
With the rising adoption of telehealth-based diagnosis and treatment, there are a rapidly increasing number of errors in diagnosis resulting in accentuated deterioration of actual quality of care. The sheer lack of awareness and reluctance of seeking medical/ professional help for assessment of the mental condition of human beings contributes to the overall low adoption of mental health care. There are multiple help lines and telepsychiatry consultation platforms for screening and assessing the mental condition of human beings, however, due to shortage of mental health professionals and related infrastructure, a large population of patients does not bode well for anyone as the quality of care diminishes. Also, the present solutions do not always pay attention to the overall appearance, mood, facial expression, body language and speech of the person and are dependent on Mental, Neurological and Substance Use (MNS) conditions during clinical assessment. The present methods of monitoring the health condition such as mental health is predominantly based on analyzing the responses provided by the patient to a specific type of multi choice questionnaire which curates the responses in a physical mode, however, there is a high chance that the data may not get rightly captured since the patient’s mental at the time of attempting the questionnaire is not being taken into consideration hence some of the disorders and stress levels while answering the questionnaire may also go unnoticed and the treatment protocols designed as a result of the analysis of responses provided by patient are resulting in accentuated deterioration of actual quality of care.
Reference may be made to US Patent 7,999,857 B2 entitled “Voice, lip-reading, face and emotion stress analysis, fuzzy logic intelligent camera system” which discloses an intelligent camera security monitoring, fuzzy logic analyses and information reporting system that includes video/audio camera, integrated local controller, interfaced plurality of sensors, and input/output means, that collects and analyses data and information observations from a viewed scene and communicates these to a central controller. The central controller with fuzzy logic processor receives, stores these observations, conducts a plurality of computer analyses techniques and technologies including face, voice, lip reading, emotion, movement, pattern recognition and stress analysis to determine responses and potential threat of/by a person, crowd, animal, action, activity or thing. The possible applications of the system are recognition of terrorists, criminals, enraged or dangerous persons as well as a person's level of intoxication or impairment by alcohol or drugs via a new “Visual Response Measure”.
Reference may be made to, which relates to US patent US20100189313A1 entitled “System and method for using three dimensional infrared imaging to identify individuals” which discloses Calibrated infrared and range imaging sensors to produce a true-metric three-dimensional (3D) surface model of any body region within the fields of view of both sensors. Curvilinear surface features in both modalities are caused by internal and external anatomical elements. They are extracted to form 3D Feature Maps that are projected onto the skin surface. Skeletonised Feature Maps define subpixel intersections that serve as anatomical landmarks to aggregate multiple images for models of larger regions of the body, and to transform images into precise standard poses. Features are classified by origin, location, and characteristics to produce annotations that are recorded with the images and feature maps in reference image libraries. The system provides an enabling technology for searchable medical image libraries.
Reference may be made to, which relates to IN patent 201717031179 entitled “Method and system for real time visualization of individual health condition on a mobile device” which discloses a method and technology to display 3D graphical output for a user using body sensor data personal medical data in real time. Embodiments of the application are describing the system and method for a real time visualization of the individual health condition on a mobile device or other devices. The mobile device would display a specific organ by gathering the vital signs, nutrition, activity level and medical data of a person who is wearing or using the device.
Hence, there is need of development of multimodal recognition system for tracking data in multi-mode recognition at a time, data acquisition unit for delivering the questionnaire in audio/visual format responses, capability of analyzing the physiological and emotional state of the human being, accommodation of precisely working classifiers which classify the signal based the condition to form complete image.
Given aforesaid disadvantages in the health monitoring system, the present invention is a non-invasive multimodal screening and assessment system and methods for multifarious human health conditions which includes wired or wireless image acquisition module such as non-mydriatic fundus camera, a system of hardware interface and a system and method of monitoring, screening and assessing the health conditions to minimize the diagnostic error rate. All methods and systems described herein can be performed in any suitable order unless otherwise indicated herein or otherwise clearly contradicted by context. The use of any and all examples, or exemplary language (e.g. “such as”) provided with respect to certain embodiments herein is intended merely to better illuminate the invention and does not pose a limitation on the scope of the invention otherwise claimed. No language in the specification should be construed as indicating any non-claimed element essential to the practice of the invention.
OBJECTIVES OF THE INVENTION
Primary objective of the invention is to provide a non-invasive multimodal screening and assessment system for human health monitoring and a method thereof.
Another objective of the present invention is to provide a method and system for facial expression-emotion recognition.
Yet another objective of the present invention is to provide a video analytics module for screening and assessment for early diagnosis of multifarious health conditions by analyzing Facial Emotion Recognition, Micro-expression Analysis, Speech Emotion Recognition, Body Language Analysis, Pupillometry / Pupillary Responses as a Biomarker.
Still another objective of the present invention is to provide a computing devicebased system and method to provide standardized and personalized assessments that can help determine one or more health conditions of a given user.
Another objective of the present invention is to provide hardware interfaces such as a workstation or a kiosk coupled with a computing device-based system and method to provide advanced diagnostics following the results generated based on the computing device-based system. Yet another objective of the present invention is to provide a system and method for Retinal Imaging to detect multifarious health conditions using wired/ wireless non-mydriatic fundus cameras.
Still another objective of the present invention is to provide a cloud-based data analytics and data management system.
These and other objects and advantages of the present invention will become readily apparent from the following detailed description taken in conjunction with the accompanying drawings.
SUMMARY OF THE INVENTION
Accordingly, the present invention provides a non-invasive multimodal screening and assessment system for human health monitoring and a method thereof for human health monitoring which is used as a diagnostic tool utilizing video analytics as primary modality for screening and assessment, retinal imaging-based biomarker identification as a secondary modality and using synchronized inputs from biofeedback and neurofeedback as an auxiliary modality. The said video analytics mainly comprises Facial Emotion Recognition, Micro -expression Analysis, Speech Emotion Recognition, Body Language Analysis and Pupillometry / Pupillary Responses as a Biomarker for determining the anomaly in the mental and physical health condition. The above mentioned modalities can be achieved by analyzing the data from the passive sensors together with artificial intelligence to identify emotions and cognitive states of a person.
Facial expression-emotion recognition serve as indicators of health disorders including depression, anxiety and trauma and emotion recognition is used to quantify emotions and design better treatment programs for patients. The non-invasive multimodal screening and assessment system for human health monitoring comprises of:
1. A wired or wireless camera selected from non- mydriatic fundus camera, a thermal camera, a High resolution RGB camera, comprising of a wired/ wireless HD camera characterized to accommodate:
An ocular eye scope,
An ophthalmic condensing lens,
A connectivity module,
Charging enclosure with light bar and
A charging port
2. A hardware interface such as a workstation or a kiosk characterized into that:
A touch screen panel
Video acquisition module characterized to Recording for Facial Expression/Speech Emotion Recognition with Zoom, Pan and Tilt functionality
Connecting ports for Biofeedback and Neurofeedback Sensors selected from Brain Activity (EEG), Muscle Tension (EMG), Heart Rate (ECG), Respiration Rate, Pulse (BVP) and Pulse Oximetry, Skin Conductance (SC/GSR), Peripheral and Body Temperature, Eye Movement (EOG) and other biosensors.
The method for non-invasive multimodal screening and assessment comprises of: a. Capturing the video while the person is answering the questionnaire. The system will extract the image frames from video and save them in a testing image database. b. Detecting human emotions such as anger, fear, disgust, happiness, sadness, surprise, contempt. c. Recognizing object and scene detection, facial analysis with sentiment tracking, image moderation detecting explicit content, face comparison, face recognition and celebrity recognition. d. Speech recognition to identify the emotions by using frequency characteristics (such as accent shape, average pitch, pitch range etc), Time related features such as speech rate and speech and frequency and voice quality parameters and energy descriptors such as breathiness, brilliance, loudness, pause and pitch discontinuity. e. Classifying speech to text characterized to identify emotions such as anger, fear, disgust, happiness, sadness, surprise, contempt.
In one aspect of the present invention discloses retinal image diagnostic using Wireless Camera selected from Non-mydriatic Fundus camera, Thermal camera, High resolution RGB camera wherein the cloud based retinal image processing engine explores the retinal neurovascular architecture and the retinal ganglion pathways linking to the Central Nervous System (CNS). The present invention also discloses an interface between ophthalmology, neurology and image processing which with the help of retinal phenotyping is able to detect and assess multiple of candidate biomarkers including history of disease and disease progression.
In an aspect of the present invention the hardware interface such as a workstation or a kiosk transmits the data to the Cloud Server using in-built GPRS Module/Ethemet Port/Wi-Fi.
In one aspect, the present inventions provides a method and system for enabling improved adherence of drug intake during clinical trial and also serve as a reliable real-time pharmacovigilance tool.
BRIEF DESCRIPTION OF THE ACCOMPANYING DRAWINGS
Reference will be made to embodiments of the invention, examples of which may be illustrated in accompanying figures. These figures are intended to be illustrative, not limiting. Although the invention is generally described in the context of these embodiments, it should be understood that it is not intended to limit the scope of the invention to these particular embodiments.
FIG. 1 represents system components for non-invasive multimodal screening and assessment for enabling human health monitoring
FIG. 2 represents a block diagram for the method of the non-invasive multimodal screening and assessment for human health monitoring
FIG 3 represents a flowchart for a method of predicting health condition depending on emotional signs relies on the observation of the face, gestures or body posture
FIG 4 represents a graph of face expressions analysis which are captured during the questionnaire session of the human being
Although the specific features of the present invention are shown in some drawings and not in others. This is done for convenience only as each feature may be combined with any or all of the other features in accordance with the present invention.
Table No. 1: Legend and Legend Description
Figure imgf000010_0001
Figure imgf000011_0001
DETAILED DESCRIPTION OF THE INVENTION
In the following detailed description, a reference is made to the accompanying drawings that form a part hereof, and in which the specific embodiments that may be practiced is shown by way of illustration. These embodiments are described in sufficient detail to enable those skilled in the art to practice the embodiments and it is to be understood that other changes may be made without departing from the scope of the embodiments. The following detailed description is therefore not to be taken in a limiting sense.
According to the one embodiment, the present invention provides non-invasive multimodal screening and assessment method and system (100) thereof for human health monitoring which is used as a diagnostic tool utilizing video analytics as primary modality for screening and assessment, retinal imaging-based biomarker identification as a secondary modality and using synchronized inputs from biofeedback and neurofeedback as an auxiliary modality. The said video analytics mainly comprises Facial Emotion Recognition, Micro-expression Analysis, Speech Emotion Recognition, Body Language Analysis and Pupillometry/Pupillary Responses as a Biomarker for determining the anomaly in the mental and physical health condition. The above-mentioned modalities can be achieved by analyzing the data from the passive sensors together with artificial intelligence to identify emotions and cognitive states of a person. Facial expression-emotion recognition serve as indicators of health disorders including depression, anxiety and trauma and emotion recognition can be used to quantify emotions and design better treatment programs for patients.
According to another aspect of present invention, the non-invasive multimodal screening and assessment method and system (100) thereof for human health monitoring, represented in Fig 1 comprises: (a) a wired or wireless camera (101) selected from non-mydriatic fundus, Thermal camera, High resolution RGB camera comprising of a wired/ wireless HD camera characterized to accommodate an ocular eye scope, an ophthalmic condensing lens (106) arranged coaxially to the light source, a connectivity module, charging enclosure with light bar and a charging port; and, (b) a hardware interface such as a workstation or a kiosk (102) characterized for data collection for the human for determining the health condition. The hardware interface such as a workstation or a kiosk (102) comprises of a touch screen panel (103); a video and audio acquisition module (104) characterized to Recording for Facial Expression/Speech Emotion Recognition with Zoom, Pan and Tilt functionality; sensor ports (105) for Biofeedback and Neurofeedback Sensors selected from Brain Activity (EEG), Muscle Tension (EMG), Heart Rate (ECG), Respiration Rate, Pulse (BVP) and Pulse Oximetry, Skin Conductance (SC/GSR), Peripheral and Body Temperature, Eye Movement (EOG) and other biosensors.
Another embodiment of the present invention discloses retinal image diagnostic using Wireless Camera having the lens for retinal imaging wherein the cloud based retinal image processing engine explores the retinal neurovascular architecture and the retinal ganglion pathways linking to the Central Nervous System (CNS). The present invention also discloses an interface between ophthalmology, neurology and image processing which with the help of retinal phenotyping will be able to detect and assess multiple of candidate biomarkers including history of disease and disease progression. In one of the embodiment of the present invention, the hardware interface (102) collected the data received from the data collection units (101, 103, 104, 105 and 106) and transmits it through the circuit board (107) to the data acquisition unit (108) for processing. The data is also transmitted to the Cloud Server (109, 110) using in-built GPRS Module/Ethernet Port/Wi-Fi by the workstation or a kiosk (102). The results from the data acquisition unit can be displayed on the remotely working display device like mobile phone (111).
Yet another embodiment of present invention, the method for non- invasive multimodal screening and assessment, represented in Fig 2 comprises of: (a) Capturing the video while the person is answering the questionnaire. The system (100) extracts the image frames from video and save them in a testing image database; (b) Detecting human emotions such as anger, fear, disgust, happiness, sadness, surprise, contempt; (c) Recognizing object and scene detection, facial analysis with sentiment tracking, image moderation detecting explicit content, face comparison, face recognition and celebrity recognition, (d) Speech recognition to identify the emotions by using frequency characteristics (such as accent shape, average pitch, pitch range etc.), Time related features such as speech rate and speech and frequency and voice quality parameters and energy descriptors such as breathiness, brilliance, loudness, pause and pitch discontinuity; and, (e) Classifying speech to text characterized to identify emotions such as anger, fear, disgust, happiness, sadness, surprise, contempt.
In one embodiment of the present invention, provides a method and system (100) for enabling improved adherence of drug intake during clinical trial and also serves as a reliable real-time pharmaco vigilance tool.
In one aspect, the present invention provides a working and Data processing by the system. For each couple there are sessions for a particular duration. Each of these sessions is recorded by the Experts using the Bispectral Camera. Basic metadata such as Age, Location, job etc will be collected initially. In one embodiment of the present invention, measuring facial cutaneous temperature and assessing both its topographic and temporal distribution can provide insights about the person’s autonomic activity. The main approaches to objectively measure these emotional signs rely on the observation of the face, gestures or body posture. Thermal IR imaging-based Affective Computing enables monitoring human physiological parameters and Autonomic Nervous System (ANS) activity in a non-contact manner and without subject’s constraining.
In one embodiment of the present invention, collection of data points is enabled. The data collected by the expert includes Thermal, collected using a thermal infrared sensor and RGB Video collected using an RGB camera, Audio signals, speech-to-text and the topics discussed with the couples in the form of metadata. Once the thermal and RGB video is collected, it is then processed to sync the frame rates with each other using the method and system (100) of present invention in order to detect the changes, frame 1 is then compared with a plurality of other frames collected throughout the session. Series of Gaussian Filters are applied to enhance the frame quality. The visual sample with the synced frame rate is passed through Drop Filter and Light Filter. The Drop filter is characterized to filter out the unwanted frames in both the Thermal and the RGB data, whereas the Light filter is characterized to eliminate the excess environment light and to provide ambient light setting in order to provide quality images. The audio recorded during the interview is also being simultaneously processed. During the recording of the session, the patient may take a pause, thinking or even remain silent for some time. This space in audio is synced with RGB/Thermal to enhance emotional cues. The Processed signal is then passed through the noise filter in order to eliminate the background noise from the speech signal. The visual, auditory signals and the text data obtained after filtering are then compressed and uploaded to the cloud. In one embodiment of the present invention, a method is provided for preparing base Deep Learning model. For RGB data, the method utilizes AffectNet and Extended Cohn-Kanade Dataset and similar datasets as a base model and preparing a deep neural network model. The system (100) and the method disclosed in the present invention uses the subjects in controlled environment with typical interactive questions. The captured RGB is used as a transfer learning model from the base. For Thermal data, preparing appropriate models for feature extraction and preparing appropriate artificial neural net model are employed. The method also includes converting audio inputs to Wave2Vector and extracting Mel Frequency and its Coefficient features and using the appropriate database of speech to detect an emotion and prepare a deep neural network classifier. The method also includes converting audio inputs to text for content, context and sentiment analysis.
In one embodiment of the present invention, a method is provided for transfer Model preparation using subjects in controlled environment. The method includes: Preparing base data model; Feeding this base data to base model for Transfer Learning; identifying emotional probabilities such as happy, sad, anger, disgust, fear, surprise and contempt; validating the data from the base data model and the predicting output and comparing with the experts opinion; the Back Propagation metrics is adjusted based on Expert / Specialist feedback during these sessions; Training the data till a model with average 70% accuracy across all modalities is prepared; Processing the input data; and, Once the input data is processed, the samples (Visual, Auditory and Text) are sent to the prepared model in the Cloud, and, the output from each modality is provided as % score for each mode.
These and other aspects of the embodiments herein will be better appreciated and understood when considered in conjunction with the following description and the accompanying drawings. It should be understood, however, that the following descriptions, while indicating the preferred embodiments and numerous specific details thereof, are given by way of illustration and not of limitation. Many changes and modifications may be made within the scope of the embodiments herein without departing from the spirit thereof, and the embodiments herein include all such modifications.
In another embodiment of present invention FIG 3 represents a flowchart for a method of predicting health condition depending on emotional signs relies on the observation of the face, gestures or body posture
In another embodiment of present invention FIG 4 represents a graph of face expressions analysis which is captured during the questionnaire session of the human being. The analysis is done by converting the captured videos to frames and followed by the frame analysis is done. For each frame, the emotional score is calculated and plotted in on the graph against the duration of questionnaire session. The highly scored expression is treated as the result of the facial expression analysis.
ADVANTAGES OF THE INVENTION
• The system (100) and method of the present invention significantly reduces the need for human intervention in the process of screening, assessment and monitoring, while also improving the accuracy of the condition itself.
• The speed and accuracy of screening, assessment and monitoring makes it an affordable and better alternative to the present modalities.
• The system (100) and method can be installed in offices to monitor the stress level of the employees to provide support for their well-being as an attempt to increase productivity.
• The system (100) and method of the present invention can be employed in the pharmaceuticals industry to monitor the adherence of the drug intake by the patient during the clinical trials.
• The system (100) and method of present invention will be useful for finding the state of the person to identify fatigue and stress in order to avoid accidents in an industrial environment or armed forces.
• The system (100) and method of the present invention can help in identifying the learner’s affective state and the learning state by analyzing the facial expressions, which then can be further processed by the teacher to analyze the learner’s pattern and accepting ability, and formulate reasonable teaching plans.
• The system (100) and method of the present invention can be employed in the hospitals not only to monitor the patients but also the doctor’s state of mind before performing surgical procedures to avoid medical errors.

Claims

Claims We Claim,
1. A system for non- invasive multimodal screening and assessment for enabling human health monitoring (100), comprising: a. A wired or wireless camera (101), selected from non-mydriatic fundus camera, a thermal camera, High resolution RGB camera wherein the camera further comprises an ocular eye scope, an ophthalmic condensing lens (106) with coaxially placed light source, a connectivity module, charging enclosure with light bar and a charging port; b. A hardware interface in form of a workstation or a kiosk (102) characterized for data collection from the human during questionnaire sessions; c. A circuit board (107), characterized for containing the electric circuit to collect and transmit the data from the input/data collection unit; d. A data acquisition and processing unit (108), characterized for procession of the data collected from the A wired or wireless camera (101) and A hardware interface; e. A cloud storage and data store system (109), characterized for storing, analysing and communicating the relevant data to end users and clinicians; f. A cloud machine learning system (110) g. A Display device (111) with integrated application working remotely for which includes Mobile phone, tablets
2. The system for non-invasive multimodal screening and assessment for enabling human health monitoring (100), as claimed in claim 1 wherein the hardware interface in form of a workstation or a kiosk (102) comprising: a. A touch screen panel (103), characterized for the feeding of input and display of output of the system (100); b. A video and audio acquisition module (104), characterized for recording facial expression and speech emotion recognition with zoom, pan and tilt functionalities and; c. A plurality of connecting sensors (105), characterized for connection between plurality of sensors to the hardware interface including biofeedback and neurofeedback sensors selected from brain activity, muscle tension, heart rate, respiration rate, pulse, pulse oximetry, skin conductance, peripheral and body temperature and eye movement
3. The system for non-invasive multimodal screening and assessment for enabling human health monitoring (100), as claimed in claim 1 wherein, the hardware interface (102) is configured to transmit data acquired from the plurality of sensors to a remote or local cloud server module through wired or wireless means.
4. The system for non-invasive multimodal screening and assessment for enabling human health monitoring (100), as claimed in claim 1 wherein, the system further comprises of a cloud based retinal image processing engine is configured to analyze a retinal image captured by the camera and explore the retinal neurovascular architecture, and the retinal ganglion pathways linking to the Central Nervous System which enables studying an interface between ophthalmology, neurology and image processing the help of retinal phenotyping detects and assesses a plurality of candidate biomarkers including history of disease and disease progression.
5. A method for non-invasive multimodal screening and assessment for enabling human health monitoring, comprising: a. Capturing a video while the person answers a questionnaire and a plurality of other biomarkers using a plurality of sensors; b. Extracting a plurality of image frames from video and storing them in a testing image database; c. Detecting human emotions such as anger, fear, disgust, happiness, sadness, surprise, contempt as well as micro expressions using image analysis; d. Recognizing object and detecting scenes, performing facial analysis with sentiment tracking, moderating images by detecting explicit content, comparing faces, and facial recognition; e. Recognizing speech to identify the emotions by using frequency characteristics such as accent shape, average pitch, pitch range, and time related features such as speech rate, speech frequency, voice quality parameters and energy descriptors such as breathiness, brilliance, loudness, pause and pitch discontinuity; and, f. Classifying speech to text characterized to identify emotions such as anger, fear, disgust, happiness, sadness, surprise, contempt.
6. The method for non-invasive multimodal screening and assessment for enabling human health monitoring as claimed in claim 4, wherein the measuring facial cutaneous temperature and assessing its topographic and temporal distribution provides insights about a person’s autonomic activity, and wherein, the main approaches to objectively measure these emotional signs rely on the observation of the face, gestures or body posture, and wherein, Thermal IR imaging-based affective computing enables monitoring human physiological parameters and Autonomic Nervous System (ANS) activity in a non-contact manner and without subject’s constraining.
18
7. The method for non- invasive multimodal screening and assessment for enabling human health monitoring as claimed in claim 4, wherein the collection of a plurality of data points is enabled which includes: a. A thermal data, wherein the thermal data is collected using a thermal infrared sensor and RGB Video collected using an RGB camera and; b. A audio data, wherein the audio data includes audio signals, speech- to-text and audio recordings of the topics discussed with the human subjects in the form of metadata wherein, the thermal data and the RGB video are processed to sync the frame rates with each other in order to detect the changes, and wherein, a series of Gaussian filters are applied to enhance the frame quality, and wherein, the visual sample with the synced frame rate is passed through Drop Filter and Light Filter, and wherein, the Drop filter is characterized to filter out the unwanted frames in both the thermal and the RGB data, whereas the Light Filter is characterized to eliminate the excess environment light and to provide ambient light setting in order to provide quality images, and wherein, the audio data is simultaneously processed and suitably synced with RGB or thermal data to enhance emotional cues, and wherein, the Processed signal is then passed through the noise filter in order to eliminate the background noise from the speech signal, and wherein, the visual, auditory signals and the text data obtained after filtering are then compressed and uploaded to a cloud storage module.
8. The method for non- invasive multimodal screening and assessment for enabling human health monitoring as claimed in claim 4, wherein a method is provided for preparing a base Deep Learning model, and wherein, for RGB data, the method utilizes AffectNet, Extended Cohn-Kanade Dataset and similar datasets as a base model and preparing
19 a deep neural network model, and wherein, the captured RGB are used as a transfer learning model from the base, and wherein, for thermal data, preparing appropriate models for feature extraction and preparing appropriate artificial neural net model are employed, and wherein, the method also includes converting audio inputs to Wave2Vector and extracting Mel Frequency and its coefficient features and using the appropriate database of speech to detect an emotion and prepare a deep neural network classifier, and wherein, the method also includes converting audio inputs to text for content, context and sentiment analysis.
9. The method for non-invasive multimodal screening and assessment for enabling human health monitoring as claimed in claim 4, wherein the method is provided for transfer model preparation using subjects in controlled environment, and wherein, the method includes: preparing base data model; feeding this base data to base model for Transfer Learning; identifying emotional probabilities such as happy, sad, anger, disgust, fear, surprise and contempt; validating the data from the base data model and the predicting output and comparing with the experts opinion; adjusting the Back Propagation metrics based on expert/specialist feedback during the sessions; training the data till a model with average 70% accuracy across all modalities is prepared; processing the input data; and, uploading the information and the samples (visual, auditory and text) are to the prepared model in the cloud computing module, and providing the output from each modality as a percentage score for each mode.
20
PCT/IN2022/050687 2021-07-31 2022-07-30 A non-invasive multimodal screening and assessment system for human health monitoring and a method thereof WO2023012818A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
IN202141034568 2021-07-31
IN202141034568 2021-07-31

Publications (1)

Publication Number Publication Date
WO2023012818A1 true WO2023012818A1 (en) 2023-02-09

Family

ID=85155358

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/IN2022/050687 WO2023012818A1 (en) 2021-07-31 2022-07-30 A non-invasive multimodal screening and assessment system for human health monitoring and a method thereof

Country Status (1)

Country Link
WO (1) WO2023012818A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116636847A (en) * 2023-06-02 2023-08-25 南京航空航天大学 Emotion assessment method and system based on wrist wearable equipment
CN116994718A (en) * 2023-09-28 2023-11-03 南京元域绿洲科技有限公司 VR technology-based mental disorder auxiliary treatment method

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2016025323A1 (en) * 2014-08-10 2016-02-18 Autonomix Medical, Inc. Ans assessment systems, kits, and methods
CN108652648A (en) * 2018-03-16 2018-10-16 合肥数翼信息科技有限公司 Depression monitoring device for depression of old people
US20200066405A1 (en) * 2010-10-13 2020-02-27 Gholam A. Peyman Telemedicine System With Dynamic Imaging

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200066405A1 (en) * 2010-10-13 2020-02-27 Gholam A. Peyman Telemedicine System With Dynamic Imaging
WO2016025323A1 (en) * 2014-08-10 2016-02-18 Autonomix Medical, Inc. Ans assessment systems, kits, and methods
CN108652648A (en) * 2018-03-16 2018-10-16 合肥数翼信息科技有限公司 Depression monitoring device for depression of old people

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116636847A (en) * 2023-06-02 2023-08-25 南京航空航天大学 Emotion assessment method and system based on wrist wearable equipment
CN116636847B (en) * 2023-06-02 2024-07-30 南京航空航天大学 Emotion assessment method and system based on wrist wearable equipment
CN116994718A (en) * 2023-09-28 2023-11-03 南京元域绿洲科技有限公司 VR technology-based mental disorder auxiliary treatment method
CN116994718B (en) * 2023-09-28 2023-12-01 南京元域绿洲科技有限公司 VR technology-based mental disorder auxiliary treatment method

Similar Documents

Publication Publication Date Title
US12053285B2 (en) Real time biometric recording, information analytics, and monitoring systems and methods
CN109157231B (en) Portable multichannel depression tendency evaluation system based on emotional stimulation task
Udovičić et al. Wearable emotion recognition system based on GSR and PPG signals
US11699529B2 (en) Systems and methods for diagnosing a stroke condition
Wang et al. Facial expression video analysis for depression detection in Chinese patients
CN106691476B (en) Image cognition psychoanalysis system based on eye movement characteristics
US20190239791A1 (en) System and method to evaluate and predict mental condition
WO2023012818A1 (en) A non-invasive multimodal screening and assessment system for human health monitoring and a method thereof
CN109065162A (en) A kind of comprehensive intelligent diagnostic system
WO2014150684A1 (en) Artifact as a feature in neuro diagnostics
CN108652587B (en) Cognitive dysfunction prevention monitoring devices
WO2017200244A1 (en) Psychiatric symptom evaluation system utilizing audio-visual content and bio-signal analysis
CN109715049A (en) For the multi-modal physiological stimulation of traumatic brain injury and the agreement and signature of assessment
Skaramagkas et al. Multi-modal deep learning diagnosis of parkinson’s disease—A systematic review
Coyle et al. High-resolution cervical auscultation and data science: new tools to address an old problem
Rajwal et al. Convolutional neural network-based EEG signal analysis: A systematic review
CN117064388A (en) System for realizing mental disorder assessment analysis based on emotion recognition
Dar et al. YAAD: young adult’s affective data using wearable ECG and GSR sensors
Garbey et al. A Digital Telehealth System to Compute Myasthenia Gravis Core Examination Metrics: Exploratory Cohort Study
EP4124287A1 (en) Regularized multiple-input pain assessment and trend
Mantri et al. Real time multimodal depression analysis
Bonyad et al. The relation between mental workload and face temperature in flight simulation
CN113876302A (en) Traditional Chinese medicine regulation and treatment system based on intelligent robot
Su et al. An Automatic Sleep Arousal Detection Method by Enhancing U-Net with Spatial-channel Attention
CN109640819A (en) For the asthma attack of test object or the equipment, system and method for asthma

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22852510

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 22852510

Country of ref document: EP

Kind code of ref document: A1