WO2020192231A1 - 一种基于表面肌电唇语识别的辅助沟通系统 - Google Patents

一种基于表面肌电唇语识别的辅助沟通系统 Download PDF

Info

Publication number
WO2020192231A1
WO2020192231A1 PCT/CN2019/130814 CN2019130814W WO2020192231A1 WO 2020192231 A1 WO2020192231 A1 WO 2020192231A1 CN 2019130814 W CN2019130814 W CN 2019130814W WO 2020192231 A1 WO2020192231 A1 WO 2020192231A1
Authority
WO
WIPO (PCT)
Prior art keywords
module
lip
emg
lip language
signal
Prior art date
Application number
PCT/CN2019/130814
Other languages
English (en)
French (fr)
Inventor
陈世雄
朱明星
王小晨
李光林
杨子建
汪鑫
Original Assignee
中国科学院深圳先进技术研究院
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中国科学院深圳先进技术研究院 filed Critical 中国科学院深圳先进技术研究院
Priority to US16/960,496 priority Critical patent/US20210217419A1/en
Publication of WO2020192231A1 publication Critical patent/WO2020192231A1/zh

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/24Speech recognition using non-acoustical features
    • G10L15/25Speech recognition using non-acoustical features using position of the lips, movement of the lips or face analysis
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • A61B5/24Detecting, measuring or recording bioelectric or biomagnetic signals of the body or parts thereof
    • A61B5/316Modalities, i.e. specific diagnostic methods
    • A61B5/389Electromyography [EMG]
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • A61B5/72Signal processing specially adapted for physiological signals or for diagnostic purposes
    • A61B5/7225Details of analog processing, e.g. isolation amplifier, gain or sensitivity adjustment, filtering, baseline or drift compensation
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • A61B5/72Signal processing specially adapted for physiological signals or for diagnostic purposes
    • A61B5/7235Details of waveform analysis
    • A61B5/7264Classification of physiological signals or data, e.g. using neural networks, statistical classifiers, expert systems or fuzzy systems
    • A61B5/7267Classification of physiological signals or data, e.g. using neural networks, statistical classifiers, expert systems or fuzzy systems involving training the classification device
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/011Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
    • G06F3/015Input arrangements based on nervous system activity detection, e.g. brain waves [EEG] detection, electromyograms [EMG] detection, electrodermal response detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/005Language recognition
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2218/00Aspects of pattern recognition specially adapted for signal processing
    • G06F2218/02Preprocessing
    • G06F2218/04Denoising
    • G06F2218/06Denoising by applying a scale-space analysis, e.g. using wavelet analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2218/00Aspects of pattern recognition specially adapted for signal processing
    • G06F2218/08Feature extraction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2218/00Aspects of pattern recognition specially adapted for signal processing
    • G06F2218/12Classification; Matching
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2218/00Aspects of pattern recognition specially adapted for signal processing
    • G06F2218/12Classification; Matching
    • G06F2218/16Classification; Matching by matching signal segments

Definitions

  • the invention belongs to the technical field of speech recognition auxiliary communication, and in particular relates to an auxiliary communication system based on surface electromyography lip language recognition.
  • Pronunciation is the basis of language expression. Pronunciation is a very complex process in which the central nervous system controls the coordinated movement of muscles. It is the result of the coordination and cooperation of multiple organs and multiple muscle groups. During the pronunciation, the facial muscles and neck muscles will move accordingly. Different sounds will have different movement patterns of the corresponding facial and neck muscles. Therefore, the electrical signals of the surface muscles of the face and neck can be collected, and through feature extraction and classification, different pronunciations can be correlated with the electrophysiological changes of different muscle groups, thereby identifying pronunciation information and assisting patients in communicating with others.
  • Surface EMG signal is a one-dimensional voltage time series signal obtained by the bioelectric changes produced by the muscular system during voluntary and involuntary activities through surface electrode guidance, amplification, display and recording, reflecting the bioelectric activity of motor neurons It is formed in the sum of time and space of many peripheral motor unit potentials. It has a greater correlation with muscle activity. To a certain extent, it can reflect the activity level of related muscles. Therefore, the correlation can be observed by analyzing the surface EMG. The movement of the muscles.
  • Surface EMG as an objective and quantitative means, has the advantages of non-invasive, simple operation, low cost and can provide quantitative and qualitative analysis, so it is widely used in medical research, human-computer interaction and other fields.
  • EMG acquisition often uses only a few electrodes to be placed on several known articulator muscles.
  • the number and positions of the electrodes are all Human subjective selection, the number of electrodes and the number of channels selected is not necessarily the optimal solution, there are certain limitations, and the accuracy of lip recognition is low.
  • the embodiment of the present invention provides an auxiliary communication system based on surface electromyography lip language recognition for patients who have difficulty in pronunciation but can express in oral and lip language, so as to solve the problem of subjective selection of electrodes in the prior art.
  • the number and location are difficult to obtain the optimal solution, and the accuracy of speech signal recognition is low.
  • the training subsystem is used to collect facial and neck electromyographic signals during lip language movements through high-density array electrodes, improve signal quality through signal preprocessing algorithms, classify lip language movement types through classification algorithms, and use channel selection algorithms Select the optimal number of electrodes and the optimal position, and establish the optimal matching template between the EMG signal and the lip language information, and upload it to the network terminal for storage;
  • the detection subsystem is used to collect the EMG signal during the lip language action at the optimal position based on the optimal number and position of the electrodes selected by the training subsystem, call the optimal matching template, and classify and decode the EMG signal , Recognize lip language information, and transform it into corresponding voice and image information, and display it in real time to realize lip language recognition.
  • the training subsystem may include a lower computer of the training subsystem and an upper computer of the training subsystem, and the lower computer of the training subsystem may include:
  • the high-density array electrode is used to obtain the high-density EMG signal of the pronunciation muscles of the user's lip language by pasting on the facial and neck pronunciation muscles;
  • the EMG acquisition module is used to amplify, filter, and convert the signals collected by the high-density array electrodes, and transmit them to the upper computer of the training subsystem.
  • the upper computer of the training subsystem may include a user interaction module and a signal classification, correction matching feedback training module, and the user interaction module may include:
  • EMG signal display sub-module used to display the collected EMG signal in real time
  • the lip language training scene display sub-module is used to provide pictures and text of the lip language scene
  • the channel selection and positioning chart display sub-module is used to provide the position distribution of electrodes on the face and neck.
  • the signal classification, correction matching feedback training module may include:
  • the signal processing sub-module is used to filter out power frequency interference and baseline drift with filters, and filter out interference noise in EMG signal by wavelet transform and template matching algorithm;
  • the classification sub-module is used to extract the EMG signal related to the pronunciation of the specified short sentence, extract the feature value, establish the corresponding relationship between the EMG signal and the specified short sentence, and perform the collected lip language content based on the EMG information classification;
  • the channel selection sub-module is used to select the best matching template, establish a personal training set, and transmit it to the network terminal.
  • the detection subsystem may include a detection subsystem lower computer and a detection subsystem upper computer
  • the detection subsystem lower computer may include:
  • Patch-type flexible electrodes used to collect the EMG signal during the lip language movement at the optimal position
  • the wireless EMG acquisition module is used to wirelessly transmit the EMG information collected by the patch-type flexible electrode to the upper computer of the detection subsystem.
  • the upper computer of the detection subsystem may include:
  • the personal training set download module is used to call the personal training set from the network sharing port of the training subsystem through the connection to the network, and store it in the APP client;
  • the lip information recognition and decoding module is used to denoise and filter the signal, and to match the characteristics of the EMG signal with the personal training set.
  • the lip information is decoded, the lip information is identified, and the classification result is The corresponding lip language content is converted into text information, and converted into voice and pictures for real-time transmission and display;
  • the APP display interaction module is used to display the optimal data set for channel selection, real-time display of electrode position, real-time display of EMG signal, real-time display of classification results, and/or display of voice picture translation.
  • the lip language information recognition and decoding module is also used to transmit the recognition result to an emergency contact set by the system.
  • the high-density array electrode may include 130 single electrodes, and the single electrodes are arranged in a high-density form with a center spacing of 1 cm.
  • the lower computer of the training subsystem may also include an electrode placement orifice.
  • the EMG acquisition module may include a microcontroller, an analog-to-digital converter, an independent synchronous clock, a pre-signal filter amplifier and a low-noise power supply.
  • the embodiment of the present invention has the beneficial effect that: the embodiment of the present invention uses the training subsystem to collect the facial and neck EMG signals during the lip language movement through the high-density array electrode, and improves the signal preprocessing algorithm. Signal quality, classify the type of lip language action through the classification algorithm, select the optimal number of electrodes and the optimal position through the channel selection algorithm, and establish the optimal matching template between the EMG signal and the lip language information, and upload it to the network terminal storage.
  • the detection subsystem is used based on the optimal number and position of electrodes selected by the training subsystem to collect the EMG signal during the lip language action at the optimal position, call the optimal matching template, and compare the EMG signal Perform classification and decoding, recognize lip language information, and convert it into corresponding voice and image information, display it in real time, and realize lip language recognition.
  • high-density array electrodes are used to obtain real-time and complete EMG signals during the pronunciation process. After processing and analysis, the electrodes that contribute the most to the lip language action in muscle activity are screened out, and The optimal number of electrodes and electrode positions are determined to achieve objective positioning of lip language recognition electrode selection, which greatly improves the accuracy of lip language recognition.
  • FIG. 1 is a structural block diagram of an auxiliary communication system based on surface electromyography lip language recognition provided by an embodiment of the present invention.
  • FIG. 1 shows a structural block diagram of an auxiliary communication system based on surface electromyography lip language recognition provided by an embodiment of the present invention. For ease of description, only the parts related to this embodiment are shown.
  • an auxiliary communication system based on surface electromyography lip language recognition may include a training subsystem and a detection subsystem.
  • the training subsystem is used to collect facial and neck electromyographic signals in the process of lip language movement through high-density array electrodes, improve signal quality through signal preprocessing algorithms, and classify lip language movement types through the channel
  • the selection algorithm selects the optimal number of electrodes and the optimal position, and establishes the optimal matching template between the EMG signal and the lip information, and uploads it to the network terminal for storage.
  • the detection subsystem is used to collect the electromyographic signal during the lip language action at the optimal position based on the optimal number and position of the electrodes selected by the training subsystem, call the optimal matching template, and perform the calculation on the electromyographic signal Classify and decode, recognize lip language information, and transform it into corresponding voice and image information, and display it in real time to realize lip language recognition.
  • the training subsystem may include two parts: a lower computer and an upper computer, that is, the lower computer of the training subsystem and the upper computer of the training subsystem.
  • the lower computer of the training subsystem may include a high-density array electrode and an EMG acquisition module.
  • the high-density array electrode is used to obtain the high-density EMG signal of the pronunciation muscles of the user's lip language by sticking on the facial and neck pronunciation muscles.
  • the reason why it is necessary to obtain the EMG signal through the high-density array electrode first is that the personal habits and pronunciation methods are different, and the parts of each person’s pronunciation force are not exactly the same. There are certain differences in the muscle activity during the pronunciation process. The characteristic positions are also different, and it is very unreasonable for different people to place electrodes on the same muscle positions. Therefore, in this embodiment, the high-density array electrodes are used to collect comprehensive EMG signals.
  • the high-density array electrode can be composed of a large number of single electrodes.
  • the specific number of single electrodes and the spacing between the single electrodes can be customized according to the size of the user’s face and neck, so as to ensure that comprehensive pronunciation muscles can be collected. Group EMG signal shall prevail.
  • the high-density array electrode may include 130 single electrodes, and the single electrodes are arranged in a high-density form with a center-to-center spacing of 1 cm.
  • the EMG acquisition module may be a 130-channel EMG acquisition module, including a microcontroller (Micro Controller Unit, MCU), analog-to-digital converter, independent synchronous clock, pre-signal filter amplifier and low-noise power supply, used to amplify, filter, analog-to-digital conversion of signals collected by high-density array electrodes, and pass USB or other
  • MCU Micro Controller Unit
  • analog-to-digital converter independent synchronous clock
  • pre-signal filter amplifier and low-noise power supply used to amplify, filter, analog-to-digital conversion of signals collected by high-density array electrodes, and pass USB or other
  • the transmission path is transmitted to the upper computer of the training subsystem.
  • the lower computer of the training subsystem may also include electrode placement orifice plates, and each orifice plate is provided with corresponding electrode hole positions, wherein the hole spacing is about 1 cm to ensure that the electrode distance is small enough.
  • the orifice plate is divided into 4 specifications: 20 holes, 25 holes, 40 holes, and 48 holes. 20, 25, 40, and 48 electrodes can be placed at the same time, reducing the workload and making the operation more convenient.
  • the upper computer of the training subsystem may be a desktop computer, a notebook computer, a tablet computer, etc., and includes a user interaction module and a signal classification, correction matching feedback training module.
  • the user interaction module may include an electromyographic signal display submodule, a lip language training scene display submodule, and a channel selection positioning chart display submodule.
  • the EMG signal display sub-module is used for real-time display of the collected EMG signal, and at the same time provides a single-channel signal selection function, which can observe the signal quality of all channels in real time and ensure the reliability of the signal.
  • the lip language training scene display sub-module is used to provide lip language scene pictures and texts needed in daily life to provide users with a personalized training set. Through fixed scene mode training, EMG signals are collected and stored as lip language analysis muscles. Electricity database. In addition, this sub-module also provides task prompts such as: “read again”, “next scene”, etc., to provide friendly interaction for repeated training and next steps.
  • the channel selection positioning chart display sub-module is used to provide the position distribution of the electrodes on the face and neck, and through training classification, real-time display of the number and specific positions of the selected effective channels.
  • the signal classification, correction matching feedback training module may include a signal processing sub-module, a classification sub-module, and a channel selection sub-module.
  • the signal processing sub-module is used to use IIR bandpass filters and filters based on optimization algorithms to initially filter out power frequency interference and baseline drift, and then use algorithms such as wavelet transform and template matching algorithms to further filter out EMG signals Interference noise such as artifacts and ECG can preprocess the signal to improve signal quality and reliability.
  • the classification sub-module is used to perform algorithm processing such as normalization and blind source separation on the signal to extract the EMG signal related to the pronunciation of the specified short sentence, extract the feature value, and use linear classifier, neural network and support vector Machine technology establishes the correspondence between the EMG signal and the specified short sentence, and classifies the collected lip language content based on the EMG information.
  • algorithm processing such as normalization and blind source separation on the signal to extract the EMG signal related to the pronunciation of the specified short sentence, extract the feature value, and use linear classifier, neural network and support vector Machine technology establishes the correspondence between the EMG signal and the specified short sentence, and classifies the collected lip language content based on the EMG information.
  • the channel selection sub-module is used to select the EMG template with the least number of channels and the best classification accuracy after multiple calibration and matching, and store and save the best matching template of EMG signal and lip language information to establish personal training Set, and transmit the optimal module data set to the network terminal.
  • the detection subsystem may include two parts: a lower computer and an upper computer, that is, a lower computer of the detection subsystem and an upper computer of the detection subsystem.
  • the lower computer of the detection subsystem includes a patch type flexible electrode and a wireless EMG acquisition module.
  • the patch type flexible electrode is used to collect the electromyographic signal during the lip language action at the optimal position.
  • the existing EMG electrode hard plate electrode has a limited degree of adhesion to the skin, and the pulling deformation of the skin is likely to cause greater noise interference to the EMG data, and the patch-type flexible electrode is made of several flexible materials.
  • the FPC soft-board single-electrode form a bendable and custom-made flexible electrode sheet that is tightly integrated with the skin.
  • the specific number of single-electrodes can be set according to the actual situation. Preferably, it can be set to 8.
  • the user selects the number of flexible electrodes to be used and the placement position of the electrodes on the face and neck according to the calculation results of the training subsystem.
  • the degree of personalization is high. It fits closely to the skin and follows the micro-deformation of the skin. The obtained electromyographic information is more stable and reliable.
  • the wireless EMG acquisition module integrates 8-channel EMG acquisition and wireless transmission functions, in which a microcontroller with integrated WIFI function, pre-amplification circuit, analog-to-digital conversion circuit, etc. are used to collect patch-type flexible electrodes
  • the EMG information is wirelessly transmitted to the upper computer of the detection subsystem through WIFI.
  • Wireless transmission is more convenient than traditional wired electrodes, is simple to wear, and reduces the influence of entanglement between wired electrode wires. WIFI transmission does not lose data, ensuring data integrity.
  • Multi-channel EMG information is transmitted wirelessly at the same time, which makes up for the defect of insufficient information in the traditional method of electrode channels.
  • the upper computer of the detection subsystem may be a mobile phone, a tablet computer, etc., including a personal training set download module, a lip language information recognition and decoding module, and an APP display interaction module.
  • the personal training set downloading module is used to call the personal training set from the network shared port of the training subsystem by connecting to the network, and store it in the APP client.
  • the lip language information recognition and decoding module includes functional modules such as data preprocessing, online EMG classification, and voice conversion of the classification results, which are used to denoise and filter the signal by using IIR filters, wavelet transform, etc. Match the features with the personal training set, decode the lip language information by using the classification algorithm, recognize the lip language content, convert the lip language content corresponding to the classification result into text information, and call the voice and picture templates through processing to convert it into voice and The picture is transmitted and displayed in real time, and is also used to transmit the recognition result to the emergency contact set by the system through the APP.
  • functional modules such as data preprocessing, online EMG classification, and voice conversion of the classification results, which are used to denoise and filter the signal by using IIR filters, wavelet transform, etc. Match the features with the personal training set, decode the lip language information by using the classification algorithm, recognize the lip language content, convert the lip language content corresponding to the classification result into text information, and call the voice and picture templates through processing to convert it into voice and The picture is transmitted
  • the APP display interaction module is used to display the optimal data set for channel selection, real-time display of electrode positions, real-time display of electromyographic signals, real-time display of classification results, and/or display of voice picture translation.
  • the above content is collected and analyzed for the electromyographic information of the facial and neck pronunciation muscles.
  • other muscles related to the pronunciation function such as the abdomen, also contain certain pronunciation movement information, which can also be used as this implementation
  • the source of the EMG information of the case, and the pronunciation information recognition is collected and analyzed for the electromyographic information of the facial and neck pronunciation muscles.
  • the core content of this embodiment is lip language recognition based on high-density EMG.
  • Lip language recognition can not only be used for people with speech impairments, but also can be extended to other occasions with inconvenient pronunciation or strong noise, such as underwater operations, noisy factories, etc. , Has huge room for development.
  • the embodiment of the present invention uses the training subsystem to collect facial and neck electromyographic signals in the process of lip language movements through high-density array electrodes, improves signal quality through signal preprocessing algorithms, and uses classification algorithms to determine the types of lip language movements. For classification, the optimal number of electrodes and optimal positions are selected through the channel selection algorithm, and the optimal matching template between the EMG signal and the lip information is established, and uploaded to the network terminal for storage.
  • the detection subsystem is used based on the optimal number and position of electrodes selected by the training subsystem to collect the EMG signal during the lip language action at the optimal position, call the optimal matching template, and compare the EMG signal Perform classification and decoding, recognize lip language information, and convert it into corresponding voice and image information, display it in real time, and realize lip language recognition.
  • high-density array electrodes are used to obtain real-time and complete EMG signals during the pronunciation process. After processing and analysis, the electrodes that contribute the most to the lip language action in muscle activity are screened out, and The optimal number of electrodes and electrode positions are determined to achieve objective positioning of lip language recognition electrode selection, which greatly improves the accuracy of lip language recognition.

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Human Computer Interaction (AREA)
  • Multimedia (AREA)
  • General Health & Medical Sciences (AREA)
  • Acoustics & Sound (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Biomedical Technology (AREA)
  • Artificial Intelligence (AREA)
  • Theoretical Computer Science (AREA)
  • Veterinary Medicine (AREA)
  • Surgery (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Public Health (AREA)
  • Animal Behavior & Ethology (AREA)
  • Biophysics (AREA)
  • Pathology (AREA)
  • Psychiatry (AREA)
  • Heart & Thoracic Surgery (AREA)
  • Medical Informatics (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Physiology (AREA)
  • Neurosurgery (AREA)
  • Dermatology (AREA)
  • Neurology (AREA)
  • Evolutionary Computation (AREA)
  • Fuzzy Systems (AREA)
  • Mathematical Physics (AREA)
  • Power Engineering (AREA)
  • Social Psychology (AREA)
  • Measurement And Recording Of Electrical Phenomena And Electrical Characteristics Of The Living Body (AREA)
  • Electrically Operated Instructional Devices (AREA)

Abstract

一种基于表面肌电唇语识别的辅助沟通系统,包括:训练子系统,用于通过高密度阵列式电极采集唇语动作过程中的面颈部肌电信号,通过信号预处理算法提高信号质量,通过分类算法对唇语动作类型进行分类,通过通道选择算法选取最优电极个数和最优位置,并建立肌电信号与唇语信息之间的最优匹配模板,上传至网络终端存储;检测子系统,用于基于所述训练子系统选取的最优电极个数和位置,采集最优位置处唇语动作过程中的肌电信号,调用最优匹配模板,对肌电信号进行分类解码,识别唇语信息,并转化成对应的语音和图像信息,实时显示出来,实现唇语识别,从而大大提高了唇语识别的准确率。

Description

一种基于表面肌电唇语识别的辅助沟通系统 技术领域
本发明属于语音识别辅助沟通技术领域,尤其涉及一种基于表面肌电唇语识别的辅助沟通系统。
背景技术
语言是人类特有的表达情感、传递信息、参与社会交往的重要能力,发音是语言表达的基础。发音是一个非常复杂的神经中枢系统控制肌肉协同运动的过程,是多个器官、多个肌群相互配合、相互协作的结果。在发音时,面部肌肉和颈部肌肉会相应地进行运动,发不同的音,对应的面颈部肌肉运动模式也不同。因此,可以通过采集面颈部的表面肌肉电信号,通过特征提取、分类,将不同的发音和不同的肌群电生理变化对应起来,从而识别发音信息,进而辅助患者与他人沟通。
根据2006年全国第二次残疾人抽样调查结果,我国有8296万残疾人,其中有127万唇语残疾人口,占总人口数的1.53%。发声障碍严重降低他们的生活质量,影响他们的日常生活交流,造成沟通的不便,对他们的家庭和社会来说都是沉重的负担。而发音障碍的诊断及治疗在临床上仍不够成熟,他们迫切需要辅助沟通产品来帮助他们表达、交流。
表面肌电信号是肌肉系统进行随意性和非随意性活动时产生的生物电变化经表面电极引导、放大、显示和记录所获得的一维电压时间序列信号,反映了运动神经元的生物电活动形成于众多外周运动单位电位在时间和空间上的总和,与肌肉活动情况有较大的关联,在一定程度上可以体现相关肌肉的活动水平,因此,通过对表面肌电进行分析可以观察到相关肌肉的运动情况。表面肌电作为一种客观量化的手段,具有无创、操作简单、成本较低和能提供定量定性分析等优点,因此被广泛应用于医学研究、人机交互等领域。
近年来,已经有一些研究使用肌电进行语音识别辅助沟通,但现有技术中肌电采集往往仅使用少数几个电极放置在已知的几块发音肌肉上,电极的个数和位置均是人为主观选取,所选取的电极个数和通道数不一定是最优方案,存在一定的局限性,唇语识别准确率较低。
技术问题
有鉴于此,本发明实施例为发音困难但可以用口形、唇语表达的患者提供了一种基于表面肌电唇语识别的辅助沟通系统,以解决现有技术中,通过人为主观选取电极的个数和位置,难以得到最优方案,语音信号识别准确率较低的问题。
技术解决方案
本发明实施例提供的一种基于表面肌电唇语识别的辅助沟通系统,可以包括:
训练子系统,用于通过高密度阵列式电极采集唇语动作过程中的面颈部肌电信号,通过信号预处理算法提高信号质量,通过分类算法对唇语动作类型进行分类,通过通道选择算法选取最优电极个数和最优位置,并建立肌电信号与唇语信息之间的最优匹配模板,上传至网络终端存储;
检测子系统,用于基于所述训练子系统选取的最优电极个数和位置,采集最优位置处唇语动作过程中的肌电信号,调用最优匹配模板,对肌电信号进行分类解码,识别唇语信息,并转化成对应的语音和图像信息,实时显示出来,实现唇语识别。
进一步地,所述训练子系统可以包括训练子系统下位机和训练子系统上位机,所述训练子系统下位机可以包括:
高密度阵列式电极,用于通过粘贴在面颈部发音肌群上来获取使用者唇语过程中发音肌群的高密度肌电信号;
肌电采集模块,用于对高密度阵列电极采集到的信号进行放大、滤波、模数转换,并传输到训练子系统上位机。
进一步地,所述训练子系统上位机可以包括用户交互模块和信号分类、校正匹配反馈训练模块,所述用户交互模块可以包括:
肌电信号显示子模块,用于实时显示采集的肌电信号;
唇语训练场景显示子模块,用于提供唇语场景图片和文字;
通道选择定位图表显示子模块,用于提供电极在面部和颈部的位置分布情况。
进一步地,所述信号分类、校正匹配反馈训练模块可以包括:
信号处理子模块,用于采用滤波器滤除工频干扰和基线漂移,利用小波变换、模板匹配算法滤除肌电信号中的干扰噪声;
分类子模块,用于提取与指定短句的发音相关的肌电信号,提取特征值,建立肌电信号与所述指定短句之间的对应关系,基于肌电信息对采集的唇语内容进行分类;
通道选取子模块,用于选取最优匹配模板,建立个人训练集,并传输到网络终端。
进一步地,所述检测子系统可以包括检测子系统下位机和检测子系统上位机,所述检测子系统下位机可以包括:
贴片式柔性电极,用于采集最优位置处唇语动作过程中的肌电信号;
无线肌电采集模块,用于将贴片式柔性电极采集的肌电信息,通过无线传输到检测子系统上位机。
进一步地,所述检测子系统上位机可以包括:
个人训练集下载模块,用于通过连接网络,从训练子系统网络共享端口,调用个人训练集,并存储于APP客户端;
唇语信息识别解码模块,用于对信号进行降噪滤波处理,并对肌电信号与个人训练集进行特征匹配,通过采用分类算法,解码唇语信息,识别出唇语内容,将分类结果所对应的唇语内容转换成文字信息,并转换成语音和图片进行实时传输显示;
APP显示交互模块,用于进行通道选取最优数据集显示、电极位置实时显示、肌电信号实时显示、分类结果实时显示和/或语音图片翻译显示。
进一步地,所述唇语信息识别解码模块还用于将识别结果传送给系统设置的紧急联系人。
进一步地,所述高密度阵列式电极可以包括130个单电极,且各个单电极之间以中心间距1厘米的高密度形式排列。
进一步地,所述训练子系统下位机还可以包括电极放置孔板。
进一步地,所述肌电采集模块可以包括微控制器、模数转换器、独立同步时钟、前置信号滤波放大器和低噪声电源。
有益效果
本发明实施例与现有技术相比存在的有益效果是:本发明实施例使用训练子系统通过高密度阵列式电极采集唇语动作过程中的面颈部肌电信号,通过信号预处理算法提高信号质量,通过分类算法对唇语动作类型进行分类,通过通道选择算法选取最优电极个数和最优位置,并建立肌电信号与唇语信息之间的最优匹配模板,上传至网络终端存储。在此基础上,使用检测子系统基于所述训练子系统选取的最优电极个数和位置,采集最优位置处唇语动作过程中的肌电信号,调用最优匹配模板,对肌电信号进行分类解码,识别唇语信息,并转化成对应的语音和图像信息,实时显示出来,实现唇语识别。通过这种先全面、后局部的策略,采用高密度阵列式电极实时、完整地获取发音过程中的肌电信号,经过处理、分析,筛选出肌肉活动中对唇语动作贡献最大的电极,并确定最优电极个数和电极位置,实现客观定位唇语识别电极选取,从而大大提高了唇语识别的准确率。
附图说明
为了更清楚地说明本发明实施例中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动性的前提下,还可以根据这些附图获得其它的附图。
图1为本发明实施例提供的一种基于表面肌电唇语识别的辅助沟通系统的结构框图。
本发明的实施方式
为使得本发明的发明目的、特征、优点能够更加的明显和易懂,下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,下面所描述的实施例仅仅是本发明一部分实施例,而非全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其它实施例,都属于本发明保护的范围。
图1示出了本发明实施例提供的一种基于表面肌电唇语识别的辅助沟通系统的结构框图,为了便于说明,仅示出了与本实施例相关的部分。
请参阅图1,本发明实施例中提供的一种基于表面肌电唇语识别的辅助沟通系统可以包括训练子系统和检测子系统。
所述训练子系统,用于通过高密度阵列式电极采集唇语动作过程中的面颈部肌电信号,通过信号预处理算法提高信号质量,通过分类算法对唇语动作类型进行分类,通过通道选择算法选取最优电极个数和最优位置,并建立肌电信号与唇语信息之间的最优匹配模板,上传至网络终端存储。
所述检测子系统,用于基于所述训练子系统选取的最优电极个数和位置,采集最优位置处唇语动作过程中的肌电信号,调用最优匹配模板,对肌电信号进行分类解码,识别唇语信息,并转化成对应的语音和图像信息,实时显示出来,实现唇语识别。
所述训练子系统可以包括下位机和上位机两部分,即训练子系统下位机和训练子系统上位机。
所述训练子系统下位机可以包括高密度阵列式电极和肌电采集模块。
所述高密度阵列式电极,用于通过粘贴在面颈部发音肌群上来获取使用者唇语过程中发音肌群的高密度肌电信号。之所以需要首先通过所述高密度阵列式电极来获取肌电信号,是由于个人习惯、发音方式不尽相同,每个人发音用力部位并不完全一样,发音过程的肌肉活动有一定差别,肌肉活动特性位置也是不同的,对不同的人在相同的几块肌肉位置放置电极是非常不合理的,因此,在本实施例中首先通过所述高密度阵列式电极来采集全面的肌电信号。
所述高密度阵列式电极可以由众多的单电极组成,具体的单电极数目以及各个单电极之间的间距均可以根据使用者面颈部尺寸实现个性化定制,以保证可以采集全面的发音肌群肌电信号为准。优选地,所述高密度阵列式电极可以包括130个单电极,且各个单电极之间以中心间距1cm的高密度形式排列。
所述肌电采集模块可以为130通道的肌电采集模块,包括微控制器(Micro Controller Unit,MCU)、模数转换器、独立同步时钟、前置信号滤波放大器和低噪声电源,用于对高密度阵列电极采集到的信号进行放大、滤波、模数转换,并通过USB或者其它传输途径传输到训练子系统上位机。
优选地,所述训练子系统下位机还可以包括电极放置孔板,每个孔板中设置相应的电极孔位,其中孔间距在1cm左右,以确保电极距离足够小。孔板分为4种规格:20孔,25孔,40孔,48孔,分别可以同时放置20、25、40、48个电极,减少工作量,操作起来更方便。
所述训练子系统上位机可以为台式机、笔记本电脑、平板电脑等设备,包括用户交互模块和信号分类、校正匹配反馈训练模块。
所述用户交互模块可以包括肌电信号显示子模块、唇语训练场景显示子模块和通道选择定位图表显示子模块。
所述肌电信号显示子模块,用于实时显示采集的肌电信号,同时提供了单通道信号选择功能,可以实时观测所有通道信号质量,确保信号的可靠性。
所述唇语训练场景显示子模块,用于提供日常生活需要的唇语场景图片和文字为用户提供个性化训练集,通过固定的场景模式训练,采集肌电信号并存储,作为唇语分析肌电数据库。另外该子模块还提供了如:“再读一遍”、“下一场景”等任务提示,为重复训练以及下一步操作提供友好交互。
所述通道选择定位图表显示子模块,用于提供电极在面部和颈部的位置分布情况,通过训练分类,实时显示所选取的有效通道的个数和具体位置。
所述信号分类、校正匹配反馈训练模块可以包括信号处理子模块、分类子模块和通道选取子模块。
所述信号处理子模块,用于采用IIR带通滤波器以及基于优化算法的滤波器初步滤除工频干扰和基线漂移,然后利用小波变换、模板匹配算法等算法进一步滤除肌电信号中的伪迹、心电等干扰噪声,对信号进行预处理,提高信号质量和可靠性。
所述分类子模块,用于对信号进行归一化、盲源分离等算法处理,以提取与指定短句的发音相关的肌电信号,提取特征值,利用线性分类器、神经网络和支持向量机技术,建立肌电信号与所述指定短句之间的对应关系,基于肌电信息对所采集的唇语内容进行分类。
所述通道选取子模块,用于经过多次校正匹配,选取出最少通道个数和最优分类精度的肌电模板,将肌电信号与唇语信息的最优匹配模板存储保存,建立个人训练集,并将该最优模块数据集传输到网络终端。
由于个人习惯、发音方式不尽相同,每个人发音用力部位并不完全一样,发音过程的肌肉活动有一定差别,肌肉活动特性位置也是不同的。因此要准确识别唇语信息,有必要对使用者进行多次发音训练,建立个人训练集,存储肌电信号与指定短句之间的对应关系,并确定个性化的电极最优解。
所述检测子系统可以包括下位机和上位机两部分,即检测子系统下位机和检测子系统上位机。
所述检测子系统下位机包括贴片式柔性电极和无线肌电采集模块。
所述贴片式柔性电极,用于采集最优位置处唇语动作过程中的肌电信号。现有的肌电电极硬板电极,与皮肤贴合程度有限,皮肤的拉扯形变容易给所肌电数据带来较大的噪声干扰,而所述贴片式柔性电极包含若干个柔性材料做成的FPC软板单电极,形成可弯曲、与皮肤紧密结合的定制柔性电极片,具体的单电极数目可以根据实际情况进行设置,优选地,可以将其设置为8个。用户根据训练子系统计算结果选取所需要使用的柔性电极个数以及电极在面颈部的放置位置,个性化程度高,与皮肤紧密贴合,跟随皮肤微形变,获取的肌电信息更稳定、可靠。
所述无线肌电采集模块集成了8通道肌电采集与无线传输功能,其中采用了集成WIFI功能的微控制器、前置放大电路、模数转换电路等,用于将贴片式柔性电极采集的肌电信息,通过WIFI等无线传输到检测子系统上位机。无线传输比传统有线电极更方便,佩戴简单、减少有线电极导线之间的缠绕带来的影响。WIFI传输不丢失数据,保证了数据的完整性。多路肌电信息同时无线传输,弥补了传统方法中电极通道少信息不全的缺陷。
所述检测子系统上位机可以为手机、平板电脑等设备,包括个人训练集下载模块、唇语信息识别解码模块和APP显示交互模块。
所述个人训练集下载模块,用于通过连接网络,从训练子系统网络共享端口,调用个人训练集,并存储于APP客户端。
所述唇语信息识别解码模块包括了数据预处理、肌电在线分类、分类结果语音转换等功能模块,用于采用IIR滤波器、小波变换等对信号进行降噪滤波处理,并对肌电信号与个人训练集进行特征匹配,通过采用分类算法,解码唇语信息,识别出唇语内容,将分类结果所对应的唇语内容转换成文字信息,通过处理调用语音和图片模板,转换成语音和图片进行实时传输显示,还用于将识别结果通过APP传送给系统设置的紧急联系人。
目前的辅助沟通系统大都需要沟通者与患者面对面,或者保持较近位置。但日常生活中,患者在很多一个人的场合也需要与他人进行交流,例如:独自在家要寻求帮助。本实施例借助无线发送技术,识别出患者的唇语信息后,一方面将唇语识别结果通过APP转换成语音和图片播放展示出来;另一方面通过用户链接,自动发送到设置好的紧急联系人手机APP上,使他人能够即时、远程地获得患者的唇语信息。
所述APP显示交互模块,用于进行通道选取最优数据集显示、电极位置实时显示、肌电信号实时显示、分类结果实时显示和/或语音图片翻译显示。
以上内容均是针对面颈部发音肌群的肌电信息进行采集分析,除此之外,其他部位与发音功能相关的肌肉,如腹部,同样包含了一定的发音运动信息,也可作为本实施例的肌电信息来源,进行发音信息识别。
本实施例的核心内容为基于高密度肌电的唇语识别,唇语识别不仅可以用于发音障碍人士,还可以推广到其他不便发音或噪音较强场合,如:水下作业,嘈杂工厂等,具有巨大的发展空间。
综上所述,本发明实施例使用训练子系统通过高密度阵列式电极采集唇语动作过程中的面颈部肌电信号,通过信号预处理算法提高信号质量,通过分类算法对唇语动作类型进行分类,通过通道选择算法选取最优电极个数和最优位置,并建立肌电信号与唇语信息之间的最优匹配模板,上传至网络终端存储。在此基础上,使用检测子系统基于所述训练子系统选取的最优电极个数和位置,采集最优位置处唇语动作过程中的肌电信号,调用最优匹配模板,对肌电信号进行分类解码,识别唇语信息,并转化成对应的语音和图像信息,实时显示出来,实现唇语识别。通过这种先全面、后局部的策略,采用高密度阵列式电极实时、完整地获取发音过程中的肌电信号,经过处理、分析,筛选出肌肉活动中对唇语动作贡献最大的电极,并确定最优电极个数和电极位置,实现客观定位唇语识别电极选取,从而大大提高了唇语识别的准确率。
所属领域的技术人员可以清楚地了解到,为了描述的方便和简洁,仅以上述各功能系统、模块的划分进行举例说明,实际应用中,可以根据需要而将上述功能分配由不同的功能系统、模块完成,以完成以上描述的全部或者部分功能。实施例中的各功能系统、模块可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中,上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。另外,各功能系统、模块的具体名称也只是为了便于相互区分,并不用于限制本申请的保护范围。
以上所述实施例仅用以说明本发明的技术方案,而非对其限制;尽管参照前述实施例对本发明进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本发明各实施例技术方案的精神和范围,均应包含在本发明的保护范围之内。

Claims (10)

  1. 一种基于表面肌电唇语识别的辅助沟通系统,其特征在于,包括:
    训练子系统,用于通过高密度阵列式电极采集唇语动作过程中的面颈部肌电信号,通过信号预处理算法提高信号质量,通过分类算法对唇语动作类型进行分类,通过通道选择算法选取最优电极个数和最优位置,并建立肌电信号与唇语信息之间的最优匹配模板,上传至网络终端存储;
    检测子系统,用于基于所述训练子系统选取的最优电极个数和位置,采集最优位置处唇语动作过程中的肌电信号,调用最优匹配模板,对肌电信号进行分类解码,识别唇语信息,并转化成对应的语音和图像信息,实时显示出来,实现唇语识别。
  2. 根据权利要求1所述的系统,其特征在于,所述训练子系统包括训练子系统下位机和训练子系统上位机,所述训练子系统下位机包括:
    高密度阵列式电极,用于通过粘贴在面颈部发音肌群上来获取使用者唇语过程中发音肌群的高密度肌电信号;
    肌电采集模块,用于对高密度阵列电极采集到的信号进行放大、滤波、模数转换,并传输到训练子系统上位机。
  3. 根据权利要求2所述的系统,其特征在于,所述训练子系统上位机包括用户交互模块和信号分类、校正匹配反馈训练模块,所述用户交互模块包括:
    肌电信号显示子模块,用于实时显示采集的肌电信号;
    唇语训练场景显示子模块,用于提供唇语场景图片和文字;
    通道选择定位图表显示子模块,用于提供电极在面部和颈部的位置分布情况。
  4. 根据权利要求3所述的系统,其特征在于,所述信号分类、校正匹配反馈训练模块包括:
    信号处理子模块,用于采用滤波器滤除工频干扰和基线漂移,利用小波变换、模板匹配算法滤除肌电信号中的干扰噪声;
    分类子模块,用于提取与指定短句的发音相关的肌电信号,提取特征值,建立肌电信号与所述指定短句之间的对应关系,基于肌电信息对采集的唇语内容进行分类;
    通道选取子模块,用于选取最优匹配模板,建立个人训练集,并传输到网络终端。
  5. 根据权利要求1所述的系统,其特征在于,所述检测子系统包括检测子系统下位机和检测子系统上位机,所述检测子系统下位机包括:
    贴片式柔性电极,用于采集最优位置处唇语动作过程中的肌电信号;
    无线肌电采集模块,用于将贴片式柔性电极采集的肌电信息,通过无线传输到检测子系统上位机。
  6. 根据权利要求5所述的系统,其特征在于,所述检测子系统上位机包括:
    个人训练集下载模块,用于通过连接网络,从训练子系统网络共享端口,调用个人训练集,并存储于APP客户端;
    唇语信息识别解码模块,用于对信号进行降噪滤波处理,并对肌电信号与个人训练集进行特征匹配,通过采用分类算法,解码唇语信息,识别出唇语内容,将分类结果所对应的唇语内容转换成文字信息,并转换成语音和图片进行实时传输显示;
    APP显示交互模块,用于进行通道选取最优数据集显示、电极位置实时显示、肌电信号实时显示、分类结果实时显示和/或语音图片翻译显示。
  7. 根据权利要求6所述的系统,其特征在于,所述唇语信息识别解码模块还用于将识别结果传送给系统设置的紧急联系人。
  8. 根据权利要求1所述的系统,其特征在于,所述高密度阵列式电极包括130个单电极,且各个单电极之间以中心间距1厘米的高密度形式排列。
  9. 根据权利要求2所述的系统,其特征在于,所述训练子系统下位机还包括电极放置孔板。
  10. 根据权利要求2所述的系统,其特征在于,所述肌电采集模块包括微控制器、模数转换器、独立同步时钟、前置信号滤波放大器和低噪声电源。
PCT/CN2019/130814 2019-03-25 2019-12-31 一种基于表面肌电唇语识别的辅助沟通系统 WO2020192231A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US16/960,496 US20210217419A1 (en) 2019-03-25 2019-12-31 Lip-language recognition aac system based on surface electromyography

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910228442.4 2019-03-25
CN201910228442.4A CN110059575A (zh) 2019-03-25 2019-03-25 一种基于表面肌电唇语识别的辅助沟通系统

Publications (1)

Publication Number Publication Date
WO2020192231A1 true WO2020192231A1 (zh) 2020-10-01

Family

ID=67317373

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/130814 WO2020192231A1 (zh) 2019-03-25 2019-12-31 一种基于表面肌电唇语识别的辅助沟通系统

Country Status (3)

Country Link
US (1) US20210217419A1 (zh)
CN (1) CN110059575A (zh)
WO (1) WO2020192231A1 (zh)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112330713A (zh) * 2020-11-26 2021-02-05 南京工程学院 基于唇语识别的重度听障患者言语理解度的改进方法
CN113887339A (zh) * 2021-09-15 2022-01-04 天津大学 融合表面肌电信号与唇部图像的无声语音识别系统及方法

Families Citing this family (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10943100B2 (en) 2017-01-19 2021-03-09 Mindmaze Holding Sa Systems, methods, devices and apparatuses for detecting facial expression
CN110892408A (zh) 2017-02-07 2020-03-17 迈恩德玛泽控股股份有限公司 用于立体视觉和跟踪的系统、方法和装置
CN110059575A (zh) * 2019-03-25 2019-07-26 中国科学院深圳先进技术研究院 一种基于表面肌电唇语识别的辅助沟通系统
CN110865705B (zh) * 2019-10-24 2023-09-19 中国人民解放军军事科学院国防科技创新研究院 多模态融合的通讯方法、装置、头戴设备及存储介质
CN111190484B (zh) * 2019-12-25 2023-07-21 中国人民解放军军事科学院国防科技创新研究院 一种多模态交互系统和方法
CN111419230A (zh) * 2020-04-17 2020-07-17 上海交通大学 一种用于运动单元解码的表面肌电信号采集系统
CN111832412B (zh) * 2020-06-09 2024-04-09 北方工业大学 一种发声训练矫正方法及系统
CN112349182A (zh) * 2020-11-10 2021-02-09 中国人民解放军海军航空大学 一种聋哑人交谈辅助系统
CN112741619A (zh) * 2020-12-23 2021-05-04 清华大学 一种自驱动唇语动作捕捉装置
CN112927704A (zh) * 2021-01-20 2021-06-08 中国人民解放军海军航空大学 一种沉默式全天候单兵通信系统
CN113627401A (zh) * 2021-10-12 2021-11-09 四川大学 融合双注意力机制的特征金字塔网络的肌电手势识别方法

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060129400A1 (en) * 2004-12-10 2006-06-15 Microsoft Corporation Method and system for converting text to lip-synchronized speech in real time
WO2018113649A1 (zh) * 2016-12-21 2018-06-28 深圳市掌网科技股份有限公司 一种虚拟现实语言交互系统与方法
CN108319912A (zh) * 2018-01-30 2018-07-24 歌尔科技有限公司 一种唇语识别方法、装置、系统和智能眼镜
CN110059575A (zh) * 2019-03-25 2019-07-26 中国科学院深圳先进技术研究院 一种基于表面肌电唇语识别的辅助沟通系统

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102169690A (zh) * 2011-04-08 2011-08-31 哈尔滨理工大学 基于表面肌电信号的语音信号识别系统和识别方法
CN102999154B (zh) * 2011-09-09 2015-07-08 中国科学院声学研究所 一种基于肌电信号的辅助发声方法及装置
CN203252647U (zh) * 2012-09-29 2013-10-30 艾利佛公司 用于判定生理特征的可佩带的设备
CA2918594A1 (en) * 2013-05-20 2014-11-27 Aliphcom Combination speaker and light source responsive to state(s) of an organism based on sensor data
KR20150104345A (ko) * 2014-03-05 2015-09-15 삼성전자주식회사 음성 합성 장치 및 음성 합성 방법
CN103948388B (zh) * 2014-04-23 2018-10-30 深圳先进技术研究院 一种肌电采集装置
US9789306B2 (en) * 2014-12-03 2017-10-17 Neurohabilitation Corporation Systems and methods for providing non-invasive neurorehabilitation of a patient

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060129400A1 (en) * 2004-12-10 2006-06-15 Microsoft Corporation Method and system for converting text to lip-synchronized speech in real time
WO2018113649A1 (zh) * 2016-12-21 2018-06-28 深圳市掌网科技股份有限公司 一种虚拟现实语言交互系统与方法
CN108319912A (zh) * 2018-01-30 2018-07-24 歌尔科技有限公司 一种唇语识别方法、装置、系统和智能眼镜
CN110059575A (zh) * 2019-03-25 2019-07-26 中国科学院深圳先进技术研究院 一种基于表面肌电唇语识别的辅助沟通系统

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112330713A (zh) * 2020-11-26 2021-02-05 南京工程学院 基于唇语识别的重度听障患者言语理解度的改进方法
CN112330713B (zh) * 2020-11-26 2023-12-19 南京工程学院 基于唇语识别的重度听障患者言语理解度的改进方法
CN113887339A (zh) * 2021-09-15 2022-01-04 天津大学 融合表面肌电信号与唇部图像的无声语音识别系统及方法

Also Published As

Publication number Publication date
US20210217419A1 (en) 2021-07-15
CN110059575A (zh) 2019-07-26

Similar Documents

Publication Publication Date Title
WO2020192231A1 (zh) 一种基于表面肌电唇语识别的辅助沟通系统
Panicker et al. A survey of machine learning techniques in physiology based mental stress detection systems
Khezri et al. Reliable emotion recognition system based on dynamic adaptive fusion of forehead biopotentials and physiological signals
Brumberg et al. Brain–computer interfaces for speech communication
WO2017193497A1 (zh) 基于融合模型的智能化健康管理服务器、系统及其控制方法
JP2019528104A (ja) 生体信号のモニタリングのための耳内感知システムおよび方法
Chen et al. Eyebrow emotional expression recognition using surface EMG signals
US7963931B2 (en) Methods and devices of multi-functional operating system for care-taking machine
US20210128049A1 (en) Pronunciation function evaluation system based on array high-density surface electromyography
CN109065162A (zh) 一种综合性智能化诊断系统
Rattanyu et al. Emotion monitoring from physiological signals for service robots in the living space
CN108814565A (zh) 一种基于多传感器信息融合和深度学习的智能中医健康检测梳妆台
CN111513735A (zh) 基于脑机接口和深度学习的重度抑郁症辨识系统及应用
CN109124655A (zh) 精神状态分析方法、装置、设备、计算机介质及多功能椅
US20220208194A1 (en) Devices, systems, and methods for personal speech recognition and replacement
Smith et al. Detection of simulated vocal dysfunctions using complex sEMG patterns
Ntalampiras Model ensemble for predicting heart and respiration rate from speech
Hinduja et al. Investigation into recognizing context over time using physiological signals
CN112669963A (zh) 智能健康机、健康数据生成方法以及健康数据管理系统
CN215017589U (zh) 一种基于微表情技术的养老服务用心理评估系统及设备
Yi et al. Mordo: Silent command recognition through lightweight around-ear biosensors
Sandhya et al. Analysis of speech imagery using brain connectivity estimators on consonant-vowel-consonant words
Smith et al. Non-invasive ambulatory monitoring of complex sEMG patterns and its potential application in the detection of vocal dysfunctions
Ghosh et al. Classification of silent speech in english and bengali languages using stacked autoencoder
Tan et al. Extracting spatial muscle activation patterns in facial and neck muscles for silent speech recognition using high-density sEMG

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19921660

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19921660

Country of ref document: EP

Kind code of ref document: A1

122 Ep: pct application non-entry in european phase

Ref document number: 19921660

Country of ref document: EP

Kind code of ref document: A1

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 21.03.2022)

122 Ep: pct application non-entry in european phase

Ref document number: 19921660

Country of ref document: EP

Kind code of ref document: A1