WO2020151155A1 - Procédé et dispositif de construction d'un modèle de détection de la maladie d'alzheimer - Google Patents

Procédé et dispositif de construction d'un modèle de détection de la maladie d'alzheimer Download PDF

Info

Publication number
WO2020151155A1
WO2020151155A1 PCT/CN2019/089829 CN2019089829W WO2020151155A1 WO 2020151155 A1 WO2020151155 A1 WO 2020151155A1 CN 2019089829 W CN2019089829 W CN 2019089829W WO 2020151155 A1 WO2020151155 A1 WO 2020151155A1
Authority
WO
WIPO (PCT)
Prior art keywords
speech
sample
alzheimer
segment
disease detection
Prior art date
Application number
PCT/CN2019/089829
Other languages
English (en)
Chinese (zh)
Inventor
刘博卿
贾雪丽
王健宗
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2020151155A1 publication Critical patent/WO2020151155A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/02Feature extraction for speech recognition; Selection of recognition unit
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/27Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique
    • G10L25/30Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique using neural networks
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
    • G10L25/66Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination for extracting parameters related to health condition
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/20ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems

Definitions

  • the most common diagnosis method is to monitor the patient's medical history and perform cognitive tests (such as picture description tasks), mental state tests, and emotional tests on the patient. In medicine, this diagnosis process can last for several weeks and is very difficult to carry out. This method of diagnosing patients based on the experience of medical experts is inefficient in predicting symptoms.
  • the method further includes: acquiring a target speech speech segment of the target user, the target speech speech segment including a speech description of the target image by the target user; and a speech segment for the target speech Perform feature extraction to obtain audio features of the target speech speech segment; determine the Alzheimer's disease detection result of the target user according to the audio feature of the target speech speech segment and the Alzheimer's disease detection model.
  • the first extraction unit is configured to perform feature extraction on each sample speech speech segment to obtain the audio characteristics of each sample speech speech segment;
  • the establishing unit is specifically configured to determine the audio characteristics of each sample speech segment, the Alzheimer's disease detection result of the sample user, the linear threshold, and the convolutional neural network.
  • a network model to establish the Alzheimer's disease detection model is specifically configured to determine the audio characteristics of each sample speech segment, the Alzheimer's disease detection result of the sample user, the linear threshold, and the convolutional neural network.
  • the Alzheimer’s disease detection model is established according to the audio characteristics of each sample speech segment, the Alzheimer’s disease detection result of the sample user, and the convolutional neural network model, and the Alzheimer’s disease
  • the detection model is used to represent the mapping relationship between audio features and Alzheimer's detection results.
  • multiple sample speech speech segments in S110 can be obtained through steps (1) to (4):
  • the normalized result of the audio data is divided into the multiple sample speech speech segments.
  • the sample speech speech segment may be a speech speech segment that has undergone fade-in and/or fade-out processing.
  • OpenSMILE can be used to extract features, and OpenSMILE provides different configuration files to extract different features. Because patients with Alzheimer's disease are prone to depression, anxiety and sadness, it is difficult for them to express their emotions in rhythm, so the following feature groups (OpenSMILE profile names) are selected: IS09, IS10, IS11, IS12.
  • LLDs are low-level descriptors, including pcm_RMSenergy (root mean square signal frame energy), mfcc (mel frequency cepstrum coefficient 1-12), PCM_zcr (time signal zero-crossing rate (based on frame)), voiceProb (calculated from ACF) Probability of utterance), F0 (fundamental frequency calculated from cepstrum), etc.
  • the Murk disease detection model is used to represent the mapping relationship between audio features and Alzheimer's disease detection results.
  • the Alzheimer's disease detection model is based on a model obtained by convolutional neural network training on sample data, and this model can be used to predict whether the sound in the audio data has Alzheimer's disease.
  • a maximum pooling layer can be added.
  • the output of the last convolutional layer changes from a two-dimensional array to a one-dimensional feature vector array
  • a fully connected layer can be added after the pooling layer, and the one-dimensional feature vector vector output by the pooling layer will be used as the input of the fully connected layer with the linear rectification function as the activation equation.
  • This fully connected layer contains 256 neurons.
  • batch normalization and random initial values of weights are also used for one layer.
  • the output layer consists of 1 sigmoid function
  • the hidden neuron composition as the activation equation.
  • the loss function is cross entropy, and the optimization algorithm is Adam. When training the network, the maximum number of iterations is set to 200.
  • V c, d represents the (c, d) element of V
  • V ⁇ R K ⁇ F is the weight matrix of the convolution kernel of the linear threshold
  • e ⁇ R is the deviation of the linear threshold.
  • the method for establishing the Alzheimer’s disease detection model and the method for detecting Alzheimer’s disease provided by the embodiments of the present application are described above with reference to FIGS. 1 and 2. The following will describe the methods provided by the embodiments of the present application in conjunction with FIGS. 3 to 6 A device for establishing an Alzheimer's disease detection model and an Alzheimer's disease detection device.
  • the first acquiring unit is further configured to acquire a target speech speech segment of the target user, the target speech speech segment including a speech description of the target image by the target user; the first The extraction unit is also used to perform feature extraction on the target speech utterance segment to obtain the audio characteristics of the target speech utterance segment; the first determining unit is used to determine the audio characteristics of the target speech utterance segment and the The Alzheimer's disease detection model determines the Alzheimer's disease detection result of the target user.
  • the establishing unit is specifically configured to determine the audio characteristics of each sample speech segment, the Alzheimer's disease detection result of the sample user, the linear threshold, and the convolutional neural network.
  • a network model to establish the Alzheimer's disease detection model is specifically configured to determine the audio characteristics of each sample speech segment, the Alzheimer's disease detection result of the sample user, the linear threshold, and the convolutional neural network.
  • FIG. 5 only shows a simplified design of the device 500.
  • the device 500 may also include other necessary components, including but not limited to any number of communication interfaces, processors, controllers, memories, etc., and all devices that can implement the application are protected by the application. Within range.
  • FIG. 6 only shows a simplified design of the device 600.
  • the device 600 may also include other necessary elements, including but not limited to any number of communication interfaces, processors, controllers, memories, etc., and all devices that can implement the application are protected by the application. Within range.

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Multimedia (AREA)
  • Acoustics & Sound (AREA)
  • Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • Public Health (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Medical Informatics (AREA)
  • Biomedical Technology (AREA)
  • General Health & Medical Sciences (AREA)
  • Signal Processing (AREA)
  • Epidemiology (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Primary Health Care (AREA)
  • Pathology (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Measurement Of The Respiration, Hearing Ability, Form, And Blood Characteristics Of Living Organisms (AREA)
  • Image Analysis (AREA)

Abstract

L'invention concerne un procédé et un dispositif de construction d'un modèle de détection de maladie d'Alzheimer. Le procédé consiste à : acquérir de multiples échantillons de segments de parole d'utilisateurs témoins et un résultat de détection de maladie d'Alzheimer pour chacun des utilisateurs témoins, chacun des multiples échantillons de segments de parole comprenant une description orale d'une image d'échantillon fournie par l'utilisateur témoin, et le résultat de détection de maladie d'Alzheimer comprenant un résultat positif ou un résultat négatif (s110); effectuer une extraction de caractéristiques sur chaque échantillon de segments de parole pour obtenir une caractéristique audio de celui-ci (s120); et construire un modèle de détection de maladie d'Alzheimer en fonction des caractéristiques audio des échantillons de segments de parole, des résultats de détection de la maladie d'Alzheimer des utilisateurs témoins et d'un modèle de réseau neuronal convolutif, le modèle de détection de maladie d'Alzheimer indiquant une relation de mappage entre des caractéristiques audio et un résultat de détection de maladie d'Alzheimer (s130).
PCT/CN2019/089829 2019-01-22 2019-06-03 Procédé et dispositif de construction d'un modèle de détection de la maladie d'alzheimer WO2020151155A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910059489.2A CN109754822A (zh) 2019-01-22 2019-01-22 建立阿兹海默症检测模型的方法和装置
CN201910059489.2 2019-01-22

Publications (1)

Publication Number Publication Date
WO2020151155A1 true WO2020151155A1 (fr) 2020-07-30

Family

ID=66406096

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/089829 WO2020151155A1 (fr) 2019-01-22 2019-06-03 Procédé et dispositif de construction d'un modèle de détection de la maladie d'alzheimer

Country Status (2)

Country Link
CN (1) CN109754822A (fr)
WO (1) WO2020151155A1 (fr)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TWI811097B (zh) * 2021-09-09 2023-08-01 南韓商智聰醫治股份有限公司 用於確定用戶癡呆程度的方法及裝置

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109754822A (zh) * 2019-01-22 2019-05-14 平安科技(深圳)有限公司 建立阿兹海默症检测模型的方法和装置
CN110674773A (zh) * 2019-09-29 2020-01-10 燧人(上海)医疗科技有限公司 一种痴呆症的识别系统、装置及存储介质
CN114596960B (zh) * 2022-03-01 2023-08-08 中山大学 基于神经网络和自然对话的阿尔兹海默症风险预估方法

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100174533A1 (en) * 2009-01-06 2010-07-08 Regents Of The University Of Minnesota Automatic measurement of speech fluency
JP2017156402A (ja) * 2016-02-29 2017-09-07 国立大学法人 奈良先端科学技術大学院大学 診断装置、診断方法、及び診断プログラム
JP6263308B1 (ja) * 2017-11-09 2018-01-17 パナソニックヘルスケアホールディングス株式会社 認知症診断装置、認知症診断方法、及び認知症診断プログラム
CN108320734A (zh) * 2017-12-29 2018-07-24 安徽科大讯飞医疗信息技术有限公司 语音信号处理方法及装置、存储介质、电子设备
CN109754822A (zh) * 2019-01-22 2019-05-14 平安科技(深圳)有限公司 建立阿兹海默症检测模型的方法和装置

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101751919B (zh) * 2008-12-03 2012-05-23 中国科学院自动化研究所 一种汉语口语重音自动检测方法
CN108073576A (zh) * 2016-11-09 2018-05-25 上海诺悦智能科技有限公司 智能搜索方法、搜索装置以及搜索引擎系统
CN107578775B (zh) * 2017-09-07 2021-02-12 四川大学 一种基于深度神经网络的多分类语音方法
CN108109633A (zh) * 2017-12-20 2018-06-01 北京声智科技有限公司 无人值守的云端语音库采集与智能产品测试的系统与方法
CN108460334A (zh) * 2018-01-23 2018-08-28 北京易智能科技有限公司 一种基于声纹和人脸图像特征融合的年龄预测系统及方法
CN108198576A (zh) * 2018-02-11 2018-06-22 华南理工大学 一种基于语音特征非负矩阵分解的阿尔茨海默症初筛方法

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100174533A1 (en) * 2009-01-06 2010-07-08 Regents Of The University Of Minnesota Automatic measurement of speech fluency
JP2017156402A (ja) * 2016-02-29 2017-09-07 国立大学法人 奈良先端科学技術大学院大学 診断装置、診断方法、及び診断プログラム
JP6263308B1 (ja) * 2017-11-09 2018-01-17 パナソニックヘルスケアホールディングス株式会社 認知症診断装置、認知症診断方法、及び認知症診断プログラム
CN108320734A (zh) * 2017-12-29 2018-07-24 安徽科大讯飞医疗信息技术有限公司 语音信号处理方法及装置、存储介质、电子设备
CN109754822A (zh) * 2019-01-22 2019-05-14 平安科技(深圳)有限公司 建立阿兹海默症检测模型的方法和装置

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
CHIEN YI-WEI; HONG SHENG-YI; CHEAH WEN-TING; FU LI-CHEN; CHANG YU-LING: "An Assessment System for Alzheimer's Disease Based on Speech Using a Novel Feature Sequence Design and Recurrent Neural Network", 2018 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 10 October 2018 (2018-10-10), pages 3289 - 3294, XP033503128, DOI: 10.1109/SMC.2018.00557 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TWI811097B (zh) * 2021-09-09 2023-08-01 南韓商智聰醫治股份有限公司 用於確定用戶癡呆程度的方法及裝置

Also Published As

Publication number Publication date
CN109754822A (zh) 2019-05-14

Similar Documents

Publication Publication Date Title
WO2020151155A1 (fr) Procédé et dispositif de construction d'un modèle de détection de la maladie d'alzheimer
US11545173B2 (en) Automatic speech-based longitudinal emotion and mood recognition for mental health treatment
Vaiciukynas et al. Detecting Parkinson’s disease from sustained phonation and speech signals
JP6958723B2 (ja) 信号処理システム、信号処理装置、信号処理方法、およびプログラム
WO2019102884A1 (fr) Dispositif de génération d'étiquette, dispositif d'apprentissage de modèle, dispositif de reconnaissance d'émotion, et procédé, programme et support de stockage pour lesdits dispositifs
WO2018204934A1 (fr) Sélection de caractéristiques vocales pour des modèles de construction pour détecter des conditions médicales
WO2021177730A1 (fr) Appareil pour diagnostiquer une maladie provoquant des troubles de la voix et de la déglutition, et sa méthode de diagnostic
CN109147826B (zh) 音乐情感识别方法、装置、计算机设备及计算机存储介质
WO2021082420A1 (fr) Procédé et dispositif d'authentification d'empreinte vocale, support et dispositif électronique
KR102298330B1 (ko) 음성인식과 자연어 처리 알고리즘을 통해 의료 상담 요약문과 전자 의무 기록을 생성하는 시스템
JPH0361959B2 (fr)
CN109087670A (zh) 情绪分析方法、系统、服务器及存储介质
WO2021073263A1 (fr) Procédé et dispositif de prévision de risque de souffrir d'une maladie
WO2022127042A1 (fr) Procédé et appareil de reconnaissance de fraude à un examen sur la base de la reconnaissance vocale et dispositif informatique
CN110717067B (zh) 视频中音频聚类的处理方法和装置
WO2019233361A1 (fr) Procédé et dispositif de réglage de volume de musique
KR20200143940A (ko) 인공지능 기반의 언어 능력 평가 방법 및 시스템
Usman et al. Heart rate detection and classification from speech spectral features using machine learning
Sakthive et al. Integrated platform and response system for healthcare using Alexa
JP6699825B2 (ja) 診断装置、診断装置の制御方法、及び診断プログラム
CN108847251B (zh) 一种语音去重方法、装置、服务器及存储介质
Makiuchi et al. Speech paralinguistic approach for detecting dementia using gated convolutional neural network
CN113539243A (zh) 语音分类模型的训练方法、语音分类方法及相关装置
JP7062966B2 (ja) 音声解析装置、音声解析システム、及びプログラム
WO2021132289A1 (fr) Système d'analyse d'état pathologique, dispositif d'analyse d'état pathologique, procédé d'analyse d'état pathologique et programme d'analyse d'état pathologique

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19911045

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19911045

Country of ref document: EP

Kind code of ref document: A1