WO2019080502A1 - Procédé de prédiction de maladie basée sur la voix, serveur d'application et support d'informations lisible par ordinateur - Google Patents

Procédé de prédiction de maladie basée sur la voix, serveur d'application et support d'informations lisible par ordinateur

Info

Publication number
WO2019080502A1
WO2019080502A1 PCT/CN2018/089428 CN2018089428W WO2019080502A1 WO 2019080502 A1 WO2019080502 A1 WO 2019080502A1 CN 2018089428 W CN2018089428 W CN 2018089428W WO 2019080502 A1 WO2019080502 A1 WO 2019080502A1
Authority
WO
WIPO (PCT)
Prior art keywords
voice
patient
category
neural network
voice data
Prior art date
Application number
PCT/CN2018/089428
Other languages
English (en)
Chinese (zh)
Inventor
梁浩
王健宗
肖京
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2019080502A1 publication Critical patent/WO2019080502A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
    • G10L25/66Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination for extracting parameters related to health condition
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/02Feature extraction for speech recognition; Selection of recognition unit
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • G10L15/063Training
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/27Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique
    • G10L25/30Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique using neural networks

Definitions

  • the present application relates to the field of disease prediction, and in particular, to a method for predicting disease using voice, an application server, and a computer readable storage medium.
  • the present application provides a method for predicting disease using voice, an application server, and a computer readable storage medium, which can conveniently perform a preliminary diagnosis of a patient through a patient's voice before the patient performs formal treatment, thereby being a follow-up doctor.
  • the formal diagnosis provides a certain amount of data support and reference, which greatly facilitates doctors and patients.
  • a first aspect of the present application provides a method for predicting disease using voice, the method being applied to an application server, the method comprising:
  • Training a deep neural network model with training data the training data having a specific speech category, the deep neural network model having an input layer and an output layer, the output layer may output a state of the speech category;
  • the category to which the patient voice data belongs is determined according to the acquired output state.
  • a second aspect of the present application provides an application server, where the application server includes a memory, a processor, and a program for performing disease prediction using voice that can be run on the processor, where the disease is performed by using a voice
  • the following steps are implemented:
  • Training a deep neural network model with training data the training data having a specific speech category, the deep neural network model having an input layer and an output layer, the output layer may output a state of the speech category;
  • the category to which the patient voice data belongs is determined according to the acquired output state.
  • a third aspect of the present application provides a computer readable storage medium storing a program for performing disease prediction using voice, the program for performing disease prediction using voice may be executed by at least one processor to enable The at least one processor performs the following steps:
  • Training a deep neural network model with training data the training data having a specific speech category, the deep neural network model having an input layer and an output layer, the output layer may output a state of the speech category;
  • the category to which the patient voice data belongs is determined according to the acquired output state.
  • the application server proposed by the present application the method for predicting disease using voice, and the computer readable storage medium, firstly, training the deep neural network model by using training data, the training data having a specific voice category,
  • the deep neural network model has an input layer and an output layer, the output layer may output a state of the voice category; secondly, acquire real-time patient voice data; and then perform data processing on the patient voice data; then,
  • the processed patient voice data is sent to the input layer of the deep neural network model after training; in addition, the output state of the output layer of the deep neural network model is acquired; finally, the output state is determined according to the acquired output state The category to which the patient's voice data belongs.
  • 1 is a schematic diagram of an optional hardware architecture of an application server
  • FIG. 2 is a program block diagram of a first embodiment of a program for predicting disease using speech using the present application
  • FIG. 3 is a structural diagram of a deep neural network model in a preferred embodiment of the present application.
  • FIG. 4 is a flow chart of a first embodiment of a method for disease prediction using speech
  • FIG. 5 is a flow chart of a second embodiment of a method for disease prediction using voice.
  • application server 1 Memory 11 processor 12 Network Interface 13 Procedure for disease prediction using speech 200 Training module 20 Acquisition module twenty one Data processing module twenty two Input module twenty three Judgment module twenty four
  • FIG. 1 it is a schematic diagram of an optional hardware architecture of the application server 1.
  • the application server 1 may be a computing device such as a rack server, a blade server, a tower server, or a rack server.
  • the application server 1 may be a stand-alone server or a server cluster composed of multiple servers.
  • the application server 1 may include, but is not limited to, the memory 11, the processor 12, and the network interface 13 being communicably connected to each other through a system bus.
  • the application server 1 connects to the network through the network interface 13 to obtain information.
  • the network may be an intranet, an Internet, a Global System of Mobile communication (GSM), a Wideband Code Division Multiple Access (WCDMA), a 4G network, or a 5G network.
  • Wireless or wired networks such as networks, Bluetooth, Wi-Fi, and call networks.
  • Figure 1 only shows the application server 1 with components 11-13, but it should be understood that not all illustrated components may be implemented, and more or fewer components may be implemented instead.
  • the memory 11 includes at least one type of readable storage medium including a flash memory, a hard disk, a multimedia card, a card type memory (eg, SD or DX memory, etc.), and a random access memory (RAM). , static random access memory (SRAM), read only memory (ROM), electrically erasable programmable read only memory (EEPROM), programmable read only memory (PROM), magnetic memory, magnetic disk, optical disk, and the like.
  • the memory 11 may be an internal storage unit of the application server 1, such as a hard disk or memory of the application server 1.
  • the memory 11 may also be an external storage device of the application server 1, such as a plug-in hard disk equipped with the application server 1, a smart memory card (SMC), and a secure digital ( Secure Digital, SD) cards, flash cards, etc.
  • the memory 11 can also include both the internal storage unit of the application server 1 and its external storage device.
  • the memory 11 is generally used to store an operating system installed in the application server 1 and various types of application software, such as program codes of the program 200 for performing disease prediction using voice. Further, the memory 11 can also be used to temporarily store various types of data that have been output or are to be output.
  • the processor 12 may be a Central Processing Unit (CPU), controller, microcontroller, microprocessor, or other data processing chip in some embodiments.
  • the processor 12 is typically used to control the overall operation of the application server 1, such as performing data interaction or communication related control and processing, and the like.
  • the processor 12 is configured to run program code or process data stored in the memory 11, such as running the program 200 for performing disease prediction using voice.
  • the network interface 13 may comprise a wireless network interface or a wired network interface, which is typically used to establish a communication connection between the application server 1 and other electronic devices.
  • a program 200 for performing disease prediction using voice is installed and run in the application server 1.
  • the application server 1 trains a deep neural network by using training data.
  • a model, the training data having a specific speech category, the deep neural network model having an input layer and an output layer, the output layer may output a state of the voice category; acquiring real-time patient voice data; Data is processed by data; the processed patient voice data is sent to the input layer of the trained deep neural network model; the output state of the output layer of the deep neural network model is obtained; and the output is obtained according to the acquired The state determines the category to which the patient's voice data belongs.
  • the present application proposes a procedure 200 for predicting disease using speech.
  • the program 200 for performing disease prediction using voice includes a series of computer program instructions stored in the memory 11, and when the computer program instructions are executed by the processor 12, the embodiments of the present application may be implemented. Control operations for disease prediction.
  • the program 200 for predicting disease using speech may be divided into one or more modules based on the particular operations implemented by the various portions of the computer program instructions. For example, in FIG. 2, the program 200 for predicting disease using speech may be divided into a training module 20, an acquisition module 21, a data processing module 22, an input module 23, and a determination module 24. among them:
  • the training module 20 is configured to train the deep neural network model with the training data.
  • the training data refers to the voice sample data used for the training of the deep neural network model, and the number of the voice sample data is extracted according to actual needs.
  • the number of the voice sample data is not specifically limited in this embodiment.
  • the training data has a particular speech category including a severe cold, a mild cold, a severe cough, a mild cough, and a non-disease, the state of the speech category being the probability of occurrence of the speech category.
  • the deep neural network model has an input layer and an output layer. Further, the deep neural network model further has a hidden layer. The output layer can output the status of the voice class.
  • the deep neural network includes an input layer 201, a plurality of hidden layers 202, and a plurality of output layers 203.
  • the input layer 201 is configured to calculate an output value of the hidden layer unit input to the lowest layer according to the voice feature data input to the deep neural network.
  • the voice feature data refers to voice data extracted from the training data.
  • the hidden layer 202 is configured to perform weighted summation on the input values from the next layer of hidden layers according to the weighting value of the layer, and calculate an output value outputted to the upper layer of the hidden layer.
  • the output layer 203 is configured to perform weighted summation on the output values of the hidden layer from the uppermost layer according to the weighting value of the layer, and calculate an output probability according to the result of the weighted summation.
  • the output probability is an output probability corresponding to the training data of the voice category.
  • Training data such as severe cold, mild cold, severe cough, mild cough, and non-disease are introduced into the basic deep neural network model to calculate the output probability corresponding to the training data of various speech categories.
  • y j wx j , where y j represents the output value of the jth training data of the current layer, w represents the weighting value of the current layer, and x j represents the input value of the jth training data of the current layer.
  • the output function of the output layer is calculated by using a softmax function.
  • the softmax function is as follows:
  • p j represents the output probability of the jth training data in the output layer
  • x j represents the weighted summation result of the jth training data in the output layer
  • the training module 20 determines the structure of the deep neural network, it is necessary to determine the weighting values of the layers of the deep neural network.
  • the training module 20 inputs all the voice feature data from the input layer of the deep neural network to the deep neural network, and obtains the output probability of the deep neural network, and calculates An error between the output probability and the expected output probability, and adjusting a weighting value of a hidden layer of the deep neural network according to an error between an output probability of the depth neural network and the expected output probability.
  • the trained deep neural network model is obtained.
  • the obtaining module 21 is configured to acquire real-time patient voice data. Specifically, the obtaining module 21 records the telephone voice entered by the patient through the recording device of the call center, and stores the telephone voice with the telephone number as an identifier to obtain real-time patient voice data.
  • the call center can be, but is not limited to, a telephone recording platform of a hospital and a remote server connected by a mobile phone app.
  • the obtaining module 21 can also actively take patient voice data. For example, in a hospital, a nurse can use a special recording device to specifically collect voice data for a patient, and use the patient name (or other attribute data representing the patient identity information, For example, ID number, social security card number, etc.) are stored for identification.
  • the data processing module 22 is configured to perform data processing on the patient voice data. Specifically, the data processing module 22 performs front-end processing on the acquired patient voice data, where the front-end processing includes noise reduction and endpoint detection. Further, the data processing module 22 further performs feature value extraction and selection of the speech signal on the patient speech data processed in the previous stage.
  • the endpoint detection is used to determine whether the patient voice data to be processed is valid voice. If it is not valid voice, the voice data is not processed, thereby improving the efficiency of the overall system.
  • the feature values that the data processing module 22 needs to extract include time domain feature parameters and time domain feature parameters, wherein the time domain feature parameters include short time average energy, short time average amplitude, short time average zero crossing rate, and resonance. Peak and base audio frequencies, etc., frequency domain characteristic parameters include linear prediction coefficient LPC, linear prediction cepstral coefficient LPCC, Mel freguency cepstrum coefficient (MFCC) and the like.
  • the basic audio frequency reflects the glottal excitation characteristics
  • the formant reflects the characteristics of the channel response
  • LPC and LPCC simultaneously reflect the characteristics of glottal excitation and channel response
  • MFCC simulates the human auditory characteristics.
  • Voices of different diseases (degrees) will have different characteristic parameter values. Therefore, the degree of disease of the patient can be initially reflected by the extraction of the eigenvalues.
  • the input module 23 sends the processed patient voice data to the input layer of the trained deep neural network model.
  • the obtaining module 21 is further configured to obtain an output state of the output layer of the deep neural network model after the processed patient voice data is sent to the input layer of the deep neural network model after training.
  • the determining module 24 determines the category to which the patient voice data belongs according to the acquired output state.
  • the training module 20 is further configured to establish a mapping relationship between each voice category and a desired state of each voice category outputted in the trained deep neural network model, In this way, the determining module 24 matches the acquired output state with the expected state in the mapping relationship table, and obtains the corresponding voice category in the mapping relationship table according to the matching, and can determine the location.
  • the patient corresponding to the patient voice data belongs to the voice category corresponding to the desired state.
  • the expected state output by the respective voice categories in the deep neural network model is a desired probability that each voice category outputs in the trained deep neural network model, such as patient voice data input after training.
  • the output state obtained in the deep neural network model matches the expected probability of the severe cold such speech class in the post-training deep neural network model, and the patient can be judged to be a severe cold, thereby providing certain data for the diagnosis of the follow-up doctor. support.
  • the program 200 for predicting diseases using speech proposed by the present application firstly trains a deep neural network model using training data, the training data having a specific speech category, and the deep neural network model has An input layer and an output layer, the output layer may output a state of the voice category; secondly, acquire real-time patient voice data; and then perform data processing on the patient voice data; and then, the processed patient voice Data is sent to the input layer of the deep neural network model after training; in addition, an output state of the output layer of the deep neural network model is acquired; finally, the category to which the patient voice data belongs is determined according to the acquired output state .
  • the present application also proposes a method for predicting disease using speech.
  • FIG. 4 it is a flowchart of a first embodiment of a method for predicting disease using speech using the present application.
  • the order of execution of the steps in the flowchart shown in FIG. 4 may be changed according to different requirements, and some steps may be omitted.
  • Step S401 training the deep neural network model with the training data.
  • the training data refers to the voice sample data used for the training of the deep neural network model, and the number of the voice sample data is extracted according to actual needs.
  • the number of the voice sample data is not specifically limited in this embodiment.
  • the training data has a particular speech category including a severe cold, a mild cold, a severe cough, a mild cough, and a non-disease, the state of the speech category being the probability of occurrence of the speech category.
  • the deep neural network model has an input layer and an output layer. Further, the deep neural network model further has a hidden layer. The output layer can output the status of the voice class.
  • the deep neural network includes an input layer 201, a plurality of hidden layers 202, and a plurality of output layers 203.
  • the input layer 201 is configured to calculate an output value of the hidden layer unit input to the lowest layer according to the voice feature data input to the deep neural network.
  • the voice feature data refers to voice data extracted from the training data.
  • the hidden layer 202 is configured to perform weighted summation on the input values from the next layer of hidden layers according to the weighting value of the layer, and calculate an output value outputted to the upper layer of the hidden layer.
  • the output layer 203 is configured to perform weighted summation on the output values of the hidden layer from the uppermost layer according to the weighting value of the layer, and calculate an output probability according to the result of the weighted summation.
  • the output probability is an output probability corresponding to the training data of the voice category.
  • Training data such as severe cold, mild cold, severe cough, mild cough, and non-disease are introduced into the basic deep neural network model to calculate the output probability corresponding to the training data of various speech categories.
  • y j wx j , where y j represents the output value of the jth training data of the current layer, w represents the weighting value of the current layer, and x j represents the input value of the jth training data of the current layer.
  • the application server 1 calculates a weighted summation result of the output layer by using the weighting value of the output layer 203, and then calculates an output function of the output layer by using a softmax function.
  • the softmax function is as follows:
  • p j represents the output probability of the jth training data in the output layer
  • x j represents the weighted summation result of the jth training data in the output layer
  • the application server 1 After determining the structure of the deep neural network, the application server 1 needs to determine the weighting values of the layers of the deep neural network.
  • the application server 1 inputs all the voice feature data from the input layer of the deep neural network to the deep neural network, and obtains the output probability of the deep neural network, and calculates An error between the output probability and the expected output probability, and adjusting a weighting value of a hidden layer of the deep neural network according to an error between an output probability of the depth neural network and the expected output probability.
  • the trained deep neural network model After obtaining the weighted values of the adjusted layers, the trained deep neural network model is obtained.
  • step S402 real-time patient voice data is acquired.
  • the application server 1 records the telephone voice entered by the patient through the recording device of the call center, and stores the telephone voice as a logo of the telephone number to obtain real-time patient voice data.
  • the call center can be, but is not limited to, a telephone recording platform of a hospital and a remote server connected by a mobile phone app.
  • the application server 1 can also actively take patient voice data. For example, in a hospital, a nurse can use a special recording device to specifically collect voice data for a patient, and use the patient name (or other attribute data representing the patient identity information, For example, ID number, social security card number, etc.) are stored for identification.
  • Step S403 performing data processing on the patient voice data. Specifically, the step of performing data processing on the patient voice data is described in detail in the second embodiment (see FIG. 5) of the method for predicting disease using voice in the present application.
  • Step S404 the processed patient voice data is sent to the input layer of the trained deep neural network model.
  • Step S405 acquiring an output state of an output layer of the deep neural network model.
  • the expected state output by the respective voice categories in the deep neural network model is a desired probability that each voice category outputs in the trained deep neural network model.
  • Step S406 determining, according to the acquired output state, a category to which the patient voice data belongs.
  • the application server 1 In order to clearly and intuitively obtain the category to which the patient voice data belongs, before determining the category to which the patient voice data belongs according to the acquired output state, the application server 1 also establishes each voice category and each voice category after the training. a mapping table between the expected states of the output in the deep neural network model, such that the application server 1 matches the acquired output state with the expected state in the mapping relationship table, and obtains the expectation according to the matching The status of the corresponding voice category in the mapping relationship table can determine that the patient corresponding to the patient voice data belongs to the voice category corresponding to the desired state.
  • the output state obtained by inputting patient speech data into the trained deep neural network model matches the expected probability of the severe cold such speech class in the trained deep neural network model, and then the patient may be determined to be a severe cold, and further Provide some data support for the follow-up doctor's diagnosis.
  • the method for predicting disease using speech proposed by the present application firstly trains a deep neural network model using training data, the training data has a specific speech category, and the deep neural network model has an input layer. And an output layer, the output layer may output a state of the voice category; secondly, acquire real-time patient voice data; and then perform data processing on the patient voice data; and then send the processed patient voice data to Entering an input layer of the deep neural network model after training; in addition, acquiring an output state of an output layer of the deep neural network model; and finally, determining a category to which the patient voice data belongs according to the acquired output state.
  • the step of performing data processing on the patient voice data includes:
  • Step S501 performing front end processing on the acquired patient voice data.
  • the front segment processing includes noise reduction and endpoint detection.
  • the endpoint detection is used to determine whether the patient voice data to be processed is valid voice. If it is not valid voice, the voice data is not processed, thereby improving the efficiency of the overall system.
  • Step S502 performing feature value extraction and selection of the speech signal on the patient speech data processed in the previous stage.
  • the feature values that the application server 1 needs to extract include time domain feature parameters and frequency domain feature parameters, wherein the time domain feature parameters include short time average energy, short time average amplitude, and short time average zero crossing rate.
  • time domain feature parameters include short time average energy, short time average amplitude, and short time average zero crossing rate.
  • formant, fundamental frequency, etc., frequency domain characteristic parameters include linear prediction coefficient LPC, linear prediction cepstral coefficient LPCC, Mel freguency cepstrum coefficient (MFCC) and so on.
  • LPC linear prediction cepstral coefficient
  • MFCC Mel freguency cepstrum coefficient
  • the basic audio frequency reflects the glottal excitation characteristics
  • the formant reflects the characteristics of the channel response
  • LPC and LPCC simultaneously reflect the characteristics of glottal excitation and channel response
  • MFCC simulates the human auditory characteristics. Voices of different diseases (degrees) will have different characteristic parameter values. Therefore, the degree of disease of the patient can be initially reflected by the extraction of the
  • the method for predicting disease using voice proposed by the present application improves the efficiency of the overall system by performing front-end processing on the acquired patient voice data. And through the extraction and selection of the eigenvalues of the speech signal from the patient speech data processed in the previous segment, the patient's disease degree is initially reflected.
  • the foregoing embodiment method can be implemented by means of software plus a necessary general hardware platform, and of course, can also be through hardware, but in many cases, the former is better.
  • Implementation Based on such understanding, the technical solution of the present application, which is essential or contributes to the prior art, may be embodied in the form of a software product stored in a storage medium (such as ROM/RAM, disk,
  • the optical disc includes a number of instructions for causing a terminal device (which may be a mobile phone, a computer, a server, an air conditioner, or a network device, etc.) to perform the methods described in various embodiments of the present application.

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Computational Linguistics (AREA)
  • Acoustics & Sound (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Signal Processing (AREA)
  • Artificial Intelligence (AREA)
  • Epidemiology (AREA)
  • Public Health (AREA)
  • General Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • Medical Treatment And Welfare Office Work (AREA)

Abstract

La présente invention concerne un procédé de prédiction de maladie basée sur la voix. Le procédé consiste : à utiliser des données d'apprentissage pour entraîner un modèle de réseau neuronal profond, les données d'apprentissage comportant un type vocal spécifique, et le modèle de réseau neuronal profond comprenant une couche d'entrée et une couche de sortie ; à obtenir des données vocales de patient en temps réel ; à réaliser un traitement de données sur les données vocales de patient ; à envoyer les données vocales de patient traitées à la couche d'entrée du modèle de réseau neuronal profond entraîné ; à obtenir un état de sortie de la couche de sortie du modèle de réseau neuronal profond ; et à déterminer le type des données vocales de patient selon l'état de sortie obtenu. La présente invention concerne en outre un serveur d'application. Selon le procédé de prédiction de maladie basée sur la voix et le serveur d'application fournis par la présente invention, un diagnostic initial peut être rapidement effectué pour un patient sur la base de la voix du patient, ce qui fournit un certain support de données et des références pour le diagnostic officiel de suivi du médecin, et aide grandement les médecins et les patients.
PCT/CN2018/089428 2017-10-23 2018-06-01 Procédé de prédiction de maladie basée sur la voix, serveur d'application et support d'informations lisible par ordinateur WO2019080502A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201710995691.7 2017-10-23
CN201710995691.7A CN108053841A (zh) 2017-10-23 2017-10-23 利用语音进行疾病预测的方法及应用服务器

Publications (1)

Publication Number Publication Date
WO2019080502A1 true WO2019080502A1 (fr) 2019-05-02

Family

ID=62119669

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/089428 WO2019080502A1 (fr) 2017-10-23 2018-06-01 Procédé de prédiction de maladie basée sur la voix, serveur d'application et support d'informations lisible par ordinateur

Country Status (2)

Country Link
CN (1) CN108053841A (fr)
WO (1) WO2019080502A1 (fr)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2022167243A1 (fr) * 2021-02-05 2022-08-11 Novoic Ltd. Procédé de traitement de la parole pour identifier des représentations de données à utiliser dans la surveillance ou le diagnostic d'un problème de santé

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108053841A (zh) * 2017-10-23 2018-05-18 平安科技(深圳)有限公司 利用语音进行疾病预测的方法及应用服务器
CN108518817A (zh) * 2018-04-10 2018-09-11 珠海格力电器股份有限公司 一种自主调节控制方法、装置和空调系统
CN109431507A (zh) * 2018-10-26 2019-03-08 平安科技(深圳)有限公司 基于深度学习的咳嗽疾病识别方法及装置
MX2021014721A (es) * 2019-05-30 2022-04-06 Insurance Services Office Inc Sistemas y metodos para aprendizaje de maquina de atributos de voz.
CN110473616B (zh) * 2019-08-16 2022-08-23 北京声智科技有限公司 一种语音信号处理方法、装置及系统
CN112259126B (zh) * 2020-09-24 2023-06-20 广州大学 一种自闭症语音特征辅助识别机器人及方法
CN116530944B (zh) * 2023-07-06 2023-10-20 荣耀终端有限公司 声音处理方法及电子设备

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102342858A (zh) * 2010-08-06 2012-02-08 上海中医药大学 中医声诊采集与分析系统
WO2016192612A1 (fr) * 2015-06-02 2016-12-08 陈宽 Procédé d'analyse de données de traitement médical basé sur un apprentissage profond, et analyseur intelligent associé
CN106710599A (zh) * 2016-12-02 2017-05-24 深圳撒哈拉数据科技有限公司 一种基于深度神经网络的特定声源检测方法与系统
CN106709254A (zh) * 2016-12-29 2017-05-24 天津中科智能识别产业技术研究院有限公司 一种医疗诊断机器人系统
CN108053841A (zh) * 2017-10-23 2018-05-18 平安科技(深圳)有限公司 利用语音进行疾病预测的方法及应用服务器

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101739869B (zh) * 2008-11-19 2012-03-28 中国科学院自动化研究所 一种基于先验知识的发音评估与诊断系统
CN110353685B (zh) * 2012-03-29 2022-03-04 昆士兰大学 用于处理患者声音的方法与装置
CN103578470B (zh) * 2012-08-09 2019-10-18 科大讯飞股份有限公司 一种电话录音数据的处理方法及系统
CN104347066B (zh) * 2013-08-09 2019-11-12 上海掌门科技有限公司 基于深层神经网络的婴儿啼哭声识别方法及系统
US9687208B2 (en) * 2015-06-03 2017-06-27 iMEDI PLUS Inc. Method and system for recognizing physiological sound
CN105869658B (zh) * 2016-04-01 2019-08-27 金陵科技学院 一种采用非线性特征的语音端点检测方法
CN105869627A (zh) * 2016-04-28 2016-08-17 成都之达科技有限公司 基于车联网的语音处理方法
CN106778014B (zh) * 2016-12-29 2020-06-16 浙江大学 一种基于循环神经网络的患病风险预测建模方法
CN107068167A (zh) * 2017-03-13 2017-08-18 广东顺德中山大学卡内基梅隆大学国际联合研究院 融合多种端到端神经网络结构的说话人感冒症状识别方法

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102342858A (zh) * 2010-08-06 2012-02-08 上海中医药大学 中医声诊采集与分析系统
WO2016192612A1 (fr) * 2015-06-02 2016-12-08 陈宽 Procédé d'analyse de données de traitement médical basé sur un apprentissage profond, et analyseur intelligent associé
CN106710599A (zh) * 2016-12-02 2017-05-24 深圳撒哈拉数据科技有限公司 一种基于深度神经网络的特定声源检测方法与系统
CN106709254A (zh) * 2016-12-29 2017-05-24 天津中科智能识别产业技术研究院有限公司 一种医疗诊断机器人系统
CN108053841A (zh) * 2017-10-23 2018-05-18 平安科技(深圳)有限公司 利用语音进行疾病预测的方法及应用服务器

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2022167243A1 (fr) * 2021-02-05 2022-08-11 Novoic Ltd. Procédé de traitement de la parole pour identifier des représentations de données à utiliser dans la surveillance ou le diagnostic d'un problème de santé

Also Published As

Publication number Publication date
CN108053841A (zh) 2018-05-18

Similar Documents

Publication Publication Date Title
WO2019080502A1 (fr) Procédé de prédiction de maladie basée sur la voix, serveur d'application et support d'informations lisible par ordinateur
WO2018149077A1 (fr) Procédé de reconnaissance d'empreinte vocale, dispositif, support d'informations et serveur d'arrière-plan
US20180261236A1 (en) Speaker recognition method and apparatus, computer device and computer-readable medium
US20200380957A1 (en) Systems and Methods for Machine Learning of Voice Attributes
EP2810277B1 (fr) Vérification du locuteur dans un système de surveillance sanitaire
KR20190022432A (ko) 전자장치, 신분 검증 방법, 시스템 및 컴퓨터 판독 가능한 저장매체
US10270736B2 (en) Account adding method, terminal, server, and computer storage medium
WO2019136909A1 (fr) Procédé de détection de corps vivant vocal basé sur un apprentissage profond, serveur et support de stockage
US20090326937A1 (en) Using personalized health information to improve speech recognition
CN107038336A (zh) 一种电子病历自动生成方法及装置
CN110457432A (zh) 面试评分方法、装置、设备及存储介质
CN112562691A (zh) 一种声纹识别的方法、装置、计算机设备及存储介质
CN109299227B (zh) 基于语音识别的信息查询方法和装置
WO2021159755A1 (fr) Procédé de traitement de données de traitement et de diagnostic intelligent, dispositif, appareil et support de stockage
US20230377602A1 (en) Health-related information generation and storage
CN111933291A (zh) 医疗信息推荐装置、方法、系统、设备及可读存储介质
AU2021333916A1 (en) Computerized decision support tool and medical device for respiratory condition monitoring and care
WO2020233381A1 (fr) Procédé et appareil de requête de service sur la base d'une reconnaissance vocale, et dispositif informatique
WO2022205249A1 (fr) Procédé de compensation de caractéristique audio, procédé de reconnaissance audio et produit associé
CN110767282A (zh) 一种健康档案生成方法、装置以及计算机可读存储介质
CN114141251A (zh) 声音识别方法、声音识别装置及电子设备
CN112614584A (zh) 语音及文本转录的抑郁症辅助诊断方法、系统及介质
CN111967235A (zh) 表单处理方法、装置、计算机设备及存储介质
CN114201580A (zh) 数据处理方法、装置、电子设备和计算机可读存储介质
CN112927413A (zh) 一种就医挂号方法、装置、设备及存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18871385

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 24.09.2020)

122 Ep: pct application non-entry in european phase

Ref document number: 18871385

Country of ref document: EP

Kind code of ref document: A1