WO2019080502A1 - Procédé de prédiction de maladie basée sur la voix, serveur d'application et support d'informations lisible par ordinateur - Google Patents
Procédé de prédiction de maladie basée sur la voix, serveur d'application et support d'informations lisible par ordinateurInfo
- Publication number
- WO2019080502A1 WO2019080502A1 PCT/CN2018/089428 CN2018089428W WO2019080502A1 WO 2019080502 A1 WO2019080502 A1 WO 2019080502A1 CN 2018089428 W CN2018089428 W CN 2018089428W WO 2019080502 A1 WO2019080502 A1 WO 2019080502A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- voice
- patient
- category
- neural network
- voice data
- Prior art date
Links
- 201000010099 disease Diseases 0.000 title claims abstract description 60
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 title claims abstract description 60
- 238000000034 method Methods 0.000 title claims abstract description 38
- 238000003062 neural network model Methods 0.000 claims abstract description 80
- 238000012549 training Methods 0.000 claims abstract description 69
- 238000012545 processing Methods 0.000 claims abstract description 36
- 238000013507 mapping Methods 0.000 claims description 25
- 238000000605 extraction Methods 0.000 claims description 7
- 238000001514 detection method Methods 0.000 claims description 6
- 230000009467 reduction Effects 0.000 claims description 4
- 238000003745 diagnosis Methods 0.000 abstract description 16
- 238000013528 artificial neural network Methods 0.000 description 23
- 206010011224 Cough Diseases 0.000 description 8
- 230000006870 function Effects 0.000 description 7
- 238000010586 diagram Methods 0.000 description 6
- 230000005284 excitation Effects 0.000 description 4
- 230000004044 response Effects 0.000 description 4
- 238000004590 computer program Methods 0.000 description 3
- 239000003814 drug Substances 0.000 description 3
- 238000004364 calculation method Methods 0.000 description 2
- 238000004891 communication Methods 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 230000003993 interaction Effects 0.000 description 1
- 238000010295 mobile communication Methods 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/48—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
- G10L25/51—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
- G10L25/66—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination for extracting parameters related to health condition
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/02—Feature extraction for speech recognition; Selection of recognition unit
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/063—Training
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/27—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique
- G10L25/30—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique using neural networks
Definitions
- the present application relates to the field of disease prediction, and in particular, to a method for predicting disease using voice, an application server, and a computer readable storage medium.
- the present application provides a method for predicting disease using voice, an application server, and a computer readable storage medium, which can conveniently perform a preliminary diagnosis of a patient through a patient's voice before the patient performs formal treatment, thereby being a follow-up doctor.
- the formal diagnosis provides a certain amount of data support and reference, which greatly facilitates doctors and patients.
- a first aspect of the present application provides a method for predicting disease using voice, the method being applied to an application server, the method comprising:
- Training a deep neural network model with training data the training data having a specific speech category, the deep neural network model having an input layer and an output layer, the output layer may output a state of the speech category;
- the category to which the patient voice data belongs is determined according to the acquired output state.
- a second aspect of the present application provides an application server, where the application server includes a memory, a processor, and a program for performing disease prediction using voice that can be run on the processor, where the disease is performed by using a voice
- the following steps are implemented:
- Training a deep neural network model with training data the training data having a specific speech category, the deep neural network model having an input layer and an output layer, the output layer may output a state of the speech category;
- the category to which the patient voice data belongs is determined according to the acquired output state.
- a third aspect of the present application provides a computer readable storage medium storing a program for performing disease prediction using voice, the program for performing disease prediction using voice may be executed by at least one processor to enable The at least one processor performs the following steps:
- Training a deep neural network model with training data the training data having a specific speech category, the deep neural network model having an input layer and an output layer, the output layer may output a state of the speech category;
- the category to which the patient voice data belongs is determined according to the acquired output state.
- the application server proposed by the present application the method for predicting disease using voice, and the computer readable storage medium, firstly, training the deep neural network model by using training data, the training data having a specific voice category,
- the deep neural network model has an input layer and an output layer, the output layer may output a state of the voice category; secondly, acquire real-time patient voice data; and then perform data processing on the patient voice data; then,
- the processed patient voice data is sent to the input layer of the deep neural network model after training; in addition, the output state of the output layer of the deep neural network model is acquired; finally, the output state is determined according to the acquired output state The category to which the patient's voice data belongs.
- 1 is a schematic diagram of an optional hardware architecture of an application server
- FIG. 2 is a program block diagram of a first embodiment of a program for predicting disease using speech using the present application
- FIG. 3 is a structural diagram of a deep neural network model in a preferred embodiment of the present application.
- FIG. 4 is a flow chart of a first embodiment of a method for disease prediction using speech
- FIG. 5 is a flow chart of a second embodiment of a method for disease prediction using voice.
- application server 1 Memory 11 processor 12 Network Interface 13 Procedure for disease prediction using speech 200 Training module 20 Acquisition module twenty one Data processing module twenty two Input module twenty three Judgment module twenty four
- FIG. 1 it is a schematic diagram of an optional hardware architecture of the application server 1.
- the application server 1 may be a computing device such as a rack server, a blade server, a tower server, or a rack server.
- the application server 1 may be a stand-alone server or a server cluster composed of multiple servers.
- the application server 1 may include, but is not limited to, the memory 11, the processor 12, and the network interface 13 being communicably connected to each other through a system bus.
- the application server 1 connects to the network through the network interface 13 to obtain information.
- the network may be an intranet, an Internet, a Global System of Mobile communication (GSM), a Wideband Code Division Multiple Access (WCDMA), a 4G network, or a 5G network.
- Wireless or wired networks such as networks, Bluetooth, Wi-Fi, and call networks.
- Figure 1 only shows the application server 1 with components 11-13, but it should be understood that not all illustrated components may be implemented, and more or fewer components may be implemented instead.
- the memory 11 includes at least one type of readable storage medium including a flash memory, a hard disk, a multimedia card, a card type memory (eg, SD or DX memory, etc.), and a random access memory (RAM). , static random access memory (SRAM), read only memory (ROM), electrically erasable programmable read only memory (EEPROM), programmable read only memory (PROM), magnetic memory, magnetic disk, optical disk, and the like.
- the memory 11 may be an internal storage unit of the application server 1, such as a hard disk or memory of the application server 1.
- the memory 11 may also be an external storage device of the application server 1, such as a plug-in hard disk equipped with the application server 1, a smart memory card (SMC), and a secure digital ( Secure Digital, SD) cards, flash cards, etc.
- the memory 11 can also include both the internal storage unit of the application server 1 and its external storage device.
- the memory 11 is generally used to store an operating system installed in the application server 1 and various types of application software, such as program codes of the program 200 for performing disease prediction using voice. Further, the memory 11 can also be used to temporarily store various types of data that have been output or are to be output.
- the processor 12 may be a Central Processing Unit (CPU), controller, microcontroller, microprocessor, or other data processing chip in some embodiments.
- the processor 12 is typically used to control the overall operation of the application server 1, such as performing data interaction or communication related control and processing, and the like.
- the processor 12 is configured to run program code or process data stored in the memory 11, such as running the program 200 for performing disease prediction using voice.
- the network interface 13 may comprise a wireless network interface or a wired network interface, which is typically used to establish a communication connection between the application server 1 and other electronic devices.
- a program 200 for performing disease prediction using voice is installed and run in the application server 1.
- the application server 1 trains a deep neural network by using training data.
- a model, the training data having a specific speech category, the deep neural network model having an input layer and an output layer, the output layer may output a state of the voice category; acquiring real-time patient voice data; Data is processed by data; the processed patient voice data is sent to the input layer of the trained deep neural network model; the output state of the output layer of the deep neural network model is obtained; and the output is obtained according to the acquired The state determines the category to which the patient's voice data belongs.
- the present application proposes a procedure 200 for predicting disease using speech.
- the program 200 for performing disease prediction using voice includes a series of computer program instructions stored in the memory 11, and when the computer program instructions are executed by the processor 12, the embodiments of the present application may be implemented. Control operations for disease prediction.
- the program 200 for predicting disease using speech may be divided into one or more modules based on the particular operations implemented by the various portions of the computer program instructions. For example, in FIG. 2, the program 200 for predicting disease using speech may be divided into a training module 20, an acquisition module 21, a data processing module 22, an input module 23, and a determination module 24. among them:
- the training module 20 is configured to train the deep neural network model with the training data.
- the training data refers to the voice sample data used for the training of the deep neural network model, and the number of the voice sample data is extracted according to actual needs.
- the number of the voice sample data is not specifically limited in this embodiment.
- the training data has a particular speech category including a severe cold, a mild cold, a severe cough, a mild cough, and a non-disease, the state of the speech category being the probability of occurrence of the speech category.
- the deep neural network model has an input layer and an output layer. Further, the deep neural network model further has a hidden layer. The output layer can output the status of the voice class.
- the deep neural network includes an input layer 201, a plurality of hidden layers 202, and a plurality of output layers 203.
- the input layer 201 is configured to calculate an output value of the hidden layer unit input to the lowest layer according to the voice feature data input to the deep neural network.
- the voice feature data refers to voice data extracted from the training data.
- the hidden layer 202 is configured to perform weighted summation on the input values from the next layer of hidden layers according to the weighting value of the layer, and calculate an output value outputted to the upper layer of the hidden layer.
- the output layer 203 is configured to perform weighted summation on the output values of the hidden layer from the uppermost layer according to the weighting value of the layer, and calculate an output probability according to the result of the weighted summation.
- the output probability is an output probability corresponding to the training data of the voice category.
- Training data such as severe cold, mild cold, severe cough, mild cough, and non-disease are introduced into the basic deep neural network model to calculate the output probability corresponding to the training data of various speech categories.
- y j wx j , where y j represents the output value of the jth training data of the current layer, w represents the weighting value of the current layer, and x j represents the input value of the jth training data of the current layer.
- the output function of the output layer is calculated by using a softmax function.
- the softmax function is as follows:
- p j represents the output probability of the jth training data in the output layer
- x j represents the weighted summation result of the jth training data in the output layer
- the training module 20 determines the structure of the deep neural network, it is necessary to determine the weighting values of the layers of the deep neural network.
- the training module 20 inputs all the voice feature data from the input layer of the deep neural network to the deep neural network, and obtains the output probability of the deep neural network, and calculates An error between the output probability and the expected output probability, and adjusting a weighting value of a hidden layer of the deep neural network according to an error between an output probability of the depth neural network and the expected output probability.
- the trained deep neural network model is obtained.
- the obtaining module 21 is configured to acquire real-time patient voice data. Specifically, the obtaining module 21 records the telephone voice entered by the patient through the recording device of the call center, and stores the telephone voice with the telephone number as an identifier to obtain real-time patient voice data.
- the call center can be, but is not limited to, a telephone recording platform of a hospital and a remote server connected by a mobile phone app.
- the obtaining module 21 can also actively take patient voice data. For example, in a hospital, a nurse can use a special recording device to specifically collect voice data for a patient, and use the patient name (or other attribute data representing the patient identity information, For example, ID number, social security card number, etc.) are stored for identification.
- the data processing module 22 is configured to perform data processing on the patient voice data. Specifically, the data processing module 22 performs front-end processing on the acquired patient voice data, where the front-end processing includes noise reduction and endpoint detection. Further, the data processing module 22 further performs feature value extraction and selection of the speech signal on the patient speech data processed in the previous stage.
- the endpoint detection is used to determine whether the patient voice data to be processed is valid voice. If it is not valid voice, the voice data is not processed, thereby improving the efficiency of the overall system.
- the feature values that the data processing module 22 needs to extract include time domain feature parameters and time domain feature parameters, wherein the time domain feature parameters include short time average energy, short time average amplitude, short time average zero crossing rate, and resonance. Peak and base audio frequencies, etc., frequency domain characteristic parameters include linear prediction coefficient LPC, linear prediction cepstral coefficient LPCC, Mel freguency cepstrum coefficient (MFCC) and the like.
- the basic audio frequency reflects the glottal excitation characteristics
- the formant reflects the characteristics of the channel response
- LPC and LPCC simultaneously reflect the characteristics of glottal excitation and channel response
- MFCC simulates the human auditory characteristics.
- Voices of different diseases (degrees) will have different characteristic parameter values. Therefore, the degree of disease of the patient can be initially reflected by the extraction of the eigenvalues.
- the input module 23 sends the processed patient voice data to the input layer of the trained deep neural network model.
- the obtaining module 21 is further configured to obtain an output state of the output layer of the deep neural network model after the processed patient voice data is sent to the input layer of the deep neural network model after training.
- the determining module 24 determines the category to which the patient voice data belongs according to the acquired output state.
- the training module 20 is further configured to establish a mapping relationship between each voice category and a desired state of each voice category outputted in the trained deep neural network model, In this way, the determining module 24 matches the acquired output state with the expected state in the mapping relationship table, and obtains the corresponding voice category in the mapping relationship table according to the matching, and can determine the location.
- the patient corresponding to the patient voice data belongs to the voice category corresponding to the desired state.
- the expected state output by the respective voice categories in the deep neural network model is a desired probability that each voice category outputs in the trained deep neural network model, such as patient voice data input after training.
- the output state obtained in the deep neural network model matches the expected probability of the severe cold such speech class in the post-training deep neural network model, and the patient can be judged to be a severe cold, thereby providing certain data for the diagnosis of the follow-up doctor. support.
- the program 200 for predicting diseases using speech proposed by the present application firstly trains a deep neural network model using training data, the training data having a specific speech category, and the deep neural network model has An input layer and an output layer, the output layer may output a state of the voice category; secondly, acquire real-time patient voice data; and then perform data processing on the patient voice data; and then, the processed patient voice Data is sent to the input layer of the deep neural network model after training; in addition, an output state of the output layer of the deep neural network model is acquired; finally, the category to which the patient voice data belongs is determined according to the acquired output state .
- the present application also proposes a method for predicting disease using speech.
- FIG. 4 it is a flowchart of a first embodiment of a method for predicting disease using speech using the present application.
- the order of execution of the steps in the flowchart shown in FIG. 4 may be changed according to different requirements, and some steps may be omitted.
- Step S401 training the deep neural network model with the training data.
- the training data refers to the voice sample data used for the training of the deep neural network model, and the number of the voice sample data is extracted according to actual needs.
- the number of the voice sample data is not specifically limited in this embodiment.
- the training data has a particular speech category including a severe cold, a mild cold, a severe cough, a mild cough, and a non-disease, the state of the speech category being the probability of occurrence of the speech category.
- the deep neural network model has an input layer and an output layer. Further, the deep neural network model further has a hidden layer. The output layer can output the status of the voice class.
- the deep neural network includes an input layer 201, a plurality of hidden layers 202, and a plurality of output layers 203.
- the input layer 201 is configured to calculate an output value of the hidden layer unit input to the lowest layer according to the voice feature data input to the deep neural network.
- the voice feature data refers to voice data extracted from the training data.
- the hidden layer 202 is configured to perform weighted summation on the input values from the next layer of hidden layers according to the weighting value of the layer, and calculate an output value outputted to the upper layer of the hidden layer.
- the output layer 203 is configured to perform weighted summation on the output values of the hidden layer from the uppermost layer according to the weighting value of the layer, and calculate an output probability according to the result of the weighted summation.
- the output probability is an output probability corresponding to the training data of the voice category.
- Training data such as severe cold, mild cold, severe cough, mild cough, and non-disease are introduced into the basic deep neural network model to calculate the output probability corresponding to the training data of various speech categories.
- y j wx j , where y j represents the output value of the jth training data of the current layer, w represents the weighting value of the current layer, and x j represents the input value of the jth training data of the current layer.
- the application server 1 calculates a weighted summation result of the output layer by using the weighting value of the output layer 203, and then calculates an output function of the output layer by using a softmax function.
- the softmax function is as follows:
- p j represents the output probability of the jth training data in the output layer
- x j represents the weighted summation result of the jth training data in the output layer
- the application server 1 After determining the structure of the deep neural network, the application server 1 needs to determine the weighting values of the layers of the deep neural network.
- the application server 1 inputs all the voice feature data from the input layer of the deep neural network to the deep neural network, and obtains the output probability of the deep neural network, and calculates An error between the output probability and the expected output probability, and adjusting a weighting value of a hidden layer of the deep neural network according to an error between an output probability of the depth neural network and the expected output probability.
- the trained deep neural network model After obtaining the weighted values of the adjusted layers, the trained deep neural network model is obtained.
- step S402 real-time patient voice data is acquired.
- the application server 1 records the telephone voice entered by the patient through the recording device of the call center, and stores the telephone voice as a logo of the telephone number to obtain real-time patient voice data.
- the call center can be, but is not limited to, a telephone recording platform of a hospital and a remote server connected by a mobile phone app.
- the application server 1 can also actively take patient voice data. For example, in a hospital, a nurse can use a special recording device to specifically collect voice data for a patient, and use the patient name (or other attribute data representing the patient identity information, For example, ID number, social security card number, etc.) are stored for identification.
- Step S403 performing data processing on the patient voice data. Specifically, the step of performing data processing on the patient voice data is described in detail in the second embodiment (see FIG. 5) of the method for predicting disease using voice in the present application.
- Step S404 the processed patient voice data is sent to the input layer of the trained deep neural network model.
- Step S405 acquiring an output state of an output layer of the deep neural network model.
- the expected state output by the respective voice categories in the deep neural network model is a desired probability that each voice category outputs in the trained deep neural network model.
- Step S406 determining, according to the acquired output state, a category to which the patient voice data belongs.
- the application server 1 In order to clearly and intuitively obtain the category to which the patient voice data belongs, before determining the category to which the patient voice data belongs according to the acquired output state, the application server 1 also establishes each voice category and each voice category after the training. a mapping table between the expected states of the output in the deep neural network model, such that the application server 1 matches the acquired output state with the expected state in the mapping relationship table, and obtains the expectation according to the matching The status of the corresponding voice category in the mapping relationship table can determine that the patient corresponding to the patient voice data belongs to the voice category corresponding to the desired state.
- the output state obtained by inputting patient speech data into the trained deep neural network model matches the expected probability of the severe cold such speech class in the trained deep neural network model, and then the patient may be determined to be a severe cold, and further Provide some data support for the follow-up doctor's diagnosis.
- the method for predicting disease using speech proposed by the present application firstly trains a deep neural network model using training data, the training data has a specific speech category, and the deep neural network model has an input layer. And an output layer, the output layer may output a state of the voice category; secondly, acquire real-time patient voice data; and then perform data processing on the patient voice data; and then send the processed patient voice data to Entering an input layer of the deep neural network model after training; in addition, acquiring an output state of an output layer of the deep neural network model; and finally, determining a category to which the patient voice data belongs according to the acquired output state.
- the step of performing data processing on the patient voice data includes:
- Step S501 performing front end processing on the acquired patient voice data.
- the front segment processing includes noise reduction and endpoint detection.
- the endpoint detection is used to determine whether the patient voice data to be processed is valid voice. If it is not valid voice, the voice data is not processed, thereby improving the efficiency of the overall system.
- Step S502 performing feature value extraction and selection of the speech signal on the patient speech data processed in the previous stage.
- the feature values that the application server 1 needs to extract include time domain feature parameters and frequency domain feature parameters, wherein the time domain feature parameters include short time average energy, short time average amplitude, and short time average zero crossing rate.
- time domain feature parameters include short time average energy, short time average amplitude, and short time average zero crossing rate.
- formant, fundamental frequency, etc., frequency domain characteristic parameters include linear prediction coefficient LPC, linear prediction cepstral coefficient LPCC, Mel freguency cepstrum coefficient (MFCC) and so on.
- LPC linear prediction cepstral coefficient
- MFCC Mel freguency cepstrum coefficient
- the basic audio frequency reflects the glottal excitation characteristics
- the formant reflects the characteristics of the channel response
- LPC and LPCC simultaneously reflect the characteristics of glottal excitation and channel response
- MFCC simulates the human auditory characteristics. Voices of different diseases (degrees) will have different characteristic parameter values. Therefore, the degree of disease of the patient can be initially reflected by the extraction of the
- the method for predicting disease using voice proposed by the present application improves the efficiency of the overall system by performing front-end processing on the acquired patient voice data. And through the extraction and selection of the eigenvalues of the speech signal from the patient speech data processed in the previous segment, the patient's disease degree is initially reflected.
- the foregoing embodiment method can be implemented by means of software plus a necessary general hardware platform, and of course, can also be through hardware, but in many cases, the former is better.
- Implementation Based on such understanding, the technical solution of the present application, which is essential or contributes to the prior art, may be embodied in the form of a software product stored in a storage medium (such as ROM/RAM, disk,
- the optical disc includes a number of instructions for causing a terminal device (which may be a mobile phone, a computer, a server, an air conditioner, or a network device, etc.) to perform the methods described in various embodiments of the present application.
Landscapes
- Engineering & Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Multimedia (AREA)
- Computational Linguistics (AREA)
- Acoustics & Sound (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Signal Processing (AREA)
- Artificial Intelligence (AREA)
- Epidemiology (AREA)
- Public Health (AREA)
- General Health & Medical Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Computation (AREA)
- Medical Treatment And Welfare Office Work (AREA)
Abstract
La présente invention concerne un procédé de prédiction de maladie basée sur la voix. Le procédé consiste : à utiliser des données d'apprentissage pour entraîner un modèle de réseau neuronal profond, les données d'apprentissage comportant un type vocal spécifique, et le modèle de réseau neuronal profond comprenant une couche d'entrée et une couche de sortie ; à obtenir des données vocales de patient en temps réel ; à réaliser un traitement de données sur les données vocales de patient ; à envoyer les données vocales de patient traitées à la couche d'entrée du modèle de réseau neuronal profond entraîné ; à obtenir un état de sortie de la couche de sortie du modèle de réseau neuronal profond ; et à déterminer le type des données vocales de patient selon l'état de sortie obtenu. La présente invention concerne en outre un serveur d'application. Selon le procédé de prédiction de maladie basée sur la voix et le serveur d'application fournis par la présente invention, un diagnostic initial peut être rapidement effectué pour un patient sur la base de la voix du patient, ce qui fournit un certain support de données et des références pour le diagnostic officiel de suivi du médecin, et aide grandement les médecins et les patients.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710995691.7 | 2017-10-23 | ||
CN201710995691.7A CN108053841A (zh) | 2017-10-23 | 2017-10-23 | 利用语音进行疾病预测的方法及应用服务器 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2019080502A1 true WO2019080502A1 (fr) | 2019-05-02 |
Family
ID=62119669
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2018/089428 WO2019080502A1 (fr) | 2017-10-23 | 2018-06-01 | Procédé de prédiction de maladie basée sur la voix, serveur d'application et support d'informations lisible par ordinateur |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN108053841A (fr) |
WO (1) | WO2019080502A1 (fr) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2022167243A1 (fr) * | 2021-02-05 | 2022-08-11 | Novoic Ltd. | Procédé de traitement de la parole pour identifier des représentations de données à utiliser dans la surveillance ou le diagnostic d'un problème de santé |
Families Citing this family (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108053841A (zh) * | 2017-10-23 | 2018-05-18 | 平安科技(深圳)有限公司 | 利用语音进行疾病预测的方法及应用服务器 |
CN108518817A (zh) * | 2018-04-10 | 2018-09-11 | 珠海格力电器股份有限公司 | 一种自主调节控制方法、装置和空调系统 |
CN109431507A (zh) * | 2018-10-26 | 2019-03-08 | 平安科技(深圳)有限公司 | 基于深度学习的咳嗽疾病识别方法及装置 |
MX2021014721A (es) * | 2019-05-30 | 2022-04-06 | Insurance Services Office Inc | Sistemas y metodos para aprendizaje de maquina de atributos de voz. |
CN110473616B (zh) * | 2019-08-16 | 2022-08-23 | 北京声智科技有限公司 | 一种语音信号处理方法、装置及系统 |
CN112259126B (zh) * | 2020-09-24 | 2023-06-20 | 广州大学 | 一种自闭症语音特征辅助识别机器人及方法 |
CN116530944B (zh) * | 2023-07-06 | 2023-10-20 | 荣耀终端有限公司 | 声音处理方法及电子设备 |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102342858A (zh) * | 2010-08-06 | 2012-02-08 | 上海中医药大学 | 中医声诊采集与分析系统 |
WO2016192612A1 (fr) * | 2015-06-02 | 2016-12-08 | 陈宽 | Procédé d'analyse de données de traitement médical basé sur un apprentissage profond, et analyseur intelligent associé |
CN106710599A (zh) * | 2016-12-02 | 2017-05-24 | 深圳撒哈拉数据科技有限公司 | 一种基于深度神经网络的特定声源检测方法与系统 |
CN106709254A (zh) * | 2016-12-29 | 2017-05-24 | 天津中科智能识别产业技术研究院有限公司 | 一种医疗诊断机器人系统 |
CN108053841A (zh) * | 2017-10-23 | 2018-05-18 | 平安科技(深圳)有限公司 | 利用语音进行疾病预测的方法及应用服务器 |
Family Cites Families (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101739869B (zh) * | 2008-11-19 | 2012-03-28 | 中国科学院自动化研究所 | 一种基于先验知识的发音评估与诊断系统 |
CN110353685B (zh) * | 2012-03-29 | 2022-03-04 | 昆士兰大学 | 用于处理患者声音的方法与装置 |
CN103578470B (zh) * | 2012-08-09 | 2019-10-18 | 科大讯飞股份有限公司 | 一种电话录音数据的处理方法及系统 |
CN104347066B (zh) * | 2013-08-09 | 2019-11-12 | 上海掌门科技有限公司 | 基于深层神经网络的婴儿啼哭声识别方法及系统 |
US9687208B2 (en) * | 2015-06-03 | 2017-06-27 | iMEDI PLUS Inc. | Method and system for recognizing physiological sound |
CN105869658B (zh) * | 2016-04-01 | 2019-08-27 | 金陵科技学院 | 一种采用非线性特征的语音端点检测方法 |
CN105869627A (zh) * | 2016-04-28 | 2016-08-17 | 成都之达科技有限公司 | 基于车联网的语音处理方法 |
CN106778014B (zh) * | 2016-12-29 | 2020-06-16 | 浙江大学 | 一种基于循环神经网络的患病风险预测建模方法 |
CN107068167A (zh) * | 2017-03-13 | 2017-08-18 | 广东顺德中山大学卡内基梅隆大学国际联合研究院 | 融合多种端到端神经网络结构的说话人感冒症状识别方法 |
-
2017
- 2017-10-23 CN CN201710995691.7A patent/CN108053841A/zh active Pending
-
2018
- 2018-06-01 WO PCT/CN2018/089428 patent/WO2019080502A1/fr active Application Filing
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102342858A (zh) * | 2010-08-06 | 2012-02-08 | 上海中医药大学 | 中医声诊采集与分析系统 |
WO2016192612A1 (fr) * | 2015-06-02 | 2016-12-08 | 陈宽 | Procédé d'analyse de données de traitement médical basé sur un apprentissage profond, et analyseur intelligent associé |
CN106710599A (zh) * | 2016-12-02 | 2017-05-24 | 深圳撒哈拉数据科技有限公司 | 一种基于深度神经网络的特定声源检测方法与系统 |
CN106709254A (zh) * | 2016-12-29 | 2017-05-24 | 天津中科智能识别产业技术研究院有限公司 | 一种医疗诊断机器人系统 |
CN108053841A (zh) * | 2017-10-23 | 2018-05-18 | 平安科技(深圳)有限公司 | 利用语音进行疾病预测的方法及应用服务器 |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2022167243A1 (fr) * | 2021-02-05 | 2022-08-11 | Novoic Ltd. | Procédé de traitement de la parole pour identifier des représentations de données à utiliser dans la surveillance ou le diagnostic d'un problème de santé |
Also Published As
Publication number | Publication date |
---|---|
CN108053841A (zh) | 2018-05-18 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2019080502A1 (fr) | Procédé de prédiction de maladie basée sur la voix, serveur d'application et support d'informations lisible par ordinateur | |
WO2018149077A1 (fr) | Procédé de reconnaissance d'empreinte vocale, dispositif, support d'informations et serveur d'arrière-plan | |
US20180261236A1 (en) | Speaker recognition method and apparatus, computer device and computer-readable medium | |
US20200380957A1 (en) | Systems and Methods for Machine Learning of Voice Attributes | |
EP2810277B1 (fr) | Vérification du locuteur dans un système de surveillance sanitaire | |
KR20190022432A (ko) | 전자장치, 신분 검증 방법, 시스템 및 컴퓨터 판독 가능한 저장매체 | |
US10270736B2 (en) | Account adding method, terminal, server, and computer storage medium | |
WO2019136909A1 (fr) | Procédé de détection de corps vivant vocal basé sur un apprentissage profond, serveur et support de stockage | |
US20090326937A1 (en) | Using personalized health information to improve speech recognition | |
CN107038336A (zh) | 一种电子病历自动生成方法及装置 | |
CN110457432A (zh) | 面试评分方法、装置、设备及存储介质 | |
CN112562691A (zh) | 一种声纹识别的方法、装置、计算机设备及存储介质 | |
CN109299227B (zh) | 基于语音识别的信息查询方法和装置 | |
WO2021159755A1 (fr) | Procédé de traitement de données de traitement et de diagnostic intelligent, dispositif, appareil et support de stockage | |
US20230377602A1 (en) | Health-related information generation and storage | |
CN111933291A (zh) | 医疗信息推荐装置、方法、系统、设备及可读存储介质 | |
AU2021333916A1 (en) | Computerized decision support tool and medical device for respiratory condition monitoring and care | |
WO2020233381A1 (fr) | Procédé et appareil de requête de service sur la base d'une reconnaissance vocale, et dispositif informatique | |
WO2022205249A1 (fr) | Procédé de compensation de caractéristique audio, procédé de reconnaissance audio et produit associé | |
CN110767282A (zh) | 一种健康档案生成方法、装置以及计算机可读存储介质 | |
CN114141251A (zh) | 声音识别方法、声音识别装置及电子设备 | |
CN112614584A (zh) | 语音及文本转录的抑郁症辅助诊断方法、系统及介质 | |
CN111967235A (zh) | 表单处理方法、装置、计算机设备及存储介质 | |
CN114201580A (zh) | 数据处理方法、装置、电子设备和计算机可读存储介质 | |
CN112927413A (zh) | 一种就医挂号方法、装置、设备及存储介质 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 18871385 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
32PN | Ep: public notification in the ep bulletin as address of the adressee cannot be established |
Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 24.09.2020) |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 18871385 Country of ref document: EP Kind code of ref document: A1 |