CN114863911A

CN114863911A - Parkinson prediction method and device based on voice signals

Info

Publication number: CN114863911A
Application number: CN202210596295.8A
Authority: CN
Inventors: 张峪涵; 文鹏程; 文贵华
Original assignee: Dongguan Songshanhu Central Hospital Dongguan Shilong People's Hospital Dongguan Third People's Hospital Dongguan Cardiovascular Disease Research Institute
Current assignee: Dongguan Songshanhu Central Hospital Dongguan Shilong People's Hospital Dongguan Third People's Hospital Dongguan Cardiovascular Disease Research Institute
Priority date: 2022-05-27
Filing date: 2022-05-27
Publication date: 2022-08-05

Abstract

The invention discloses a Parkinson prediction method based on a voice signal, which comprises the following steps: collecting a voice signal of a tester and extracting a voice feature vector of the voice signal; constructing a classifier set model comprising at least three different candidate classifiers, training the classifier set model and obtaining a machine learning model, wherein the machine learning model selects a proper classifier from the candidate classifiers according to an input voice feature vector; and inputting the voice feature vectors into a trained machine learning model and predicting the probability of the test person to suffer from the Parkinson disease. The invention adopts at least three different risk prediction classifiers as candidate classifiers, not only has high accuracy, but also has high speed, and the method can be convenient for the ordinary people to analyze themselves at any time, and can diagnose and treat the Parkinson's disease as soon as possible when discovering that the tendency of the Parkinson's disease has higher risk.

Description

Parkinson prediction method and device based on voice signals

Technical Field

The invention relates to the technical field of medical data acquisition, in particular to a Parkinson prediction method and device based on voice signals.

Background

With the aging of society, the number of elderly people is increasing, and the probability of the elderly people suffering from Parkinson's Disease (PD) is increasing. Meanwhile, no reversible therapy for the Parkinson disease is found at present, although the medicine can obviously relieve the symptoms of the disease, the occurrence and the severity of the disease are difficult to predict by a simple method in an early stage, the evaluation time period is long, the evaluation process is complex, and the accuracy is influenced. On the other hand, it is inconvenient to evaluate patients with a tendency to Parkinson's disease in hospitals, medical resources are consumed, and medical costs are increased. Therefore, it is necessary to develop an intelligent portable device, which can be conveniently screened anytime and anywhere, and go to hospital for diagnosis and treatment only when the risk of the tendency of the Parkinson's disease is predicted to be high.

The Parkinson disease can show some symptoms before occurrence, some speech-based machine learning prediction methods are available at present, and the problems are that machine learning requires a large number of training samples, and labeled samples with predicted PD tendency are few, because the labeled samples need professional knowledge and a large amount of labor cost and time.

The existing machine learning method belongs to inertial thinking modeling and is easy to carry out error classification on a test sample. In fact, humans dynamically change their method according to the current test sample, rather than identifying all test samples in the same way. Because the sound production systems of different people have their own particularity, even the sound production of the same person at different times is different. Thus, the training speech samples used for machine learning may differ significantly from the tester's speech at the time of prediction, as they may be from different people, which results in the machine learning method being prone to significant performance differences between different groups of people, resulting in poor prediction performance. Although there are many good methods of speech acquisition and feature extraction, they are still affected by many unpredictable factors, such as the gender of the subject, the acoustic environment that varies to a large extent, and the physical condition and features of the subject. Furthermore, the method of acquisition and measurement of training speech may be different from that of the subject under test, and these methods are affected by the foregoing and have different robustness. Thus the predictive power of these methods is different for the same test sample. Each prediction method corresponds to a classifier, many of which have different capabilities and complementarity. In experiments we found that the classifier works well for some test samples, but for others it can often be in error. In particular, when two classifiers are used to classify test samples, their classification capabilities may be completely opposite. Therefore, according to the specific situation of a tester, it is reasonable to select different classifiers to realize personalized prediction, so the invention provides a personalized prediction method and portable equipment for Parkinson's disease risk based on voice signals.

Disclosure of Invention

The invention aims to provide a Parkinson prediction method and equipment based on voice signals, which are high in accuracy and high in speed.

In order to achieve the above object, in a first aspect, the present invention provides a parkinson prediction method based on a speech signal, including the steps of: collecting a voice signal of a tester and extracting a voice feature vector of the voice signal; constructing a classifier set model comprising at least three different candidate classifiers, training the classifier set model and obtaining a machine learning model, wherein the machine learning model selects a proper classifier from the candidate classifiers according to an input voice feature vector; and inputting the voice feature vectors into a trained machine learning model and predicting the probability of the test person to suffer from the Parkinson disease.

Preferably, the step of collecting the voice signal of the tester and extracting the voice feature vector of the voice signal comprises: collecting the sound of a tester in a quiet environment to obtain a voice signal; preprocessing the collected voice signals; and extracting the voice characteristic vector of the preprocessed voice signal.

Preferably, the at least three different candidate classifiers include decision trees, stochastic, and neural networks.

Preferably, the step of constructing a classifier set model including at least three different candidate classifiers, training the classifier set model and obtaining a machine learning model, and selecting a suitable classifier from the candidate classifiers by the machine learning model according to the input speech feature vector includes: setting a class label for each voice feature vector, wherein when a tester of voice signals turns to a patient suffering from Parkinson's disease, the value of the class label is 1, otherwise, the value of the class label is-1, and a voice data set SD for training is constructed; training each candidate classifier by using the speech data set SD to obtain a candidate classifier model so as to construct a classifier set model; selecting a suitable candidate classifier model for each speech feature vector in the speech data set SD, attaching the speech feature vector to the suitable candidate classifier model as a label, and further constructing a new classifier model data set CD; and training a machine learning method by adopting a classifier model data set CD to obtain a machine learning model. And classifying the voice characteristic vector of the input voice signal by using a training acquisition machine learning model to acquire the label of the candidate classifier model of the voice characteristic vector.

Preferably, the machine learning method comprises a support vector machine, a random forest and a neural network.

Preferably, selecting a suitable candidate classifier model for each speech feature vector in the speech data set SD, attaching the speech feature vector to the suitable candidate classifier model as a label, and further constructing a new classifier model data set CD includes: dividing a voice data set SD in a cross mode according to k-fold, sequentially selecting 1-fold, taking the rest (k-1) fold as a training set, and taking the whole data set SD as a test set; selecting a classifier model from the classifier set models, training on a training set, and classifying a test set; through k times of tests, the average classification accuracy of the classifier model on the voice feature vector samples is obtained; according to Bayes' theorem, calculating the selection probability of each voice feature vector for selecting the classifier model; for each speech feature vector, the classifier model with the largest probability of selection is selected as its label to construct a new classifier model data set CD.

Preferably, the step of inputting the speech feature vectors into the trained machine learning model and predicting the probability of the tester suffering from parkinson's disease comprises: and classifying the voice feature vector of the input voice signal by using a training and obtaining machine learning model to obtain the risk probability that the voice feature vector belongs to the Parkinson's disease and is not the Parkinson's disease.

Preferably, the step of inputting the speech feature vectors into the trained machine learning model and predicting the probability of the tester suffering from parkinson's disease is followed by: and when the risk probability is larger than a set threshold value, reminding the tester of medical diagnosis.

In a second aspect, the present invention further provides a parkinson prediction device based on a voice signal, the device comprising a voice signal acquisition device, a display touch screen, a processor and a software code including the parkinson prediction method based on a voice signal of the first aspect; the display touch screen is used for feeding back the probability of the Parkinson disease of the tester.

Preferably, the device is a smartphone or a tablet computer.

Compared with the prior art, the method adopts at least three different risk prediction classifiers as candidate classifiers, not only has high accuracy, but also has high speed, can be convenient for the ordinary people to analyze themselves at any time, and can diagnose and treat the Parkinson's disease as soon as possible when discovering that the tendency of the Parkinson's disease has higher risk.

Drawings

Fig. 1 is a flowchart of a parkinson prediction method based on a speech signal according to an embodiment of the present invention.

Detailed Description

In order to explain technical contents, structural features, and effects achieved by the present invention in detail, the following detailed description is given with reference to the embodiments and the accompanying drawings.

The embodiment of the invention provides a Parkinson prediction method based on a voice signal, which comprises the following steps:

s1, collecting the voice signal of the tester and extracting the voice feature vector of the voice signal;

s2, constructing a classifier set model containing at least three different candidate classifiers, training the classifier set model and obtaining a machine learning model, wherein the machine learning model selects a proper classifier from the candidate classifiers according to the input speech feature vector;

and S3, inputting the voice feature vectors into the trained machine learning model and predicting the probability of the Parkinson disease of the tester.

The embodiment of the invention adopts at least three different risk prediction classifiers as the candidate classifiers, thereby not only having high accuracy, but also having high speed. The method can be convenient for common people to analyze themselves at any time, and can diagnose and treat the Parkinson's disease as soon as possible when higher risk is found to exist.

In the embodiment of the present invention, as shown in fig. 1, the step S1 of collecting the voice signal of the tester and extracting the voice feature step vector of the voice signal includes:

s11, collecting the voice signal of the tester, including the dialogue between the doctor and the patient, the reading of the appointed talk and the pronunciation; when selecting pronunciation, the vowel is selected because the mechanism of different sound formation is different. The voice data acquisition can be carried out through various devices, and in order to meet the requirements of portable devices, the devices used in the acquisition are common tablet computers and smart phones; collecting tester information includes: collecting the sound of the tester under a quiet environment by using the voice signal, the serial number of the tester, whether the tester has confirmed diagnosis of Parkinson, whether other diseases causing the voice disorder exist, the sick time, the UPDRS (movement), the UPDRS (integration), and the collection date and time to obtain the voice signal; specifically, in this embodiment, the commonly used edutainment story "north wind and sun" for phonetics study is selected, so that the tester reads the test corpus in cantonese or mandarin with natural speed and appropriate loudness, and before formal testing, the tester can read the test corpus silently to be familiar with the short texts. The acquisition process is completed in a quiet environment, and the environmental noise is controlled to be below 45 dB. The voice of the reading corpus of the patient is collected through the smart phone and is associated with the health file of the patient, and the identity card, the name, whether the patient is diagnosed with Parkinson before, whether other diseases causing the speech disorder exist, the illness time, the UPDRS (movement), the UPDRS (integration), and the collection date and time of the tester are recorded.

S12, preprocessing the collected voice signals; specifically, preprocessing the speech signal includes format conversion, sampling frequency conversion, pre-emphasis, windowing and framing, removing unvoiced portions, while also separating voiced data (vocal cord vibration) and unvoiced data (vocal cord vibration-free), and data normalization and outlier removal to improve the patient's voice quality. The pre-treatment may be performed, for example, using the XAduioPro tool.

And S13, extracting the voice feature vector of the voice signal after the preprocessing. Specifically, two methods are adopted for extracting features, wherein the first method adopts artificial extraction and comprises commonly used amplitude parameters, pulse parameters, frequency parameters, sounding parameters, tone parameters and harmony parameters, and the second method adopts a deep learning method. The embodiment of the invention adopts an openSMILE voice feature extraction tool, and the tool has wider application in the fields of voice recognition, emotion calculation, music information retrieval and the like. The implementation case adopts openSMILE to extract the following characteristics: frame Energy, Frame Intensity/Loudness (adaptation) spectrum, Critical Band spectrum (MFCC), acoustic spectrum (LSP), acoustic spectrum (approximate Intensity of acoustic spectrum), Linear spectrum (LPC), Linear Spectrum Pair (LSP), acoustic spectrum (spectrum and Frequency), acoustic spectrum (sum of Spectral frequencies and Frequency), acoustic spectrum (approximate Intensity), acoustic spectrum (sum of acoustic spectrum) and acoustic spectrum (sum of acoustic spectrum) frequencies, acoustic spectrum and acoustic spectrum (sum of acoustic spectrum) spectrum, acoustic spectrum and acoustic spectrum (sum of Frequency), acoustic spectrum and acoustic spectrum (sum of Frequency and Frequency), acoustic spectrum (sum of acoustic spectrum and acoustic spectrum (sum of Frequency, acoustic spectrum and acoustic spectrum (sum of Frequency, acoustic spectrum, Frequency and acoustic spectrum, Frequency, and Frequency, psychoacoustics sharp, spectral harmony.

In the embodiment of the invention, the at least three different candidate classifiers are decision trees, stochastic senones and neural networks, and the different candidate classifiers are adopted, so that the mutual complementation can be realized, and the diversity is kept. Other classifiers may optionally be included in some other embodiments.

In the embodiment of the present invention, step S2 is to construct a classifier set model including at least three different candidate classifiers, train the classifier set model, and obtain a machine learning model, where the step of selecting a suitable classifier from the candidate classifiers by the machine learning model according to an input speech feature vector includes:

and S21, setting a class label for each voice feature vector, wherein the value of the class label is 1 when the tester of the voice signal turns to the patient with Parkinson' S disease, and the value of the class label is-1 otherwise, and constructing a voice data set SD for training.

And S22, training each candidate classifier by using the voice data set SD, and obtaining a candidate classifier model to construct a classifier set model.

S23, selecting a suitable candidate classifier model for each speech feature vector in the speech data set SD, and attaching the speech feature vector to the suitable candidate classifier model as a label to construct a new classifier model data set CD.

S24, training a machine learning method by adopting a classifier model data set CD to obtain a machine learning model, specifically, the machine learning method comprises a support vector machine, a random forest and a neural network, and the machine learning method selected by the embodiment of the invention is the support vector machine.

And S25, classifying the speech feature vector of the input speech signal by using the trained machine learning model to obtain the label of the candidate classifier model of the speech feature vector, namely the classifier model most suitable for classifying the speech feature vector.

In this embodiment of the present invention, step S23 is to select a suitable candidate classifier model for each speech feature vector in the speech data set SD, attach the suitable candidate classifier model to the speech feature vector as a label, and further construct a new classifier model data set CD, including:

s231, the speech data set SD is cross-divided by k-fold, 1-fold is sequentially selected, the remaining (k-1) fold is a training set, the whole data set SD is a test set, and specifically, k is a parameter, for example, a value is 10.

S232, selecting one classifier model from the classifier set models, training on the training set, and classifying the test set.

And S233, obtaining the average classification accuracy of the classifier model to the voice feature vector samples through k times of tests.

And S234, calculating the selection probability of each voice feature vector for selecting the classifier model according to the Bayesian theorem.

S235, for each voice feature vector, selecting the classifier model with the maximum selection probability as a label of the voice feature vector to construct a new classifier model data set CD.

In the embodiment of the invention, the step S3 of inputting the speech feature vector into the trained machine learning model and predicting the probability of the Parkinson' S disease of the tester comprises the following steps:

and S31, classifying the voice feature vectors of the input voice signals by using the training acquisition machine learning model, and acquiring the risk probability that the voice feature vectors belong to the Parkinson 'S disease and are not the Parkinson' S disease. After the speech feature vectors of the speech signals are input into the machine learning model, the machine learning model selects a proper classifier model, and the selected classifier model is used for predicting the probability of the Parkinson disease of the tester.

After step S3, the embodiment of the present invention further includes: and S4, when the risk probability is larger than the set threshold value, reminding the tester of medical diagnosis. The threshold value of the risk probability of the Parkinson's disease can be set to be 50%, and when the risk probability of the Parkinson's disease is greater than or equal to 50% as a result of the test, the tester is reminded to go to a hospital for diagnosis and treatment.

The embodiment of the invention also provides a Parkinson prediction device based on the voice signal, which comprises a voice signal acquisition device, a display touch screen, a processor, a loudspeaker and a software code containing the Parkinson prediction method based on the voice signal, wherein the display touch screen is used for feeding back the probability of the Parkinson disease of a tester.

In the embodiment of the invention, the equipment is portable equipment such as a smart phone or a tablet personal computer, and when the risk probability is greater than a given threshold value, the result is displayed by displaying the touch screen to remind a tester to go to a hospital for diagnosis and treatment.

The Parkinson prediction method based on the voice signals and the portable equipment in the embodiment of the invention have the advantages of high prediction accuracy and high speed. The portable equipment is convenient for the masses to analyze anytime and anywhere, and finds the risk of diseases in time, so as to diagnose and treat the diseases as soon as possible.

The above disclosure is only a preferred embodiment of the present invention, and certainly should not be taken as limiting the scope of the present invention, which is therefore intended to cover all equivalent changes and modifications within the scope of the present invention.

Claims

1. A Parkinson prediction method based on a voice signal is characterized by comprising the following steps:

collecting a voice signal of a tester and extracting a voice feature vector of the voice signal;

constructing a classifier set model comprising at least three different candidate classifiers, training the classifier set model and obtaining a machine learning model, wherein the machine learning model selects a proper classifier from the candidate classifiers according to an input voice feature vector;

and inputting the voice feature vectors into a trained machine learning model and predicting the probability of the test person to suffer from the Parkinson disease.

2. The method of parkinson's prediction based on speech signal of claim 1, wherein the step of collecting the speech signal of the tester and extracting the speech feature vector of the speech signal comprises:

collecting the sound of a tester in a quiet environment to obtain a voice signal;

preprocessing the collected voice signals;

and extracting the voice characteristic vector of the preprocessed voice signal.

3. The speech signal-based parkinson's prediction method of claim 1, wherein the at least three different candidate classifiers comprise decision trees, stochastic trees and neural networks.

4. The method of speech signal-based parkinson's prediction according to claim 1, wherein the step of constructing a classifier ensemble model comprising at least three different candidate classifiers, training said classifier ensemble model and obtaining a machine learning model, said machine learning model selecting a suitable classifier among the candidate classifiers according to the input speech feature vectors comprises:

setting a class label for each voice feature vector, wherein when a tester of voice signals turns to a patient suffering from Parkinson's disease, the value of the class label is 1, otherwise, the value of the class label is-1, and a voice data set SD for training is constructed;

training each candidate classifier by using the speech data set SD to obtain a candidate classifier model so as to construct a classifier set model;

selecting a suitable candidate classifier model for each speech feature vector in the speech data set SD, attaching the speech feature vector to the suitable candidate classifier model as a label, and further constructing a new classifier model data set CD;

training a machine learning method by adopting a classifier model data set CD to obtain a machine learning model;

and classifying the voice characteristic vector of the input voice signal by using a training acquisition machine learning model to acquire the label of the candidate classifier model of the voice characteristic vector.

5. The speech signal-based parkinson's prediction method of claim 4, wherein the machine learning method comprises a support vector machine, a random forest and a neural network.

6. The method of claim 4, wherein the selecting a suitable candidate classifier model for each speech feature vector in the speech data set SD, labeling the speech feature vector with the suitable candidate classifier model, and constructing a new classifier model data set CD comprises:

dividing a voice data set SD in a cross mode according to k-fold, sequentially selecting 1-fold, taking the rest (k-1) fold as a training set, and taking the whole data set SD as a test set;

selecting a classifier model from the classifier set models, training on a training set, and classifying a test set;

through k times of tests, the average classification accuracy of the classifier model on the voice feature vector samples is obtained;

according to Bayes' theorem, calculating the selection probability of each voice feature vector for selecting the classifier model;

for each speech feature vector, the classifier model with the largest probability of selection is selected as its label to construct a new classifier model data set CD.

7. The speech signal-based parkinson's disease prediction method of claim 1, wherein the step of inputting the speech feature vectors into a trained machine learning model and predicting the probability of parkinson's disease for the test subject comprises:

and classifying the voice feature vector of the input voice signal by using a training and obtaining machine learning model to obtain the risk probability that the voice feature vector belongs to the Parkinson's disease and is not the Parkinson's disease.

8. The method of speech signal-based parkinson's disease prediction according to claim 1, wherein said step of inputting speech feature vectors into a trained machine learning model and predicting the probability of parkinson's disease of said subject is followed by the steps of:

and when the risk probability is greater than a set threshold value, reminding the tester of medical diagnosis.

9. A speech signal based parkinson's prediction device, said device comprising a speech signal acquisition device, a display touch screen, a processor and software code embodying the speech signal based parkinson's prediction method of any of claims 1-8; the display touch screen is used for feeding back the probability of the Parkinson disease of the tester.

10. The speech signal-based parkinson's prediction device of claim 9, wherein the device is a smartphone or a tablet computer.