CN110415824B - Cerebral apoplexy disease risk assessment device and equipment - Google Patents

Cerebral apoplexy disease risk assessment device and equipment Download PDF

Info

Publication number
CN110415824B
CN110415824B CN201910695069.3A CN201910695069A CN110415824B CN 110415824 B CN110415824 B CN 110415824B CN 201910695069 A CN201910695069 A CN 201910695069A CN 110415824 B CN110415824 B CN 110415824B
Authority
CN
China
Prior art keywords
spectrogram
voice signal
classifier
risk assessment
voice
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910695069.3A
Other languages
Chinese (zh)
Other versions
CN110415824A (en
Inventor
叶武剑
李琪
刘怡俊
牟志伟
李学易
张子文
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong University of Technology
Original Assignee
Guangdong University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong University of Technology filed Critical Guangdong University of Technology
Priority to CN201910695069.3A priority Critical patent/CN110415824B/en
Publication of CN110415824A publication Critical patent/CN110415824A/en
Application granted granted Critical
Publication of CN110415824B publication Critical patent/CN110415824B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/03Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
    • G10L25/18Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being spectral information of each sub-band
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/03Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
    • G10L25/21Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being power information
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/27Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique
    • G10L25/30Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique using neural networks
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
    • G10L25/66Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination for extracting parameters related to health condition
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/30ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for calculating health indices; for individual health risk assessment
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2218/00Aspects of pattern recognition specially adapted for signal processing
    • G06F2218/02Preprocessing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2218/00Aspects of pattern recognition specially adapted for signal processing
    • G06F2218/08Feature extraction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2218/00Aspects of pattern recognition specially adapted for signal processing
    • G06F2218/12Classification; Matching
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/30Computing systems specially adapted for manufacturing

Abstract

The invention discloses a device for evaluating the risk of cerebral apoplexy, which comprises a signal acquisition module, a signal processing module and a processing module, wherein the signal acquisition module is used for acquiring a voice signal to be evaluated and editing the voice signal to be evaluated to acquire a specific monosyllabic voice signal; the signal processing module is used for preprocessing and fast Fourier transforming the monosyllabic voice signal to obtain the power spectral density of the voice signal and converting the power spectral density of the voice signal into the energy density of the voice signal; the spectrogram generation module is used for generating a spectrogram according to the energy density of the voice signal; and the risk assessment module is used for inputting the spectrogram into a pre-created classifier to obtain a risk assessment result. The device in this application uses the pronunciation information as the risk of aassessment user illness, has reduced the degree of difficulty of information acquisition, adopts the classifier to carry out the analysis to the chinese graph spectrum of pronunciation information, improves the accuracy of evaluation result. The invention also provides a cerebral apoplexy disease risk assessment device which has the beneficial effects.

Description

Cerebral apoplexy disease risk assessment device and equipment
Technical Field
The invention relates to the technical field of stroke risk assessment, in particular to a device and equipment for assessing the risk of stroke.
Background
Cerebral stroke (stroke), also known as "stroke", is an acute cerebrovascular disease, which is a group of diseases in which brain tissue is damaged due to sudden rupture of cerebral blood vessels or failure of blood to flow into the brain due to vessel blockage, and some patients can generate ambiguous sounds after suffering the disease and some patients can generate short-time expression because others cannot hear and understand the contents of the patient at all; either aphasia or involuntary salivation.
At present, the treatment means for cerebral apoplexy is limited, and the curative effect is not ideal, so that education on popularization of cerebral apoplexy risk factors and premonitory symptoms by the whole population is enhanced, and the cerebral apoplexy can be truly prevented and treated. The existing cerebral apoplexy prediction system utilizes evaluation software to evaluate cerebral apoplexy disease risk indexes according to collected physiological data, and then performs data processing and result prediction.
Because the stroke precursor time is very short in many times, the index information of the evaluation software in the prior art needs quite long time to acquire a large amount of case data, the system prediction period is too long, and even the patient is likely to miss the optimal treatment time, so that the disease condition of the patient is aggravated, the current evaluation software cannot reach the high-efficiency standard, and the timely treatment of the stroke disease is not facilitated.
Disclosure of Invention
The invention aims to provide a device and equipment for evaluating the risk of cerebral apoplexy and a computer readable storage medium, which solve the problems of low evaluation efficiency and high cost of cerebral apoplexy diseases at present.
In order to solve the above technical problems, the present invention provides a risk assessment device for stroke, comprising:
the signal acquisition module is used for acquiring a voice signal to be evaluated, editing the voice signal to be evaluated to acquire a specific monosyllabic voice signal;
the signal processing module is used for preprocessing the monosyllabic voice signal and performing fast Fourier transform to obtain the power spectrum density of the voice signal, and converting the power spectrum density of the voice signal into the energy density of the voice signal;
the spectrogram generation module is used for generating a spectrogram according to the voice signal energy density, wherein the spectrogram comprises voice time information, voice frequency information and voice energy information;
and the risk assessment module is used for inputting the spectrogram into a pre-created classifier to obtain a risk assessment result.
The system further comprises a classifier creation module, wherein the classifier creation module specifically comprises:
the voice signal acquisition unit is used for acquiring voice signal samples and acquiring voice spectrum samples according to the voice signal samples;
the sample dividing unit is used for dividing the spectrogram sample into at least a training spectrogram set and a verification spectrogram set according to a proportion;
the training learning unit is used for inputting the training spectrogram set into a convolutional neural network, extracting deep voice characteristic parameters in the spectrogram through deep convolution kernels of different layers, and obtaining a classification model;
the model adjusting unit is used for inputting the verification spectrogram set, adjusting parameters of the classification model, controlling fitting capacity of the classification model and obtaining the classifier.
The sample dividing unit is specifically configured to: dividing the spectrogram sample into a training spectrogram set, a verification spectrogram set and a test spectrogram set according to the proportion;
the classifier creation module further comprises a test unit, wherein the test spectrogram set is input into the classifier after the classifier is obtained, so that the risk assessment accuracy of the classifier is obtained; and judging whether the risk assessment accuracy reaches a preset accuracy, and if so, completing the creation of the classifier.
The signal processing module is specifically configured to input the power spectral density of the speech signal into a formula: s=10log 10 (Y (m, n)), where Y (m, n) is the power spectral density of the speech signal, m is the number of frames of speech information, and n is the frame length of speech information.
The spectrogram generation module is specifically configured to generate the spectrogram with a horizontal axis representing time, a vertical axis representing frequency, and a coordinate point value being two dimensions of voice data energy based on an imagesc function.
The invention also provides a device for evaluating the risk of cerebral apoplexy, which comprises a processor, a recording device connected with the processor and a memory connected with the processor;
the recording device is used for recording the voice signal to be evaluated and sending the voice signal to be evaluated to the processor;
the memory is used for storing a computer program;
the processor is configured to execute the computer program according to the to-be-evaluated voice signal, so as to implement an operation step of evaluating a risk of suffering from a stroke, where the operation step includes:
collecting and obtaining a voice signal to be evaluated, and editing the voice signal to be evaluated to obtain a specific monosyllabic voice signal;
preprocessing and fast Fourier transforming the monosyllabic voice signal to obtain the power spectrum density of the voice signal, and converting the power spectrum density of the voice signal into the energy density of the voice signal;
generating a spectrogram according to the voice signal energy density, wherein the spectrogram comprises voice time information, voice frequency information and voice energy information;
and inputting the spectrogram into a pre-created classifier to obtain a risk assessment result.
Wherein the processor is further configured to create the classifier;
wherein the process of creating the classifier by the processor comprises:
collecting a voice signal sample, and obtaining a spectrogram sample according to the voice signal sample;
dividing the spectrogram sample into a training spectrogram set and a verification spectrogram set according to a proportion;
inputting the training spectrogram set into a convolutional neural network, and extracting deep voice characteristic parameters in the spectrogram through deep convolution kernels of different layers to obtain a classification model;
and inputting a verification language spectrum atlas, adjusting parameters of the classification model, and controlling fitting capacity of the classification model to obtain the classifier.
The processor is specifically configured to divide the spectrogram sample into a training spectrogram set, a verification spectrogram set and a test spectrogram set according to a proportion;
after obtaining the classifier, further comprising:
inputting the test spectrogram set into the classifier to obtain the risk assessment accuracy of the classifier;
and judging whether the risk assessment accuracy reaches a preset accuracy, and if so, completing the creation of the classifier.
The processor is specifically configured to input the power spectral density of the speech signal into a formula: s=10log 10 (Y (m, n)), where Y (m, n) is the power spectral density of the speech signal, m is the number of frames of speech information, and n is the frame length of speech information.
The processor is specifically configured to generate, based on an imagesc function, the spectrogram in which the horizontal axis represents time, the vertical axis represents frequency, and the coordinate point value is two dimensions of voice data energy.
The invention provides a cerebral apoplexy illness risk assessment device, which comprises a signal acquisition module, a signal analysis module and a control module, wherein the signal acquisition module is used for acquiring a voice signal to be assessed and editing the voice signal to be assessed to acquire a specific monosyllabic voice signal; the signal processing module is used for preprocessing the monosyllabic voice signal and performing fast Fourier transform to obtain the power spectrum density of the voice signal, and converting the power spectrum density of the voice signal into the energy density of the voice signal; the spectrogram generation module is used for generating a spectrogram according to the voice signal energy density, wherein the spectrogram comprises voice time information, voice frequency information and voice energy information; and the risk assessment module is used for inputting the spectrogram into a pre-created classifier to obtain a risk assessment result.
According to the risk assessment device for cerebral apoplexy, voice information of a user is collected to serve as an information basis for assessing risk of the cerebral apoplexy, difficulty in data information collection is reduced to a certain extent, time spent in information collection is shortened, and after characteristics of the voice information are converted into language graph in a graph form, a classifier is used for analyzing the language graph, and compared with the process of directly analyzing the voice information, a more accurate analysis result can be obtained, and further a more accurate risk assessment result is obtained. Therefore, the risk assessment device can efficiently complete risk assessment, can obtain more accurate assessment results, and is beneficial to the prevention and timely treatment of cerebral apoplexy for patients.
The invention also provides a cerebral apoplexy disease risk assessment device, which has the beneficial effects.
Drawings
For a clearer description of embodiments of the invention or of the prior art, the drawings that are used in the description of the embodiments or of the prior art will be briefly described, it being apparent that the drawings in the description below are only some embodiments of the invention, and that other drawings can be obtained from them without inventive effort for a person skilled in the art.
Fig. 1 is a schematic structural diagram of a device for evaluating risk of stroke according to an embodiment of the present invention;
FIG. 2 is a graph of speech information provided by an embodiment of the present invention;
FIG. 3 is a schematic diagram of a device for assessing risk of stroke according to another embodiment of the present invention;
FIG. 4 is a schematic diagram of the class creation module of FIG. 3;
fig. 5 is a schematic structural diagram of a device for evaluating risk of stroke according to an embodiment of the present invention.
Detailed Description
In order to better understand the aspects of the present invention, the present invention will be described in further detail with reference to the accompanying drawings and detailed description. It will be apparent that the described embodiments are only some, but not all, embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Fig. 1 is a block diagram of a device for evaluating risk of stroke according to an embodiment of the present invention, and the device for evaluating risk of stroke referring to fig. 1 may include:
the signal acquisition module 100 is used for acquiring a voice signal to be evaluated, and editing the voice signal to be evaluated to acquire a specific monosyllabic voice signal;
in particular, a quiet treatment room may be selected and a recording device may be used to collect specific voice information for the patient for whom stroke risk assessment is desired. The recorded speech content is a long segment of speech including special syllables that are clinically used to diagnose cerebral apoplexy with poor oral teeth.
The speech signal is clipped using speech processing software, and is divided into a plurality of short mono-syllable speech segments, such as xiong, xia, ri, ta, etc., for diagnosing abnormal syllables of the oral teeth in stroke, care should be taken to make the duration of the speech segments obtained by clipping identical.
The signal processing module 200 is used for preprocessing and fast fourier transforming the monosyllabic voice signal to obtain the power spectral density of the voice signal, and converting the power spectral density of the voice signal into the energy density of the voice signal;
specifically, the pre-processing of the monosyllabic speech signal may specifically include the steps of:
(1) Pre-emphasis: the emphasis is a signal processing mode of compensating high frequency components of an input monosyllabic voice signal at a transmitting end, and the monosyllabic voice signal is made to pass through a high-pass filter, so that the high frequency part is lifted, the monosyllabic voice signal is flattened, the monosyllabic voice signal is kept in the whole frequency band from low frequency to high frequency, and the frequency spectrum can be obtained by using the same signal-to-noise ratio. Meanwhile, in order to eliminate the effects of vocal cords and lips in the sounding process, the loss of information is reduced, and the high-frequency resolution of monosyllabic voice signals is increased. Typically, the filter transfer function is by a high pass filter: h (z) =1-u/z, where u is a pre-emphasis coefficient between 0.9 and 1.0.
(2) Framing; and in the process of recording the voice signals to be evaluated, the voice signals collected at every N sampling points are integrated into an observation unit, which is called a frame. Typically, the value of N is 256 or 512, and the covering time is about 20-30 ms, so that an overlap area is provided between two adjacent frames to avoid excessive variation of two adjacent frames, and the overlap area includes M sampling points, and typically, the value of M is about 1/2 or 1/3 of N. The sampling frequency of the voice signal adopted by the voice recognition is 8KHz or 16KHz, so that continuous voice signal information is kept not to be lost, and the Nyquist sampling law is satisfied.
In the traditional framing method, the last time of less than one frame is abandoned, but the short syllables in the invention contain a great amount of characteristic information needed by people, so that in order to pursue the integrity of the voice signal, the information loss in the framing process is avoided, and zero padding operation is carried out. The specific operation is that each frame before is a selected specific frame length, and the last frame is not enough to be the specific frame length, so that zero padding can be realized by using a function enframe in matlab, wherein the zero padding operation formula comprises the following steps: f=zeros (nf, len), where len is the frame length and nf is the number of frames.
(3) Windowing: windowing is to reduce the problem of discontinuity in the monosyllabic speech signal where frames start and end; at the same time, after windowing, the speech signal which is not periodic originally shows partial characteristics of the periodic function. Since frequency leakage is generated when directly cutting off a monosyllabic voice signal (adding a rectangular window), in order to improve the situation, a hamming window is selected in the embodiment, because the amplitude-frequency characteristic of the hamming window is that the side lobe attenuation is larger, and the attenuation between a main lobe peak value and a first side lobe peak value can reach 40db, and the hamming window has the following formula:
Figure BDA0002149103760000071
multiplying the monosyllabic voice signal contained in each frame by a hamming window, assuming that the monosyllabic voice signal after framing is S (n); wherein n=0, 1, …, N-1; n is the number of frames. Then H (n) =s (n) ×w (n) after multiplication by the hamming window, ω (n) is of the form:
ω (N, a) = (1-a) -a cos [ 2n/(N-1) ], 0.ltoreq.n.ltoreq.n-1; typically a takes 0.46 and n is the frame number.
After the pre-processing of the monosyllabic speech signal is completed, the speech signal is fourier transformed. Since the signal characteristics are generally difficult to see by the transformation of the monosyllabic speech signal in the time domain, the monosyllabic speech signal of each frame is observed by transforming it into the energy distribution in the frequency domain by fourier transformation, and the speech signal spectrum of each frame is obtained by performing fast fourier transformation on the monosyllabic speech signal of each frame after framing and windowing, and the fast fourier transformation formula is as follows:
Figure BDA0002149103760000072
where x (N) is an input monosyllabic speech signal of each frame, N represents the number of points of fourier transform, and N may be 512 or 1024. After obtaining the voice signal spectrum of each frame, performing modular squaring on the voice signal spectrum to obtain the power spectrum of the voice signal, and calling a function Spectrogram in matlab to perform short-time Fourier analysis in actual operation, wherein the function is shown as follows;
specgram(x,N,fs,window,overlap);
wherein x is a monosyllabic voice signal of each frame to be processed, and in this embodiment, wav format is adopted; n is the number of short-time Fourier transform points, and 1024 and 512 are generally taken; fs is the sampling frequency, and 44100Hz is generally taken, because 44100Hz is the theoretical CD tone quality limit, the experimental result can be more accurate; the overlap is the number of coincidence points between frames, which is half of N, and the power spectrum density of the voice signal can be obtained through function calculation.
The fourier transform is followed by a short-time fourier transform and a power spectral density of the input monosyllabic speech signal. And substituting the HZ (hertz) into a formula to convert the unit of the power spectral density into dB (decibel), so as to obtain the energy density of the voice signal. The formula is as follows:
S=10*log10(Y(m,n));
wherein Y (m, n) is the power spectral density obtained by Fourier transform analysis, m is the number of frames, and n is the frame length, namely nf and len in the steps of framing and windowing. If not formulated, a power spectrum in "1" is obtained; the energy density of the voice signal is calculated by the formula and is expressed in dB.
Because the power spectrum is easier to observe and reflects the problem, the power spectrum is often expressed by dB, and the subsequently drawn spectrogram contains more information and is clearer. The purpose of this transformation is to pull up those components of lower amplitude relatively high amplitude components in order to observe periodic signals that are masked in low amplitude noise.
The spectrogram generation module 300 is configured to generate a spectrogram according to the energy density of the voice signal, where the spectrogram includes voice time information, voice frequency information and voice energy information;
the spectrogram comprises voice time information, voice frequency information and voice energy information.
The method comprises the steps of calling a function imagesc in matlab to draw three parameters of M, N,10 log10 (Y (M, N)) into a two-dimensional graph, namely a spectrogram, wherein the generated spectrogram is shown in fig. 2, fig. 2 is a spectrogram of voice information provided by the embodiment of the invention, the horizontal axis in fig. 2 represents time, the vertical axis represents frequency, the coordinate point value is voice data energy, and the color depth value of a pixel reflects signal energy densities of frequencies corresponding to different moments because three-dimensional information is expressed by a two-dimensional plane.
The risk assessment module 400 is configured to input the spectrogram into a pre-created classifier to obtain a risk assessment result.
Specifically, the voice signal to be evaluated is converted into a spectrogram in the form of an image, and the voice spectrogram is analyzed by a classifier capable of analyzing the image, so that a risk evaluation result is obtained.
The risk assessment result is obtained by using the voice signal of the user who carries out the risk assessment of cerebral apoplexy according to the need, and the voice signal needs to be deeply studied and analyzed. However, in the current deep learning and analysis technology, the technology of analyzing the image is more mature, while the technology of directly performing deep learning and analysis on the voice information is relatively immature, and if the risk assessment is directly performed on the voice information, the accuracy of the risk assessment can be greatly reduced.
Therefore, the device in the application generates the corresponding spectrogram after the voice information to be evaluated is processed in a series, and carries out deep learning and analysis on the spectrogram to finally obtain the risk evaluation result of the spectrogram, so that the accuracy of the result of obtaining the risk evaluation of the illness through the voice signal is improved to a certain extent, and the device is beneficial to timely treating the illness of patients.
According to the experimental surface, the accuracy of the disease risk assessment of healthy people is up to 95%, and the accuracy of the disease risk assessment of the healthy people is up to 90%. Therefore, the cerebral apoplexy risk assessment device has higher accuracy.
In summary, the characteristic parameters in the extracted voice signals are used as the cerebral apoplexy prediction information, the potential cerebral apoplexy patient can record specific voice signals at any time and any place, the information required by prediction is easier to obtain, the method is convenient and quick, the error is small, compared with the traditional cerebral apoplexy prediction information which is mainly required to be obtained by using expensive medical equipment, and the cost of disease risk assessment is reduced; in addition, because the image processing based on deep learning is more mature than the pattern recognition technology, the risk probability of cerebral apoplexy can be obtained more quickly and accurately. Therefore, the cerebral apoplexy risk assessment device provided by the invention has the characteristics of low cost, high efficiency and high accuracy.
Based on the above embodiment, in another embodiment of the present invention, as shown in fig. 3 and fig. 4, fig. 3 is a schematic structural diagram of a device for evaluating risk of stroke in stroke according to another embodiment of the present invention, the device for evaluating risk of stroke further includes a classifier creation module 500, and fig. 4 is a schematic structural diagram of the classifier creation module in fig. 3, where the classifier creation module 500 specifically includes:
a spectrogram acquisition unit 501, configured to acquire a speech signal sample, and obtain a spectrogram sample according to the speech signal sample;
it should be noted that, in this embodiment, the process of obtaining the spectrogram sample according to the voice sample is the same as the operation steps of generating the spectrogram according to the voice information to be evaluated in the above embodiment, so that a detailed description is not repeated in this embodiment.
However, for the collected voice signal samples, stroke patients (both men and women) and common people without diseases (both men and women) should be included, the number of people contained in each group is kept the same as much as possible, and the content of the recorded voice and the voice content of the voice information to be evaluated are the same.
The sample dividing unit 502 is configured to divide the spectrogram sample into at least a training spectrogram set and a verification spectrogram set according to a proportion;
the generated spectrogram sample signs are divided into training sets and verification sets in proportion for constructing a classifier and subsequent training, and the voice signal samples of patients and healthy people with about the same proportion are required in each set.
The training learning unit 503 is configured to input the training spectrogram set into a convolutional neural network, extract deep speech feature parameters in the spectrogram through deep convolution kernels of different layers, and obtain a classification model;
the model adjustment unit 504 is configured to input a verification language spectrum atlas, adjust parameters of the classification model, and control fitting ability of the classification model to obtain the classifier.
Based on deep learning technology, training Convolutional Neural Network (CNN) to build classification model, using training set and verifying spectrogram in previous division as input of neural network, extracting deep voice characteristic parameters in spectrogram by different layers of deep convolutional kernel, convolutional neural network can utilize different convolutional kernel, pooling layer and finally output characteristic parameters to control fitting ability of integral model, extracting and analyzing characteristic in spectrogram step by step, finally adding full connection layer and logistic regression algorithm to obtain accuracy of prediction result, because the invention aims to distinguish healthy person and patient, the activation function adopted in full connection layer is sigmiod, which is often used as threshold function of neural network, mapping variable between 0 and 1, defined by the following formula:
Figure BDA0002149103760000101
and after training is finished, parameters are continuously adjusted, and repeated experiments are carried out, so that the stroke risk prediction is more accurate, and finally the classifier is obtained and used for predicting stroke.
Optionally, in order to further improve the accuracy of the classifier in the risk assessment of disease, the sample dividing unit 502 in the present invention is specifically configured to: dividing the spectrogram sample into a training spectrogram set, a verification spectrogram set and a test spectrogram set according to the proportion;
the classifier creation module further includes a test unit 505, configured to input the test spectrogram set into the classifier after obtaining the classifier, to obtain a risk assessment accuracy of the classifier; and judging whether the risk assessment accuracy reaches a preset accuracy, and if so, completing the creation of the classifier.
In this embodiment, after a spectrogram sample is obtained, a training spectrogram set, a verification spectrogram set and a test spectrogram set are respectively divided, wherein the training spectrogram set and the verification spectrogram set are used for deep learning to obtain a classifier, the test spectrogram set is used for testing the classifier, and only when the accuracy of the classifier testing meets the requirement, the classifier is reliable and effective, and can be put into use, otherwise, the classifier needs to be searched from the sampled sample and the deep learning algorithm to create an unsuccessful root, so that the effectiveness and reliability of the classifier are ensured to a certain extent.
The following describes a device for assessing risk of stroke provided by an embodiment of the present invention, where the device for assessing risk of stroke described below and the device for assessing risk of stroke described above may be referred to correspondingly.
As shown in fig. 5, fig. 5 is a schematic structural diagram of a risk assessment device for stroke according to an embodiment of the present invention, where the risk assessment device for stroke includes a processor 11, a recording device 12 connected to the processor, and a memory 13 connected to the processor;
wherein the recording device 12 is used for recording a voice signal to be evaluated and sending the voice signal to be evaluated to the processor;
the memory 13 is used for storing a computer program;
the processor 11 is configured to execute the computer program according to the speech signal to be evaluated, so as to implement the operation steps of evaluating the risk of suffering from cerebral stroke, where the operation steps may include:
step S11: collecting and obtaining a voice signal to be evaluated, and editing the voice signal to be evaluated to obtain a specific monosyllabic voice signal.
Step S12: the monosyllabic speech signal is preprocessed and fast fourier transformed to obtain the speech signal power spectral density, and the speech signal power spectral density is converted into the speech signal energy density.
Step S13: generating a spectrogram according to the energy density of the voice signal.
Step S14: inputting the spectrogram into a pre-created classifier to obtain a risk assessment result.
Optionally, in another specific embodiment of the present invention, the processor is further configured to create the classifier;
the step of creating the classifier by the processor may specifically include:
step S21: and collecting a voice signal sample, and obtaining a spectrogram sample according to the voice signal sample.
Step S22: the spectrogram samples are proportionally divided into at least a training spectrogram set and a verification spectrogram set.
Step S23: and inputting the training spectrogram set into a convolutional neural network, and extracting deep voice characteristic parameters in the spectrogram through deep convolution kernels of different layers to obtain a classification model.
Step S24: and inputting a verification language spectrum atlas, adjusting parameters of the classification model, and controlling the fitting capacity of the classification model to obtain the classifier.
Optionally, in order to further improve the accuracy of the classifier in the risk assessment of illness, in another embodiment of the present invention, the process for creating a classifier by the processor 11 may further include the following operation steps, specifically:
step S31: and collecting a voice signal sample, and obtaining a spectrogram sample according to the voice signal sample.
Step S32: the spectrogram samples are proportionally divided into a training spectrogram set, a verification spectrogram set and a test spectrogram set.
Step S33: and inputting the training spectrogram set into a convolutional neural network, and extracting deep voice characteristic parameters in the spectrogram through deep convolution kernels of different layers to obtain a classification model.
Step S34: and inputting a verification language spectrum atlas, adjusting parameters of the classification model, and controlling the fitting capacity of the classification model to obtain the classifier.
Step S35: and inputting the test spectrogram set into a classifier to obtain the risk assessment accuracy of the classifier.
Step S36: judging whether the risk assessment accuracy reaches a preset accuracy or not, if yes, completing the creation of the classifier, and if not, failing to create the classifier.
Optionally, in another specific embodiment of the present invention, the processor 11 is specifically configured to input the power spectral density of the speech signal into the formula: s=10log 10 (Y (m, n)), where Y (m, n) is the power spectral density of the speech signal, m is the number of frames of speech information, and n is the frame length of speech information.
Optionally, in another embodiment of the present invention, the processor 11 is specifically configured to generate, based on an imagesc function, the spectrogram in which the horizontal axis represents time, the vertical axis represents frequency, and the coordinate point value is two-dimensional of the voice data energy.
The risk assessment device for stroke in this embodiment is used to implement the foregoing risk assessment device for stroke, so that the specific embodiments of the risk assessment device for stroke can be seen as the example parts of the foregoing risk assessment device for stroke, for example, the signal acquisition module 100, the signal processing module 200, the spectrogram generating module 300, and the risk assessment module 400 are all software program modules built in the processor 11, so that the specific embodiments thereof will be described with reference to the corresponding examples of the respective parts and will not be repeated herein.
In addition, the memory in this embodiment may be Random Access Memory (RAM), memory, read-only memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art.
In this specification, each embodiment is described in a progressive manner, and each embodiment is mainly described in a different point from other embodiments, so that the same or similar parts between the embodiments are referred to each other. Those of skill would further appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both, and that the various illustrative elements and steps are described above generally in terms of functionality in order to clearly illustrate the interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.

Claims (8)

1. A risk assessment device for stroke, comprising:
the signal acquisition module is used for acquiring a voice signal to be evaluated, editing the voice signal to be evaluated to acquire a specific monosyllabic voice signal;
the signal processing module is used for preprocessing the monosyllabic voice signal and performing fast Fourier transform to obtain the power spectrum density of the voice signal, and converting the power spectrum density of the voice signal into the energy density of the voice signal;
the spectrogram generation module is used for generating a spectrogram according to the voice signal energy density, wherein the spectrogram comprises voice time information, voice frequency information and voice energy information;
the risk assessment module is used for inputting the spectrogram into a pre-established classifier to obtain a risk assessment result;
the signal processing module is specifically configured to input the power spectral density of the speech signal into a formula: s=10log 10 (Y (m, n)), where Y (m, n) is the power spectral density of the speech signal, m is the number of frames of speech information, and n is the frame length of speech information.
2. The stroke risk assessment device according to claim 1, further comprising a classifier creation module, said classifier creation module specifically comprising:
the voice signal acquisition unit is used for acquiring voice signal samples and acquiring voice spectrum samples according to the voice signal samples;
the sample dividing unit is used for dividing the spectrogram sample into at least a training spectrogram set and a verification spectrogram set according to a proportion;
the training learning unit is used for inputting the training spectrogram set into a convolutional neural network, extracting deep voice characteristic parameters in the spectrogram through deep convolution kernels of different layers, and obtaining a classification model;
the model adjusting unit is used for inputting the verification spectrogram set, adjusting parameters of the classification model, controlling fitting capacity of the classification model and obtaining the classifier.
3. The stroke risk assessment device according to claim 2, wherein said sample dividing unit is specifically configured to: dividing the spectrogram sample into a training spectrogram set, a verification spectrogram set and a test spectrogram set according to the proportion;
the classifier creation module further comprises a test unit, wherein the test spectrogram set is input into the classifier after the classifier is obtained, so that the risk assessment accuracy of the classifier is obtained; and judging whether the risk assessment accuracy reaches a preset accuracy, and if so, completing the creation of the classifier.
4. A stroke risk assessment device according to any one of claims 1 to 3, wherein said spectrogram generation module is specifically configured to generate said spectrogram with a horizontal axis representing time, a vertical axis representing frequency, and a coordinate point value being two-dimensional of speech data energy, based on an imagesc function.
5. A risk assessment device for stroke, comprising a processor, a recording device connected with the processor, and a memory connected with the processor;
the recording device is used for recording the voice signal to be evaluated and sending the voice signal to be evaluated to the processor;
the memory is used for storing a computer program;
the processor is configured to execute the computer program according to the to-be-evaluated voice signal, so as to implement an operation step of evaluating a risk of suffering from a stroke, where the operation step includes:
collecting and obtaining a voice signal to be evaluated, and editing the voice signal to be evaluated to obtain a specific monosyllabic voice signal;
preprocessing and fast Fourier transforming the monosyllabic voice signal to obtain the power spectrum density of the voice signal, and converting the power spectrum density of the voice signal into the energy density of the voice signal;
generating a spectrogram according to the voice signal energy density, wherein the spectrogram comprises voice time information, voice frequency information and voice energy information;
inputting the spectrogram into a pre-created classifier to obtain a risk assessment result;
the processor is specifically configured to input the power spectral density of the speech signal into the formula: s=10log 10 (Y (m, n)), where Y (m, n) is the power spectral density of the speech signal, m is the number of frames of speech information, and n is the frame length of speech information.
6. The stroke risk assessment device according to claim 5, wherein said processor is further configured to create said classifier;
wherein the process of creating the classifier by the processor comprises:
collecting a voice signal sample, and obtaining a spectrogram sample according to the voice signal sample;
dividing the spectrogram sample into a training spectrogram set and a verification spectrogram set according to a proportion;
inputting the training spectrogram set into a convolutional neural network, and extracting deep voice characteristic parameters in the spectrogram through deep convolution kernels of different layers to obtain a classification model;
and inputting a verification language spectrum atlas, adjusting parameters of the classification model, and controlling fitting capacity of the classification model to obtain the classifier.
7. The stroke risk assessment device according to claim 6, wherein said processor is specifically configured to scale said spectrogram samples into a training spectrogram set, a verification spectrogram set and a test spectrogram set; after obtaining the classifier, further comprising: inputting the test spectrogram set into the classifier to obtain the risk assessment accuracy of the classifier; and judging whether the risk assessment accuracy reaches a preset accuracy, and if so, completing the creation of the classifier.
8. The stroke risk assessment device according to any one of claims 5 to 7, wherein said processor is specifically configured to generate said spectrogram with a horizontal axis representing time, a vertical axis representing frequency, and a coordinate point value being two-dimensional of speech data energy based on an imagesc function.
CN201910695069.3A 2019-07-30 2019-07-30 Cerebral apoplexy disease risk assessment device and equipment Active CN110415824B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910695069.3A CN110415824B (en) 2019-07-30 2019-07-30 Cerebral apoplexy disease risk assessment device and equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910695069.3A CN110415824B (en) 2019-07-30 2019-07-30 Cerebral apoplexy disease risk assessment device and equipment

Publications (2)

Publication Number Publication Date
CN110415824A CN110415824A (en) 2019-11-05
CN110415824B true CN110415824B (en) 2023-05-09

Family

ID=68364207

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910695069.3A Active CN110415824B (en) 2019-07-30 2019-07-30 Cerebral apoplexy disease risk assessment device and equipment

Country Status (1)

Country Link
CN (1) CN110415824B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113450913A (en) * 2020-08-06 2021-09-28 心医国际数字医疗系统(大连)有限公司 Data processing device and method and electronic equipment
CN113270196B (en) * 2021-05-25 2023-07-14 郑州大学 Cerebral apoplexy recurrence risk perception and behavior decision model construction system and method
CN114171162B (en) * 2021-12-03 2022-10-11 广州穗海新峰医疗设备制造股份有限公司 Mirror neuron rehabilitation training method and system based on big data analysis

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107220506A (en) * 2017-06-05 2017-09-29 东华大学 Breast cancer risk assessment analysis system based on depth convolutional neural networks
CN109065171A (en) * 2018-11-05 2018-12-21 苏州贝斯派生物科技有限公司 The construction method and system of Kawasaki disease risk evaluation model based on integrated study

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109545299A (en) * 2018-11-14 2019-03-29 严洋 Cranial vascular disease risk based on artificial intelligence quickly identifies aid prompting system and method
CN109559761A (en) * 2018-12-21 2019-04-02 广东工业大学 A kind of risk of stroke prediction technique based on depth phonetic feature

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107220506A (en) * 2017-06-05 2017-09-29 东华大学 Breast cancer risk assessment analysis system based on depth convolutional neural networks
CN109065171A (en) * 2018-11-05 2018-12-21 苏州贝斯派生物科技有限公司 The construction method and system of Kawasaki disease risk evaluation model based on integrated study

Also Published As

Publication number Publication date
CN110415824A (en) 2019-11-05

Similar Documents

Publication Publication Date Title
US6862558B2 (en) Empirical mode decomposition for analyzing acoustical signals
CN107622797B (en) Body condition determining system and method based on sound
Ozdas et al. Investigation of vocal jitter and glottal flow spectrum as possible cues for depression and near-term suicidal risk
CN110415824B (en) Cerebral apoplexy disease risk assessment device and equipment
CN110600053A (en) Cerebral stroke dysarthria risk prediction method based on ResNet and LSTM network
Krishna et al. Speech synthesis using EEG
Golabbakhsh et al. Automatic identification of hypernasality in normal and cleft lip and palate patients with acoustic analysis of speech
AU2013274940B2 (en) Cepstral separation difference
EP3954278A1 (en) Apnea monitoring method and device
CN115346561B (en) Depression emotion assessment and prediction method and system based on voice characteristics
Cordella et al. Classification-based screening of Parkinson’s disease patients through voice signal
Usman et al. Heart rate detection and classification from speech spectral features using machine learning
Majda-Zdancewicz et al. Deep learning vs feature engineering in the assessment of voice signals for diagnosis in Parkinson’s disease
WO2002065157A2 (en) Empirical mode decomposition for analyzing acoustical signals
CN114305484A (en) Heart disease heart sound intelligent classification method, device and medium based on deep learning
Taşkıran et al. A deep learning based decision support system for diagnosis of Temporomandibular joint disorder
Touahria et al. Discrete Wavelet based Features for PCG Signal Classification using Hidden Markov Models.
Usman et al. Dataset of raw and pre-processed speech signals, Mel Frequency Cepstral Coefficients of Speech and Heart Rate measurements
Morshed et al. Automated heart valve disorder detection based on PDF modeling of formant variation pattern in PCG signal
CN115116475B (en) Voice depression automatic detection method and device based on time delay neural network
Sengupta et al. Optimization of cepstral features for robust lung sound classification
Sahoo et al. Analyzing the vocal tract characteristics for out-of-breath speech
Scalassara et al. Autoregressive decomposition and pole tracking applied to vocal fold nodule signals
CN114400025A (en) Automatic schizophrenia voice detection method and system based on EHHT and CI
Milton et al. Tamil and English speech database for heartbeat estimation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant