WO2020013296A1 - 精神・神経系疾患を推定する装置 - Google Patents
精神・神経系疾患を推定する装置 Download PDFInfo
- Publication number
- WO2020013296A1 WO2020013296A1 PCT/JP2019/027587 JP2019027587W WO2020013296A1 WO 2020013296 A1 WO2020013296 A1 WO 2020013296A1 JP 2019027587 W JP2019027587 W JP 2019027587W WO 2020013296 A1 WO2020013296 A1 WO 2020013296A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- disease
- subject
- estimation
- reference range
- unit
- Prior art date
Links
- 208000012902 Nervous system disease Diseases 0.000 title claims abstract description 14
- 208000020016 psychiatric disease Diseases 0.000 title claims abstract description 14
- 230000003340 mental effect Effects 0.000 title claims abstract description 6
- 208000025966 Neurological disease Diseases 0.000 title abstract 3
- 201000010099 disease Diseases 0.000 claims abstract description 64
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 claims abstract description 64
- 238000004364 calculation method Methods 0.000 claims abstract description 51
- 238000001514 detection method Methods 0.000 claims abstract description 34
- 238000000034 method Methods 0.000 claims description 46
- 230000036541 health Effects 0.000 claims description 40
- 208000024714 major depressive disease Diseases 0.000 claims description 10
- 208000024827 Alzheimer disease Diseases 0.000 claims description 4
- 208000009829 Lewy Body Disease Diseases 0.000 claims description 4
- 201000002832 Lewy body dementia Diseases 0.000 claims description 4
- 208000018737 Parkinson disease Diseases 0.000 claims description 4
- 208000020925 Bipolar disease Diseases 0.000 claims description 3
- 208000025748 atypical depressive disease Diseases 0.000 claims 1
- 238000004891 communication Methods 0.000 description 34
- 230000008569 process Effects 0.000 description 34
- 230000008859 change Effects 0.000 description 19
- 238000010586 diagram Methods 0.000 description 16
- 230000006870 function Effects 0.000 description 11
- 238000001228 spectrum Methods 0.000 description 9
- 230000006866 deterioration Effects 0.000 description 8
- 238000000611 regression analysis Methods 0.000 description 6
- 238000005070 sampling Methods 0.000 description 6
- 230000006872 improvement Effects 0.000 description 5
- 230000002996 emotional effect Effects 0.000 description 4
- 238000013473 artificial intelligence Methods 0.000 description 3
- 230000003247 decreasing effect Effects 0.000 description 3
- 238000010801 machine learning Methods 0.000 description 3
- 230000005236 sound signal Effects 0.000 description 3
- 230000002123 temporal effect Effects 0.000 description 3
- 238000007477 logistic regression Methods 0.000 description 2
- 238000000926 separation method Methods 0.000 description 2
- 208000020401 Depressive disease Diseases 0.000 description 1
- 206010054089 Depressive symptom Diseases 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 238000004422 calculation algorithm Methods 0.000 description 1
- 238000007635 classification algorithm Methods 0.000 description 1
- 238000007621 cluster analysis Methods 0.000 description 1
- 230000002596 correlated effect Effects 0.000 description 1
- 238000003066 decision tree Methods 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 238000003745 diagnosis Methods 0.000 description 1
- 230000008034 disappearance Effects 0.000 description 1
- 238000005401 electroluminescence Methods 0.000 description 1
- 230000008451 emotion Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000002068 genetic effect Effects 0.000 description 1
- 230000003862 health status Effects 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000002787 reinforcement Effects 0.000 description 1
- 230000035945 sensitivity Effects 0.000 description 1
- 230000009885 systemic effect Effects 0.000 description 1
- 210000001260 vocal cord Anatomy 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H50/00—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
- G16H50/20—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61B—DIAGNOSIS; SURGERY; IDENTIFICATION
- A61B10/00—Other methods or instruments for diagnosis, e.g. instruments for taking a cell sample, for biopsy, for vaccination diagnosis; Sex determination; Ovulation-period determination; Throat striking implements
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61B—DIAGNOSIS; SURGERY; IDENTIFICATION
- A61B5/00—Measuring for diagnostic purposes; Identification of persons
- A61B5/16—Devices for psychotechnics; Testing reaction times ; Devices for evaluating the psychological state
- A61B5/165—Evaluating the state of mind, e.g. depression, anxiety
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61B—DIAGNOSIS; SURGERY; IDENTIFICATION
- A61B5/00—Measuring for diagnostic purposes; Identification of persons
- A61B5/40—Detecting, measuring or recording for evaluating the nervous system
- A61B5/4076—Diagnosing or monitoring particular conditions of the nervous system
- A61B5/4082—Diagnosing or monitoring movement diseases, e.g. Parkinson, Huntington or Tourette
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61B—DIAGNOSIS; SURGERY; IDENTIFICATION
- A61B5/00—Measuring for diagnostic purposes; Identification of persons
- A61B5/40—Detecting, measuring or recording for evaluating the nervous system
- A61B5/4076—Diagnosing or monitoring particular conditions of the nervous system
- A61B5/4088—Diagnosing of monitoring cognitive diseases, e.g. Alzheimer, prion diseases or dementia
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61B—DIAGNOSIS; SURGERY; IDENTIFICATION
- A61B5/00—Measuring for diagnostic purposes; Identification of persons
- A61B5/48—Other medical applications
- A61B5/4803—Speech analysis specially adapted for diagnostic purposes
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/10—Speech classification or search using distance or distortion measures between unknown speech and reference templates
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H50/00—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
- G16H50/30—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for calculating health indices; for individual health risk assessment
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification techniques
- G10L17/26—Recognition of special voice characteristics, e.g. for use in lie detectors; Recognition of animal voices
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H10/00—ICT specially adapted for the handling or processing of patient-related medical or healthcare data
- G16H10/40—ICT specially adapted for the handling or processing of patient-related medical or healthcare data for data related to laboratory analysis, e.g. patient specimen analysis
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H50/00—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
- G16H50/70—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients
Definitions
- the present invention relates to an apparatus for estimating a psychiatric / nervous system disease.
- Patent Literature 1 discloses a technique in which a subject's voice is converted into a frequency spectrum, an autocorrelation waveform is obtained while being shifted on a frequency axis, and a pitch frequency is calculated therefrom to estimate an emotional state.
- the range that can be estimated by the above technique is limited to the range of estimating the state of "emotional" of a person such as anger, joy, tension, sadness, or depressive symptoms, and the accuracy of estimating a disease was not high.
- the present invention has been made in view of such circumstances, and has as its object to provide a medical device that estimates a psychiatric / nervous system disease with high accuracy.
- the present invention is an apparatus for estimating a psychiatric / nervous system disease from voice data uttered by a subject, comprising: an arithmetic processing device; and an estimation program for causing the arithmetic processing device to execute processing.
- a first acoustic parameter is calculated from audio data obtained from the subject, and a feature amount is calculated based on the second acoustic parameter associated with the disease in advance, thereby calculating the score of the subject.
- Setting a reference range based on the calculation unit and the feature amount, and detecting a disease whose score exceeds the reference range.
- An estimating unit for estimating a systemic disease When the detection unit detects one or more diseases.
- the present invention it is possible to provide a medical device that estimates a psychiatric / nervous system disease with high accuracy.
- FIG. 9 is an explanatory diagram of a second acoustic parameter.
- FIG. 9 is an explanatory diagram of a second acoustic parameter.
- FIG. 9 is an explanatory diagram of a second acoustic parameter.
- FIG. 9 is an explanatory diagram of a second acoustic parameter.
- FIG. 9 is an explanatory diagram of a second acoustic parameter.
- FIG. 9 is an explanatory diagram of a second acoustic parameter. It is a figure showing an example of scoring.
- It is a flowchart of this invention. It is a flowchart of this invention. It is a flowchart of this invention. It is a flowchart of this invention.
- FIG. 5 is an ROC curve showing the estimation accuracy of the present invention. 5 is an ROC curve showing the estimation accuracy of the present invention.
- FIG. 9 is an explanatory diagram of a second acoustic parameter. It is a figure of the regression analysis of this invention. It is a figure of the regression analysis of this invention.
- FIG. 1 shows a configuration diagram of an estimation device 100 of the present invention.
- the arithmetic processing unit 110 includes functional units of a calculating unit 111, a detecting unit 112, and an estimating unit 113.
- the estimation device 100 is connected to the communication terminal 200 via a wire or wirelessly.
- the communication terminal 200 includes an audio input unit 201 such as a microphone and a video output unit 202 for displaying an estimation result. Note that the calculation unit 111, the detection unit 112, and the estimation unit 113 may be realized by hardware.
- FIG. 2 shows an embodiment of the estimation device 100 via the network NW.
- the estimating apparatus 100 is realized by a server A having an arithmetic processing function and a recording function of recording an estimation program, and a database (DB) server B in which audio data classified by disease is stored.
- the server A may independently perform the processing of the database (DB) server B.
- the communication terminal 200 is connected to the server A via the network NW, and the server A is further connected to the database (DB) server B via a wire or wirelessly.
- the estimation device 100 may be realized by the communication terminal 200.
- the estimation program stored in server A via network NW is downloaded and recorded in recording device 120 of communication terminal 200.
- the communication terminal 200 may function as the calculation unit 111, the detection unit 112, and the estimation unit 113 when the CPU included in the communication terminal 200 executes an application recorded in the recording device 120 of the communication terminal 200.
- the estimation program may be recorded on an optical disc such as a DVD or a portable recording medium such as a USB memory and distributed.
- the communication terminal 200 is a device including an audio input unit 201 and a video output unit 202.
- a smartphone for example, a smartphone, a tablet-type terminal, or a notebook computer or a desktop personal computer including a microphone.
- the communication terminal 200 obtains an audio signal spoken by the subject via the microphone of the communication terminal 200, and generates audio data of a digital signal by sampling the audio signal at a predetermined sampling frequency (for example, 11 kHz). .
- the generated audio data is transmitted to the estimation device 100.
- the communication terminal 200 displays the result estimated by the estimation device 100 on a display as the video output unit 202.
- the display is an organic EL (Organic Electro-Luminescence) or a liquid crystal.
- the microphone may be directly connected to the estimation device 100 via a wire or wirelessly.
- the estimation device 100 may sample the audio signal from the microphone at a predetermined sampling frequency to obtain digital signal audio data.
- FIG. 10 illustrates an example of an estimation process in the estimation device 100 illustrated in FIG.
- the processing illustrated in FIG. 10 is realized by the arithmetic processing device 110 of the estimation device 100 executing the estimation program recorded in the recording device 120 of the estimation device 100.
- Each function of the calculation unit 111, the detection unit 112, and the estimation unit 113 of the arithmetic processing device 110 will be described with reference to FIG.
- step S101 the calculation unit 111 determines whether or not audio data has been acquired.
- audio data There are two types of audio data, one of which is first audio data obtained from a target subject.
- the other is second audio data obtained from the database (DB) server B or the like in FIG.
- the second audio data is associated with each disease in advance.
- the second audio data may be recorded in advance in the recording device 120 of the estimation device 100 together with the estimation program.
- step S103 If the audio data has been acquired, the process proceeds to step S103. If the audio data has not yet been acquired, the audio data is acquired via the communication terminal 200 and the database (DB) server B in step S102.
- the calculation unit 111 calculates a first acoustic parameter and a second acoustic parameter from the obtained two types of audio data.
- the acoustic parameter is obtained by parameterizing a characteristic when a sound is transmitted, and is used as a variable f (n) of a characteristic amount appearing thereafter.
- the first acoustic parameter is calculated from first audio data of a subject whose disease is to be estimated.
- the second acoustic parameter is calculated from the second audio data acquired from the database (DB) server B or the like. Since the second voice data is associated with each disease in advance, each disease and the acoustic parameter are also associated with the calculated second acoustic parameter.
- the second acoustic parameter may be recorded in advance in the recording device 120 of the estimation device 100 together with the estimation program.
- the disease group that can be estimated using the estimation device 100 that is, the disease group that has been previously associated with the second voice data includes Lewy body dementia, Alzheimer's dementia, Parkinson's disease, major depression, and bipolar. Including disability or non-specific depression.
- the disease group is not limited to this.
- the acoustic parameters include the following items.
- One or more arbitrary acoustic parameters to be used as the variable f (n) are selected from the above acoustic parameter items, and a coefficient is added to the selected arbitrary acoustic parameter to obtain a feature amount F (a). Is created. Any acoustic parameters used are selected that are correlated with the particular disease to be estimated. For the variable f (n) and their coefficients, the estimation program may improve the quality of the feature quantity by machine learning from information stored in a database or the like after selection by the user.
- the acoustic parameters may be normalized because their numerical values have large differences.
- the feature amount may be normalized to two or more.
- step S104 the calculation unit 111 determines whether a linear model specific to the disease has been created. If a linear model has already been created, the process proceeds to step S106. If a linear model has not been created yet, in step S105, a linear model is created based on the second acoustic parameter in which the acoustic parameter is associated with each disease.
- a feature amount is created based on the created linear model.
- the feature amount can be represented by the following equation F (a).
- the subject's score used in the next detection unit 112 is calculated from the first acoustic parameter based on the feature value F (a).
- f (n) is an arbitrarily selected one or more second acoustic parameters from the above acoustic parameter items (1) to (11).
- xn is a disease-specific coefficient.
- f (n) and xn may be recorded in the recording device 120 of the estimation program in advance. Further, the feature amount may be improved in the process of machine learning of the estimation program.
- the estimation program has a learning function using artificial intelligence and performs estimation processing using the learning function.
- Neural network type deep learning may be used, reinforcement learning etc. that partially strengthens the learning field may be used, and other, genetic algorithm, cluster analysis, self-organizing map, ensemble learning, Etc. may be used.
- other technologies related to artificial intelligence may be used.
- ensemble learning a classification algorithm may be created by a technique using both boosting and a decision tree.
- the feature quantity may be divided into two or more. For example, the following division is possible.
- FIG. 3 is an explanatory diagram relating to a volume envelope.
- the horizontal axis indicates time t, and the vertical axis indicates the normalized power spectrum density.
- the volume envelope consists of an attack time, a decay time, a sustain level, and a release time.
- the attack time (“Attack”) is the time from the start of the sound to the maximum volume.
- the decay time (“Decay”) is a decay time from when a sound is generated until the sound reaches a certain fixed volume (sustain level).
- the release time is the disappearance time from when the sound is produced until the sound completely disappears.
- FIG. 4 is an explanatory diagram relating to wave information of a waveform.
- the horizontal axis indicates time t, and the vertical axis indicates sound pressure.
- the wave information of the waveform includes jitter (Jitter) and shimmer (Shimmer). Jitter indicates a disturbance in the cycle on the time axis when the time per cycle is Ti, and can be described by the following equation.
- Shimmer indicates a disturbance in amplitude with respect to the sound pressure when the sound pressure per amplitude is Ai, and can be described by the following equation.
- FIG. 5 is an explanatory diagram regarding the zero-point crossing rate.
- the zero-point crossing rate is obtained by calculating the number of times that the waveform of the sound pressure of the voice crosses the reference pressure per unit time, as the degree of the change of the waveform of the voice. The zero point crossing rate will be described later in detail.
- FIG. 6 is an explanatory diagram related to the Hurst index.
- the Hurst exponent indicates the correlation of changes in the audio waveform.
- the Hurst index will be described later in detail.
- FIG. 7 is an explanatory diagram related to VOT (Voice Onset Time).
- VOT means the time from the start of air flow (Start of voicingng) until the vocal cords start vibrating (Stop Release), that is, the voiced start time (VOT).
- the horizontal axis indicates time t
- the vertical axis indicates sound pressure.
- FIG. 8 is a diagram illustrating various types of statistics in the utterance data.
- the upper part shows a graph of the sound intensity of a certain frequency component, with the horizontal axis representing time t and the vertical axis representing the frequency axis.
- the level of the sound intensity is indicated by the shading of the color.
- the frequency region to be processed is trimmed, and the frequency spectrum of each point in the trimmed region is shown in the middle graph.
- the middle graph shows the frequency spectrum at each point on the time axis of the upper graph, so the darker part of the upper graph is drawn with higher sound intensity, and the lighter part is drawn with lower sound intensity.
- the lower part of the graph shows the power spectrum density on the vertical axis and the time axis on the horizontal axis, by analyzing the spectrum of the middle frequency spectrum.
- the statistical value of the distribution in the utterance for a certain coefficient of the mel frequency cepstrum (first quartile, median, third quartile, 95 percent, arithmetic mean, geometric mean, third quartile
- Statistical value of the distribution in the utterance at the rate of change of the frequency spectrum (first quartile, median, third quartile, 95 percent point, arithmetic mean, geometric Average, the difference between the third quartile and the median)
- the statistical value of the distribution in the utterance with respect to the time change of a certain coefficient of the mel frequency cepstrum (the first quartile, the median, the third quartile, 95% point, arithmetic mean, geometric mean, difference between third quartile and median, etc., statistical value of utterance distribution with respect to time change of a certain coefficient of mel frequency cepstrum (first quartile) , Median, third quartile, 95 percent, arithmetic flat ,
- the geometric mean the difference between the third
- step S106 in FIG. 10 the subject is scored after the feature amount is set. Scoring is a process of calculating a subject's score based on the characteristic amount F (a) unique to the disease and the first acoustic parameter.
- the subject's score obtained by scoring is transmitted to the detection unit 112 and the estimation unit 113.
- the detection unit 112 determines whether or not a health reference range created based on the feature amount is set.
- the health reference range is an area that distinguishes a healthy subject from a subject having an individual disease based on a regression line created by the feature value F (a).
- step S107 If the detecting unit 112 determines that the health reference range is set in step S107, the process proceeds to step S109. If it is determined that the health reference range has not been set, in step S108, a health reference range is set based on the feature amount. The information on the reference range is transmitted to the estimation unit 113.
- step S109 the detection unit 112 detects a disease exceeding the health reference range from the subject's score calculated by the calculation unit 111.
- step S110 the detection unit 112 determines whether there are a plurality of detected diseases. If there is no detected disease, or if there is only one detected disease, the process proceeds to step S112.
- step S111 the common terms and coefficients of the feature amounts of the detected diseases are compared to improve the feature amounts.
- the result of the feature amount improvement may be output to a database (DB) server B or a recording device 120 that records an estimation program for machine learning.
- the improvement of the feature amount may be compared and verified until a significant difference occurs between the plurality of feature amounts. If the detected feature quantities of the disease have a common term, the differences in the common terms may be compared first, and then the individual feature quantities may be compared.
- the comparison method may be comparison by range calculation in addition to comparison by multiplication.
- the characteristic amount specific to the disease may be improved by comparing characteristic amounts specific to the disease and selecting the maximum value, or by adding them.
- the plurality of diseases when a plurality of detected diseases are confirmed to have a sufficient difference from the health reference range, the plurality of diseases may be detected as final candidates. Further, the improvement of the feature amount may be manually adjusted by the user.
- step S106 After the feature amount is improved, the score of the subject obtained in step S106 is recalculated if necessary. The improved feature amount and the recalculated score result are transmitted to the estimation unit 113. After all the processes in the detection unit 112 are completed, the process proceeds to step S112.
- the estimation unit 113 estimates a disease from the feature amount acquired by the calculation unit 111 and the detection unit 112 and the subject's score based on the feature amount.
- step S113 the estimation unit 113 outputs an estimation result to the communication terminal 200.
- a disease having the largest value among differences between a subject's score and a health reference range may be selected to estimate the disease.
- scores of the plurality of diseases may be shown as shown in FIG. 9 and final determination may be left to the user.
- the estimation unit 113 may estimate the degree of health of the subject according to the distance between the subject's score calculated in step S106 and the boundary of the reference range set in step S108. Then, the estimation unit 113 may output information indicating the estimated health condition and the degree of health of the subject to the communication terminal 200.
- the estimation device 100 ends the estimation processing.
- the estimating apparatus 100 repeatedly executes the processing from step S101 to step S113 each time receiving the voice data of the subject from the communication terminal 200.
- step S104 when the information of the reference range is determined in advance by the estimation device 100 or an external computer device and is recorded in the recording device 120 of the estimation device 100, step S104, step S105, and step S104 are performed. Steps S107 and S108 may be omitted.
- the calculation unit 111 calculates the score of the subject based on the feature amount using the voice data of the subject obtained from the communication terminal 200.
- the estimating unit 113 estimates the health condition or disease of the subject based on a comparison between the calculated score of the subject and the reference range set by the detecting unit 112.
- FIG. 13 shows an example of the result estimated in the above steps S101 to S113.
- FIG. 13 is a graph of an ROC curve showing the separation performance of a healthy person or a specific disease and the other.
- the horizontal axis indicates specificity, and the vertical axis indicates sensitivity. In other words, the horizontal axis indicates the false positive rate, and the vertical axis indicates the true positive rate.
- the ROC curves in FIG. 13 all showed a high value of the true positive rate at the time of low false positive standing.
- AUC Absolute ⁇ under ⁇ an ⁇ ROC ⁇ curve
- the estimating apparatus 100 can highly and professionally estimate a specific disease among a plurality of mental and nervous system diseases with high accuracy. Can be.
- the calculation unit 111 calculates the zero-point crossing rate as the degree of the change of the waveform in the voice. In addition, the calculation unit 111 calculates a Hurst exponent indicating a correlation between changes in the waveform of the sound. The calculation unit 111 outputs the calculated zero-point crossing rate and the Hurst exponent of the subject to the detection unit 112 and the estimation unit 113.
- the detecting unit 112 estimates the health state of the subject from the zero-crossing rate and the Hurst index of the subject calculated by the calculating unit 111, so that the health state indicating a healthy state without a disease such as depression is estimated. Set the reference range for.
- the calculation unit 111 reads voice data of a plurality of persons whose health status is known whether or not they suffer from a disease such as depression from the database or the recording device 120 of the estimation device 100, and reads a plurality of voice data from the read voice data.
- the second acoustic parameter is calculated, including the zero crossing rate and the Hurst exponent for each of the persons.
- the calculation unit 111 calculates a linear discriminant or a logistic for the distribution of the zero crossing rates and the Hurst indices of the plurality of persons calculated by the calculation unit 111 in the two-dimensional space of the zero crossing rate and the Hurst exponent. By performing a process of linear classification such as regression analysis, a feature amount is created based on these linear models.
- the detection unit 112 sets a boundary line that separates a region of a person suffering from depression or the like from a reference range of a healthy person not suffering from depression or the like based on the feature amount created by the calculation unit 111. Set.
- the detection unit 112 outputs information indicating the reference range including the determined boundary line to the estimation unit 113.
- the detection unit 112 may be omitted.
- the estimating unit 113 determines a health state (for example, whether the subject is in a depressed state or the like) of the subject based on the subject's zero-crossing rate and the Hurst index score calculated by the calculating unit 111 and the reference range set by the detecting unit 112. E). Then, the estimation unit 113 outputs information indicating the estimated health condition to the communication terminal 200.
- a health state for example, whether the subject is in a depressed state or the like
- FIG. 14 shows an example of audio data obtained via the communication terminal 200 shown in FIG.
- FIG. 14 shows a temporal change in the sound pressure of the voice uttered by the subject obtained via the communication terminal 200.
- the horizontal axis in FIG. 14 indicates time t, and the vertical axis indicates sound pressure.
- FIG. 14 shows the data of the utterance unit that uttered “Thank you” in the voice data of the utterance by the subject.
- Times t0, t1, t2, t3, and t4 indicate the start times at which the words “A”, “R”, “GA”, “TO”, and “U” included in the utterance unit are uttered.
- the calculation process performed by the calculation unit 111 on the voice data in which the word “Ri” is uttered in the utterance unit of “Thank you” will be described. The same or similar calculation process is executed for the unit.
- the calculation unit 111 calculates the zero point crossing rate and the Hurst exponent for each window WD having a sample number of 512 or the like using the audio data acquired from the communication terminal 200. As shown in FIG. 14, since the sound pressure greatly changes in the utterance of each word, for example, the calculation unit 111 calculates the zero-point crossing rate by using a window WD1 having a sample number smaller than the window WD, such as 30. Then, the average value of the sound pressure is calculated, and the average value calculated in each window WD1 is set as the reference pressure of each window WD1. The calculation unit 111 measures the number of times the subject's sound pressure crosses the calculated reference pressure (average value) in each window WD1, and calculates the zero-point crossing rate.
- Calculating section 111 calculates an average value of the zero-point crossing rates calculated in each window WD1 as zero-point crossing rate ZCR of window WD.
- the standard deviation ⁇ ( ⁇ ) of the difference between the sound pressure x (t) at time t and the sound pressure x (t + ⁇ ) separated by time ⁇ from time t is related as shown in Expression (1). It is known that there is a power law relationship as shown in Expression (2) between the time interval ⁇ and the standard deviation ⁇ ( ⁇ ). H in Equation (2) is the Hurst exponent.
- the Hurst exponent H is "0" because there is no temporal correlation between the audio data. Also, as the audio data changes from white noise to pink noise or brown noise, that is, as the audio waveform has temporal correlation, the Hurst exponent H shows a value larger than “0”.
- the Hurst exponent H is 0.5. Further, as the audio data has a stronger correlation than the brown noise, that is, as the audio data becomes more dependent on the past state, the Hurst exponent H takes a value between 0.5 and 1.
- the calculation unit 111 obtains the standard deviation ⁇ ( ⁇ ) of the audio data for each ⁇ in which the time interval ⁇ is from 1 to 15, and calculates the standard deviation ⁇ ( ⁇ ) of each obtained time interval ⁇ .
- the Hurst exponent H is calculated by performing a regression analysis on ()).
- Calculating section 111 moves window WD at a predetermined interval such as a quarter of the width of window WD, and calculates zero-point crossing rate ZCR and Hurst exponent H in each window WD. Then, the calculation unit 111 averages the calculated zero-point crossing rate ZCR and the Hurst exponent H of all the calculated windows WD, and uses the averaged zero-point crossing rate ZCR and the Hurst exponent H as the zero-point crossing rate and the Hurst exponent of the subject PA. Output to estimation section 113.
- FIG. 15 shows an example of the distribution of the zero-crossing rate ZCR and the Hurst exponent H of a plurality of persons calculated by the calculation unit 111 shown in FIG.
- the vertical axis indicates the zero-point crossing rate ZCR
- the horizontal axis indicates the Hurst index H.
- the zero-crossing rate ZCR and the Hurst index H of a person suffering from a disease such as depression are indicated by crosses, and the zero-crossing rate ZCR and the Hurst index H of a healthy person are indicated by circles.
- the distribution of the zero point crossing rate ZCR and the Hurst exponent H shown in FIG. 15 is generated using the voice data of 1,218 people. Out of a total of 1218 people, 697 people have a disease such as depression, and 521 healthy people.
- the calculation unit 111 executes a linear classification process such as a linear discriminant or a logistic regression analysis on the distribution of the zero-crossing rates ZCR and the Hurst exponent H of a plurality of persons shown in FIG.
- the detection unit 112 determines a boundary line indicated by a broken line that separates a person suffering from a disease such as depression from a healthy person.
- the detection unit 112 outputs information of the reference range including the determined boundary line to the estimation unit 113 using the region below the boundary line indicated by the broken line as the reference range, and sets the reference range in the estimation unit 113.
- the vertical axis of the zero-point crossing rate ZCR and the horizontal axis of the Hurst exponent H are linear axes.
- the boundary indicated by a broken line is represented by an exponential function or a power function, the boundary line It is preferable to use a logarithmic axis in order to show a straight line.
- FIG. 16 shows an example of the distribution of the zero-crossing rate ZCR and the Hurst exponent H according to the environment in which audio data is acquired.
- the vertical axis indicates the zero-point crossing rate ZCR and the horizontal axis indicates the Hurst index H, as in FIG.
- FIG. 16 shows a boundary line determined by the detection unit 112 from the distribution of the zero point crossing rate ZCR and the Hurst exponent H shown in FIG.
- FIG. 16 shows the distribution of the zero-crossing rate ZCR and the Hurst exponent H calculated by the communication terminal 200 using voice data obtained by sampling the voice of the subject at a sampling frequency of 11 kHz, as black triangles.
- the communication terminal 200 downsamples the voice data of the subject PA sampled at 11 kHz at a sampling frequency of 8 kHz.
- FIG. 16 shows the distribution of the zero-crossing rate ZCR and the Hurst exponent H calculated using the audio data down-sampled to 8 kHz, as white rectangles.
- the zero-point crossing rate ZCR and the Hurst exponent H of the subject PA are affected by deterioration of sound quality (increase in noise) due to downsampling. That is, the zero-crossing rate ZCR of the down-sampled audio data is increased by increasing the number of times the noise and the sound pressure of the voice cross the reference pressure. It shows a large value as compared with.
- the Hurst exponent H of the down-sampled audio shows a smaller value than the Hurst exponent H of the audio data sampled at 11 kHz because the audio data approaches white noise due to an increase in noise.
- the zero-crossing rate ZCR and the Hurst exponent H are affected by downsampling, but do not change independently of each other but change in a relationship. That is, as shown in FIG. 16, the zero-point crossing rate ZCR and the Hurst exponent H change along the boundary indicated by the broken line while correlating with each other with respect to the deterioration of the sound quality due to downsampling or the like. .
- the estimation device 100 can estimate the health condition of the subject with higher accuracy than in the past, regardless of the environment in which the voice data is acquired.
- FIG. 11 shows an example of the estimation process in the estimation device 100 shown in FIG.
- the processing illustrated in FIG. 11 is realized by the arithmetic processing device 110 of the estimation device 100 executing the estimation program recorded in the recording device 120 of the estimation device 100.
- step S201 the calculation unit 111 determines whether or not audio data has been acquired.
- audio data There are two types of audio data, one of which is first audio data obtained from a target subject.
- the other is second audio data obtained from the database (DB) server B or the like in FIG.
- the second audio data is associated with major depression in advance.
- the second audio data may be recorded in advance in the recording device 120 of the estimation device 100 together with the estimation program.
- step S203 If the voice data has been acquired, the process proceeds to step S203. If the audio data has not yet been acquired, the audio data is acquired via the communication terminal 200 and the database (DB) server B in step S202.
- step S203 the calculation unit 111 calculates a first acoustic parameter and a second acoustic parameter, that is, a zero-point crossing rate ZCR and a Hurst exponent H from the acquired two types of audio data.
- the second acoustic parameter may be recorded in advance in the recording device 120 of the estimation device 100 together with the estimation program.
- step S204 the calculation unit 111 determines whether or not feature amounts unique to the disease have been created. If the feature has already been created, the process proceeds to step S206. If a feature has not been created yet, in step S205, a feature is created based on the zero crossing rate ZCR and the Hurst index H associated with major depression. Specifically, a linear classification process such as a linear discriminant or a logistic regression analysis is performed on the distribution of the zero-point crossing rate ZCR and the Hurst index H.
- scoring of the subject is performed. Scoring is a process of calculating a subject's score based on a characteristic amount specific to a disease and a first acoustic parameter.
- the subject's score obtained by scoring is transmitted to the detection unit 112 and the estimation unit 113.
- step S207 the detection unit 112 determines whether or not a health reference range created based on the feature amount is set.
- step S207 If the detecting unit 112 determines that the health reference range has been set in step S207, the process proceeds to step S209. If it is determined that the health reference range has not been set, in step S208, a health reference range is set based on the feature amount.
- step S209 the detection unit 112 detects whether or not the score related to the zero-crossing rate ZCR and the Hurst index H of the subject calculated by the calculation unit 111 is within the reference range of health.
- step S212 the estimating unit 113 estimates that the disease has major depression when the score of the subject exceeds the reference range in the detecting unit 112.
- the estimating unit 113 estimates that the subject is healthy.
- the estimating unit 113 outputs information indicating the estimated health condition of the subject to the communication terminal 200.
- the estimating unit 113 determines, for example, according to the distance between the score related to the subject's zero-crossing rate ZCR and the Hurst exponent H detected in step S206 and the boundary of the reference range set in step S208. Thus, the degree of health of the subject may be estimated. Then, the estimation unit 113 may output information indicating the estimated health condition and the degree of health of the subject to the communication terminal 200.
- the estimation device 100 ends the estimation process.
- the estimating apparatus 100 repeatedly executes the processing from step S201 to step S213 every time receiving the voice data of the subject from the communication terminal 200.
- step S204 when the information of the reference range is determined in advance by the estimation device 100 or an external computer device and is recorded in the recording device 120 of the estimation device 100, step S204, step S205, and step S204 are performed. Steps S207 and S208 may be omitted.
- the calculation unit 111 calculates the score of the feature amount related to the zero-crossing rate ZCR and the Hurst exponent H of the subject using the voice data of the subject obtained from the communication terminal 200.
- the estimation unit 113 estimates the health state of the subject based on a comparison between the calculated position of the zero-crossing rate ZCR and the Hurst index H of the subject and the reference range set by the detection unit 112.
- the zero point crossing rate ZCR and the Hurst exponent H are affected by sound quality deterioration due to downsampling or the like, but do not change independently of each other but change with a relationship. . For this reason, deterioration in sound quality due to downsampling or the like does not affect the operation of the estimating unit 113 that determines whether or not a score related to the subject's zero crossing rate ZCR and the Hurst exponent H is included in the reference range. That is, the estimation device 100 can estimate the health state of the subject with higher accuracy than before, regardless of the environment in which the voice data is acquired.
- the estimation device 100 can obtain the zero-crossing rate ZCR and the Hurst exponent H from voice data of a subject suffering from major depression or the like, voice data including long vowels, or the like. For this reason, the estimation device 100 can accurately estimate the health state of the subject as compared with the related art that uses information indicating the correspondence between the voice parameter and the emotional state.
- the calculation unit 111 uses, for example, a waveform model of a voice represented by Expression (3), and changes a zero-point crossing rate ZCR and a Hurst exponent that change according to a ratio of noise included in the voice.
- a feature amount is created based on the relationship with H, and a boundary line of the reference range can be set.
- x (t-1), x (t), and x (t + 1) indicate audio data sampled at times t-1, t, and t + 1.
- ⁇ indicates the degree to which the audio data x (t) depends on the past state. For example, when ⁇ is 0, the audio data x (t) indicates an independent value without depending on the past state, indicating that it is white noise.
- Rand1 and rand2 indicate uniform random numbers between 0 and 1.
- the scale adjusts the amount of change in the waveform of the audio data x (t) according to the uniform random number of rand1, and is set to a value such as 0.1 or 0.2.
- SIGN is a function shown in Expression (4), and determines a change in the audio data x (t).
- ⁇ adjusts the fluctuation of the audio data x (t) according to the uniform random number of rand2 via the function SIGN. For example, when ⁇ is set to 1 and ⁇ is set to 0.5, the audio data x (t) reproduces a waveform similar to brown noise.
- the speech waveform model shown in Expression (3) is an example, and may be expressed using another function.
- the calculating unit 111 changes ⁇ from 0 to 1 using the audio waveform model of Expression (3) in which ⁇ is set to 1, and calculates the audio data x (t) at each ⁇ value.
- the zero point crossing rate ZCR and the Hurst index H are calculated.
- the calculation unit 111 performs a regression analysis process such as a least squares method on the distribution of the zero point crossing rate ZCR and the Hurst exponent H at the calculated values of ⁇ .
- the calculation unit 111 determines a straight line passing through the zero point crossing rate ZCR of each value of ⁇ and the Hurst exponent H as a boundary line.
- the estimation device 100 can easily set the boundary of the reference range without acquiring voice data of a plurality of persons to determine the boundary of the reference range.
- calculation section 111 outputs information on the reference range including the determined boundary line to estimation section 113, and sets the reference range in estimation section 113.
- the calculation unit 111 may be omitted.
- FIG. 12 shows an example of an estimation process in the estimation device 100 shown in FIG.
- the process shown in FIG. 12 is realized by the arithmetic processing unit 110 of the estimation device 100 executing the estimation program recorded in the recording device 120 of the estimation device 100. That is, the process illustrated in FIG. 12 illustrates another embodiment of the estimation method and the estimation program.
- step S301 the calculation unit 111 determines whether or not audio data has been acquired. If the audio data has been acquired, the process proceeds to step S303. If the audio data has not been acquired yet, in step S302, the audio data is acquired via the communication terminal 200 or the like.
- step S303 the calculation unit 111 calculates first acoustic parameters, that is, a zero-point crossing rate ZCR and a Hurst exponent H from the acquired audio data.
- step S307 the calculation unit 111 determines whether a health reference range has been set. If the health reference range has been set, the calculation unit 111 proceeds to step S308a. If the reference range has not been set, the calculation unit 111 proceeds to step S308.
- step S308 the calculation unit 111 changes ⁇ from 0 to 1 using the audio waveform model of Expression (3) in which ⁇ is set to 1, and sets the audio data x (t) at each ⁇ value.
- the zero point crossing rate ZCR and the Hurst index H are calculated.
- the detection unit 112 performs a regression analysis process such as a least squares method on the distribution of the zero-point crossing rate ZCR and the Hurst exponent H at the calculated value of ⁇ , and performs the zero-point crossing of the value of each ⁇ .
- a straight line passing through the rate ZCR and the Hurst exponent H is set as a boundary line.
- step S308a the detection unit 112 outputs information on the reference range including the boundary set in step S308 to the estimation unit 113, and sets the reference range.
- step S308a the subject is scored.
- the scoring in the third embodiment uses the first acoustic parameter of the subject, that is, the zero-crossing rate ZCR and the Hurst exponent H of the subject.
- the result of scoring is output to detection section 112 and estimation section 113.
- step S309 the detection unit 112 detects whether or not the zero-crossing rate ZCR and the Hurst exponent H of the subject calculated in step S308a are within the reference range set in step S308.
- step S312 when the score of the subject exceeds the reference range by the detection unit 112, the estimation unit 113 estimates that the disease is due to major depression.
- the estimating unit 113 estimates that the subject is healthy.
- the estimating unit 113 outputs information indicating the estimated health condition of the subject to the communication terminal 200.
- the estimating unit 113 calculates, for example, the distance between the score related to the zero-crossing rate ZCR and the Hurst exponent H of the subject calculated in step S308a and the boundary line of the reference range set in step S308. Thus, the degree of health of the subject may be estimated. Then, the estimation unit 113 may output information indicating the estimated health condition and the degree of health of the subject to the communication terminal 200.
- the estimation device 100 ends the estimation process.
- the estimating apparatus 100 repeatedly executes the processing from step S301 to step S313 every time the voice data of the subject is received from the communication terminal 200.
- the calculation unit 111 calculates the zero-crossing rate ZCR and the Hurst exponent H of the subject using the voice data of the subject obtained via the communication terminal 200.
- the estimating unit 113 estimates the health condition of the subject PA based on a comparison between the calculated position of the zero-crossing rate ZCR and the Hurst index H of the subject and the reference range set by the detecting unit 112.
- the estimation device 100 can estimate the health state of the subject with higher accuracy than before, regardless of the environment in which the voice data is acquired.
- the estimation device 100 can obtain the zero-crossing rate ZCR and the Hurst exponent H from voice data of a subject suffering from major depression or the like, voice data including long vowels, or the like. For this reason, the estimation device 100 can accurately estimate the health state of the subject as compared with the related art that uses information indicating the correspondence between the voice parameter and the emotional state.
- estimation device may be applied to, for example, robots, artificial intelligence, automobiles, call centers, the Internet, mobile terminal device applications and services such as smartphones and tablet terminals, and search systems. Further, the estimation device may be applied to a diagnosis device, an automatic inquiry device, a disaster triage, and the like.
- the estimation device has been mainly described so far, the operation method of the medical device that operates the medical device including the estimation device as described above may be used, or the computer may perform the same processing as the medical device.
- An estimation program, a non-transitory recording medium readable by a computer that records the estimation program, and the like may be used.
Landscapes
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Engineering & Computer Science (AREA)
- Public Health (AREA)
- Medical Informatics (AREA)
- Biomedical Technology (AREA)
- General Health & Medical Sciences (AREA)
- Pathology (AREA)
- Neurology (AREA)
- Veterinary Medicine (AREA)
- Heart & Thoracic Surgery (AREA)
- Physics & Mathematics (AREA)
- Molecular Biology (AREA)
- Surgery (AREA)
- Animal Behavior & Ethology (AREA)
- Biophysics (AREA)
- Psychiatry (AREA)
- Neurosurgery (AREA)
- Developmental Disabilities (AREA)
- Hospice & Palliative Care (AREA)
- Physiology (AREA)
- Psychology (AREA)
- Child & Adolescent Psychology (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Data Mining & Analysis (AREA)
- Primary Health Care (AREA)
- Epidemiology (AREA)
- Databases & Information Systems (AREA)
- Social Psychology (AREA)
- Educational Technology (AREA)
- Multimedia (AREA)
- Acoustics & Sound (AREA)
- Human Computer Interaction (AREA)
- Computational Linguistics (AREA)
- Measurement Of The Respiration, Hearing Ability, Form, And Blood Characteristics Of Living Organisms (AREA)
- Medical Treatment And Welfare Office Work (AREA)
Abstract
Description
本出願は、平成30年7月13日に出願された特願2018-133333の優先権を主張する。
図10は、図1に示した推定装置100における推定処理の一例を示す。図10に示す処理は、推定装置100の演算処理装置110が推定装置100の記録装置120に記録された推定プログラムを実行することにより実現される。図10を用いて、演算処理装置110の算出部111、検出部112および推定部113の各機能についてそれぞれ説明する。
処理を開始すると、ステップS101において、算出部111は音声データが取得済みであるか否かを判定する。音声データには2種類のデータがあり、1つは対象とする被験者から取得する第1の音声データである。もう1つは、図2のデータベース(DB)サーバB等から取得する第2の音声データである。第2の音声データは、各疾患と予め関連付けがされている。第2の音声データは、推定プログラムと一緒に推定装置100の記録装置120に予め記録されていてもよい。
(1.音量のエンベロープ)
図4は、波形の波動情報に関する説明図である。横軸は時間tを示し、縦軸は音圧を示す。
図5は、ゼロ点交差率に関する説明図である。ゼロ点交差率は、音声の音圧の波形が基準圧力を横切る単位時間あたりの回数を、音声における波形の変化の激しさの度合いとして算出したものである。ゼロ点交差率に関しては、後に詳述する。
図6は、ハースト指数に関する説明図である。ハースト指数は、音声の波形における変化の相関性を示す。ハースト指数に関しては、後に詳述する。
図7は、VOT(Voice Onset Time)に関する説明図である。VOTとは、空気が流れだしてから(Start of Voicing)、声帯が振動を始めるまで(Stop Release)の時間、すなわち有声開始時間(VOT)を意味する。図7では、横軸に時間tを示し、縦軸に音圧を示す。
図8は、発話データ内の統計量に関する各種説明図である。上段は、ある周波数成分の音声の強度について、横軸を時間tとして示し、縦軸を周波数軸としてグラフを示す。上段のグラフでは、音声の強度の高低を色の濃淡で示している。上段のグラフのうち、処理対象とする周波数の領域をトリミングして、トリミングされた領域における各点の周波数スペクトルを中段に示す。
次に、ステップS107において、検出部112は特徴量を基に作成された健康の基準範囲が設定されているか否かを判定する。健康の基準範囲とは、特徴量F(a)により作成される回帰直線から、健常な被験者と個々の疾患を有する被験者とを区別する領域である。
次に、ステップS112において、推定部113は算出部111および検出部112で取得した特徴量およびそれに基づく被験者のスコアから、疾患の推定を行う。
次に、第2の音響パラメータとして、ゼロ点交差率、ハースト指数を選択した場合の一実施例について詳述する。
図1に示す推定装置100において、算出部111は、例えば、式(3)に示される音声の波形モデルを用い、音声に含まれるノイズの割合に応じて変化するゼロ点交差率ZCRとハースト指数Hとの関係性に基づいて特徴量を作成して、基準範囲の境界線を設定することができる。
112…検出部
113…推定部
100…推定装置
200…通信端末
Claims (5)
- 被験者が発話した音声データから精神・神経系の疾患を推定する装置であって、演算処理装置と、前記演算処理装置が処理を実行するための推定プログラムを記録した記録装置、を備え、
前記被験者から取得した前記音声データから第1の音響パラメータを算出するとともに、予め疾患と関連付けされた第2の音響パラメータにより特徴量を算出して、前記被験者のスコアを算出する、算出部と、
前記特徴量に基づき基準範囲を設定して、前記スコアが前記基準範囲を超える疾患を検出する、検出部と、
前記検出部で1つ以上の疾患が検出された場合に、前記精神・神経系の疾患を推定する、推定部、
を備える、
装置。 - 前記精神・神経系の疾患の候補は、アルツハイマー型認知症、レビー小体型認知症、パーキンソン病、大うつ病、非定型うつ病、および双極性障害からなる群から1つ以上が選択され、前記第2の音響パラメータは、選択された前記疾患の候補と相関性を有する、
請求項1に記載の装置。 - 前記基準範囲を超えて検出された前記疾患が1つ以下である場合は、検出する作業を終了し、
前記基準範囲を超えて検出された前記疾患が2つ以上ある場合は、検出された前記疾患どうしの前記特徴量を比較して、前記特徴量を改善する、
請求項1または請求項2に記載の装置。 - 請求項1~3のいずれか一項に記載の医療装置を実行させるための推定プログラムが記録された記録媒体。
- 被験者が発話した音声データから精神・神経系の疾患を推定するための、医療装置の作動方法であって、前記医療装置は、演算処理装置と、前記演算処理装置が処理を実行するための推定プログラムを記録した記録装置と、を備え、
前記演算処理装置の算出部が、前記被験者から取得した前記音声データから第1の音響パラメータを算出するとともに、予め疾患と関連付けされた第2の音響パラメータに基づき特徴量を算出して、前記被験者のスコアを算出する、ステップと、
前記演算処理装置の検出部が、前記特徴量に基づき健康の基準範囲を設定して、前記スコアが前記基準範囲を超える疾患を検出する、ステップと、
前記演算処理装置の推定部が、前記検出部で1つ以上の疾患が検出された場合に、前記精神・神経系の疾患を推定する、ステップと、
を備える、
医療装置の作動方法。
Priority Applications (5)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
GB2100152.4A GB2590201B (en) | 2018-07-13 | 2019-07-11 | Apparatus for estimating mental/neurological disease |
SG11202100147VA SG11202100147VA (en) | 2018-07-13 | 2019-07-11 | Apparatus for estimating mental/neurological disease |
JP2020530269A JP7389421B2 (ja) | 2018-07-13 | 2019-07-11 | 精神・神経系疾患を推定する装置 |
US17/258,948 US12029579B2 (en) | 2018-07-13 | 2019-07-11 | Apparatus for estimating mental/neurological disease |
JP2023190849A JP7563683B2 (ja) | 2018-07-13 | 2023-11-08 | 精神・神経系疾患を推定する装置 |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2018-133333 | 2018-07-13 | ||
JP2018133333 | 2018-07-13 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2020013296A1 true WO2020013296A1 (ja) | 2020-01-16 |
Family
ID=69143045
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2019/027587 WO2020013296A1 (ja) | 2018-07-13 | 2019-07-11 | 精神・神経系疾患を推定する装置 |
Country Status (5)
Country | Link |
---|---|
US (1) | US12029579B2 (ja) |
JP (2) | JP7389421B2 (ja) |
GB (1) | GB2590201B (ja) |
SG (1) | SG11202100147VA (ja) |
WO (1) | WO2020013296A1 (ja) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP6854554B1 (ja) * | 2020-06-11 | 2021-04-07 | Pst株式会社 | 情報処理装置、情報処理方法、情報処理システム、及び情報処理プログラム |
WO2021220646A1 (ja) * | 2020-04-28 | 2021-11-04 | Pst株式会社 | 情報処理装置、方法、及びプログラム |
JP2023533331A (ja) * | 2020-07-10 | 2023-08-02 | イモコグ カンパニー リミテッド | 音声特性に基づくアルツハイマー病予測方法及び装置 |
WO2024116254A1 (ja) * | 2022-11-28 | 2024-06-06 | Pst株式会社 | 情報処理装置、情報処理方法、情報処理システム、及び情報処理プログラム |
Families Citing this family (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR101958188B1 (ko) * | 2018-10-12 | 2019-03-14 | 신성대학 산학협력단 | 음성 분석을 기반으로 하는 뇌졸중 판단 시스템 및 그 방법 |
WO2020128542A1 (en) * | 2018-12-18 | 2020-06-25 | Szegedi Tudományegyetem | Automatic detection of neurocognitive impairment based on a speech sample |
WO2020163645A1 (en) * | 2019-02-06 | 2020-08-13 | Daniel Glasner | Biomarker identification |
US11232570B2 (en) | 2020-02-13 | 2022-01-25 | Olympus Corporation | System and method for diagnosing severity of gastritis |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20030078768A1 (en) * | 2000-10-06 | 2003-04-24 | Silverman Stephen E. | Method for analysis of vocal jitter for near-term suicidal risk assessment |
JP2011255106A (ja) * | 2010-06-11 | 2011-12-22 | Nagoya Institute Of Technology | 認知機能障害危険度算出装置、認知機能障害危険度算出システム、及びプログラム |
WO2015168606A1 (en) * | 2014-05-02 | 2015-11-05 | The Regents Of The University Of Michigan | Mood monitoring of bipolar disorder using speech analysis |
WO2017138376A1 (ja) * | 2016-02-09 | 2017-08-17 | Pst株式会社 | 推定方法、推定プログラム、推定装置および推定システム |
JP2017532082A (ja) * | 2014-08-22 | 2017-11-02 | エスアールアイ インターナショナルSRI International | 患者の精神状態のスピーチベース評価のためのシステム |
US20170354363A1 (en) * | 2011-08-02 | 2017-12-14 | Massachusetts Institute Of Technology | Phonologically-based biomarkers for major depressive disorder |
JP6337362B1 (ja) * | 2017-11-02 | 2018-06-06 | パナソニックIpマネジメント株式会社 | 認知機能評価装置、及び、認知機能評価システム |
JP2018121749A (ja) * | 2017-01-30 | 2018-08-09 | 株式会社リコー | 診断装置、プログラム及び診断システム |
Family Cites Families (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPS56114912A (en) * | 1980-02-18 | 1981-09-09 | Nippon Telegr & Teleph Corp <Ntt> | Manufacture of connector for optical fiber |
JPS6337362U (ja) | 1986-08-28 | 1988-03-10 | ||
CN101199002B (zh) | 2005-06-09 | 2011-09-07 | 株式会社A.G.I. | 检测音调频率的语音分析器和语音分析方法 |
CA2689848A1 (en) * | 2006-02-28 | 2007-09-07 | Phenomenome Discoveries Inc. | Methods for the diagnosis of dementia and other neurological disorders |
AU2010357179A1 (en) * | 2010-07-06 | 2013-02-14 | Rmit University | Emotional and/or psychiatric state detection |
US10276260B2 (en) * | 2012-08-16 | 2019-04-30 | Ginger.io, Inc. | Method for providing therapy to an individual |
US10293160B2 (en) * | 2013-01-15 | 2019-05-21 | Electrocore, Inc. | Mobile phone for treating a patient with dementia |
KR20230161532A (ko) * | 2015-11-24 | 2023-11-27 | 메사추세츠 인스티튜트 오브 테크놀로지 | 치매를 예방, 경감 및/또는 치료하기 위한 시스템 및 방법 |
US11504038B2 (en) * | 2016-02-12 | 2022-11-22 | Newton Howard | Early detection of neurodegenerative disease |
EP3711680A4 (en) * | 2017-11-14 | 2021-08-18 | Osaka University | COGNITIVE MALFUNCTION DIAGNOSIS AND COGNITIVE MALFUNCTION DIAGNOSIS PROGRAM |
JP6667907B2 (ja) * | 2018-06-28 | 2020-03-18 | 株式会社アルム | 認知症診断装置、および認知症診断システム |
EP3821815A4 (en) * | 2018-07-13 | 2021-12-29 | Life Science Institute, Inc. | Mental/nervous system disorder estimation system, estimation program, and estimation method |
KR20220009954A (ko) * | 2019-04-17 | 2022-01-25 | 컴퍼스 패쓰파인더 리미티드 | 신경인지 장애, 만성 통증을 치료하고 염증을 감소시키는 방법 |
-
2019
- 2019-07-11 WO PCT/JP2019/027587 patent/WO2020013296A1/ja active Application Filing
- 2019-07-11 JP JP2020530269A patent/JP7389421B2/ja active Active
- 2019-07-11 SG SG11202100147VA patent/SG11202100147VA/en unknown
- 2019-07-11 US US17/258,948 patent/US12029579B2/en active Active
- 2019-07-11 GB GB2100152.4A patent/GB2590201B/en active Active
-
2023
- 2023-11-08 JP JP2023190849A patent/JP7563683B2/ja active Active
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20030078768A1 (en) * | 2000-10-06 | 2003-04-24 | Silverman Stephen E. | Method for analysis of vocal jitter for near-term suicidal risk assessment |
JP2011255106A (ja) * | 2010-06-11 | 2011-12-22 | Nagoya Institute Of Technology | 認知機能障害危険度算出装置、認知機能障害危険度算出システム、及びプログラム |
US20170354363A1 (en) * | 2011-08-02 | 2017-12-14 | Massachusetts Institute Of Technology | Phonologically-based biomarkers for major depressive disorder |
WO2015168606A1 (en) * | 2014-05-02 | 2015-11-05 | The Regents Of The University Of Michigan | Mood monitoring of bipolar disorder using speech analysis |
JP2017532082A (ja) * | 2014-08-22 | 2017-11-02 | エスアールアイ インターナショナルSRI International | 患者の精神状態のスピーチベース評価のためのシステム |
WO2017138376A1 (ja) * | 2016-02-09 | 2017-08-17 | Pst株式会社 | 推定方法、推定プログラム、推定装置および推定システム |
JP2018121749A (ja) * | 2017-01-30 | 2018-08-09 | 株式会社リコー | 診断装置、プログラム及び診断システム |
JP6337362B1 (ja) * | 2017-11-02 | 2018-06-06 | パナソニックIpマネジメント株式会社 | 認知機能評価装置、及び、認知機能評価システム |
Non-Patent Citations (2)
Title |
---|
2018 CBEES-BBS BALI, INDONESIA CONFERENCE ABSTRACT, 23 April 2018 (2018-04-23), pages 1 - 7 , 16, 48, Retrieved from the Internet <URL:http://www.icpps.org/ICPPS2018-program.pdf> [retrieved on 20190917] * |
HIGUCHI, M. ET AL.: "CLASSIFICATION OF BIPOLAR DISORDER, MAJOR DEPRESSIVE DISORDER, AND HEALTHY STATE USING VOICE", ASIAN JOURNAL OF PHARMACEUTICAL AND CLINICAL RESEARCH, vol. 11, no. 15, October 2018 (2018-10-01), pages 89 - 93, XP055674862, Retrieved from the Internet <URL:http://dx.doi.org/10.22159/ajpcr.2018.vlls3.30042> * |
Cited By (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2021220646A1 (ja) * | 2020-04-28 | 2021-11-04 | Pst株式会社 | 情報処理装置、方法、及びプログラム |
JPWO2021220646A1 (ja) * | 2020-04-28 | 2021-11-04 | ||
EP4144302A1 (en) * | 2020-04-28 | 2023-03-08 | PST Inc. | Information processing device, method, and program |
JP7466131B2 (ja) | 2020-04-28 | 2024-04-12 | Pst株式会社 | 情報処理装置、方法、及びプログラム |
EP4144302A4 (en) * | 2020-04-28 | 2024-05-29 | PST Inc. | INFORMATION PROCESSING DEVICE, METHOD AND PROGRAM |
JP6854554B1 (ja) * | 2020-06-11 | 2021-04-07 | Pst株式会社 | 情報処理装置、情報処理方法、情報処理システム、及び情報処理プログラム |
WO2021250854A1 (ja) | 2020-06-11 | 2021-12-16 | Pst株式会社 | 情報処理装置、情報処理方法、情報処理システム、及び情報処理プログラム |
JP2021194527A (ja) * | 2020-06-11 | 2021-12-27 | Pst株式会社 | 情報処理装置、情報処理方法、情報処理システム、及び情報処理プログラム |
JP7430398B2 (ja) | 2020-06-11 | 2024-02-13 | Pst株式会社 | 情報処理装置、情報処理方法、情報処理システム、及び情報処理プログラム |
JP2023533331A (ja) * | 2020-07-10 | 2023-08-02 | イモコグ カンパニー リミテッド | 音声特性に基づくアルツハイマー病予測方法及び装置 |
WO2024116254A1 (ja) * | 2022-11-28 | 2024-06-06 | Pst株式会社 | 情報処理装置、情報処理方法、情報処理システム、及び情報処理プログラム |
Also Published As
Publication number | Publication date |
---|---|
GB2590201A8 (en) | 2021-07-28 |
JPWO2020013296A1 (ja) | 2021-08-05 |
JP7389421B2 (ja) | 2023-11-30 |
JP7563683B2 (ja) | 2024-10-08 |
US20210121125A1 (en) | 2021-04-29 |
GB2590201A (en) | 2021-06-23 |
US12029579B2 (en) | 2024-07-09 |
JP2024020321A (ja) | 2024-02-14 |
SG11202100147VA (en) | 2021-02-25 |
GB2590201B (en) | 2022-09-21 |
GB202100152D0 (en) | 2021-02-17 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2020013296A1 (ja) | 精神・神経系疾患を推定する装置 | |
US9536525B2 (en) | Speaker indexing device and speaker indexing method | |
JP6755304B2 (ja) | 情報処理装置 | |
JP5024154B2 (ja) | 関連付け装置、関連付け方法及びコンピュータプログラム | |
JP3584458B2 (ja) | パターン認識装置およびパターン認識方法 | |
US20160086622A1 (en) | Speech processing device, speech processing method, and computer program product | |
US20190279644A1 (en) | Speech processing device, speech processing method, and recording medium | |
JP3298858B2 (ja) | 低複雑性スピーチ認識器の区分ベースの類似性方法 | |
CN110914897B (zh) | 语音识别系统和语音识别装置 | |
CN114127849A (zh) | 语音情感识别方法和装置 | |
JP7160095B2 (ja) | 属性識別装置、属性識別方法、およびプログラム | |
JP5803125B2 (ja) | 音声による抑圧状態検出装置およびプログラム | |
TW201721631A (zh) | 聲音辨識裝置、聲音強調裝置、聲音辨識方法、聲音強調方法以及導航系統 | |
CN111862946B (zh) | 一种订单处理方法、装置、电子设备及存储介质 | |
CN114155882B (zh) | 一种基于语音识别的“路怒”情绪判断方法和装置 | |
JP7307507B2 (ja) | 病態解析システム、病態解析装置、病態解析方法、及び病態解析プログラム | |
JP7159655B2 (ja) | 感情推定システムおよびプログラム | |
JP6933335B2 (ja) | 推定方法、推定プログラムおよび推定装置 | |
Perera et al. | Automatic Evaluation Software for Contact Centre Agents’ voice Handling Performance | |
JP2022114906A (ja) | 心理状態管理装置 | |
CN110364182B (zh) | 一种声音信号处理方法及装置 | |
US20240071409A1 (en) | Aerosol quantity estimation method, aerosol quantity estimation device, and recording medium | |
CN117352008A (zh) | 语音监测方法、装置、电子设备及可读存储介质 | |
Chapaneri et al. | Emotion recognition from speech using Teager based DSCC features | |
JP4970371B2 (ja) | 情報処理装置 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 19834426 Country of ref document: EP Kind code of ref document: A1 |
|
ENP | Entry into the national phase |
Ref document number: 202100152 Country of ref document: GB Kind code of ref document: A Free format text: PCT FILING DATE = 20190711 |
|
ENP | Entry into the national phase |
Ref document number: 2020530269 Country of ref document: JP Kind code of ref document: A |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 19834426 Country of ref document: EP Kind code of ref document: A1 |