CN112998709A - Depression degree detection method using audio data - Google Patents
Depression degree detection method using audio data Download PDFInfo
- Publication number
- CN112998709A CN112998709A CN202110212777.4A CN202110212777A CN112998709A CN 112998709 A CN112998709 A CN 112998709A CN 202110212777 A CN202110212777 A CN 202110212777A CN 112998709 A CN112998709 A CN 112998709A
- Authority
- CN
- China
- Prior art keywords
- audio data
- network model
- layer
- degree
- sample
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000001514 detection method Methods 0.000 title claims abstract description 7
- 230000011218 segmentation Effects 0.000 claims abstract description 4
- 238000011176 pooling Methods 0.000 claims description 15
- 238000000034 method Methods 0.000 claims description 11
- OSXPVFSMSBQPBU-UHFFFAOYSA-N 2-(2-carboxyethoxycarbonyl)benzoic acid Chemical compound OC(=O)CCOC(=O)C1=CC=CC=C1C(O)=O OSXPVFSMSBQPBU-UHFFFAOYSA-N 0.000 claims description 3
- 238000012935 Averaging Methods 0.000 claims description 2
- 238000000605 extraction Methods 0.000 claims description 2
- 238000003745 diagnosis Methods 0.000 description 10
- 208000020016 psychiatric disease Diseases 0.000 description 3
- 238000010586 diagram Methods 0.000 description 2
- 201000010099 disease Diseases 0.000 description 2
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 2
- 208000020401 Depressive disease Diseases 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000003759 clinical diagnosis Methods 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 230000004069 differentiation Effects 0.000 description 1
- 239000003814 drug Substances 0.000 description 1
- 230000005802 health problem Effects 0.000 description 1
- 238000001228 spectrum Methods 0.000 description 1
- 208000011580 syndromic disease Diseases 0.000 description 1
Images
Classifications
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61B—DIAGNOSIS; SURGERY; IDENTIFICATION
- A61B5/00—Measuring for diagnostic purposes; Identification of persons
- A61B5/16—Devices for psychotechnics; Testing reaction times ; Devices for evaluating the psychological state
- A61B5/165—Evaluating the state of mind, e.g. depression, anxiety
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61B—DIAGNOSIS; SURGERY; IDENTIFICATION
- A61B5/00—Measuring for diagnostic purposes; Identification of persons
- A61B5/48—Other medical applications
- A61B5/4803—Speech analysis specially adapted for diagnostic purposes
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61B—DIAGNOSIS; SURGERY; IDENTIFICATION
- A61B5/00—Measuring for diagnostic purposes; Identification of persons
- A61B5/72—Signal processing specially adapted for physiological signals or for diagnostic purposes
- A61B5/7235—Details of waveform analysis
- A61B5/7264—Classification of physiological signals or data, e.g. using neural networks, statistical classifiers, expert systems or fuzzy systems
Abstract
The invention discloses a depression degree detection method using audio data, which comprises the following steps: 1) acquiring a plurality of audio data samples; 2) extracting the characteristics of each audio data sample; 3) acquiring global features according to the features extracted in the step 2), and then segmenting each audio data sample to acquire the global features of each segment of samples obtained by segmentation; 4) training the deep convolutional network model according to the global characteristics of each section of the sample obtained in the step 3), and then detecting the melancholy degree of the object to be detected by using the trained deep convolutional network model.
Description
Technical Field
The invention relates to a depression degree detection method, in particular to a depression degree detection method using audio data.
Background
At present, the clinical diagnosis standards of depression comprise diagnosis standards published by WHO formal International Classification of diseases and health problems, 10 th edition, diagnosis standards published by American psychiatric Association, mental disorder diagnosis statistics handbook, 4 th edition, Chinese Classification of mental disorders and diagnosis standards, 3 rd edition, and Chinese traditional and Western medicine integrated mental disease syndrome differentiation and typing diagnosis standards. Currently, most of the medical diagnosis methods for depression diseases are diagnosis of depression by professional physicians according to some medically well-listed diagnosis standards, communication with suspected patients, and health questionnaires. However, the diagnosis method has the limitations of strong subjectivity and small flexibility, and has low accuracy in depression degree diagnosis.
Disclosure of Invention
The present invention is directed to overcoming the above-mentioned disadvantages of the prior art and providing a depression degree detecting method using audio data, which can detect a depression degree more accurately.
In order to achieve the above object, the method for detecting a degree of depression using audio data according to the present invention comprises the steps of:
1) acquiring a plurality of audio data samples;
2) extracting the characteristics of each audio data sample;
3) acquiring global features according to the features extracted in the step 2), and then segmenting each audio data sample to acquire the global features of each segment of samples obtained by segmentation;
4) training the deep convolutional network model according to the global characteristics of each section of the sample obtained in the step 3), and then detecting the melancholy degree of the object to be detected by using the trained deep convolutional network model.
The specific operation of the step 2) is as follows:
performing feature extraction on each audio data sample by using CONVAEP, wherein each feature is acquired once every 10 milliseconds, and extracting features, wherein the extracted features comprise F0, VUV, NAQ, QOQ, H1H2, PSP, MDQ, peakSlope, Rd _ conf, MCEP _0-24, HMPDM _0-24 and HMPDD _ 0-12.
The specific operation of the step 3) is as follows:
averaging 10ms data of all characteristics of each audio data sample, taking the average result as the global characteristic of the audio data sample, dividing each audio data sample into 100 parts, and obtaining the global characteristic of each segment of sample.
The deep convolutional network model comprises a male deep convolutional network model and a female deep convolutional network model.
The male deep convolutional network model comprises a first convolutional layer, a first pooling layer, a second convolutional layer, a second pooling layer, a third convolutional layer, a third pooling layer and a full-link layer.
The female deep convolutional network model comprises a first convolutional layer, a first pooling layer, a second convolutional layer, a second pooling layer and a full-link layer.
The invention has the following beneficial effects:
when the method for detecting the depression degree by using the audio data is specifically operated, the audio data of the detected object only needs to be acquired, then the audio data of the detected object is input into the trained deep convolution network model, and the melancholy degree of the detected object is judged by using the trained deep convolution network model.
Drawings
FIG. 1 is a schematic diagram of a deep convolutional network model for males;
FIG. 2 is a schematic diagram of a girl's deep convolutional network model.
Detailed Description
The invention is described in further detail below with reference to the accompanying drawings:
the depression degree detection method using audio data according to the present invention includes the steps of:
1) the method comprises the steps of obtaining a plurality of audio data samples, wherein when the audio data samples are obtained, audio data in conversation with a user are obtained firstly, then denoising is carried out on the obtained audio data, the frequency spectrum of an inquirer is removed, the audio data of the user are guaranteed, and the audio data of the user are used as the audio data samples.
2) Extracting the characteristics of each audio data sample;
specifically, the CONVAEP is used for extracting features of each audio data sample, wherein each feature is acquired once every 10 milliseconds, and the features are extracted, and the extracted features comprise F0, VUV, NAQ, QOQ, H1H2, PSP, MDQ, peakSlope, Rd _ conf, MCEP _0-24, HMPDM _0-24 and HMPDD _0-12, namely 13 features with 73 features in total
3) Acquiring global features according to the features extracted in the step 2), and then segmenting each audio data sample to acquire the global features of each segment of samples obtained by segmentation;
specifically, 10ms data of all characteristics of each audio data sample is averaged, an average result is used as a global characteristic of the audio data sample, the time length of audio of each sample is different, so that the obtained audio data of each sample is different, and in order to meet the requirement of a uniform data format and collect more data, each audio data sample is divided into 100 parts to obtain the global characteristic of each segment of sample.
4) Training the deep convolutional network model according to the global characteristics of each section of the sample obtained in the step 3), and then detecting the melancholy degree of the object to be detected by using the trained deep convolutional network model.
Because men and women have great difference in audio frequency, timbre and the like, if the same strategy is used on the audio frequency, great errors are caused, different networks are respectively used for modeling the men and the women, and the depression degrees of the men and the women are respectively predicted, namely, the deep convolution network models comprise a male deep convolution network model and a female deep convolution network model.
The male deep convolutional network model comprises a first convolutional layer, a first pooling layer, a second convolutional layer, a second pooling layer, a third convolutional layer, a third pooling layer and a full-link layer.
The female deep convolutional network model comprises a first convolutional layer, a first pooling layer, a second convolutional layer, a second pooling layer and a full-link layer.
Claims (6)
1. A depression degree detection method using audio data, comprising the steps of:
1) acquiring a plurality of audio data samples;
2) extracting the characteristics of each audio data sample;
3) acquiring global features according to the features extracted in the step 2), and then segmenting each audio data sample to acquire the global features of each segment of samples obtained by segmentation;
4) training the deep convolutional network model according to the global characteristics of each section of the sample obtained in the step 3), and then detecting the melancholy degree of the object to be detected by using the trained deep convolutional network model.
2. The method for detecting a degree of depression using audio data according to claim 1, wherein the specific operation of step 2) is:
performing feature extraction on each audio data sample by using CONVAEP, wherein each feature is acquired once every 10 milliseconds, and extracting features, wherein the extracted features comprise F0, VUV, NAQ, QOQ, H1H2, PSP, MDQ, peakSlope, Rd _ conf, MCEP _0-24, HMPDM _0-24 and HMPDD _ 0-12.
3. The method of detecting a degree of depression using audio data according to claim 1, wherein the specific operation of step 3) is:
averaging 10ms data of all characteristics of each audio data sample, taking the average result as the global characteristic of the audio data sample, dividing each audio data sample into 100 parts, and obtaining the global characteristic of each segment of sample.
4. The method of detecting a degree of depression using audio data according to claim 1, wherein the deep convolutional network model includes a deep convolutional network model for males and a deep convolutional network model for females.
5. The method of detecting depression degree using audio data according to claim 1, wherein the deep convolutional network model for men includes a first convolutional layer, a first pooling layer, a second convolutional layer, a second pooling layer, a third convolutional layer, a third pooling layer, and a full-link layer.
6. The method of detecting depression degree using audio data according to claim 1, wherein the deep convolutional network model of the female includes a first convolutional layer, a first pooling layer, a second convolutional layer, a second pooling layer, and a full-link layer.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110212777.4A CN112998709A (en) | 2021-02-25 | 2021-02-25 | Depression degree detection method using audio data |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110212777.4A CN112998709A (en) | 2021-02-25 | 2021-02-25 | Depression degree detection method using audio data |
Publications (1)
Publication Number | Publication Date |
---|---|
CN112998709A true CN112998709A (en) | 2021-06-22 |
Family
ID=76386008
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110212777.4A Pending CN112998709A (en) | 2021-02-25 | 2021-02-25 | Depression degree detection method using audio data |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112998709A (en) |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2012003523A1 (en) * | 2010-07-06 | 2012-01-12 | Rmit University | Emotional and/or psychiatric state detection |
CN107704549A (en) * | 2017-09-26 | 2018-02-16 | 百度在线网络技术(北京)有限公司 | Voice search method, device and computer equipment |
CN109599129A (en) * | 2018-11-13 | 2019-04-09 | 杭州电子科技大学 | Voice depression recognition methods based on attention mechanism and convolutional neural networks |
-
2021
- 2021-02-25 CN CN202110212777.4A patent/CN112998709A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2012003523A1 (en) * | 2010-07-06 | 2012-01-12 | Rmit University | Emotional and/or psychiatric state detection |
CN107704549A (en) * | 2017-09-26 | 2018-02-16 | 百度在线网络技术(北京)有限公司 | Voice search method, device and computer equipment |
CN109599129A (en) * | 2018-11-13 | 2019-04-09 | 杭州电子科技大学 | Voice depression recognition methods based on attention mechanism and convolutional neural networks |
Non-Patent Citations (2)
Title |
---|
李金鸣 等: "基于深度学习的音频抑郁症识别", 《计算机应用与软件》 * |
码农家园: "DAIC-WOZ数据集", 《码农家园》 * |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110443798B (en) | Autism detection method, device and system based on magnetic resonance image | |
CN103027667B (en) | Characteristic parameter extraction of pulse wave | |
CN108185996B (en) | Arterial blood vessel age estimation model construction method and device | |
CN111539944B (en) | Method, device, electronic equipment and storage medium for acquiring statistical attribute of lung focus | |
CN107622797B (en) | Body condition determining system and method based on sound | |
CN109833035B (en) | Classification prediction data processing method of pulse wave blood pressure measuring device | |
CN109124610B (en) | Anti-interference method and device for non-invasive blood pressure measurement | |
CN111000563B (en) | Automatic measuring method and device for retinal artery and vein diameter ratio | |
CN110246577B (en) | Method for assisting gestational diabetes genetic risk prediction based on artificial intelligence | |
WO2011123068A1 (en) | A method and system for determining a stage of fibrosis in a liver | |
CN108056770A (en) | A kind of heart rate detection method based on artificial intelligence | |
CN103690152A (en) | Arterial elasticity evaluating device based on pulse analysis | |
Moreno et al. | Type 2 diabetes screening test by means of a pulse oximeter | |
CN104545792B (en) | The arteriovenous retinal vessel optic disc localization method of eye fundus image | |
CN112869716B (en) | Pulse feature identification system and method based on two-channel convolutional neural network | |
Pereira et al. | Invasive validation of the Complior Analyse in the assessment of central artery pressure curves: a methodological study | |
CN112971795B (en) | Electrocardiosignal quality evaluation method | |
CN109288508A (en) | A kind of pressure value intelligent measurement method based on CRNN-BP | |
Batista et al. | A multichannel time–frequency and multi-wavelet toolbox for uterine electromyography processing and visualisation | |
CN110379509A (en) | A kind of Breast Nodules aided diagnosis method and system based on DSSD | |
CN101637394A (en) | Method for positioning and segmenting heart sound signal | |
CN112998709A (en) | Depression degree detection method using audio data | |
CN106725401A (en) | Automatic blood pressure measurement method based on deep learning | |
CN110517264A (en) | A kind of lesion extracting method and device based on blood vessel segmentation | |
CN107837083B (en) | J wave automatic testing method based on least square method supporting vector machine |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20210622 |