CN112998709A - Depression degree detection method using audio data - Google Patents

Depression degree detection method using audio data Download PDF

Info

Publication number
CN112998709A
CN112998709A CN202110212777.4A CN202110212777A CN112998709A CN 112998709 A CN112998709 A CN 112998709A CN 202110212777 A CN202110212777 A CN 202110212777A CN 112998709 A CN112998709 A CN 112998709A
Authority
CN
China
Prior art keywords
audio data
network model
layer
degree
sample
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110212777.4A
Other languages
Chinese (zh)
Inventor
乔亚男
杨帆
罗丹
王珊
薄钧戈
黄程
黄鑫
房琛琛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xian Jiaotong University
Original Assignee
Xian Jiaotong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xian Jiaotong University filed Critical Xian Jiaotong University
Priority to CN202110212777.4A priority Critical patent/CN112998709A/en
Publication of CN112998709A publication Critical patent/CN112998709A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • A61B5/16Devices for psychotechnics; Testing reaction times ; Devices for evaluating the psychological state
    • A61B5/165Evaluating the state of mind, e.g. depression, anxiety
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • A61B5/48Other medical applications
    • A61B5/4803Speech analysis specially adapted for diagnostic purposes
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • A61B5/72Signal processing specially adapted for physiological signals or for diagnostic purposes
    • A61B5/7235Details of waveform analysis
    • A61B5/7264Classification of physiological signals or data, e.g. using neural networks, statistical classifiers, expert systems or fuzzy systems

Abstract

The invention discloses a depression degree detection method using audio data, which comprises the following steps: 1) acquiring a plurality of audio data samples; 2) extracting the characteristics of each audio data sample; 3) acquiring global features according to the features extracted in the step 2), and then segmenting each audio data sample to acquire the global features of each segment of samples obtained by segmentation; 4) training the deep convolutional network model according to the global characteristics of each section of the sample obtained in the step 3), and then detecting the melancholy degree of the object to be detected by using the trained deep convolutional network model.

Description

Depression degree detection method using audio data
Technical Field
The invention relates to a depression degree detection method, in particular to a depression degree detection method using audio data.
Background
At present, the clinical diagnosis standards of depression comprise diagnosis standards published by WHO formal International Classification of diseases and health problems, 10 th edition, diagnosis standards published by American psychiatric Association, mental disorder diagnosis statistics handbook, 4 th edition, Chinese Classification of mental disorders and diagnosis standards, 3 rd edition, and Chinese traditional and Western medicine integrated mental disease syndrome differentiation and typing diagnosis standards. Currently, most of the medical diagnosis methods for depression diseases are diagnosis of depression by professional physicians according to some medically well-listed diagnosis standards, communication with suspected patients, and health questionnaires. However, the diagnosis method has the limitations of strong subjectivity and small flexibility, and has low accuracy in depression degree diagnosis.
Disclosure of Invention
The present invention is directed to overcoming the above-mentioned disadvantages of the prior art and providing a depression degree detecting method using audio data, which can detect a depression degree more accurately.
In order to achieve the above object, the method for detecting a degree of depression using audio data according to the present invention comprises the steps of:
1) acquiring a plurality of audio data samples;
2) extracting the characteristics of each audio data sample;
3) acquiring global features according to the features extracted in the step 2), and then segmenting each audio data sample to acquire the global features of each segment of samples obtained by segmentation;
4) training the deep convolutional network model according to the global characteristics of each section of the sample obtained in the step 3), and then detecting the melancholy degree of the object to be detected by using the trained deep convolutional network model.
The specific operation of the step 2) is as follows:
performing feature extraction on each audio data sample by using CONVAEP, wherein each feature is acquired once every 10 milliseconds, and extracting features, wherein the extracted features comprise F0, VUV, NAQ, QOQ, H1H2, PSP, MDQ, peakSlope, Rd _ conf, MCEP _0-24, HMPDM _0-24 and HMPDD _ 0-12.
The specific operation of the step 3) is as follows:
averaging 10ms data of all characteristics of each audio data sample, taking the average result as the global characteristic of the audio data sample, dividing each audio data sample into 100 parts, and obtaining the global characteristic of each segment of sample.
The deep convolutional network model comprises a male deep convolutional network model and a female deep convolutional network model.
The male deep convolutional network model comprises a first convolutional layer, a first pooling layer, a second convolutional layer, a second pooling layer, a third convolutional layer, a third pooling layer and a full-link layer.
The female deep convolutional network model comprises a first convolutional layer, a first pooling layer, a second convolutional layer, a second pooling layer and a full-link layer.
The invention has the following beneficial effects:
when the method for detecting the depression degree by using the audio data is specifically operated, the audio data of the detected object only needs to be acquired, then the audio data of the detected object is input into the trained deep convolution network model, and the melancholy degree of the detected object is judged by using the trained deep convolution network model.
Drawings
FIG. 1 is a schematic diagram of a deep convolutional network model for males;
FIG. 2 is a schematic diagram of a girl's deep convolutional network model.
Detailed Description
The invention is described in further detail below with reference to the accompanying drawings:
the depression degree detection method using audio data according to the present invention includes the steps of:
1) the method comprises the steps of obtaining a plurality of audio data samples, wherein when the audio data samples are obtained, audio data in conversation with a user are obtained firstly, then denoising is carried out on the obtained audio data, the frequency spectrum of an inquirer is removed, the audio data of the user are guaranteed, and the audio data of the user are used as the audio data samples.
2) Extracting the characteristics of each audio data sample;
specifically, the CONVAEP is used for extracting features of each audio data sample, wherein each feature is acquired once every 10 milliseconds, and the features are extracted, and the extracted features comprise F0, VUV, NAQ, QOQ, H1H2, PSP, MDQ, peakSlope, Rd _ conf, MCEP _0-24, HMPDM _0-24 and HMPDD _0-12, namely 13 features with 73 features in total
3) Acquiring global features according to the features extracted in the step 2), and then segmenting each audio data sample to acquire the global features of each segment of samples obtained by segmentation;
specifically, 10ms data of all characteristics of each audio data sample is averaged, an average result is used as a global characteristic of the audio data sample, the time length of audio of each sample is different, so that the obtained audio data of each sample is different, and in order to meet the requirement of a uniform data format and collect more data, each audio data sample is divided into 100 parts to obtain the global characteristic of each segment of sample.
4) Training the deep convolutional network model according to the global characteristics of each section of the sample obtained in the step 3), and then detecting the melancholy degree of the object to be detected by using the trained deep convolutional network model.
Because men and women have great difference in audio frequency, timbre and the like, if the same strategy is used on the audio frequency, great errors are caused, different networks are respectively used for modeling the men and the women, and the depression degrees of the men and the women are respectively predicted, namely, the deep convolution network models comprise a male deep convolution network model and a female deep convolution network model.
The male deep convolutional network model comprises a first convolutional layer, a first pooling layer, a second convolutional layer, a second pooling layer, a third convolutional layer, a third pooling layer and a full-link layer.
The female deep convolutional network model comprises a first convolutional layer, a first pooling layer, a second convolutional layer, a second pooling layer and a full-link layer.

Claims (6)

1. A depression degree detection method using audio data, comprising the steps of:
1) acquiring a plurality of audio data samples;
2) extracting the characteristics of each audio data sample;
3) acquiring global features according to the features extracted in the step 2), and then segmenting each audio data sample to acquire the global features of each segment of samples obtained by segmentation;
4) training the deep convolutional network model according to the global characteristics of each section of the sample obtained in the step 3), and then detecting the melancholy degree of the object to be detected by using the trained deep convolutional network model.
2. The method for detecting a degree of depression using audio data according to claim 1, wherein the specific operation of step 2) is:
performing feature extraction on each audio data sample by using CONVAEP, wherein each feature is acquired once every 10 milliseconds, and extracting features, wherein the extracted features comprise F0, VUV, NAQ, QOQ, H1H2, PSP, MDQ, peakSlope, Rd _ conf, MCEP _0-24, HMPDM _0-24 and HMPDD _ 0-12.
3. The method of detecting a degree of depression using audio data according to claim 1, wherein the specific operation of step 3) is:
averaging 10ms data of all characteristics of each audio data sample, taking the average result as the global characteristic of the audio data sample, dividing each audio data sample into 100 parts, and obtaining the global characteristic of each segment of sample.
4. The method of detecting a degree of depression using audio data according to claim 1, wherein the deep convolutional network model includes a deep convolutional network model for males and a deep convolutional network model for females.
5. The method of detecting depression degree using audio data according to claim 1, wherein the deep convolutional network model for men includes a first convolutional layer, a first pooling layer, a second convolutional layer, a second pooling layer, a third convolutional layer, a third pooling layer, and a full-link layer.
6. The method of detecting depression degree using audio data according to claim 1, wherein the deep convolutional network model of the female includes a first convolutional layer, a first pooling layer, a second convolutional layer, a second pooling layer, and a full-link layer.
CN202110212777.4A 2021-02-25 2021-02-25 Depression degree detection method using audio data Pending CN112998709A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110212777.4A CN112998709A (en) 2021-02-25 2021-02-25 Depression degree detection method using audio data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110212777.4A CN112998709A (en) 2021-02-25 2021-02-25 Depression degree detection method using audio data

Publications (1)

Publication Number Publication Date
CN112998709A true CN112998709A (en) 2021-06-22

Family

ID=76386008

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110212777.4A Pending CN112998709A (en) 2021-02-25 2021-02-25 Depression degree detection method using audio data

Country Status (1)

Country Link
CN (1) CN112998709A (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2012003523A1 (en) * 2010-07-06 2012-01-12 Rmit University Emotional and/or psychiatric state detection
CN107704549A (en) * 2017-09-26 2018-02-16 百度在线网络技术(北京)有限公司 Voice search method, device and computer equipment
CN109599129A (en) * 2018-11-13 2019-04-09 杭州电子科技大学 Voice depression recognition methods based on attention mechanism and convolutional neural networks

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2012003523A1 (en) * 2010-07-06 2012-01-12 Rmit University Emotional and/or psychiatric state detection
CN107704549A (en) * 2017-09-26 2018-02-16 百度在线网络技术(北京)有限公司 Voice search method, device and computer equipment
CN109599129A (en) * 2018-11-13 2019-04-09 杭州电子科技大学 Voice depression recognition methods based on attention mechanism and convolutional neural networks

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
李金鸣 等: "基于深度学习的音频抑郁症识别", 《计算机应用与软件》 *
码农家园: "DAIC-WOZ数据集", 《码农家园》 *

Similar Documents

Publication Publication Date Title
CN110443798B (en) Autism detection method, device and system based on magnetic resonance image
CN103027667B (en) Characteristic parameter extraction of pulse wave
CN108185996B (en) Arterial blood vessel age estimation model construction method and device
CN111539944B (en) Method, device, electronic equipment and storage medium for acquiring statistical attribute of lung focus
CN107622797B (en) Body condition determining system and method based on sound
CN109833035B (en) Classification prediction data processing method of pulse wave blood pressure measuring device
CN109124610B (en) Anti-interference method and device for non-invasive blood pressure measurement
CN111000563B (en) Automatic measuring method and device for retinal artery and vein diameter ratio
CN110246577B (en) Method for assisting gestational diabetes genetic risk prediction based on artificial intelligence
WO2011123068A1 (en) A method and system for determining a stage of fibrosis in a liver
CN108056770A (en) A kind of heart rate detection method based on artificial intelligence
CN103690152A (en) Arterial elasticity evaluating device based on pulse analysis
Moreno et al. Type 2 diabetes screening test by means of a pulse oximeter
CN104545792B (en) The arteriovenous retinal vessel optic disc localization method of eye fundus image
CN112869716B (en) Pulse feature identification system and method based on two-channel convolutional neural network
Pereira et al. Invasive validation of the Complior Analyse in the assessment of central artery pressure curves: a methodological study
CN112971795B (en) Electrocardiosignal quality evaluation method
CN109288508A (en) A kind of pressure value intelligent measurement method based on CRNN-BP
Batista et al. A multichannel time–frequency and multi-wavelet toolbox for uterine electromyography processing and visualisation
CN110379509A (en) A kind of Breast Nodules aided diagnosis method and system based on DSSD
CN101637394A (en) Method for positioning and segmenting heart sound signal
CN112998709A (en) Depression degree detection method using audio data
CN106725401A (en) Automatic blood pressure measurement method based on deep learning
CN110517264A (en) A kind of lesion extracting method and device based on blood vessel segmentation
CN107837083B (en) J wave automatic testing method based on least square method supporting vector machine

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20210622