CN112908435A - Depression cognitive behavior training system and voice data processing method - Google Patents

Depression cognitive behavior training system and voice data processing method Download PDF

Info

Publication number
CN112908435A
CN112908435A CN202110119287.XA CN202110119287A CN112908435A CN 112908435 A CN112908435 A CN 112908435A CN 202110119287 A CN202110119287 A CN 202110119287A CN 112908435 A CN112908435 A CN 112908435A
Authority
CN
China
Prior art keywords
training
user
cognitive behavior
stage
training stage
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110119287.XA
Other languages
Chinese (zh)
Other versions
CN112908435B (en
Inventor
张锡哲
王菲
张然
方翰铮
王洋
谷迁乔
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing Brain Hospital
Original Assignee
Nanjing Brain Hospital
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing Brain Hospital filed Critical Nanjing Brain Hospital
Priority to CN202110119287.XA priority Critical patent/CN112908435B/en
Publication of CN112908435A publication Critical patent/CN112908435A/en
Application granted granted Critical
Publication of CN112908435B publication Critical patent/CN112908435B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H10/00ICT specially adapted for the handling or processing of patient-related medical or healthcare data
    • G16H10/20ICT specially adapted for the handling or processing of patient-related medical or healthcare data for electronic clinical trials or questionnaires
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/03Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
    • G10L25/24Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being the cepstrum
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/27Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique
    • G10L25/30Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique using neural networks
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
    • G10L25/63Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination for estimating an emotional state
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H20/00ICT specially adapted for therapies or health-improving plans, e.g. for handling prescriptions, for steering therapy or for monitoring patient compliance
    • G16H20/70ICT specially adapted for therapies or health-improving plans, e.g. for handling prescriptions, for steering therapy or for monitoring patient compliance relating to mental therapies, e.g. psychological therapy or autogenous training
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/20ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Public Health (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Multimedia (AREA)
  • Primary Health Care (AREA)
  • Signal Processing (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Computational Linguistics (AREA)
  • Epidemiology (AREA)
  • Child & Adolescent Psychology (AREA)
  • Hospice & Palliative Care (AREA)
  • Psychiatry (AREA)
  • Biomedical Technology (AREA)
  • Developmental Disabilities (AREA)
  • Psychology (AREA)
  • Social Psychology (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Pathology (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Electrically Operated Instructional Devices (AREA)

Abstract

The invention provides a depression cognitive behavior training system, which comprises: the login module is used for acquiring login request information of a user and realizing login; the cognitive behavior training module is used for realizing the cognitive behavior training of the user in a training interface according to the training stage where the user is currently located and the training content corresponding to the training stage after the user successfully logs in; and the emotion recording module is used for entering an audio recording page before each training stage starts and after each training stage is finished, analyzing according to voice data recorded by the user and obtaining the rating of the user depression scale. The data fed back by the user is acquired comprehensively in the training process, the objectivity is strong, and the accuracy is high. The invention also provides a voice data processing method which can accurately identify the emotion change of the user.

Description

Depression cognitive behavior training system and voice data processing method
Technical Field
The invention relates to the technical field of medical instruments, in particular to a depression cognitive behavior training system and a voice data processing method.
Background
With the rapid development of digital services and the internet, the accessibility of psychotherapy is widened from a geographical and economic point of view.
The interventional content of computer cognitive behavioral therapy allows computerized therapy of a user through an application or network. Many programs have been developed for computerized cognitive behavioral therapy at home and abroad, and clinical trials have been conducted for various mental and physical diseases, most of which are cognitive behavioral therapy provided using networks or computers. It is currently recognized that computerized cognitive behavioral therapy and face-to-face psychological therapy are equivalent in alleviating symptoms of depression and anxiety, and thus computerized cognitive behavioral therapy has become one of the intervention modes in the field of psychological therapy due to the advantages that therapists invest less time and energy, can intervene in multiple persons simultaneously, is not affected by geographical locations, and has curative effects confirmed by clinical trials.
The core symptoms of depression are mainly manifested by depressed mood, loss of pleasure and lack of interest, which also result in the change of motor control, resulting in psychomotor retardation, thereby affecting the characteristics of sound source and vocal tract, and manifested by marked external acoustic feature manifestations of psychomotor retardation such as reduced volume, slowed speech speed, and obviously prolonged pause time. Moreover, studies have shown that the sound trajectory is smoother as the range of sound variation becomes smaller with the exacerbations of the depressive symptoms. Therefore, through the analysis of the acoustic characteristics, the change of the individual emotion can be identified, and the psychological health and the curative effect of treatment can be objectively evaluated and analyzed.
Most of the existing computerized cognitive behavior intervention systems are mainly in a narration mode and a questionnaire mode, and the modes are single, so that data fed back by a user in a treatment process are not comprehensive. In the registration and treatment stage, the existing computerized cognitive behavior treatment process realizes the login of a treatment webpage mainly by opening a treatment account for a user; entering a treatment page provided by the system after logging in to complete corresponding treatment; the user is interviewed or emailed by the therapist and feedback is obtained after each treatment is over or weekly. In the stage of evaluating the curative effect, the therapist mainly obtains treatment feedback through scales or patient statements, and the feedback information is single in form and subjective.
Disclosure of Invention
Technical problem to be solved
In view of the problems in the art described above, the present invention is at least partially addressed. Therefore, the invention aims to provide a depression cognitive behavior training system which is comprehensive in data fed back by a user in a training process, strong in objectivity and high in accuracy.
A second object of the present invention is to provide a speech data processing method capable of accurately recognizing a change in emotion of a user.
(II) technical scheme
In order to achieve the above object, a first aspect of the present invention provides a depressive cognitive behavior training system, including:
the login module is used for acquiring login request information of a user and realizing login;
the cognitive behavior training module is used for realizing the cognitive behavior training of the user in a training interface according to the training stage where the user is currently located and the training content corresponding to the training stage after the user successfully logs in;
and the emotion recording module is used for entering an audio recording page before each training stage starts and after each training stage is finished, analyzing according to voice data recorded by the user and obtaining the rating of the user depression scale.
Further, the depressive cognitive behavior training system further includes:
the selection module is used for entering a page introduced by the cognitive behavior training of the depression after logging in to select the cognitive behavior training of the depression;
the verification module is used for verifying whether the user opens the authority of the cognitive behavior training module after the cognitive behavior training of the depression is determined;
correspondingly, the cognitive behavior training module is used for realizing the cognitive behavior training of the user in the training interface according to the training stage where the user is currently located and the training content corresponding to the training stage after the verification is successful.
Further, the cognitive behavior training module is specifically configured to,
when the training stage in which the user is currently positioned is a first training stage, according to the training content for identifying the depression, the cognitive behavior training of the user is realized in a training interface, and the training stage enters a second training stage after the cognitive behavior training is completed;
when the training stage in which the user is currently positioned is a second training stage, according to the training content for recognizing the computerized cognitive behavior training, the cognitive behavior training of the user is realized in a training interface, and the training stage enters a third training stage after the cognitive behavior training is completed;
when the training stage of the user is a third training stage, according to the first training content for recognizing the automatic thinking, the cognitive behavior training of the user is realized in the training interface, and the training enters a fourth training stage after the cognitive behavior training is finished;
when the training stage of the user is the fourth training stage, according to the second training content for recognizing the automatic thinking, the cognitive behavior training of the user is realized in the training interface, and the training stage enters the fifth training stage after the cognitive behavior training is finished;
when the training stage of the user is a fifth training stage, according to the first training content for correcting the automatic thinking, the cognitive behavior training of the user is realized in the training interface, and the training stage enters a sixth training stage after the cognitive behavior training is finished;
when the training stage of the user is a sixth training stage, according to the second training content for correcting the automatic thinking, the cognitive behavior training of the user is realized in the training interface, and the training stage enters a seventh training stage after the cognitive behavior training is finished;
when the training stage in which the user is currently located is a seventh training stage, according to the contents of the first six stages, the cognitive behavior training of the user is realized in a training interface in a question-answer mode, and the eighth training stage is started after the training is completed;
when the current training stage of the user is the eighth training stage, according to the training content for recognizing and correcting the intermediate beliefs, the cognitive behavior training of the user is realized in a training interface, and the ninth training stage is started after the cognitive behavior training is finished;
when the current training stage of the user is the ninth training stage, according to the training content of the problem solving method, the cognitive behavior training of the user is realized in the training interface, and the tenth training stage is started after the cognitive behavior training is finished;
when the training stage of the user is the tenth training stage, according to the first training content for recognizing and correcting the core beliefs, the cognitive behavior training of the user is realized in the training interface, and the eleventh training stage is started after the cognitive behavior training is finished;
when the current training stage of the user is the eleventh training stage, according to the second training content for recognizing and correcting the core beliefs, the cognitive behavior training of the user is realized in the training interface, and the twelfth training stage is started after the cognitive behavior training is finished;
and when the training stage in which the user is currently positioned is the twelfth training stage, the cognitive behavior training of the user is realized in the training interface according to the training content for consolidating and preventing relapse.
Further, the emotion recording module is specifically configured to,
acquiring voice data input by a user, and preprocessing the voice data;
performing feature extraction on the preprocessed voice data at preset sampling intervals by adopting an open source voice algorithm library to obtain a first low-level feature set; calculating the resonance frequency, the local amplitude root mean square, the first derivative of the Mel cepstrum coefficient and the second derivative of the Mel cepstrum coefficient of the voice data to obtain a second low-level feature set;
calculating the maximum value, the minimum value, the median, the mean value, the variance, the kurtosis, the skewness, the linear regression slope, the linear regression intercept and the linear regression fitting degree of each low-level feature to obtain a high-level feature set;
performing Pearson correlation test according to the depression scale score of the user and the high-grade feature set, and screening out related high-grade features;
and inputting the related high-grade features into a pre-trained neural network model to obtain the depression scale score of the user.
Further, the emotion recording module is also used for,
and analyzing the mood change of the user according to the depression scale scores of the user before the beginning and after the completion of the current training stage of the user, and determining whether the user carries out the next training stage.
Further, the depressive cognitive behavior training system further includes: and the family operation module is used for entering a family operation page after the preset training stage is finished, recording and storing the family operation content finished by the user.
A second aspect of the present invention provides a method for processing voice data, including:
s1, acquiring voice data input by a user, and preprocessing the voice data;
s2, performing feature extraction on the preprocessed voice data at preset sampling intervals by adopting an open source voice algorithm library to obtain a first low-level feature set; calculating the resonance frequency, the local amplitude root mean square, the first derivative of the Mel cepstrum coefficient and the second derivative of the Mel cepstrum coefficient of the voice data to obtain a second low-level feature set;
calculating the maximum value, the minimum value, the median, the mean value, the variance, the kurtosis, the skewness, the linear regression slope, the linear regression intercept and the linear regression fitting degree of each low-level feature to obtain a high-level feature set;
performing Pearson correlation test according to the depression scale score of the user and the high-grade feature set, and screening out related high-grade features;
and S3, inputting the related high-level features into a pre-trained neural network model to obtain the depression scale score of the user.
Further, in S1, the preprocessing is performed on the voice data, and includes: and transcoding the voice data into a single track wav format voice file with a sampling frequency of 16 KHz.
Further, in S2, a pearson correlation test is performed according to the depression scale score and the high-level feature set of the user to screen out the relevant high-level features, including: performing Pearson correlation test according to the depression scale scores of the users and the high-grade feature set to obtain a correlation coefficient of each high-grade feature and the depression scale scores of the users; and performing statistical test on the set of the correlation coefficients to obtain a P value corresponding to each high-grade feature, and screening out the high-grade features with the P values smaller than a preset value as the related high-grade features.
Further, in S2, calculating a resonance frequency of the voice data includes: calculating the resonance frequency of the voice data by adopting a linear predictive coding sound model;
the linear predictive coding acoustic model includes:
Figure BDA0002921869560000051
where s (k) is speech data, p is the order of the filter, apE (k) is the coefficient of the filter, and e (k) is the random noise of unvoiced sound.
(III) advantageous effects
The invention has the beneficial effects that:
1. according to the depression cognitive behavior training system provided by the invention, the depression state of the user is evaluated from three aspects of acoustic characteristics, scale feedback and family operation by arranging the emotion recording module and the family operation module, rich evaluation means are introduced, more comprehensive user feedback data can be obtained in the training process, the objectivity is strong, the accuracy is high, and the subjectivity caused by artificial evaluation is solved.
2. In the voice data processing method provided by the embodiment of the invention, the standard cepstrum parameter MFCC reflects the static characteristics of the voice parameters, the difference spectrum of the static characteristics (the first derivative and the second derivative of the MFCC) reflects the dynamic characteristics of the voice parameters, and the dynamic characteristics and the static characteristics are combined to be used as the input of a prediction model, so that the recognition accuracy of individual emotion change can be obviously improved.
Drawings
The invention is described with the aid of the following figures:
fig. 1 is a schematic structural diagram of a cognitive behavior training system for depression according to an embodiment of the present invention;
fig. 2 is a flowchart of a cognitive behavior training module for cognitive behavior training according to an embodiment of the present invention;
FIG. 3 is a flow chart of a method of processing voice data according to an embodiment of the present invention;
FIG. 4 is a schematic structural diagram of a neural network model according to an embodiment of the present invention;
FIG. 5 is a schematic diagram of a verification scatter plot according to an embodiment of the present invention.
[ description of reference ]
1: a login module;
2: a selection module;
3: a verification module;
4: a cognitive behavior training module;
5: an emotion recording module;
6: and a family operation module.
Detailed Description
For the purpose of better explaining the present invention and to facilitate understanding, the present invention will be described in detail by way of specific embodiments with reference to the accompanying drawings.
According to the depression cognitive behavior training system provided by the embodiment of the invention, the emotion recording module is arranged, and voice data input by a user is acquired before and after each training stage is started and completed, and is analyzed to obtain the depression scale score of the user. The data fed back by the user can be acquired comprehensively in the training process, the objectivity is high, and the accuracy is high.
In order to better understand the above technical solutions, exemplary embodiments of the present invention will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the invention are shown in the drawings, it should be understood that the invention can be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the invention to those skilled in the art.
A depressive cognitive behavior training system proposed according to an embodiment of the present invention will be described with reference to the accompanying drawings.
Fig. 1 is a schematic structural diagram of a cognitive behavior training system for depression according to an embodiment of the present invention.
As shown in fig. 1, the cognitive behavior training system for depression comprises a login module 1, a selection module 2, a verification module 3, a cognitive behavior training module 4 and an emotion recording module 5.
The login module 1 is used for acquiring login request information of a user and realizing login. And the selection module 2 is used for entering a page introduced by the cognitive behavior training of the depression after logging in to select the cognitive behavior training of the depression. And the verification module 3 is used for verifying whether the user opens the authority of the cognitive behavior training module after the cognitive behavior training of the depression is selected. And the cognitive behavior training module 4 is used for realizing the cognitive behavior training of the user in the training interface according to the training stage where the user is currently located and the training content corresponding to the training stage after the verification is successful. And the emotion recording module 5 is used for entering an audio recording page before each training stage starts and after each training stage is finished, analyzing according to voice data recorded by the user and obtaining a user depression scale score.
Before the depression cognitive behavior training system provided by the embodiment of the invention is used, a user is autonomously registered through a client or is created for the user by a therapist in the background of the system.
Further, as shown in fig. 2, the cognitive behavior training module 4 is specifically configured to:
when the training stage in which the user is currently positioned is the first training stage, the cognitive behavior training of the user is realized in the training interface according to the training content for identifying the depression, and the user enters the second training stage after the cognitive behavior training is completed. In the first training phase, the user is made aware of what is depression, which is characteristic of depression. Specifically, according to the training content for identifying the depression, cognitive behavior training for the user is realized in a training interface, and the method comprises the following steps: and recording and storing the depression characteristic expression selected by the user in the training interface according to the condition of the user.
When the training stage in which the user is currently located is a second training stage, the cognitive behavior training of the user is realized in a training interface according to the training content of cognitive Computerized Cognitive Behavior Training (CCBT), and the training stage enters a third training stage after the cognitive behavior training is completed. In the second training phase, the user is made aware of what the CCBT is, how it functions.
When the training stage in which the user is currently positioned is a third training stage, according to the first training content for recognizing the automatic thinking, the cognitive behavior training of the user is realized in the training interface, and the training stage enters a fourth training stage after the cognitive behavior training is completed. In a third training phase, the user is made to target treatment, learn what the automatic thinking is, and experience the relationship between automatic thinking and emotion. Specifically, according to a first training content for recognizing automatic thinking, the cognitive behavior training of a user is realized in a training interface, and the method comprises the following steps: and recording and storing the training targets input by the user through the training interface.
And when the training stage in which the user is currently positioned is a fourth training stage, according to the second training content for recognizing the automatic thinking, the cognitive behavior training of the user is realized in the training interface, and the training stage enters a fifth training stage after the cognitive behavior training is completed. In the fourth training phase, the user is made aware of the ten classical automatic thoughts, understands the meaning of each automatic thought, and looks at the automatic thoughts in another perspective.
When the training stage in which the user is currently positioned is a fifth training stage, according to the first training content for correcting the automatic thinking, the cognitive behavior training of the user is realized in the training interface, and the training stage enters a sixth training stage after the cognitive behavior training is completed. In the fifth training phase, the user is made to learn the use scenario of the mind replacement method, master the Scotta question technique, and learn to use skills in the real scenario. Specifically, according to a first training content for correcting automatic thinking, the cognitive behavior training of a user is realized in a training interface, and the method comprises the following steps: and listing a preset number of different scenes in the training interface, and recording and storing scene selections made by the user according to the emotion degree. Specifically, according to a first training content for correcting automatic thinking, the method for training the cognitive behavior of the user in the training interface further comprises the following steps: the method comprises the steps of firstly obtaining questions puzzling the user per se recorded by a preset number of users through a training interface, then obtaining answers of corresponding questions recorded by the user in a continuous self-questioning mode, and finally obtaining the current idea after the user answers the questions puzzling the user per se.
And when the training stage in which the user is currently positioned is a sixth training stage, according to the second training content for correcting the automatic thinking, the cognitive behavior training of the user is realized in the training interface, and the seventh training stage is started after the cognitive behavior training is finished. In the sixth training phase, the user is asked to learn the idea of using behavior to verify himself, to use arrow-down techniques to mine his mind, and to learn skilled skills. Specifically, according to the second training content for correcting the automatic thinking, the cognitive behavior training of the user is realized in the training interface, and the method comprises the following steps: according to the second training content for correcting the automatic thinking, the practice of replacing thinking and the practice of using the technology to resist negative thinking for the user are realized in the training interface; wherein the confrontational exercise totals 15 statements that the user needs to judge are ideas or facts. Specifically, according to the second training content for correcting the automatic thinking, the cognitive behavior training of the user is realized in the training interface, and the method further comprises the following steps: at the end of the sixth stage, according to the second training content for correcting the automatic thinking, the training of the automatic thinking example of the user is realized in the training interface; by: the method comprises the following steps of automatic thinking, potential evaluation and potential evaluation, and in such a form, the real ideas of the user are recorded by an arrow downward method.
And when the training stage in which the user is currently positioned is the seventh training stage, the cognitive behavior training of the user is realized in the training interface according to the training content of strengthening and behavior activation, and the eighth training stage is started after the cognitive behavior training is completed. Specifically, according to the reinforced training content, the cognitive behavior training of the user is realized in the training interface, and the method comprises the following steps: and according to the contents of the first six stages, the cognitive behavior training of the user is realized in a training interface in a question-answer mode. Specifically, according to the training content activated by the behavior, the cognitive behavior training of the user is realized in the training interface, and the method comprises the following steps: and acquiring at least one interesting activity event finished by the user through the user interface, wherein the interesting activity event comprises the time, the content, the current emotional information and the evaluation information of the event.
And when the current training stage of the user is the eighth training stage, according to the training content for recognizing and correcting the intermediate beliefs, the cognitive behavior training of the user is realized in the training interface, and the ninth training stage is started after the cognitive behavior training is finished. In the eighth training phase, the user is made aware of the meaning of the intermediary beliefs, recognizes the classic format of the intermediary beliefs, and learns to recognize their own intermediary beliefs.
And when the current training stage of the user is the ninth training stage, according to the training content of the problem solving method, the cognitive behavior training of the user is realized in the training interface, and the tenth training stage is started after the cognitive behavior training is finished. In the ninth training phase, the user is made to learn the steps of learning, and using the problem solving method.
When the training stage in which the user is currently located is the tenth training stage, according to the first training content for recognizing and correcting the core beliefs, the cognitive behavior training of the user is realized in the training interface, and the eleventh training stage is started after the cognitive behavior training is completed. In the tenth training stage, the user is led to learn to know the core beliefs, how to change the core beliefs, and the user actively faces the life by establishing good core beliefs.
And when the current training stage of the user is the eleventh training stage, according to the second training content for recognizing and correcting the core beliefs, the cognitive behavior training of the user is realized in the training interface, and the twelfth training stage is started after the cognitive behavior training is finished. Specifically, according to the second training content for recognizing and correcting the core belief, the cognitive behavior training of the user is realized in the training interface, and the method comprises the following steps: and recording and storing the core beliefs determined by the user.
And when the training stage in which the user is currently positioned is the twelfth training stage, the cognitive behavior training of the user is realized in the training interface according to the training content for consolidating and preventing relapse. In the twelfth training phase, the user is made to review the entire training process, consolidate learned methodology and skills, and learn to become his therapist. Specifically, according to the training content for consolidating and preventing relapse, the cognitive behavior training of the user is realized in the training interface, which comprises the following steps: according to the contents of the first eleven stages, the cognitive behavior training of the user is realized in a training interface in a question-answer mode.
Further, the emotion recording module 5 is specifically configured to: and acquiring voice data input by a user, and preprocessing the voice data. Performing feature extraction on the preprocessed voice data at preset sampling intervals by adopting an open source voice algorithm library to obtain a first low-level feature set; calculating the resonance frequency, the local amplitude root mean square, the first derivative of the Mel cepstrum coefficient and the second derivative of the Mel cepstrum coefficient of the voice data to obtain a second low-level feature set; calculating the maximum value, the minimum value, the median, the mean value, the variance, the kurtosis, the skewness, the linear regression slope, the linear regression intercept and the linear regression fitting degree of each low-level feature to obtain a high-level feature set; performing Pearson correlation test according to the depression scale score and the high-grade feature set of the user, and screening out related high-grade features; and inputting the related high-grade features into a pre-trained neural network model to obtain the depression scale score of the user. The change in emotion of the user can be accurately recognized.
Further, the emotion recording module 5 is further configured to analyze the emotion change of the user according to the rating of the user's depression scale before the start and after the completion of the current training phase of the user, and determine whether the user performs the next training phase.
Further, the cognitive behavior training system for depression provided by the embodiment of the invention further comprises: and the family operation module 6 is used for entering a family operation page after the preset training stage is finished, recording and storing the family operation content finished by the user. Specifically, the homework module 6 is configured to enter a homework page after the first, third, fourth, fifth, sixth, seventh, eighth, ninth, tenth, eleventh, and twelfth training phases are completed, record and store homework content completed by the user.
In summary, the depression cognitive behavior training system provided by the embodiment of the invention estimates the depression state of the user from three aspects of acoustic characteristics, scale feedback and family operation by arranging the emotion recording module and the family operation module, introduces abundant estimation means, can obtain more comprehensive user feedback data in the training process, has strong objectivity and high accuracy, and solves the subjectivity caused by artificial estimation.
The present invention also provides a voice data processing method, as shown in fig. 3, including the following steps:
and step S1, acquiring voice data input by the user, and preprocessing the voice data.
Step S2, performing feature extraction on the preprocessed voice data at preset sampling intervals by adopting an open source voice algorithm library to obtain a first low-level feature set; calculating resonance frequency Formants, local amplitude root mean square Peak2RMS, first-order derivatives MFCC-deltas of Mel cepstrum coefficients and second-order derivatives MFCC-delta-deltas of the Mel cepstrum coefficients of voice data to obtain a second low-level feature set; calculating the maximum value, the minimum value, the median, the mean value, the variance, the kurtosis, the skewness, the linear regression slope, the linear regression intercept and the linear regression fitting degree of each low-level feature to obtain a high-level feature set; and (4) carrying out Pearson correlation test according to the depression scale score and the high-grade feature set of the user, and screening out the related high-grade features.
And step S3, inputting the related advanced features into a pre-trained neural network model to obtain the depression scale score of the user.
In the voice data processing method provided by the embodiment of the invention, the standard cepstrum parameter MFCC reflects the static characteristics of the voice parameters, the difference spectrum of the static characteristics (the first derivative and the second derivative of the MFCC) reflects the dynamic characteristics of the voice parameters, and the dynamic characteristics and the static characteristics are combined to be used as the input of a prediction model, so that the recognition accuracy of individual emotion change can be obviously improved.
Further, in step S1, the preprocessing is performed on the voice data, and includes: and transcoding the voice data into a single track wav format voice file with a sampling frequency of 16 KHz. Specifically, voice data is transcoded into a wav format by using an FFmpeg tool, the voice data in the wav format is down-sampled to 16KHz by using a Python script, and finally the voice data is saved as a mono wav voice file.
Specifically, in step S2, performing feature extraction on the preprocessed voice data at preset sampling intervals by using an open source voice algorithm library to obtain a first low-level feature set, including: and extracting Low-Level-Descriptors (LLDs) of the audio file by adopting an open source speech algorithm library at a sampling interval of 0.01s, and extracting 74 LLDs. Each LLDs is a chronological sequence of features.
Specifically, in step S2, calculating the resonance frequency Formants of the voice data includes: the resonance frequency of the speech data is calculated using a linear predictive coding acoustic model. The linear predictive coding acoustic model includes:
Figure BDA0002921869560000121
where s (k) is speech data, p is the order of the filter, apE (k) is the coefficient of the filter, and e (k) is the random noise of unvoiced sound.
Specifically, in step S2, calculating the local amplitude root mean square Peak2RMS includes: and segmenting the waveform signal of the voice data by using the window size of 20ms and the window displacement of 10ms, calculating the root mean square of the amplitude range of the audio signal in each segment, and combining the root mean square and the root mean square in the sequence of the segments to obtain the local amplitude root mean square Peak2 RMS.
Specifically, in step S2, calculating the first derivative of the mel-frequency cepstral coefficient MFCC-delta-deltas and the second derivative of the mel-frequency cepstral coefficient MFCC-delta-deltas includes: the method comprises the steps of carrying out pre-emphasis, framing, windowing, fast Fourier transform, Mel filter bank and cepstrum analysis on an audio signal to obtain Mel cepstrum coefficients MFCC, and calculating differential coefficients (first derivatives) of the Mel cepstrum coefficients and acceleration coefficients (second derivatives) of the Mel cepstrum coefficients according to the Mel cepstrum coefficients MFCC.
Further, in step S2, a pearson correlation test is performed according to the depression scale score and the high-level feature set of the user, and the relevant high-level features are screened out, including: performing Pearson correlation test according to the depression scale scores and the High-level feature set of the user to obtain a correlation coefficient of each High-level feature HSFs (High-Static-Functions) and the depression scale scores of the user; and performing statistical test on the set of correlation coefficients to obtain a P value corresponding to each high-level feature HSFs, and screening out the high-level feature HSFs of which the P value is smaller than a preset value as the related high-level features. Specifically, the preset value is 0.01.
Specifically, the relevant high-level features screened out include: MCEP _0_ kurt, MFCC _ deltas _10_ interrupt, MCEP _17_ R2, MCEP _0_ skew, Peak2RMS _ kurt, MFCC _ delta _ deltas _11_ max, MFCC _ deltas _11_ std, MFCC _ delta _ deltas _4_ kurt, MFCC _ deltas _12_ kurt, MCEP _0_ R2, MFCC _ deltas _5_ std, MFCC _ delta _ deltas _18_ R2, crop _ std, MFCC _ delta _ deltas _1_ kurt, MFCC _ delta _ deltas _5_ max, PeRMS _ RMS _ skew, HMD _2_ skew, MCEP _3_ alert, MFCC _ phtals _9_ kurt, MFCC _ deltas _5_ max, MFCC _2_ deltas _5_ max, MFCC _ deltas _ kurt _ 368 _ MFCC _ deltas _5_ mfd, MFCC _ deltas _5_ kurt _12_ mfd _ MFCC _ deltas _5_ kurt _ mfd, MFCC _ deltas _3_ kurt _3_ phstd, MFCC _9_ kurt _10_ mfd _3_ mfd _3_ phd _3_ mfd _3_ mfd. Where mcpe denotes mel cepstrum, MFCC _ deltas denotes a first derivative of mel cepstrum coefficients, MFCC _ delta _ deltas denotes a second derivative of mel cepstrum coefficients, Peak2RMS denotes a local amplitude root mean square, shear denotes glottal microphonic, HMPDD denotes a deviation (variance) of an error of a harmonic phase, HMPDM denotes a mean (mean) of an error of a harmonic phase, skew denotes skewness, kurt denotes kurtosis, intercept denotes an intercept of a linear regression, R2 denotes an error of a linear regression, alph denotes a slope of a linear regression, std denotes a standard deviation, max denotes a maximum value, and min denotes a minimum value.
Specifically, in step S3, the neural network model is structured as shown in fig. 4, and the network is composed of 5 layers, where 4 fully-connected layers are hidden layers, each hidden layer contains 32 nodes, and a softplus function is used as an activation function for the input layer and the hidden layer, and 1 fully-connected layer is used as an output layer for outputting the prediction result of the model. Specifically, in step S3, the obtaining of the pre-trained neural network model includes: a1, acquiring a voice data sample as a training set; a2, carrying out the operations of steps S1 and S2 on the training set to obtain a related advanced feature set; and A3, inputting the relevant advanced feature set into the neural network, and continuously repeating the forward propagation and backward propagation algorithms to enable the mean square error of the predicted value and the true value to reach a preset value, thereby obtaining a trained neural network model. The trained neural network model can achieve the purpose of predicting the rating of the patient scale.
We recruited 47 volunteers to score the depression scale and recorded the voice file of the volunteers reciting "life like summer flowers". After voice data are collected, the voice data processing method provided by the invention is used in combination with scale feedback to train a prediction model. And verifying the validity of the model by adopting a leave-one-out verification mode. Through verification, the model can reach 3.11 points of absolute error, and the correlation coefficient of the prediction score and the scale score can reach 0.68. The effect of the neural network model is evaluated by using leave-one-out verification, namely, one sample is selected as a verification set during training, 80% of the rest samples are used as a training set, and 20% of the rest samples are used as a test set. The training set is used for training the neural network model, the verification set is used for evaluating the convergence condition of the model in the training process, and the test set is used for predicting after the training is finished so as to evaluate the prediction capability of the model. The training and validation process is repeated for a number of times on all the data sets to ensure that each sample is subjected to the validation set to obtain the estimation of the predictive ability of the model on the data sets, and the training is finished as shown in fig. 5.
Then, the treatment authority of the cognitive behavior training system for the depression is opened for 47 volunteers in the background, and the volunteers are arranged to perform CCBT intervention. After intervention is finished, according to the voice data processing method provided by the invention, volunteers are evaluated by combining with scale feedback. The evaluation results showed that the depression status of 47 volunteers was significantly improved, while the differential analysis of the acoustic characteristics of the volunteers after receiving CCBT treatment also demonstrated the improvement of depression symptoms.
It should be understood that the above description of specific embodiments of the present invention is only for the purpose of illustrating the technical lines and features of the present invention, and is intended to enable those skilled in the art to understand the contents of the present invention and to implement the present invention, but the present invention is not limited to the above specific embodiments. It is intended that all such changes and modifications as fall within the scope of the appended claims be embraced therein.

Claims (10)

1. A depressive cognitive behavior training system, comprising:
the login module (1) is used for acquiring login request information of a user and realizing login;
the cognitive behavior training module (4) is used for realizing the cognitive behavior training of the user in a training interface according to the training stage where the user is currently located and the training content corresponding to the training stage after the login is successful;
and the emotion recording module (5) is used for entering an audio recording page before each training stage starts and after each training stage is finished, analyzing according to voice data recorded by the user and obtaining a user depression scale score.
2. The system of claim 1, further comprising:
the selection module (2) is used for entering a page introduced by the cognitive behavior training of the depression after logging in and selecting the cognitive behavior training of the depression;
the verification module (3) is used for verifying whether the user opens the authority of the cognitive behavior training module after the cognitive behavior training of the depression is selected;
accordingly, the number of the first and second electrodes,
and the cognitive behavior training module (4) is used for realizing the cognitive behavior training of the user in a training interface according to the training stage where the user is currently located and the training content corresponding to the training stage after the verification is successful.
3. The system according to claim 1, characterized in that the cognitive behavior training module (4) is specifically configured to,
when the training stage in which the user is currently positioned is a first training stage, according to the training content for identifying the depression, the cognitive behavior training of the user is realized in a training interface, and the training stage enters a second training stage after the cognitive behavior training is completed;
when the training stage in which the user is currently positioned is a second training stage, according to the training content for recognizing the computerized cognitive behavior training, the cognitive behavior training of the user is realized in a training interface, and the training stage enters a third training stage after the cognitive behavior training is completed;
when the training stage of the user is a third training stage, according to the first training content for recognizing the automatic thinking, the cognitive behavior training of the user is realized in the training interface, and the training enters a fourth training stage after the cognitive behavior training is finished;
when the training stage of the user is the fourth training stage, according to the second training content for recognizing the automatic thinking, the cognitive behavior training of the user is realized in the training interface, and the training stage enters the fifth training stage after the cognitive behavior training is finished;
when the training stage of the user is a fifth training stage, according to the first training content for correcting the automatic thinking, the cognitive behavior training of the user is realized in the training interface, and the training stage enters a sixth training stage after the cognitive behavior training is finished;
when the training stage of the user is a sixth training stage, according to the second training content for correcting the automatic thinking, the cognitive behavior training of the user is realized in the training interface, and the training stage enters a seventh training stage after the cognitive behavior training is finished;
when the training stage in which the user is currently located is a seventh training stage, according to the contents of the first six stages, the cognitive behavior training of the user is realized in a training interface in a question-answer mode, and the eighth training stage is started after the training is completed;
when the current training stage of the user is the eighth training stage, according to the training content for recognizing and correcting the intermediate beliefs, the cognitive behavior training of the user is realized in a training interface, and the ninth training stage is started after the cognitive behavior training is finished;
when the current training stage of the user is the ninth training stage, according to the training content of the problem solving method, the cognitive behavior training of the user is realized in the training interface, and the tenth training stage is started after the cognitive behavior training is finished;
when the training stage of the user is the tenth training stage, according to the first training content for recognizing and correcting the core beliefs, the cognitive behavior training of the user is realized in the training interface, and the eleventh training stage is started after the cognitive behavior training is finished;
when the current training stage of the user is the eleventh training stage, according to the second training content for recognizing and correcting the core beliefs, the cognitive behavior training of the user is realized in the training interface, and the twelfth training stage is started after the cognitive behavior training is finished;
and when the training stage in which the user is currently positioned is the twelfth training stage, the cognitive behavior training of the user is realized in the training interface according to the training content for consolidating and preventing relapse.
4. System according to claim 1, characterized in that the emotion recording module (5) is specifically adapted to,
acquiring voice data input by a user, and preprocessing the voice data;
performing feature extraction on the preprocessed voice data at preset sampling intervals by adopting an open source voice algorithm library to obtain a first low-level feature set; calculating the resonance frequency, the local amplitude root mean square, the first derivative of the Mel cepstrum coefficient and the second derivative of the Mel cepstrum coefficient of the voice data to obtain a second low-level feature set;
calculating the maximum value, the minimum value, the median, the mean value, the variance, the kurtosis, the skewness, the linear regression slope, the linear regression intercept and the linear regression fitting degree of each low-level feature to obtain a high-level feature set;
performing Pearson correlation test according to the depression scale score of the user and the high-grade feature set, and screening out related high-grade features;
and inputting the related high-grade features into a pre-trained neural network model to obtain the depression scale score of the user.
5. The system according to claim 1, characterized in that the emotion recording module (5) is further adapted to,
and analyzing the mood change of the user according to the depression scale scores of the user before the beginning and after the completion of the current training stage of the user, and determining whether the user carries out the next training stage.
6. The system of claim 1, further comprising:
and the family operation module (6) is used for entering a family operation page after the preset training stage is finished, recording and storing the family operation content finished by the user.
7. A method for processing voice data, comprising:
s1, acquiring voice data input by a user, and preprocessing the voice data;
s2, performing feature extraction on the preprocessed voice data at preset sampling intervals by adopting an open source voice algorithm library to obtain a first low-level feature set; calculating the resonance frequency, the local amplitude root mean square, the first derivative of the Mel cepstrum coefficient and the second derivative of the Mel cepstrum coefficient of the voice data to obtain a second low-level feature set;
calculating the maximum value, the minimum value, the median, the mean value, the variance, the kurtosis, the skewness, the linear regression slope, the linear regression intercept and the linear regression fitting degree of each low-level feature to obtain a high-level feature set;
performing Pearson correlation test according to the depression scale score of the user and the high-grade feature set, and screening out related high-grade features;
and S3, inputting the related high-grade characteristics into a pre-trained neural network model to obtain the depression scale score of the user.
8. The method according to claim 7, wherein the preprocessing the voice data in S1 includes:
and transcoding the voice data into a single track wav format voice file with a sampling frequency of 16 KHz.
9. The method of claim 7, wherein screening for relevant advanced features in S2 by performing a pearson correlation test based on the user' S depression scale score and the set of advanced features comprises:
performing Pearson correlation test according to the depression scale scores of the users and the high-grade feature set to obtain a correlation coefficient of each high-grade feature and the depression scale scores of the users;
and carrying out statistical test on the set of the correlation coefficients to obtain a P value corresponding to each high-grade feature, and screening out the high-grade features with the P values smaller than a preset value as the related high-grade features.
10. The method of claim 7, wherein the step of calculating the resonant frequency of the voice data in step S2 comprises: calculating the resonance frequency of the voice data by adopting a linear predictive coding sound model;
the linear predictive coding acoustic model includes:
Figure FDA0002921869550000041
where s (k) is speech data, p is the order of the filter, apE (k) is the coefficient of the filter, and e (k) is the random noise of unvoiced sound.
CN202110119287.XA 2021-01-28 2021-01-28 Depression cognitive behavior training system and voice data processing method Active CN112908435B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110119287.XA CN112908435B (en) 2021-01-28 2021-01-28 Depression cognitive behavior training system and voice data processing method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110119287.XA CN112908435B (en) 2021-01-28 2021-01-28 Depression cognitive behavior training system and voice data processing method

Publications (2)

Publication Number Publication Date
CN112908435A true CN112908435A (en) 2021-06-04
CN112908435B CN112908435B (en) 2024-05-31

Family

ID=76119684

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110119287.XA Active CN112908435B (en) 2021-01-28 2021-01-28 Depression cognitive behavior training system and voice data processing method

Country Status (1)

Country Link
CN (1) CN112908435B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115116475A (en) * 2022-06-13 2022-09-27 北京邮电大学 Voice depression automatic detection method and device based on time delay neural network

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107169272A (en) * 2017-05-04 2017-09-15 北京心海导航教育科技股份有限公司 cognitive behavior training method and system
KR20180084332A (en) * 2017-01-16 2018-07-25 어빌리티기술융합협동조합 Evaluation and training system of mild cognitive impairment
CN108550375A (en) * 2018-03-14 2018-09-18 鲁东大学 A kind of emotion identification method, device and computer equipment based on voice signal
CN111951824A (en) * 2020-08-14 2020-11-17 苏州国岭技研智能科技有限公司 Detection method for distinguishing depression based on sound
CN112006697A (en) * 2020-06-02 2020-12-01 东南大学 Gradient boosting decision tree depression recognition method based on voice signals

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20180084332A (en) * 2017-01-16 2018-07-25 어빌리티기술융합협동조합 Evaluation and training system of mild cognitive impairment
CN107169272A (en) * 2017-05-04 2017-09-15 北京心海导航教育科技股份有限公司 cognitive behavior training method and system
CN108550375A (en) * 2018-03-14 2018-09-18 鲁东大学 A kind of emotion identification method, device and computer equipment based on voice signal
CN112006697A (en) * 2020-06-02 2020-12-01 东南大学 Gradient boosting decision tree depression recognition method based on voice signals
CN111951824A (en) * 2020-08-14 2020-11-17 苏州国岭技研智能科技有限公司 Detection method for distinguishing depression based on sound

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115116475A (en) * 2022-06-13 2022-09-27 北京邮电大学 Voice depression automatic detection method and device based on time delay neural network
CN115116475B (en) * 2022-06-13 2024-02-02 北京邮电大学 Voice depression automatic detection method and device based on time delay neural network

Also Published As

Publication number Publication date
CN112908435B (en) 2024-05-31

Similar Documents

Publication Publication Date Title
Cen et al. A real-time speech emotion recognition system and its application in online learning
CN101201980B (en) Remote Chinese language teaching system based on voice affection identification
Ramakrishnan Recognition of emotion from speech: A review
CN112006697B (en) Voice signal-based gradient lifting decision tree depression degree recognition system
CN108564942A (en) One kind being based on the adjustable speech-emotion recognition method of susceptibility and system
Xiao et al. Analyzing speech rate entrainment and its relation to therapist empathy in drug addiction counseling
CN116343824B (en) Comprehensive evaluation and solution method, system, device and medium for talent expression capability
Caponetti et al. Biologically inspired emotion recognition from speech
CN106073706A (en) A kind of customized information towards Mini-mental Status Examination and audio data analysis method and system
US20200160881A1 (en) Language disorder diagnosis/screening
Borrie et al. The role of somatosensory information in speech perception: Imitation improves recognition of disordered speech
Li et al. Global-local-feature-fused driver speech emotion detection for intelligent cockpit in automated driving
Hamsa et al. An enhanced emotion recognition algorithm using pitch correlogram, deep sparse matrix representation and random forest classifier
Wang et al. ECAPA-TDNN Based Depression Detection from Clinical Speech.
CN112908435B (en) Depression cognitive behavior training system and voice data processing method
Matsuura et al. Refinement of utterance fluency feature extraction and automated scoring of L2 oral fluency with dialogic features
CN117666790A (en) Immersive talent expression training system based on brain-computer interface technology
Jia et al. Two-level discriminative speech emotion recognition model with wave field dynamics: A personalized speech emotion recognition method
Wang Detecting pronunciation errors in spoken English tests based on multifeature fusion algorithm
Trabelsi et al. Discrete and continuous emotion recognition using sequence kernels
Krishnamachari et al. Developing neural representations for robust child-adult diarization
CN114242115A (en) Intelligent home-school system
Sahoo et al. Detection of speech-based physical load using transfer learning approach
Gomes Implementation of i-vector algorithm in speech emotion recognition by using two different classifiers: Gaussian mixture model and support vector machine
US20240185861A1 (en) Method and system of verifying the identity of a participant during a clinical assessment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant