CN118213068A - Depressive disorder detection algorithm based on cascading of large language model and small model - Google Patents

Depressive disorder detection algorithm based on cascading of large language model and small model Download PDF

Info

Publication number
CN118213068A
CN118213068A CN202410397717.8A CN202410397717A CN118213068A CN 118213068 A CN118213068 A CN 118213068A CN 202410397717 A CN202410397717 A CN 202410397717A CN 118213068 A CN118213068 A CN 118213068A
Authority
CN
China
Prior art keywords
depression
small
model
large language
language model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202410397717.8A
Other languages
Chinese (zh)
Inventor
郑通
李腾
郭艳蓉
洪日昌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hefei University of Technology
Anhui University
Institute of Artificial Intelligence of Hefei Comprehensive National Science Center
Original Assignee
Hefei University of Technology
Anhui University
Institute of Artificial Intelligence of Hefei Comprehensive National Science Center
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hefei University of Technology, Anhui University, Institute of Artificial Intelligence of Hefei Comprehensive National Science Center filed Critical Hefei University of Technology
Priority to CN202410397717.8A priority Critical patent/CN118213068A/en
Publication of CN118213068A publication Critical patent/CN118213068A/en
Pending legal-status Critical Current

Links

Landscapes

  • Machine Translation (AREA)

Abstract

The invention discloses a depressive disorder detection algorithm based on cascading of a large language model and a small model, which belongs to the technical field of pattern recognition, and can finish depressive disorder detection by using online social network platform data. Specifically, the algorithm utilizes the contextual learning creation to enable ChatGPT, namely GPT-3.5 depression detection capability to be fully exerted, so that GPT-3.5 is used for extracting characteristics of user microblog text data screened under certain conditions, and finally a small-scale neural network is used for processing the characteristics, so that the identification and detection of the depression degree of the microblog user are completed. On one hand, the invention effectively utilizes the strong natural language processing capability of GPT-3.5 to accurately and efficiently analyze the microblog blog content of the user, and on the other hand, the invention uses the rear-connection small model to process the output text of the large language model, corrects the error of the large language model and ensures that the judging result is more accurate.

Description

Depressive disorder detection algorithm based on cascading of large language model and small model
Technical Field
The invention belongs to the technical field of pattern recognition, and particularly relates to a depressive disorder detection algorithm based on cascading of a large language model and a small model.
Background
Depression (Depression) is a common mental disease, and the traditional depression evaluation method at present mainly depends on self-report and doctor observation, but subjective errors exist in manual evaluation, and problems of long time consumption, tension of clinical staff and the like influence the evaluation result. There is an increasing awareness that an objectively effective depression detection method is now needed. With the advent of artificial intelligence (ARTIFICIAL INTELLIGENCE, AI), analysis of large amounts of data for clinical psychiatry has become increasingly viable. Breakthroughs are made in early prediction, screening and diagnosis of depression. Many machine learning architectures have conducted trials on datasets involving multiple modalities, including facial expressions, brain electrical signals, text, audio and video, and are increasingly being combined with mental health.
Analysis of depression in these modalities requires a data set of the corresponding modality, however, such depression data set has the problems of long collection time, large difficulty in collecting data, small total number of samples and unbalanced positive and negative samples, thus being unfavorable for development of related experiments. In recent years, more and more depression patients post and express their emotion on an online social network platform, so how to accurately identify emotion tendencies and depression degrees of online healthy community users by using artificial intelligence technology, thereby assisting effective treatment of depression patients, and becoming a hot problem of current academic and industry attention.
Disclosure of Invention
The invention aims to provide a depressive disorder detection algorithm based on cascading of a large language model and a small model, which solves the problem of poor precision of intelligently detecting depressive disorder in the prior art.
The aim of the invention can be achieved by the following technical scheme:
A depressive disorder detection algorithm based on cascading of a large language model and a small model, comprising the steps of:
firstly, arranging text contents;
comprehensively analyzing a plurality of depression scales, and editing preliminary promts suitable for a large language model by utilizing key information contained in the depression scales;
Selecting the blog text, and finishing the rest effective information into a text;
secondly, writing a Prompt in Few-Shot form;
analyzing a plurality of user blogs by utilizing the preliminarily edited promt, and compiling an analysis process into the promt to form Few-Shot-Learning form In-Context Learning to obtain complete promt, so as to finish judging a plurality of items of data of a single sample by utilizing a large language model; the main content is about the frequency of the user corresponding to the sample in a plurality of states, wherein the frequency of each state is expressed by an integer of 0 to 3;
Thirdly, training a small-scale neural network;
Training the small-scale neural network to finish final judgment, wherein input data are various values of the features extracted by the large language model, and output values are the result that the user is judged to be depressed or healthy.
Further, the method for carefully selecting the blog text comprises the following steps:
All n blogs for sample x i Respectively judging, if a piece of blog belongs to the forwarding lottery content, the too short blog content or the too long blog content, removing the piece of blog, and the restContinuously splicing to form a text section, wherein the text section is used as data of a single sample for subsequent judgment and is marked as x' i;
wherein m is the number of blog entries selected to participate in the experiment, m is less than or equal to n and m is less than or equal to 30.
Further, the depression scale comprises: PHQ-8 depression scale, DSM-5 diagnosis, hamiltonian depression scale, SDS depression self-rating scale, and Birns depression list.
Further, the method for writing Few-shot form Prompt comprises the following steps:
Manual diagnostic analysis of J samples with real tags, the J samples being noted as Corresponding to real tag/>And will be specific to the sample/>The analysis results of (a) are arranged into a format which is convenient for programming batch statistics and is recorded as/>
The Prompt in Few-shot form, denoted promt FS,PromptFS, is:
wherein J denotes the use of Number of pairs;
Inputting the complete promt FS of the selected blond text x' i containing the ith sample into the GPT-3.5 model, obtaining the output of GPT-3.5 and recording as Answer i;
extraction of 16 items of data from Answer i I.e. the ith sample user is related to depression 16-dimensional features/>
Further, j=2, i.e., J e {1,2}.
Further, the method for training the small-scale neural network comprises the following steps:
Establishing a small-scale neural network, and converting the features extracted by the large language model into a confidence level of depression and health; the model input is 16-dimensional, corresponding to 16 items of data sorted by Answer i The output is 2DCorresponding to non-depression and depression.
The final prediction can be expressed as:
And judging that the sample x i is depression if the prediction result is 1, and judging that the sample x i is non-depression if the prediction result is 0.
Further, the calculation mode of NLLLoss in Pytorch, where Loss is negative log likelihood Loss, is expressed as:
Where y i represents the predicted result for the ith sample, True label representing the ith sample
The invention has the beneficial effects that:
the invention discloses a depressive disorder detection system based on cascading of a large language model and a small model, which commonly uses a Prompt, the large language model and a small-scale neural network for depressive disorder recognition work. In addition, the invention introduces an In-context Learning scheme into experiments to obtain an interpretable depression state evaluation system, on one hand, the powerful natural language processing capability of GPT-3.5 is effectively utilized to accurately and efficiently analyze the microblog text content of a user, and on the other hand, the invention processes the output text of a large language model by using a rear small model to correct errors of the large language model, so that the judgment result is more accurate.
Drawings
The invention is further described below with reference to the accompanying drawings.
FIG. 1 is a frame diagram of a depressive disorder detection algorithm in accordance with the present invention;
FIG. 2 is a schematic representation of the Few-shot form of Prompt.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
A depression disorder detection algorithm based on cascade connection of a large language model and a small model is shown in fig. 1, and is divided into three steps, specifically shown as follows:
firstly, arranging text contents;
Firstly, comprehensively analyzing various depression scales, and editing a preliminary Prompt project (Prompt) suitable for a large language model by utilizing key information contained in the depression scales;
Meanwhile, selecting the blogs, eliminating the forwarding lottery blogs which are easy to interfere with experiments, the too short blogs which cannot be used for judgment and the too long blogs which possibly cause the text length to exceed ChatGPT limit, and then finishing the rest effective information into a text;
secondly, writing a Prompt in Few-Shot form;
Analyzing a plurality of user blogs by utilizing the previously preliminarily edited promt, and compiling an analysis process into the promt to form Few-Shot-Learning (ICL) In the form of In-Context Learning to obtain a complete promt, so as to finish judging 16 items of data of a single sample by utilizing a large language model;
Thirdly, training a small-scale neural network;
training a small-scale (relatively large language model) neural network to finish final judgment, wherein input data are various values of the features extracted by the large language model, and output values are the result that the user is judged to be depressed or healthy.
For ease of understanding, the following description will now be given of a specific method of text content arrangement in the first step:
All n blogs for sample x i Respectively judging, if a piece of blog belongs to one of ①-③ contents, excluding the piece of blog, and the rest part/>And continuously splicing to form a text section, wherein the text section is used as data of a single sample for subsequent judgment and is marked as x ' i.
① Forwarding lottery drawing content;
Forwarding lottery content will seriously interfere with the judgment because it is a fixed content that is written earlier by a specific crowd and often accompanied by a large number of positive emotions. The invention carries out carefully choosing on the blog, and excludes lottery drawing information according to keywords such as red package, attention drawing, forwarding drawing and the like;
② Excessively short blog content;
Too short blog content cannot express specific emotion and is insufficient to support judgment, on the premise that the number of blog entries available for judgment by each user is limited, the content which is difficult to judge is eliminated, and normal blog content is selected for judgment more;
③ Lengthy blog content;
Considering that GPT-3.5-Turbo has a limit on the total length of an input text, the invention makes corresponding constraint on a blog with longer length;
The consolidated text content may be represented as:
wherein m is the number of blog entries selected to participate in the experiment, m is less than or equal to n and m is less than or equal to 30.
The core of promt is derived from a variety of depression scales including, but not limited to: PHQ-8 depression scale, DSM-5 diagnosis, hamiltonian depression scale (HAMD), depression self-rating scale (SDS) and Bernsted depression list (BDC);
These depressive scales are filled in a manner that determines how often the person filling the form is in a particular state (e.g. "never", "occasionally", "often" and "always").
The invention aims to enable ChatGPT to correctly answer the frequency of the states of a user after the user's blog is watched, and the answer format is convenient for programming batch statistics. In order to achieve the purposes, the invention refers to a plurality of depression scales, arranges 16 key problems, and refers to a sample of the Prompt writing of ChatGPT according to the format and filling requirements of the depression scales, and writes a preliminary Prompt in a Zero-Shot form, which is denoted as Prompt ZS.
For ease of understanding, a specific method for writing Few-shot form Prompt in the second step will now be described as follows, as shown in FIG. 2:
The present invention designs a promt In combination with In-Context Learning, and adds an analysis process for a plurality of user blogs to the promt, which is also specifically referred to as Few-Shot-Learning. The analysis process of the user blog is a diagnosis process of a single sample, the invention combines the real label to carry out manual diagnosis and analysis on J samples, and the J samples are recorded as Corresponding to real tag/>And will be specific to the sample/>The analysis results of (a) are arranged into a format which is convenient for programming batch statistics and is recorded as/>
In order to obtain the best subsequent experimental results, the invention tests a plurality of Prompt ZS,And/>The best promt is found by multiple experiments. The final determination of the primt in Few-shot form, denoted primt FS,PromptFS, can be expressed in the following form:
wherein J denotes the use of The number of pairs is typically j=2, i.e., J e {1,2}, where Few-shot can be taken as Two-shot.
The complete promt FS of the culled and collated Bowen text x' i containing the ith sample is input into the GPT-3.5 model, which yields the output of GPT-3.5, denoted Answer i, in a format similar toAnd/>The same or similar.
16 Items of data can be extracted from Answer i I.e. the ith sample user is related to depression 16-dimensional features/>The main content is about the frequency of the user corresponding to the sample in some states, and the frequency of each state is expressed by an integer from 0 to 3.
For ease of understanding, the following description will now be made of a specific method of training a medium-small scale neural network in the third step:
For 16-dimensional features extracted by using a large language model, the invention performs final processing in a traditional deep learning manner. The invention establishes a small-scale (relative to a large language model) neural network, and converts the features extracted by the large language model into a confidence level of depression and health. The model input is 16-dimensional, corresponding to 16 items of data sorted by Answer i Output is 2D/>Corresponding to non-depression and depression.
The final prediction can be expressed as:
And judging that the sample x i is depression if the prediction result is 1, and judging that the sample x i is non-depression if the prediction result is 0.
The calculation of NLLLoss in Pytorch for negative log likelihood Loss (Negative Log Likelihood Loss, NLLLoss) is shown as:
Where y i represents the predicted result for the ith sample, Representing the true label of the ith sample.
ChatGPT is a popular large language model based chat application that acts as a conversation agent, exposing its ability to understand text and respond accordingly. While ChatGPT and large language models are generally considered to have some limitations, these models have proven to be good generic models across different applications. While large language models have great potential, even the most powerful pre-trained LLMs may not directly meet specific needs when we need some custom output, experiments lack critical context, or need to handle specialized vocabulary. In order to make LLM more desirable, researchers often make some sort or some sort of "tuning," and the main tuning modes currently include: full Fine-Tuning (PEFT), prompt engineering (Prompt), and search enhancement generation (RAG). The Prompt includes In-context Learning (ICL), instruction Tuning, chain of Thought, and other specific methods.
The invention discloses a depressive disorder detection system based on cascading of a large language model and a small model, which commonly uses a Prompt, the large language model and a small-scale neural network for depressive disorder recognition work. In addition, the invention introduces an In-context Learning scheme into experiments to obtain an explanatory depression state assessment system.
The foregoing is merely illustrative and explanatory of the invention, as various modifications and additions may be made to the particular embodiments described, or in a similar manner, by those skilled in the art, without departing from the scope of the invention or exceeding the scope of the invention as defined in the claims.

Claims (7)

1. A depressive disorder detection algorithm based on cascading of a large language model and a small model, comprising the steps of:
firstly, arranging text contents;
comprehensively analyzing a plurality of depression scales, and editing preliminary promts suitable for a large language model by utilizing key information contained in the depression scales;
Selecting the blog text, and finishing the rest effective information into a text;
secondly, writing a Prompt in Few-Shot form;
analyzing a plurality of user blogs by utilizing the preliminarily edited promt, and compiling an analysis process into the promt to form Few-Shot-Learning form In-Context Learning to obtain complete promt, so as to finish judging a plurality of items of data of a single sample by utilizing a large language model; the main content is about the frequency of the user corresponding to the sample in a plurality of states, wherein the frequency of each state is expressed by an integer of 0 to 3;
Thirdly, training a small-scale neural network;
Training the small-scale neural network to finish final judgment, wherein input data are various values of the features extracted by the large language model, and output values are the result that the user is judged to be depressed or healthy.
2. The depressive disorder detection algorithm based on cascading large language models and small models according to claim 1, wherein the method for carefully selecting the blog comprises the following steps:
All n blogs for sample x i Judging respectively, if one piece of blog belongs to forwarding lottery content, too short blog content or too long blog content, removing the piece of blog, and the rest part/>Continuously splicing to form a text section, wherein the text section is used as data of a single sample for subsequent judgment and is marked as x' i;
wherein m is the number of blog entries selected to participate in the experiment, m is less than or equal to n and m is less than or equal to 30.
3. The depressive disorder detection algorithm based on cascading large language models and small models according to claim 1, wherein the depression scale comprises: PHQ-8 depression scale, DSM-5 diagnosis, hamiltonian depression scale, SDS depression self-rating scale, and Birns depression list.
4. The depressive disorder detection algorithm based on cascading large language model and small model according to claim 1, wherein the method for writing Few-shot form promt is as follows:
Manual diagnostic analysis of J samples with real tags, the J samples being noted as Corresponding to real tag/>And will be specific to the sample/>The analysis results of (a) are arranged into a format which is convenient for programming batch statistics and is recorded as
The Prompt in Few-shot form, denoted promt FS,PromptFS, is:
wherein J denotes the use of Number of pairs;
Inputting the complete promt FS of the selected blond text x' i containing the ith sample into the GPT-3.5 model, obtaining the output of GPT-3.5 and recording as Answer i;
extraction of 16 items of data from Answer i I.e. the ith sample user is related to depression 16-dimensional features/>
5. The depressive disorder detection algorithm based on concatenation of large language model and small model according to claim 4, wherein j=2, i.e. J e {1,2}.
6. The depressive disorder detection algorithm based on cascading large language models and small models according to claim 4, wherein the training method of the small-scale neural network is as follows:
Establishing a small-scale neural network, and converting the features extracted by the large language model into a confidence level of depression and health; the model input is 16-dimensional, corresponding to 16 items of data sorted by Answer i Output is 2D/>Corresponding to non-depression and depression.
The final prediction can be expressed as:
And judging that the sample x i is depression if the prediction result is 1, and judging that the sample x i is non-depression if the prediction result is 0.
7. The depressive disorder detection algorithm based on cascading large language models and small models according to claim 6, wherein the selected Loss is a negative log likelihood Loss, and the NLLLoss calculation mode in Pytorch is expressed as:
Where y i represents the predicted result for the ith sample, Representing the true label of the ith sample.
CN202410397717.8A 2024-04-02 2024-04-02 Depressive disorder detection algorithm based on cascading of large language model and small model Pending CN118213068A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410397717.8A CN118213068A (en) 2024-04-02 2024-04-02 Depressive disorder detection algorithm based on cascading of large language model and small model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410397717.8A CN118213068A (en) 2024-04-02 2024-04-02 Depressive disorder detection algorithm based on cascading of large language model and small model

Publications (1)

Publication Number Publication Date
CN118213068A true CN118213068A (en) 2024-06-18

Family

ID=91448079

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410397717.8A Pending CN118213068A (en) 2024-04-02 2024-04-02 Depressive disorder detection algorithm based on cascading of large language model and small model

Country Status (1)

Country Link
CN (1) CN118213068A (en)

Similar Documents

Publication Publication Date Title
CN108597541B (en) Speech emotion recognition method and system for enhancing anger and happiness recognition
CN111949759A (en) Method and system for retrieving medical record text similarity and computer equipment
CN109003677B (en) Structured analysis processing method for medical record data
CN111145903A (en) Method and device for acquiring vertigo inquiry text, electronic equipment and inquiry system
CN114188022A (en) Clinical children cough intelligent pre-diagnosis system based on textCNN model
CN111859938B (en) Electronic medical record entity relation extraction method based on position vector noise reduction and rich semantics
Wagner et al. Applying cooperative machine learning to speed up the annotation of social signals in large multi-modal corpora
CN114242194A (en) Natural language processing device and method for medical image diagnosis report based on artificial intelligence
CN117349427A (en) Artificial intelligence multi-mode content generation system for public opinion event coping
CN117497140B (en) Multi-level depression state detection method based on fine granularity prompt learning
CN112200674B (en) Stock market emotion index intelligent calculation information system
CN113360643A (en) Electronic medical record data quality evaluation method based on short text classification
CN117690581A (en) Disease inquiry process auxiliary information generation method based on large language model
CN115600602B (en) Method, system and terminal device for extracting key elements of long text
CN116611447A (en) Information extraction and semantic matching system and method based on deep learning method
CN118213068A (en) Depressive disorder detection algorithm based on cascading of large language model and small model
Menon et al. Deep learning based transcribing and summarizing clinical conversations
CN113836892A (en) Sample size data extraction method and device, electronic equipment and storage medium
Li et al. Research on Chorus Emotion Recognition and Intelligent Medical Application Based on Health Big Data
CN116992867B (en) Depression emotion detection method and system based on soft prompt theme modeling
CN117473096B (en) Knowledge point labeling method fusing LATEX labels and model thereof
CN117194604B (en) Intelligent medical patient inquiry corpus construction method
CN113268651B (en) Automatic abstract generation method and device for search information
CN117577306A (en) Alzheimer's disease diagnosis system based on audio frequency and text mode fusion
Ji et al. Depression Detection via User Behavior and Tweets

Legal Events

Date Code Title Description
PB01 Publication
SE01 Entry into force of request for substantive examination