CN115935075A - Social network user depression detection method integrating tweet information and behavior characteristics - Google Patents

Social network user depression detection method integrating tweet information and behavior characteristics Download PDF

Info

Publication number
CN115935075A
CN115935075A CN202310045687.XA CN202310045687A CN115935075A CN 115935075 A CN115935075 A CN 115935075A CN 202310045687 A CN202310045687 A CN 202310045687A CN 115935075 A CN115935075 A CN 115935075A
Authority
CN
China
Prior art keywords
user
layer
text
depression
historical
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202310045687.XA
Other languages
Chinese (zh)
Other versions
CN115935075B (en
Inventor
王李冬
曹世华
胡克用
李文娟
安康
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Dayu Chuangfu Technology Co ltd
Original Assignee
Qianjiang College of Hangzhou Normal University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Qianjiang College of Hangzhou Normal University filed Critical Qianjiang College of Hangzhou Normal University
Priority to CN202310045687.XA priority Critical patent/CN115935075B/en
Publication of CN115935075A publication Critical patent/CN115935075A/en
Application granted granted Critical
Publication of CN115935075B publication Critical patent/CN115935075B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A90/00Technologies having an indirect contribution to adaptation to climate change
    • Y02A90/10Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Machine Translation (AREA)

Abstract

The invention discloses a social network user depression detection method integrating tweet information and behavior characteristics. Firstly, crawling a user data set from a Sina microblog database, performing text cleaning, and generating a depressed user data set and a non-depressed user data set through manual marking; then, combining the multichannel CNN and the BiGRU based on the attention mechanism to analyze the emotional tendency of each text pushed by the user, filtering out part of forward emotional pushed texts, and forming a user history text; secondly, extracting characteristics of the user such as posting time, forwarding behavior and image publishing behavior to form a user behavior characteristic vector; and finally, building a depression detection model fusing the historical text of the user and the behavior of the user, training the detection model by using an Adam optimization method, and detecting the user to be detected by using the model after the training is finished. The method can effectively integrate the user text pushing information and the user behavior characteristics to automatically detect the depression state of the user, and has the characteristics of low detection cost, convenient operation and the like.

Description

Social network user depression detection method integrating tweet information and behavior characteristics
Technical Field
The invention relates to the field of depression automatic detection, in particular to a social network user depression detection technology based on tweet information and user behavior characteristics.
Background
As a more serious disorder disease, depression affects the physical and mental health of patients. According to the statistics of the world health organization, the number of global depression patients is up to 3.22 hundred million. Accurate diagnosis of patients with depression is a prerequisite for treatment, but patients with depression must actively contact with mental health professionals and actively seek medical advice to have an opportunity to obtain a diagnosis. However, due to the lack of medical knowledge in most people, the risk of disease is not realized, or factors such as shame, etc., make more than 70% of early depression patients not effectively treated. Therefore, an automatic depression screening technology without a face diagnosis is urgently needed, potential depression patients are excavated, and harm to people and the society caused by depression is reduced through automatic early warning or auxiliary diagnosis provided for corresponding medical institutions and the like.
The current automatic detection method of depression is mainly realized by using voice or video characteristics, for example, srimadhur et al propose a convolutional neural network based on a spectrum program to process voice signals, and about 60% of accuracy can be obtained based on the method. Negi et al use attributes of voice, pitch, and rhythm to build a depression detection model. The Melo et al can propose an accurate prediction method based on the distributed learning on the basis of facial expression analysis of the face, explore the relationship between facial images and depression levels, and have robustness to noise data and uncertain labels. There is a commonality in the above-mentioned research methods, that is, most methods require analysis through voice, face image and video data at diagnosis and treatment, and the acquisition of these data requires the user to actively seek medical advice.
With the popularity of social networks, more and more users are beginning to share their emotions and feelings on social media, such as Twitter and Facebook. More and more researchers have discovered that social media can serve as a window to observe the mental health of a user. For example, shen et al faced the Twitter platform and found that the behavior of depressed users and non-depressed users on the social platform was not the same. Chiu et al predicted the composite depression score for each post on the Instagram using features such as images and text on the social network, and fully considered the time interval factor between tweets. Zogan et al, which aim at text objects, perform text semantic coding by a multi-layer attention mechanism, and predict the probability value of depression of a user by using a neural network. However, the above method still has several problems:
1) The text of the existing text of the user contains more useless noise text, and the text of the text interferes the detection of the depressed user and influences the accuracy of the detection. However, most algorithms analyze all historical tweets of users, and a satisfactory detection effect cannot be achieved.
2) Most existing methods ignore behavior attributes of a user publishing a tweet, such as publishing time, whether the published tweet contains an image, whether the published tweet has a forwarding attribute, and the like.
Disclosure of Invention
The invention aims to overcome the defects of the prior art and provides a depression detection method fusing a tweet text and user behavior characteristics.
The technical scheme adopted by the invention for solving the technical problem comprises the following steps:
step 1, crawling a user data set from a Xinlang microblog database, cleaning texts, and generating a depressed user data set and a non-depressed user data set through manual labeling, namely a data acquisition module.
And 2, analyzing the emotional tendency of each tweet of the user by fusing the multichannel CNN and the BiGRU based on the attention mechanism to obtain the emotional tendency probability value of the tweet of the user. Randomly removing positive emotional tendency tweets with a certain proportion p from the history tweets of each user, and splicing the rest tweets into a user history text T; namely a tweet emotional tendency analysis module.
And 3, extracting characteristics of the posting time, the forwarding behavior, the image publishing behavior and the like of the user to form a user behavior characteristic vector, namely a user behavior characteristic module.
And 4, building a depression detection model fusing the user historical text T and the user behavior feature vectors, inputting the T into a BiGRU layer and a feed-forward (feed-forward) attention layer to obtain the feature vectors of each user historical text, and inputting the feature vectors of the user historical texts and the user behavior feature vectors into a full connection layer and a softmax layer after fusing the feature vectors of the user historical texts.
And 5, training a depression detection model by using an Adam optimization method, and detecting the depression state of the user by using a test set after the training is finished.
Further, the step 1 is specifically realized as follows:
1-1, collecting a data set of candidate depression users from Xinlang microblogs. Several topics related to depression, such as "depression", "juvenile depression", are randomly selected, and then candidate depressed users are crawled from each topic. And (4) crawling historical data of the candidate depressed users, wherein the historical data comprises information such as historical tweets, publishing time, whether forwarding tweets are adopted, whether images are published and the like.
1-2, selecting users who mention the history of depression diagnosis in the tweet of the candidate depression user data set as depression users; in addition, if the user's tweets contain words for symptoms associated with the field of depression including "depression", and some associated therapeutic agents include "sertraline", "fluoxetine", etc., the same is set for depressed users.
1-3, randomly selecting users from the theme unrelated to depression, for example, randomly selecting users from the theme such as 'this day happy', 'food', 'travel', and the like, and crawling historical data of the users, including information such as historical tweets, publishing time, whether forwarding tweets are required, whether images are published, and forming a data set of the users not suffering depression.
And 1-4, for the text data of each user, performing cleaning on the text data through word segmentation and data filtering. Text word segmentation is performed using the "Jieba" word segmentation package. The data filtering is mainly to remove the "#" theme, URL information, irregular characters, stop words and official account users, and convert emoticons into text information.
Further, the step 2 is specifically realized as follows:
2-1, pre-training a CBOW model by utilizing a large-scale Chinese Wikipedia data set so as to obtain an embedded vector of the Chinese word. The history of each user is pushed to the text t i After the CBOW model, a matrix S belongs to R n×d Where n represents the number of words in the tweet and d represents the embedded vector dimension for each word.
2-2. As shown in FIG. 2, the matrix S is input into a multi-channel CNN, which contains convolutional and pooling layers. In the convolutional layer, assume the convolutional kernel W ∈ R h×d H = {2,3,4} is the size of the convolution kernel, and the eigenvector a = [ a ] is obtained by the convolution kernel W 0 ,a 1 ,...,a n-h ]∈R n-h+1 ,a j =σ(W·S i:i+h-1 + b); where σ represents a non-linear function, b represents a bias term, S i:i+h-1 Representing the ith through (i + h-1) th rows of the matrix S. In the pooling layer: and (4) inputting the output of the convolution layers under different convolution kernels into the pooling layer, and extracting the most important feature O under the fixed dimension.
2-3, pushing each text t i Input into the attention-based BiGRU model. The first layer is designed as a BiGRU layer having a forward GRU and backward GRU structure. In the first layer, the outputs from the hidden layers in both directions are connected as the final output of the BiGRU layer. The second layer is designed as a feed-forward attention layer to obtain a representative vector with fixed length:
c i =tanh(W i h i +b i )
Figure SMS_1
Figure SMS_2
wherein ,hi Representing a word s i Output vector at BiGRU layer,c i Represents the output of the fully connected layer, W i ∈R 1×d and bi E R is the weight and bias in the attention calculation process, h represents the output of the attention layer, α i Representing a word s i Attention distribution coefficient of (1).
2-4. As shown in fig. 2, the feature O of the output of step 2-2 and the output h of the attention layer of step 2-3 are spliced to obtain a vector V = [ O, h =]. V is input into the fully connected layer and a dropout layer is added after the fully connected layer to prevent overfitting. Designing a softmax layer after the dropout layer, and outputting to obtain a user specific text t i Positive and negative emotional tendency probability value p (y) i = positive') and p (y) i ='negative')。p(y i = 'positive') represents the probability value that the tweet is a positive emotional tendency, p (y) i = 'negative') represents a probability value for a presumed negative emotional tendency.
And 2-5, training the model by using an Adam optimizer.
And 2-6, randomly removing positive emotion texts with a certain proportion p from the history texts of each user, and splicing the rest texts into a history text T.
Further, the step 3 is realized as follows:
3-1, in order to extract the release time characteristics of a certain user, extracting the tweet proportion released by each user every hour in a week. In specific implementation, the proportion of the number of the derived messages is calculated according to the number of the derived messages issued in a specific hour
Figure SMS_3
The pushtext publication time in one day can form a 24-dimensional feature, and the pushtext in one week forms a 168-dimensional feature, which is marked as f t
3-2, in order to extract the forwarding behavior characteristics of a certain user, extracting the forwarding labels of the previous 150 historical tweets of the certain user as forwarding behavior characteristic vectors. If a certain tweet is forwarded from the tweets of other people, the forwarding tag is set to 1, otherwise, the forwarding tag is set to 0. If there are fewer than 150 historical tweets for a user, the vector is filled with 1 s. The generated user forwarding behavior feature vector is recorded as f r
3-3. For extracting image of userAnd (4) distributing characteristics, namely extracting image distribution labels of the previous 150 historical tweets of a certain user to form a characteristic vector. If a certain tweet issued by the user contains image information, the image issuing tag is set to 1, otherwise, the image issuing tag is set to 0. If a user's historical tweets are less than 150, the vector is filled with 0 s. The generated image release characteristic vector is recorded as f g
3-4. The value ranges of different characteristics are different, so that the characteristic f is obtained t Normalized to [0,1 ] by min-max normalization method]To give f' t Then f 'is prepared' t 、f r and fg And f is obtained by splicing the feature vectors. f is the behavior feature vector of the user with dimension 468.
Further, the step 4 is implemented as follows:
4-1, obtaining an embedded vector of each word in a historical text T of a certain user by utilizing the CBOW model obtained by training in the step 2-1, and forming a historical tweet sequence S' e in R m×d Where m represents the total number of words in the historical tweet sequence, d represents the embedding vector dimension for each word, d =300.
4-2. As shown in FIG. 3, the historical tweet sequence S' for each user is entered into the attention-based BiGRU model. The first layer is designed as a BiGRU layer with a forward GRU and backward GRU structure. In this layer, the outputs of the hidden layer from both directions are connected as the final output of the BiGRU. The second layer is designed as a feed-forward attention layer to obtain a representative vector with fixed length:
c i '=tanh(W i 'h i '+b i ')
Figure SMS_4
Figure SMS_5
wherein ,hi ' represents the word S ' in the historical tweet sequence S ' i At the output vector of BiGRU, c i ' denotes the output of the fully connected layer, W i '∈R 1×d and bi '. Epsilon.R is the weight and bias in the attention calculation process, alpha i ' stands for the word s i ' the attention-assigning coefficient, h ' represents the output of the attention layer, i.e., the feature vector of the user's historical text.
4-3, as shown in figure 3, splicing the characteristic vector h 'of the user historical text and the user behavior characteristic vector f, inputting the spliced characteristic vector h' and the user behavior characteristic vector f into a full connection layer, then designing a sigmoid layer, and outputting to obtain the depression probability value of the user
Figure SMS_6
Figure SMS_7
wherein ,
Figure SMS_8
represents the output of the fully connected layer, W f and bf Representing weights and biases, and defining a cross-entropy loss function as:
Figure SMS_9
wherein K represents the number of training sets.
Further, the step 5 is implemented as follows:
5-1. The model in FIG. 3 was trained on a training set using an Adam optimizer.
And 5-2, after training, inputting the test set into a text-pushing emotion judgment model, filtering positive emotion text with a certain proportion p, forming the remaining text of each user into historical text pushing of the user, extracting behavior characteristic vectors of the user according to the step 3, inputting the behavior characteristic vectors and the historical text pushing into the trained automatic detection model, and outputting the probability value that a certain user suffers from depression.
The invention has the following beneficial effects:
the method has the focus on how to effectively fuse the user text information and the user behavior characteristics to automatically detect the depression state of the user. The method can track the psychological behavior condition of the user at any time based on the disclosed social platform data, is used for automatic detection of depression users, can also be used as an early automatic screening technology for depression users in a social network, and has the characteristics of low detection cost, convenience in operation and the like. The method comprises the steps of forming a user behavior feature vector based on extracting features of a user such as posting time, forwarding behavior and image publishing behavior; and finally, building a depression detection model fusing the user historical text and the user behaviors. The method can automatically predict potential depression users of the social network, and provides favorable technical means for auxiliary diagnosis of depression in hospitals, early-stage psychological problem early warning and tracking of college students, entry assessment of employees of enterprises and public institutions and the like.
Drawings
FIG. 1 is a flow chart of the method of the present invention;
FIG. 2 is a model diagram of a tweet emotion decision model that integrates a multichannel CNN and an attention BiGRU;
fig. 3 is a diagram of an automatic depression detection model that fuses historical text sequences and user behavior.
Detailed Description
The invention will be further described with reference to the accompanying drawings.
As shown in fig. 1, the method for detecting depression of social network users by fusing tweet information and behavior characteristics includes the following steps:
step 1, crawling a user data set from a Xinlang microblog database, cleaning texts, and generating a depressed user data set and a non-depressed user data set through manual labeling.
And 2, analyzing the emotional tendency of each tweet of the user by fusing the multichannel CNN and the BiGRU based on the attention mechanism to obtain the emotional tendency probability value of the tweet of the user. And randomly removing the positive emotional tendency tweed with a certain proportion p from the historical tweed of each user, and splicing the rest tweeds into the historical text T of the user.
And 3, extracting characteristics of the posting time, the forwarding behavior, the image publishing behavior and the like of the user to form a user behavior characteristic vector.
And 4, building a depression detection model fusing the user historical texts and the user behaviors, inputting the T into a BiGRU layer and a feed-forward (feed-forward) attention layer to obtain a feature vector of each user historical text, fusing the feature vectors of the user historical texts and the user behavior feature vectors, and inputting the feature vectors into a full connection layer and a softmax layer.
And 5, training the detection model by using an Adam optimization method, and detecting the depression state of the user by using a test set after the training is finished.
Further, the step 1 is specifically realized as follows:
1-5, collecting a data set of candidate depression users from the Xinlang microblog. Several topics related to depression, such as "depression", "juvenile depression", are selected, and candidate depressed users are crawled from each topic. We crawl these users' historical data, including historical tweets, time of publication, whether forwarding tweets, whether images are published, etc.
1-6, selecting users who mention a history of depression diagnosis in their tweets as depressed users for the candidate depressed user dataset; in addition, if the user's tweets contain words for symptoms related to the field of depression, such as "depression", and some related therapeutic drugs, such as "sertraline", "fluoxetine", etc., the same is set for depressed users.
1-7, randomly selecting users from the topics of ' this day happy ', food, travel ' and the like, crawling historical data of the users, including information such as historical text pushing, publishing time, whether text pushing is forward or not, whether images are published and the like, and forming a non-depression user data set.
According to the steps, the method faces to the domestic social platform Xinlang microblog, crawls data and generates a data set of a large depressed user and a non-depressed user. The data set contained 6423 depressed users and 8617 normal users. The specific situation is as follows:
TABLE 1 Sina microblog user data set specific information
Number of users Number of context
Depression user 6423 207322
Non-depressed user 8617 496327
Total up to 15040 703649
And 1-8, for the text data of each user, performing cleaning on the text data through word segmentation and data filtering. Text word segmentation is performed using a "Jieba" word segmentation package. The data filtering mainly comprises the steps of removing a # theme, URL information, irregular characters, stop words and official account users, and converting emoticons into text information.
Further, the step 2 is specifically realized as follows:
2-1, pre-training a CBOW model by utilizing a large-scale Chinese Wikipedia data set to obtain Chinese words
The embedded vector of (2). The present invention sets the word vector size to 300. The history of each user is pushed to the text t i After the CBOW model, a matrix S belongs to R n×d Where n represents the number of words in the tweet and d represents the embedded vector dimension for each word.
2-2. As shown in FIG. 2, S is input into a multi-channel CNN, which contains convolutional and pooling layers. In convolutional layers, a convolutional kernel is assumedW∈R h×d H = {2,3,4} is the size of the convolution kernel, and the eigenvector a = [ a ] is obtained by the convolution kernel W 0 ,a 1 ,...,a n-h ]∈R n-h+1 ,a j =σ(W·S i:i+h-1 + b); where σ represents a non-linear function, b represents a bias term, S i:i+h-1 Representing the ith through (i + h-1) th rows of the matrix S. The present invention sets the number of each convolution kernel to 128 with a step size of 1. In the pooling layer, the output of the convolutional layers under different convolutional kernels is input into the pooling layer, and the most important feature O under a fixed dimension is extracted, wherein the dimension is 128 x 3.
2-3, inputting each tweet into the attention-based BiGRU model. The first layer is designed as a BiGRU layer with a forward GRU and backward GRU structure, and the present invention sets the dimension of the hidden layer to 128. In this layer, the outputs of the hidden layer from both directions are connected as the final output of the BiGRU. The second layer is designed as a feed-forward attention layer to obtain a representative vector with fixed length:
c i =tanh(W i h i +b i )
Figure SMS_10
Figure SMS_11
wherein ,hi Representing words s i At the output vector of BiGRU, c i Represents the output of the fully connected layer, W i ∈R 1×d and bi E R is the weight and bias in the attention calculation process, h represents the output of the attention layer and has a fixed length of 128, alpha i Representing a word s i Attention distribution coefficient of (1).
2-4. As shown in fig. 2, the output O of step 2-2 and the output h of step 2-3 are spliced to obtain a vector V = [ O, h =]Dimension 512. V is input into the fully connected layer and a dropout layer is added after the fully connected layer to prevent overfitting. Designing softmax layer after dropout layer, and outputting to obtain usefulUser-specific tweet t i Positive and negative emotional tendency probability value p (y) i = 'positive') and p (y) i ='negative')。p(y i = 'positive') represents the probability value that the inferences are positive emotional trends, p (y) i = 'negative') represents the probability value that the tweet is a negative emotional tendency.
2-5. The model in FIG. 2 was trained using an Adam optimizer. Specifically, the Mini-batch size is set to 100, the learning rate is set to 0.001, the epoch is set to 50, and the discharge rate is set to 0.5. And 2-6, randomly removing positive emotion texts with a certain proportion p from the history texts of each user, and splicing the rest texts into a history text T. In a specific implementation, the ratio p =0.5 is set.
Further, the step 3 is realized as follows:
3-1, in order to extract the release time characteristics of a certain user, extracting the tweet proportion released by each user every hour in a week. In particular, the proportion of the number of the tweets is calculated according to the number of the tweets issued in a specific hour
Figure SMS_12
The pushtext publication time in one day can form a 24-dimensional feature, and the pushtext in one week forms a 168-dimensional feature, which is marked as f t . <xnotran> , 20 , 0 23 , [0,0,0,0,0,2,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0 </xnotran>]<xnotran> , 24 [0,0,0,0,0,0.1,0,0,0.05,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0 </xnotran>]。
3-2, in order to extract the forwarding behavior characteristics of a certain user, extracting the forwarding labels of the previous 150 historical tweets of the certain user as forwarding behavior characteristic vectors. If a certain pushtext is forwarded from the pushtext of other people, the forwarding tag is set to 1, otherwise, the forwarding tag is set to 0. If there are fewer than 150 historical tweets for a user, the vector is filled with 1 s. The generated user forwarding behavior feature vector is recorded as f r
3-3, in order to extract the image release characteristics of the user, extracting the image release labels of the previous 150 historical tweets of the user to form a characteristic vector. If a certain tweet issued by the user contains a figureAnd if the image information is obtained, the image release label is set to be 1, otherwise, the image release label is set to be 0. If a user's historical tweets are less than 150, the vector is filled with 0 s. The generated image release characteristic vector is recorded as f g
3-4. The value ranges of different characteristics are different, so that the characteristic f is obtained t Normalized to [0,1 ] by min-max normalization method]To give f' t Then f 'is prepared' t ,f r and fg And f is obtained by splicing the feature vectors. f is the behavior feature vector of the user with dimension 468.
Further, the step 4 is implemented as follows:
4-1, obtaining an embedded vector of each word in a certain user historical text T by utilizing the CBOW model obtained by training in the step 2-1, and forming a matrix S' belonging to the R m×d Where m represents the total number of words in the historical tweet sequence, d represents the embedding vector dimension for each word, and d =300.
4-2. As shown in FIG. 3, the historical tweet sequence S' for each user is entered into the attention-based BiGRU model. The first layer is designed as a BiGRU layer with a forward GRU and backward GRU structure, and the present invention sets the dimension of the hidden layer to 128. In this layer, the outputs from the hidden layers in both directions are connected as the final output of the BiGRU. The second layer is designed as a feed-forward attention layer to obtain a representative vector with fixed length:
c i '=tanh(W i 'h i '+b i ')
Figure SMS_13
Figure SMS_14
wherein ,hi ' denotes vocabulary S ' in the history grammar sequence S ' i Output vector at BiGRU, c i ' denotes the output of the fully connected layer, W i '∈R 1×d and bi '. Epsilon.R is the weight and bias in the attention calculation process, α i ' is representative ofWord s i 'and h' represents the output of the attention layer. In particular, h' has a fixed length of 128.
4-3, as shown in figure 3, splicing the characteristic vector h 'of the user historical text and the user behavior characteristic vector f, inputting the spliced characteristic vector h' and the user behavior characteristic vector f into a full connection layer, then designing a sigmoid layer, and outputting to obtain the depression probability value of the user
Figure SMS_15
Figure SMS_16
wherein ,
Figure SMS_17
represents the output of the fully connected layer, W f and bf Representing weights and biases, and defining a cross-entropy loss function as:
Figure SMS_18
wherein K represents the number of training sets.
Further, the step 5 is implemented as follows:
5-1. The model in FIG. 3 was trained on a training set using an Adam optimizer. Specifically, the Mini-batch size is set to 200, the learning rate is set to 0.001, the epoch is set to 100, and the discharge rate is set to 0.5.
And 5-2, after training, inputting the test set into a text-pushing emotion judgment model, filtering positive emotion text with a certain proportion p, forming the remaining text of each user into historical text pushing of the user, extracting behavior characteristic vectors of the user according to the step 3, inputting the behavior characteristic vectors and the historical text pushing into the trained automatic detection model, and outputting the probability value that a certain user suffers from depression. The invention makes the crawled Sina user database according to the following steps of 7:3, dividing the training set and the test set in proportion, wherein the specific judgment standard comprises the following steps: F1-Score, recall and Precision, the test results are shown in Table 2.
TABLE 2 test results
Method Precision Recall F1_score
TBF 0.8581 0.7258 0.7864
EHLM 0.8723 0.7896 0.8289
This patent 0.8887 0.8749 0.8823
In addition, comparing the present invention with the TBF (Chiong et al) and EHLM (Ansari et al) methods, the results of Table 2 show that the effect of the present invention is significantly superior to the other two methods. Both on Precision and on Recall, a major improvement was achieved. Compared with a TBF method, the method achieves 0.0959 improvement on F1_ Score; compared with the EHLM method, the invention achieves a 0.0534 improvement on F1_ Score.

Claims (6)

1. The social network user depression detection method fusing tweet information and behavior characteristics is characterized by comprising the following steps of:
step 1, crawling a user data set from a Xinlang microblog database, cleaning texts, and generating a depressed user data set and a non-depressed user data set through manual labeling;
step 2, integrating the multichannel CNN and the BiGRU based on the attention mechanism to analyze the emotional tendency of each piece of tweed of the user, and obtaining the emotional tendency probability value of the tweed of the user; randomly removing positive emotional tendency tweets with a certain proportion p from the history tweets of each user, and splicing the rest tweets into a user history text T;
step 3, extracting characteristics of the user such as posting time, forwarding behavior and image publishing behavior to form a user behavior characteristic vector;
step 4, building a depression detection model fusing the user historical text T and the user behavior feature vectors, inputting the T into a BiGRU layer and a feedforward attention layer to obtain the feature vectors of each user historical text, fusing the feature vectors of the user historical texts and the user behavior feature vectors, and then inputting the fused feature vectors into a full connection layer and a softmax layer;
and 5, training a depression detection model by using an Adam optimization method, and detecting the depression state of the user by using a test set after the training is finished.
2. The method for detecting depression of social network users based on fusion of tweet information and behavior features as claimed in claim 1, wherein the step 1 is implemented as follows:
1-1, collecting a data set of candidate depression users from the Xinlang microblog; randomly selecting several topics relevant to depression, and then crawling candidate depression users from each relevant topic; historical data of the candidate depressed users are crawled, wherein the historical data comprises historical tweets, release time, whether forwarding tweets are used or not and whether image information is released or not;
1-2, selecting users who mention a history of depression diagnosis in their tweets as depressed users for the candidate depressed user data set; in addition, if the user's tweets contain words for symptoms related to the field of depression including "suicide", "depression", and some related therapeutic drugs including "sertraline", "fluoxetine", the same is set for depressed users;
1-3, randomly selecting users from the non-relevant theme of depression, and crawling historical data of the users, wherein the historical data comprises historical text pushing, publishing time, whether the text is forwarding text pushing or not, whether image information is published or not, and forming a non-depression user data set;
1-4, aiming at the text data of each user, cleaning the text data through word segmentation and data filtering; performing text word segmentation by using a "Jieba" word segmentation packet; the data filtering mainly comprises the steps of removing a # theme, URL information, irregular characters, stop words and official account users, and converting emoticons into text information.
3. The method for detecting depression of social network users based on fusion of tweet information and behavior features as claimed in claim 2, wherein the step 2 is implemented as follows:
2-1, pre-training a CBOW model by utilizing a large-scale Chinese Wikipedia data set so as to obtain an embedded vector of Chinese words; pushing the history of each user to a text t i After the CBOW model, a matrix S belongs to R n×d Where n represents the number of words in the tweet, d represents the embedded vector dimension of each word;
2-2, inputting the matrix S into a multichannel CNN, wherein the multichannel CNN comprises a convolution layer and a pooling layer; in the convolutional layer, assume the convolutional kernel W ∈ R h×d H = {2,3,4} is the size of the convolution kernel, and the eigenvector a = [ a ] is obtained by the convolution kernel W 0 ,a 1 ,...,a n-h ]∈R n-h+1 ,a j =σ(W·S i:i+h-1 + b); where σ denotes a non-linear function, b denotes a bias term, S i:i+h-1 Represents the ith to ith + h-1 rows of the matrix S; in the pooling layer: inputting the output of the convolution layers under different convolution kernels into a pooling layer, and extracting the most important feature O under a fixed dimension;
2-3, inputting each tweet into a BiGRU model based on attention; designing the first layer as a BiGRU layer with a forward GRU structure and a backward GRU structure; in the first layer, the outputs from the hidden layers in both directions are connected as the final output of the BiGRU layer; the second layer is designed as a feed-forward attention layer to obtain a representative vector with fixed length:
c i =tanh(W i h i +b i )
Figure FDA0004055315870000021
Figure FDA0004055315870000022
wherein ,hi Representing a word s i Output vector at the BiGRU layer, c i Represents the output of the fully connected layer, W i ∈R 1×d and bi E R is the weight and bias in the attention calculation process, h represents the output of the attention layer, α i Representing a word s i The attention distribution coefficient of (a);
2-4, splicing the output characteristic O of the step 2-2 and the output h of the attention layer of the step 2-3 to obtain a vector V = [ O, h =](ii) a Inputting V into a full connection layer, and adding a dropout layer after the full connection layer to prevent overfitting; designing a softmax layer after the dropout layer, and outputting to obtain a user specific text t i Positive and negative emotional tendency probability value p (y) i = 'positive') and p (y) i ='negative');p(y i = 'positive') represents the probability value that the inferences are positive emotional trends, p (y) i = 'negative') represents the probability value of inferring a negative emotional tendency;
2-5, training the model by using an Adam optimizer;
and 2-6, randomly removing positive emotion texts with a certain proportion p from the history texts of each user, and splicing the rest texts into a history text T.
4. The method for detecting depression of social network users based on fusion of tweet information and behavior features as claimed in claim 3, wherein the step 3 is implemented as follows:
3-1, extracting the tweet proportion released by each user every hour in a week in order to extract the release time characteristics of the user; calculating the ratio of the number of the given time-lapse messages according to the number of the given time-lapse messages
Figure FDA0004055315870000031
The pushtext release time in one day can form a 24-dimensional feature, and the pushtext in one week can form a 168-dimensional release time feature, which is marked as f t
3-2, in order to extract the forwarding behavior characteristics of a certain user, extracting the forwarding labels of the previous 150 historical tweets of the certain user as forwarding behavior characteristic vectors; if a certain pushtext is forwarded from the pushtext of other people, the forwarding label is set to be 1, otherwise, the forwarding label is set to be 0; if the history tweet of a user is less than 150 pieces, filling the vector with 1; the generated user forwarding behavior feature vector is recorded as f r
3-3, in order to extract image release characteristics of a user, extracting image release labels of the previous 150 historical tweets of a certain user to form a characteristic vector; if a certain tweet published by the user contains image information, setting an image publishing label as 1, otherwise, setting the image publishing label as 0; if the historical tweets of a certain user are less than 150, filling the vector with 0; the generated image release characteristic vector is recorded as f g
3-4. The value ranges of different characteristics are different, so that the characteristic f is obtained t Normalized to [0,1 ] by min-max normalization method]To give f' t Then f 'is prepared' t 、f r and fg Splicing the feature vectors to obtain f; f is the behavior feature vector of the user with dimension 468.
5. The method for detecting depression of social network users based on fusion of tweet information and behavior features as claimed in claim 4, wherein the step 4 is implemented as follows:
4-1, obtaining an embedded vector of each word in a historical text T of a certain user by utilizing the CBOW model obtained by training in the step 2-1, and forming a historical tweet sequence S' e in R m×d Where m represents the total number of words in the historical tweet sequence, d represents the embedded vector dimension for each word, d =300;
4-2, inputting the historical tweet sequence S' of each user into a BiGRU model based on attention; designing the first layer as a BiGRU layer with a forward GRU structure and a backward GRU structure; in this layer, the outputs from the hidden layers in both directions are connected as the final output of the BiGRU; the second layer is designed as a feedforward attention layer to obtain a representative vector with a fixed length:
c i '=tanh(W i 'h i '+b i ')
Figure FDA0004055315870000041
Figure FDA0004055315870000042
wherein ,hi ' represents the word S ' in the historical tweet sequence S ' i Output vector at BiGRU, c i ' denotes the output of the fully connected layer, W i '∈R 1×d and bi '. Epsilon.R is the weight and bias in the attention calculation process, alpha i ' represents the word s i ' the attention distribution coefficient, h ' represents the output of the attention layer, i.e. the feature vector of the user's historical text;
4-3, splicing the characteristic vector h 'of the user historical text and the user behavior characteristic vector f, inputting the spliced characteristic vector h' and the user behavior characteristic vector f into a full-connection layer, designing a sigmoid layer, and outputting to obtain the depression probability value of the user
Figure FDA0004055315870000043
Figure FDA0004055315870000044
wherein ,
Figure FDA0004055315870000045
represents the output of the fully connected layer, W f and bf Representing weights and biases, and defining a cross-entropy loss function as:
Figure FDA0004055315870000046
wherein K represents the number of training sets.
6. The method for detecting depression of social network users based on fusion of tweet information and behavior features as claimed in claim 5, wherein the step 3 is implemented as follows:
and (3) after the training of the depression detection model is finished, inputting the test set into the depression detection model, filtering positive emotion text with a certain proportion p, forming the rest text of each user into the historical text of the user, extracting the behavior characteristic vector of the user according to the step 3, inputting the behavior characteristic vector and the historical text into the trained depression detection model, and outputting the probability value that a certain user suffers from depression.
CN202310045687.XA 2023-01-30 2023-01-30 Social network user depression detection method integrating text information and behavior characteristics Active CN115935075B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310045687.XA CN115935075B (en) 2023-01-30 2023-01-30 Social network user depression detection method integrating text information and behavior characteristics

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310045687.XA CN115935075B (en) 2023-01-30 2023-01-30 Social network user depression detection method integrating text information and behavior characteristics

Publications (2)

Publication Number Publication Date
CN115935075A true CN115935075A (en) 2023-04-07
CN115935075B CN115935075B (en) 2023-08-18

Family

ID=86654491

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310045687.XA Active CN115935075B (en) 2023-01-30 2023-01-30 Social network user depression detection method integrating text information and behavior characteristics

Country Status (1)

Country Link
CN (1) CN115935075B (en)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110807320A (en) * 2019-11-11 2020-02-18 北京工商大学 Short text emotion analysis method based on CNN bidirectional GRU attention mechanism
CN112417098A (en) * 2020-11-20 2021-02-26 南京邮电大学 Short text emotion classification method based on CNN-BiMGU model
WO2021133157A1 (en) * 2019-12-23 2021-07-01 Mimos Berhad System and method for automatically detecting depression symptoms of a social media user
CN113220825A (en) * 2021-03-23 2021-08-06 上海交通大学 Modeling method and system of topic emotion tendency prediction model for personal tweet
CN113435192A (en) * 2021-06-15 2021-09-24 王丽亚 Chinese text emotion analysis method based on changing neural network channel cardinality
CN114628008A (en) * 2022-03-22 2022-06-14 广东工业大学 Social user depression tendency detection method based on heterogeneous graph attention network
CN115080725A (en) * 2022-06-02 2022-09-20 四川大学 Multi-element time series feature extraction method for depression symptoms of social network users
CN115392260A (en) * 2022-10-31 2022-11-25 暨南大学 Social media tweet emotion analysis method facing specific target

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110807320A (en) * 2019-11-11 2020-02-18 北京工商大学 Short text emotion analysis method based on CNN bidirectional GRU attention mechanism
WO2021133157A1 (en) * 2019-12-23 2021-07-01 Mimos Berhad System and method for automatically detecting depression symptoms of a social media user
CN112417098A (en) * 2020-11-20 2021-02-26 南京邮电大学 Short text emotion classification method based on CNN-BiMGU model
CN113220825A (en) * 2021-03-23 2021-08-06 上海交通大学 Modeling method and system of topic emotion tendency prediction model for personal tweet
CN113435192A (en) * 2021-06-15 2021-09-24 王丽亚 Chinese text emotion analysis method based on changing neural network channel cardinality
CN114628008A (en) * 2022-03-22 2022-06-14 广东工业大学 Social user depression tendency detection method based on heterogeneous graph attention network
CN115080725A (en) * 2022-06-02 2022-09-20 四川大学 Multi-element time series feature extraction method for depression symptoms of social network users
CN115392260A (en) * 2022-10-31 2022-11-25 暨南大学 Social media tweet emotion analysis method facing specific target

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
方振宇;: "基于抑郁词典的社交网络心理障碍检测方法", 电脑知识与技术, no. 07, pages 44 - 47 *
易顺明;易昊;周国栋;: "采用情感特征向量的Twitter情感分类方法研究", 小型微型计算机系统, no. 11, pages 2454 - 2458 *

Also Published As

Publication number Publication date
CN115935075B (en) 2023-08-18

Similar Documents

Publication Publication Date Title
Almars Attention-Based Bi-LSTM Model for Arabic Depression Classification.
Kabir et al. DEPTWEET: A typology for social media texts to detect depression severities
Mao et al. Prediction of depression severity based on the prosodic and semantic features with bidirectional LSTM and time distributed CNN
CN112256866A (en) Text fine-grained emotion analysis method based on deep learning
Asghar et al. Detection and classification of psychopathic personality trait from social media text using deep learning model
Figuerêdo et al. Early depression detection in social media based on deep learning and underlying emotions
Kawintiranon et al. PoliBERTweet: a pre-trained language model for analyzing political content on twitter
Kholodna et al. A Machine Learning Model for Automatic Emotion Detection from Speech.
Tseng et al. Approaching Human Performance in Behavior Estimation in Couples Therapy Using Deep Sentence Embeddings.
CN116245110A (en) Multi-dimensional information fusion user standing detection method based on graph attention network
Cao et al. Category-aware chronic stress detection on microblogs
CN114628008A (en) Social user depression tendency detection method based on heterogeneous graph attention network
CN117393163A (en) Social network user depression detection method and system based on multi-mode information fusion
CN115935075B (en) Social network user depression detection method integrating text information and behavior characteristics
Marerngsit et al. A two-stage text-to-emotion depressive disorder screening assistance based on contents from online community
Tadisetty et al. Anonymous prediction of mental illness in social media
Wu et al. Development of Internet suicide message identification and the Monitoring-Tracking-Rescuing model in Taiwan
Bell et al. Detecting diabetes risk from social media activity
Kanaan et al. Detecting mental disorders through social media content
Uddin Depression Detection in Text Using Long Short-Term Memory-Based Neural Structured Learning
Cao et al. News detection for recurrent neural network approach
Sudha et al. Depression detection using machine learning
Agbesi et al. Multichannel 2D-CNN Attention-Based BiLSTM Method for Low-Resource Ewe Sentiment Analysis
Mehta et al. Sentiment Analysis on Covid-19 Using Deep Learning
CN113535948B (en) LSTM-Attention text classification method introducing essential point information

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20240206

Address after: Room 801, 85 Kefeng Road, Huangpu District, Guangzhou City, Guangdong Province

Patentee after: Guangzhou Dayu Chuangfu Technology Co.,Ltd.

Country or region after: China

Address before: Hangzhou City, Zhejiang province 310036 Xiasha Higher Education Park forest Street No. 16

Patentee before: HANGZHOU NORMAL UNIVERSITY QIANJIANG College

Country or region before: China