CN115935075A - Social network user depression detection method integrating tweet information and behavior characteristics - Google Patents
Social network user depression detection method integrating tweet information and behavior characteristics Download PDFInfo
- Publication number
- CN115935075A CN115935075A CN202310045687.XA CN202310045687A CN115935075A CN 115935075 A CN115935075 A CN 115935075A CN 202310045687 A CN202310045687 A CN 202310045687A CN 115935075 A CN115935075 A CN 115935075A
- Authority
- CN
- China
- Prior art keywords
- user
- layer
- text
- depression
- historical
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000001514 detection method Methods 0.000 title claims abstract description 30
- 239000013598 vector Substances 0.000 claims abstract description 91
- 238000000034 method Methods 0.000 claims abstract description 36
- 230000000994 depressogenic effect Effects 0.000 claims abstract description 25
- 238000012549 training Methods 0.000 claims abstract description 25
- 230000002996 emotional effect Effects 0.000 claims abstract description 21
- 238000001914 filtration Methods 0.000 claims abstract description 10
- ORILYTVJVMAKLC-UHFFFAOYSA-N Adamantane Natural products C1C(C2)CC3CC1CC2C3 ORILYTVJVMAKLC-UHFFFAOYSA-N 0.000 claims abstract description 9
- 230000009193 crawling Effects 0.000 claims abstract description 9
- 238000004140 cleaning Methods 0.000 claims abstract description 7
- 230000007246 mechanism Effects 0.000 claims abstract description 5
- 238000005457 optimization Methods 0.000 claims abstract description 4
- 230000008451 emotion Effects 0.000 claims description 10
- 238000003745 diagnosis Methods 0.000 claims description 9
- 239000011159 matrix material Substances 0.000 claims description 9
- 238000011176 pooling Methods 0.000 claims description 9
- 230000011218 segmentation Effects 0.000 claims description 9
- 238000012360 testing method Methods 0.000 claims description 9
- 230000008569 process Effects 0.000 claims description 7
- 238000004364 calculation method Methods 0.000 claims description 6
- 208000020401 Depressive disease Diseases 0.000 claims description 4
- RTHCYVBBDHJXIQ-MRXNPFEDSA-N (R)-fluoxetine Chemical compound O([C@H](CCNC)C=1C=CC=CC=1)C1=CC=C(C(F)(F)F)C=C1 RTHCYVBBDHJXIQ-MRXNPFEDSA-N 0.000 claims description 3
- 229960002464 fluoxetine Drugs 0.000 claims description 3
- 230000001788 irregular Effects 0.000 claims description 3
- 238000002372 labelling Methods 0.000 claims description 3
- 238000012886 linear function Methods 0.000 claims description 3
- 238000010606 normalization Methods 0.000 claims description 3
- 229960002073 sertraline Drugs 0.000 claims description 3
- VGKDLMBJGBXTGI-SJCJKPOMSA-N sertraline Chemical compound C1([C@@H]2CC[C@@H](C3=CC=CC=C32)NC)=CC=C(Cl)C(Cl)=C1 VGKDLMBJGBXTGI-SJCJKPOMSA-N 0.000 claims description 3
- 208000024891 symptom Diseases 0.000 claims description 3
- 229940126585 therapeutic drug Drugs 0.000 claims description 2
- 230000004927 fusion Effects 0.000 claims 5
- 206010010144 Completed suicide Diseases 0.000 claims 1
- 244000097202 Rathbunia alamosensis Species 0.000 abstract description 3
- 235000009776 Rathbunia alamosensis Nutrition 0.000 abstract description 3
- 230000006399 behavior Effects 0.000 description 40
- 238000013527 convolutional neural network Methods 0.000 description 6
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 3
- 230000006872 improvement Effects 0.000 description 3
- 230000004630 mental health Effects 0.000 description 3
- 238000004458 analytical method Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 235000013305 food Nutrition 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 230000000366 juvenile effect Effects 0.000 description 2
- 238000012216 screening Methods 0.000 description 2
- 244000257727 Allium fistulosum Species 0.000 description 1
- 235000008553 Allium fistulosum Nutrition 0.000 description 1
- 241001608711 Melo Species 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 239000002131 composite material Substances 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 201000010099 disease Diseases 0.000 description 1
- 208000035475 disorder Diseases 0.000 description 1
- 239000003814 drug Substances 0.000 description 1
- 238000010195 expression analysis Methods 0.000 description 1
- 230000001815 facial effect Effects 0.000 description 1
- 230000008921 facial expression Effects 0.000 description 1
- 230000002349 favourable effect Effects 0.000 description 1
- 230000036541 health Effects 0.000 description 1
- 230000008520 organization Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000033764 rhythmic process Effects 0.000 description 1
- 229940124597 therapeutic agent Drugs 0.000 description 1
Images
Classifications
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02A—TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
- Y02A90/00—Technologies having an indirect contribution to adaptation to climate change
- Y02A90/10—Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation
Landscapes
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Machine Translation (AREA)
Abstract
The invention discloses a social network user depression detection method integrating tweet information and behavior characteristics. Firstly, crawling a user data set from a Sina microblog database, performing text cleaning, and generating a depressed user data set and a non-depressed user data set through manual marking; then, combining the multichannel CNN and the BiGRU based on the attention mechanism to analyze the emotional tendency of each text pushed by the user, filtering out part of forward emotional pushed texts, and forming a user history text; secondly, extracting characteristics of the user such as posting time, forwarding behavior and image publishing behavior to form a user behavior characteristic vector; and finally, building a depression detection model fusing the historical text of the user and the behavior of the user, training the detection model by using an Adam optimization method, and detecting the user to be detected by using the model after the training is finished. The method can effectively integrate the user text pushing information and the user behavior characteristics to automatically detect the depression state of the user, and has the characteristics of low detection cost, convenient operation and the like.
Description
Technical Field
The invention relates to the field of depression automatic detection, in particular to a social network user depression detection technology based on tweet information and user behavior characteristics.
Background
As a more serious disorder disease, depression affects the physical and mental health of patients. According to the statistics of the world health organization, the number of global depression patients is up to 3.22 hundred million. Accurate diagnosis of patients with depression is a prerequisite for treatment, but patients with depression must actively contact with mental health professionals and actively seek medical advice to have an opportunity to obtain a diagnosis. However, due to the lack of medical knowledge in most people, the risk of disease is not realized, or factors such as shame, etc., make more than 70% of early depression patients not effectively treated. Therefore, an automatic depression screening technology without a face diagnosis is urgently needed, potential depression patients are excavated, and harm to people and the society caused by depression is reduced through automatic early warning or auxiliary diagnosis provided for corresponding medical institutions and the like.
The current automatic detection method of depression is mainly realized by using voice or video characteristics, for example, srimadhur et al propose a convolutional neural network based on a spectrum program to process voice signals, and about 60% of accuracy can be obtained based on the method. Negi et al use attributes of voice, pitch, and rhythm to build a depression detection model. The Melo et al can propose an accurate prediction method based on the distributed learning on the basis of facial expression analysis of the face, explore the relationship between facial images and depression levels, and have robustness to noise data and uncertain labels. There is a commonality in the above-mentioned research methods, that is, most methods require analysis through voice, face image and video data at diagnosis and treatment, and the acquisition of these data requires the user to actively seek medical advice.
With the popularity of social networks, more and more users are beginning to share their emotions and feelings on social media, such as Twitter and Facebook. More and more researchers have discovered that social media can serve as a window to observe the mental health of a user. For example, shen et al faced the Twitter platform and found that the behavior of depressed users and non-depressed users on the social platform was not the same. Chiu et al predicted the composite depression score for each post on the Instagram using features such as images and text on the social network, and fully considered the time interval factor between tweets. Zogan et al, which aim at text objects, perform text semantic coding by a multi-layer attention mechanism, and predict the probability value of depression of a user by using a neural network. However, the above method still has several problems:
1) The text of the existing text of the user contains more useless noise text, and the text of the text interferes the detection of the depressed user and influences the accuracy of the detection. However, most algorithms analyze all historical tweets of users, and a satisfactory detection effect cannot be achieved.
2) Most existing methods ignore behavior attributes of a user publishing a tweet, such as publishing time, whether the published tweet contains an image, whether the published tweet has a forwarding attribute, and the like.
Disclosure of Invention
The invention aims to overcome the defects of the prior art and provides a depression detection method fusing a tweet text and user behavior characteristics.
The technical scheme adopted by the invention for solving the technical problem comprises the following steps:
step 1, crawling a user data set from a Xinlang microblog database, cleaning texts, and generating a depressed user data set and a non-depressed user data set through manual labeling, namely a data acquisition module.
And 2, analyzing the emotional tendency of each tweet of the user by fusing the multichannel CNN and the BiGRU based on the attention mechanism to obtain the emotional tendency probability value of the tweet of the user. Randomly removing positive emotional tendency tweets with a certain proportion p from the history tweets of each user, and splicing the rest tweets into a user history text T; namely a tweet emotional tendency analysis module.
And 3, extracting characteristics of the posting time, the forwarding behavior, the image publishing behavior and the like of the user to form a user behavior characteristic vector, namely a user behavior characteristic module.
And 4, building a depression detection model fusing the user historical text T and the user behavior feature vectors, inputting the T into a BiGRU layer and a feed-forward (feed-forward) attention layer to obtain the feature vectors of each user historical text, and inputting the feature vectors of the user historical texts and the user behavior feature vectors into a full connection layer and a softmax layer after fusing the feature vectors of the user historical texts.
And 5, training a depression detection model by using an Adam optimization method, and detecting the depression state of the user by using a test set after the training is finished.
Further, the step 1 is specifically realized as follows:
1-1, collecting a data set of candidate depression users from Xinlang microblogs. Several topics related to depression, such as "depression", "juvenile depression", are randomly selected, and then candidate depressed users are crawled from each topic. And (4) crawling historical data of the candidate depressed users, wherein the historical data comprises information such as historical tweets, publishing time, whether forwarding tweets are adopted, whether images are published and the like.
1-2, selecting users who mention the history of depression diagnosis in the tweet of the candidate depression user data set as depression users; in addition, if the user's tweets contain words for symptoms associated with the field of depression including "depression", and some associated therapeutic agents include "sertraline", "fluoxetine", etc., the same is set for depressed users.
1-3, randomly selecting users from the theme unrelated to depression, for example, randomly selecting users from the theme such as 'this day happy', 'food', 'travel', and the like, and crawling historical data of the users, including information such as historical tweets, publishing time, whether forwarding tweets are required, whether images are published, and forming a data set of the users not suffering depression.
And 1-4, for the text data of each user, performing cleaning on the text data through word segmentation and data filtering. Text word segmentation is performed using the "Jieba" word segmentation package. The data filtering is mainly to remove the "#" theme, URL information, irregular characters, stop words and official account users, and convert emoticons into text information.
Further, the step 2 is specifically realized as follows:
2-1, pre-training a CBOW model by utilizing a large-scale Chinese Wikipedia data set so as to obtain an embedded vector of the Chinese word. The history of each user is pushed to the text t i After the CBOW model, a matrix S belongs to R n×d Where n represents the number of words in the tweet and d represents the embedded vector dimension for each word.
2-2. As shown in FIG. 2, the matrix S is input into a multi-channel CNN, which contains convolutional and pooling layers. In the convolutional layer, assume the convolutional kernel W ∈ R h×d H = {2,3,4} is the size of the convolution kernel, and the eigenvector a = [ a ] is obtained by the convolution kernel W 0 ,a 1 ,...,a n-h ]∈R n-h+1 ,a j =σ(W·S i:i+h-1 + b); where σ represents a non-linear function, b represents a bias term, S i:i+h-1 Representing the ith through (i + h-1) th rows of the matrix S. In the pooling layer: and (4) inputting the output of the convolution layers under different convolution kernels into the pooling layer, and extracting the most important feature O under the fixed dimension.
2-3, pushing each text t i Input into the attention-based BiGRU model. The first layer is designed as a BiGRU layer having a forward GRU and backward GRU structure. In the first layer, the outputs from the hidden layers in both directions are connected as the final output of the BiGRU layer. The second layer is designed as a feed-forward attention layer to obtain a representative vector with fixed length:
c i =tanh(W i h i +b i )
wherein ,hi Representing a word s i Output vector at BiGRU layer,c i Represents the output of the fully connected layer, W i ∈R 1×d and bi E R is the weight and bias in the attention calculation process, h represents the output of the attention layer, α i Representing a word s i Attention distribution coefficient of (1).
2-4. As shown in fig. 2, the feature O of the output of step 2-2 and the output h of the attention layer of step 2-3 are spliced to obtain a vector V = [ O, h =]. V is input into the fully connected layer and a dropout layer is added after the fully connected layer to prevent overfitting. Designing a softmax layer after the dropout layer, and outputting to obtain a user specific text t i Positive and negative emotional tendency probability value p (y) i = positive') and p (y) i ='negative')。p(y i = 'positive') represents the probability value that the tweet is a positive emotional tendency, p (y) i = 'negative') represents a probability value for a presumed negative emotional tendency.
And 2-5, training the model by using an Adam optimizer.
And 2-6, randomly removing positive emotion texts with a certain proportion p from the history texts of each user, and splicing the rest texts into a history text T.
Further, the step 3 is realized as follows:
3-1, in order to extract the release time characteristics of a certain user, extracting the tweet proportion released by each user every hour in a week. In specific implementation, the proportion of the number of the derived messages is calculated according to the number of the derived messages issued in a specific hourThe pushtext publication time in one day can form a 24-dimensional feature, and the pushtext in one week forms a 168-dimensional feature, which is marked as f t 。
3-2, in order to extract the forwarding behavior characteristics of a certain user, extracting the forwarding labels of the previous 150 historical tweets of the certain user as forwarding behavior characteristic vectors. If a certain tweet is forwarded from the tweets of other people, the forwarding tag is set to 1, otherwise, the forwarding tag is set to 0. If there are fewer than 150 historical tweets for a user, the vector is filled with 1 s. The generated user forwarding behavior feature vector is recorded as f r 。
3-3. For extracting image of userAnd (4) distributing characteristics, namely extracting image distribution labels of the previous 150 historical tweets of a certain user to form a characteristic vector. If a certain tweet issued by the user contains image information, the image issuing tag is set to 1, otherwise, the image issuing tag is set to 0. If a user's historical tweets are less than 150, the vector is filled with 0 s. The generated image release characteristic vector is recorded as f g 。
3-4. The value ranges of different characteristics are different, so that the characteristic f is obtained t Normalized to [0,1 ] by min-max normalization method]To give f' t Then f 'is prepared' t 、f r and fg And f is obtained by splicing the feature vectors. f is the behavior feature vector of the user with dimension 468.
Further, the step 4 is implemented as follows:
4-1, obtaining an embedded vector of each word in a historical text T of a certain user by utilizing the CBOW model obtained by training in the step 2-1, and forming a historical tweet sequence S' e in R m×d Where m represents the total number of words in the historical tweet sequence, d represents the embedding vector dimension for each word, d =300.
4-2. As shown in FIG. 3, the historical tweet sequence S' for each user is entered into the attention-based BiGRU model. The first layer is designed as a BiGRU layer with a forward GRU and backward GRU structure. In this layer, the outputs of the hidden layer from both directions are connected as the final output of the BiGRU. The second layer is designed as a feed-forward attention layer to obtain a representative vector with fixed length:
c i '=tanh(W i 'h i '+b i ')
wherein ,hi ' represents the word S ' in the historical tweet sequence S ' i At the output vector of BiGRU, c i ' denotes the output of the fully connected layer, W i '∈R 1×d and bi '. Epsilon.R is the weight and bias in the attention calculation process, alpha i ' stands for the word s i ' the attention-assigning coefficient, h ' represents the output of the attention layer, i.e., the feature vector of the user's historical text.
4-3, as shown in figure 3, splicing the characteristic vector h 'of the user historical text and the user behavior characteristic vector f, inputting the spliced characteristic vector h' and the user behavior characteristic vector f into a full connection layer, then designing a sigmoid layer, and outputting to obtain the depression probability value of the user
wherein ,represents the output of the fully connected layer, W f and bf Representing weights and biases, and defining a cross-entropy loss function as:
wherein K represents the number of training sets.
Further, the step 5 is implemented as follows:
5-1. The model in FIG. 3 was trained on a training set using an Adam optimizer.
And 5-2, after training, inputting the test set into a text-pushing emotion judgment model, filtering positive emotion text with a certain proportion p, forming the remaining text of each user into historical text pushing of the user, extracting behavior characteristic vectors of the user according to the step 3, inputting the behavior characteristic vectors and the historical text pushing into the trained automatic detection model, and outputting the probability value that a certain user suffers from depression.
The invention has the following beneficial effects:
the method has the focus on how to effectively fuse the user text information and the user behavior characteristics to automatically detect the depression state of the user. The method can track the psychological behavior condition of the user at any time based on the disclosed social platform data, is used for automatic detection of depression users, can also be used as an early automatic screening technology for depression users in a social network, and has the characteristics of low detection cost, convenience in operation and the like. The method comprises the steps of forming a user behavior feature vector based on extracting features of a user such as posting time, forwarding behavior and image publishing behavior; and finally, building a depression detection model fusing the user historical text and the user behaviors. The method can automatically predict potential depression users of the social network, and provides favorable technical means for auxiliary diagnosis of depression in hospitals, early-stage psychological problem early warning and tracking of college students, entry assessment of employees of enterprises and public institutions and the like.
Drawings
FIG. 1 is a flow chart of the method of the present invention;
FIG. 2 is a model diagram of a tweet emotion decision model that integrates a multichannel CNN and an attention BiGRU;
fig. 3 is a diagram of an automatic depression detection model that fuses historical text sequences and user behavior.
Detailed Description
The invention will be further described with reference to the accompanying drawings.
As shown in fig. 1, the method for detecting depression of social network users by fusing tweet information and behavior characteristics includes the following steps:
step 1, crawling a user data set from a Xinlang microblog database, cleaning texts, and generating a depressed user data set and a non-depressed user data set through manual labeling.
And 2, analyzing the emotional tendency of each tweet of the user by fusing the multichannel CNN and the BiGRU based on the attention mechanism to obtain the emotional tendency probability value of the tweet of the user. And randomly removing the positive emotional tendency tweed with a certain proportion p from the historical tweed of each user, and splicing the rest tweeds into the historical text T of the user.
And 3, extracting characteristics of the posting time, the forwarding behavior, the image publishing behavior and the like of the user to form a user behavior characteristic vector.
And 4, building a depression detection model fusing the user historical texts and the user behaviors, inputting the T into a BiGRU layer and a feed-forward (feed-forward) attention layer to obtain a feature vector of each user historical text, fusing the feature vectors of the user historical texts and the user behavior feature vectors, and inputting the feature vectors into a full connection layer and a softmax layer.
And 5, training the detection model by using an Adam optimization method, and detecting the depression state of the user by using a test set after the training is finished.
Further, the step 1 is specifically realized as follows:
1-5, collecting a data set of candidate depression users from the Xinlang microblog. Several topics related to depression, such as "depression", "juvenile depression", are selected, and candidate depressed users are crawled from each topic. We crawl these users' historical data, including historical tweets, time of publication, whether forwarding tweets, whether images are published, etc.
1-6, selecting users who mention a history of depression diagnosis in their tweets as depressed users for the candidate depressed user dataset; in addition, if the user's tweets contain words for symptoms related to the field of depression, such as "depression", and some related therapeutic drugs, such as "sertraline", "fluoxetine", etc., the same is set for depressed users.
1-7, randomly selecting users from the topics of ' this day happy ', food, travel ' and the like, crawling historical data of the users, including information such as historical text pushing, publishing time, whether text pushing is forward or not, whether images are published and the like, and forming a non-depression user data set.
According to the steps, the method faces to the domestic social platform Xinlang microblog, crawls data and generates a data set of a large depressed user and a non-depressed user. The data set contained 6423 depressed users and 8617 normal users. The specific situation is as follows:
TABLE 1 Sina microblog user data set specific information
Number of users | Number of context | |
Depression user | 6423 | 207322 |
Non-depressed user | 8617 | 496327 |
Total up to | 15040 | 703649 |
And 1-8, for the text data of each user, performing cleaning on the text data through word segmentation and data filtering. Text word segmentation is performed using a "Jieba" word segmentation package. The data filtering mainly comprises the steps of removing a # theme, URL information, irregular characters, stop words and official account users, and converting emoticons into text information.
Further, the step 2 is specifically realized as follows:
2-1, pre-training a CBOW model by utilizing a large-scale Chinese Wikipedia data set to obtain Chinese words
The embedded vector of (2). The present invention sets the word vector size to 300. The history of each user is pushed to the text t i After the CBOW model, a matrix S belongs to R n×d Where n represents the number of words in the tweet and d represents the embedded vector dimension for each word.
2-2. As shown in FIG. 2, S is input into a multi-channel CNN, which contains convolutional and pooling layers. In convolutional layers, a convolutional kernel is assumedW∈R h×d H = {2,3,4} is the size of the convolution kernel, and the eigenvector a = [ a ] is obtained by the convolution kernel W 0 ,a 1 ,...,a n-h ]∈R n-h+1 ,a j =σ(W·S i:i+h-1 + b); where σ represents a non-linear function, b represents a bias term, S i:i+h-1 Representing the ith through (i + h-1) th rows of the matrix S. The present invention sets the number of each convolution kernel to 128 with a step size of 1. In the pooling layer, the output of the convolutional layers under different convolutional kernels is input into the pooling layer, and the most important feature O under a fixed dimension is extracted, wherein the dimension is 128 x 3.
2-3, inputting each tweet into the attention-based BiGRU model. The first layer is designed as a BiGRU layer with a forward GRU and backward GRU structure, and the present invention sets the dimension of the hidden layer to 128. In this layer, the outputs of the hidden layer from both directions are connected as the final output of the BiGRU. The second layer is designed as a feed-forward attention layer to obtain a representative vector with fixed length:
c i =tanh(W i h i +b i )
wherein ,hi Representing words s i At the output vector of BiGRU, c i Represents the output of the fully connected layer, W i ∈R 1×d and bi E R is the weight and bias in the attention calculation process, h represents the output of the attention layer and has a fixed length of 128, alpha i Representing a word s i Attention distribution coefficient of (1).
2-4. As shown in fig. 2, the output O of step 2-2 and the output h of step 2-3 are spliced to obtain a vector V = [ O, h =]Dimension 512. V is input into the fully connected layer and a dropout layer is added after the fully connected layer to prevent overfitting. Designing softmax layer after dropout layer, and outputting to obtain usefulUser-specific tweet t i Positive and negative emotional tendency probability value p (y) i = 'positive') and p (y) i ='negative')。p(y i = 'positive') represents the probability value that the inferences are positive emotional trends, p (y) i = 'negative') represents the probability value that the tweet is a negative emotional tendency.
2-5. The model in FIG. 2 was trained using an Adam optimizer. Specifically, the Mini-batch size is set to 100, the learning rate is set to 0.001, the epoch is set to 50, and the discharge rate is set to 0.5. And 2-6, randomly removing positive emotion texts with a certain proportion p from the history texts of each user, and splicing the rest texts into a history text T. In a specific implementation, the ratio p =0.5 is set.
Further, the step 3 is realized as follows:
3-1, in order to extract the release time characteristics of a certain user, extracting the tweet proportion released by each user every hour in a week. In particular, the proportion of the number of the tweets is calculated according to the number of the tweets issued in a specific hourThe pushtext publication time in one day can form a 24-dimensional feature, and the pushtext in one week forms a 168-dimensional feature, which is marked as f t . <xnotran> , 20 , 0 23 , [0,0,0,0,0,2,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0 </xnotran>]<xnotran> , 24 [0,0,0,0,0,0.1,0,0,0.05,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0 </xnotran>]。
3-2, in order to extract the forwarding behavior characteristics of a certain user, extracting the forwarding labels of the previous 150 historical tweets of the certain user as forwarding behavior characteristic vectors. If a certain pushtext is forwarded from the pushtext of other people, the forwarding tag is set to 1, otherwise, the forwarding tag is set to 0. If there are fewer than 150 historical tweets for a user, the vector is filled with 1 s. The generated user forwarding behavior feature vector is recorded as f r 。
3-3, in order to extract the image release characteristics of the user, extracting the image release labels of the previous 150 historical tweets of the user to form a characteristic vector. If a certain tweet issued by the user contains a figureAnd if the image information is obtained, the image release label is set to be 1, otherwise, the image release label is set to be 0. If a user's historical tweets are less than 150, the vector is filled with 0 s. The generated image release characteristic vector is recorded as f g 。
3-4. The value ranges of different characteristics are different, so that the characteristic f is obtained t Normalized to [0,1 ] by min-max normalization method]To give f' t Then f 'is prepared' t ,f r and fg And f is obtained by splicing the feature vectors. f is the behavior feature vector of the user with dimension 468.
Further, the step 4 is implemented as follows:
4-1, obtaining an embedded vector of each word in a certain user historical text T by utilizing the CBOW model obtained by training in the step 2-1, and forming a matrix S' belonging to the R m×d Where m represents the total number of words in the historical tweet sequence, d represents the embedding vector dimension for each word, and d =300.
4-2. As shown in FIG. 3, the historical tweet sequence S' for each user is entered into the attention-based BiGRU model. The first layer is designed as a BiGRU layer with a forward GRU and backward GRU structure, and the present invention sets the dimension of the hidden layer to 128. In this layer, the outputs from the hidden layers in both directions are connected as the final output of the BiGRU. The second layer is designed as a feed-forward attention layer to obtain a representative vector with fixed length:
c i '=tanh(W i 'h i '+b i ')
wherein ,hi ' denotes vocabulary S ' in the history grammar sequence S ' i Output vector at BiGRU, c i ' denotes the output of the fully connected layer, W i '∈R 1×d and bi '. Epsilon.R is the weight and bias in the attention calculation process, α i ' is representative ofWord s i 'and h' represents the output of the attention layer. In particular, h' has a fixed length of 128.
4-3, as shown in figure 3, splicing the characteristic vector h 'of the user historical text and the user behavior characteristic vector f, inputting the spliced characteristic vector h' and the user behavior characteristic vector f into a full connection layer, then designing a sigmoid layer, and outputting to obtain the depression probability value of the user
wherein ,represents the output of the fully connected layer, W f and bf Representing weights and biases, and defining a cross-entropy loss function as:
wherein K represents the number of training sets.
Further, the step 5 is implemented as follows:
5-1. The model in FIG. 3 was trained on a training set using an Adam optimizer. Specifically, the Mini-batch size is set to 200, the learning rate is set to 0.001, the epoch is set to 100, and the discharge rate is set to 0.5.
And 5-2, after training, inputting the test set into a text-pushing emotion judgment model, filtering positive emotion text with a certain proportion p, forming the remaining text of each user into historical text pushing of the user, extracting behavior characteristic vectors of the user according to the step 3, inputting the behavior characteristic vectors and the historical text pushing into the trained automatic detection model, and outputting the probability value that a certain user suffers from depression. The invention makes the crawled Sina user database according to the following steps of 7:3, dividing the training set and the test set in proportion, wherein the specific judgment standard comprises the following steps: F1-Score, recall and Precision, the test results are shown in Table 2.
TABLE 2 test results
Method | Precision | Recall | F1_score |
TBF | 0.8581 | 0.7258 | 0.7864 |
EHLM | 0.8723 | 0.7896 | 0.8289 |
This patent | 0.8887 | 0.8749 | 0.8823 |
In addition, comparing the present invention with the TBF (Chiong et al) and EHLM (Ansari et al) methods, the results of Table 2 show that the effect of the present invention is significantly superior to the other two methods. Both on Precision and on Recall, a major improvement was achieved. Compared with a TBF method, the method achieves 0.0959 improvement on F1_ Score; compared with the EHLM method, the invention achieves a 0.0534 improvement on F1_ Score.
Claims (6)
1. The social network user depression detection method fusing tweet information and behavior characteristics is characterized by comprising the following steps of:
step 1, crawling a user data set from a Xinlang microblog database, cleaning texts, and generating a depressed user data set and a non-depressed user data set through manual labeling;
step 2, integrating the multichannel CNN and the BiGRU based on the attention mechanism to analyze the emotional tendency of each piece of tweed of the user, and obtaining the emotional tendency probability value of the tweed of the user; randomly removing positive emotional tendency tweets with a certain proportion p from the history tweets of each user, and splicing the rest tweets into a user history text T;
step 3, extracting characteristics of the user such as posting time, forwarding behavior and image publishing behavior to form a user behavior characteristic vector;
step 4, building a depression detection model fusing the user historical text T and the user behavior feature vectors, inputting the T into a BiGRU layer and a feedforward attention layer to obtain the feature vectors of each user historical text, fusing the feature vectors of the user historical texts and the user behavior feature vectors, and then inputting the fused feature vectors into a full connection layer and a softmax layer;
and 5, training a depression detection model by using an Adam optimization method, and detecting the depression state of the user by using a test set after the training is finished.
2. The method for detecting depression of social network users based on fusion of tweet information and behavior features as claimed in claim 1, wherein the step 1 is implemented as follows:
1-1, collecting a data set of candidate depression users from the Xinlang microblog; randomly selecting several topics relevant to depression, and then crawling candidate depression users from each relevant topic; historical data of the candidate depressed users are crawled, wherein the historical data comprises historical tweets, release time, whether forwarding tweets are used or not and whether image information is released or not;
1-2, selecting users who mention a history of depression diagnosis in their tweets as depressed users for the candidate depressed user data set; in addition, if the user's tweets contain words for symptoms related to the field of depression including "suicide", "depression", and some related therapeutic drugs including "sertraline", "fluoxetine", the same is set for depressed users;
1-3, randomly selecting users from the non-relevant theme of depression, and crawling historical data of the users, wherein the historical data comprises historical text pushing, publishing time, whether the text is forwarding text pushing or not, whether image information is published or not, and forming a non-depression user data set;
1-4, aiming at the text data of each user, cleaning the text data through word segmentation and data filtering; performing text word segmentation by using a "Jieba" word segmentation packet; the data filtering mainly comprises the steps of removing a # theme, URL information, irregular characters, stop words and official account users, and converting emoticons into text information.
3. The method for detecting depression of social network users based on fusion of tweet information and behavior features as claimed in claim 2, wherein the step 2 is implemented as follows:
2-1, pre-training a CBOW model by utilizing a large-scale Chinese Wikipedia data set so as to obtain an embedded vector of Chinese words; pushing the history of each user to a text t i After the CBOW model, a matrix S belongs to R n×d Where n represents the number of words in the tweet, d represents the embedded vector dimension of each word;
2-2, inputting the matrix S into a multichannel CNN, wherein the multichannel CNN comprises a convolution layer and a pooling layer; in the convolutional layer, assume the convolutional kernel W ∈ R h×d H = {2,3,4} is the size of the convolution kernel, and the eigenvector a = [ a ] is obtained by the convolution kernel W 0 ,a 1 ,...,a n-h ]∈R n-h+1 ,a j =σ(W·S i:i+h-1 + b); where σ denotes a non-linear function, b denotes a bias term, S i:i+h-1 Represents the ith to ith + h-1 rows of the matrix S; in the pooling layer: inputting the output of the convolution layers under different convolution kernels into a pooling layer, and extracting the most important feature O under a fixed dimension;
2-3, inputting each tweet into a BiGRU model based on attention; designing the first layer as a BiGRU layer with a forward GRU structure and a backward GRU structure; in the first layer, the outputs from the hidden layers in both directions are connected as the final output of the BiGRU layer; the second layer is designed as a feed-forward attention layer to obtain a representative vector with fixed length:
c i =tanh(W i h i +b i )
wherein ,hi Representing a word s i Output vector at the BiGRU layer, c i Represents the output of the fully connected layer, W i ∈R 1×d and bi E R is the weight and bias in the attention calculation process, h represents the output of the attention layer, α i Representing a word s i The attention distribution coefficient of (a);
2-4, splicing the output characteristic O of the step 2-2 and the output h of the attention layer of the step 2-3 to obtain a vector V = [ O, h =](ii) a Inputting V into a full connection layer, and adding a dropout layer after the full connection layer to prevent overfitting; designing a softmax layer after the dropout layer, and outputting to obtain a user specific text t i Positive and negative emotional tendency probability value p (y) i = 'positive') and p (y) i ='negative');p(y i = 'positive') represents the probability value that the inferences are positive emotional trends, p (y) i = 'negative') represents the probability value of inferring a negative emotional tendency;
2-5, training the model by using an Adam optimizer;
and 2-6, randomly removing positive emotion texts with a certain proportion p from the history texts of each user, and splicing the rest texts into a history text T.
4. The method for detecting depression of social network users based on fusion of tweet information and behavior features as claimed in claim 3, wherein the step 3 is implemented as follows:
3-1, extracting the tweet proportion released by each user every hour in a week in order to extract the release time characteristics of the user; calculating the ratio of the number of the given time-lapse messages according to the number of the given time-lapse messagesThe pushtext release time in one day can form a 24-dimensional feature, and the pushtext in one week can form a 168-dimensional release time feature, which is marked as f t ;
3-2, in order to extract the forwarding behavior characteristics of a certain user, extracting the forwarding labels of the previous 150 historical tweets of the certain user as forwarding behavior characteristic vectors; if a certain pushtext is forwarded from the pushtext of other people, the forwarding label is set to be 1, otherwise, the forwarding label is set to be 0; if the history tweet of a user is less than 150 pieces, filling the vector with 1; the generated user forwarding behavior feature vector is recorded as f r ;
3-3, in order to extract image release characteristics of a user, extracting image release labels of the previous 150 historical tweets of a certain user to form a characteristic vector; if a certain tweet published by the user contains image information, setting an image publishing label as 1, otherwise, setting the image publishing label as 0; if the historical tweets of a certain user are less than 150, filling the vector with 0; the generated image release characteristic vector is recorded as f g ;
3-4. The value ranges of different characteristics are different, so that the characteristic f is obtained t Normalized to [0,1 ] by min-max normalization method]To give f' t Then f 'is prepared' t 、f r and fg Splicing the feature vectors to obtain f; f is the behavior feature vector of the user with dimension 468.
5. The method for detecting depression of social network users based on fusion of tweet information and behavior features as claimed in claim 4, wherein the step 4 is implemented as follows:
4-1, obtaining an embedded vector of each word in a historical text T of a certain user by utilizing the CBOW model obtained by training in the step 2-1, and forming a historical tweet sequence S' e in R m×d Where m represents the total number of words in the historical tweet sequence, d represents the embedded vector dimension for each word, d =300;
4-2, inputting the historical tweet sequence S' of each user into a BiGRU model based on attention; designing the first layer as a BiGRU layer with a forward GRU structure and a backward GRU structure; in this layer, the outputs from the hidden layers in both directions are connected as the final output of the BiGRU; the second layer is designed as a feedforward attention layer to obtain a representative vector with a fixed length:
c i '=tanh(W i 'h i '+b i ')
wherein ,hi ' represents the word S ' in the historical tweet sequence S ' i Output vector at BiGRU, c i ' denotes the output of the fully connected layer, W i '∈R 1×d and bi '. Epsilon.R is the weight and bias in the attention calculation process, alpha i ' represents the word s i ' the attention distribution coefficient, h ' represents the output of the attention layer, i.e. the feature vector of the user's historical text;
4-3, splicing the characteristic vector h 'of the user historical text and the user behavior characteristic vector f, inputting the spliced characteristic vector h' and the user behavior characteristic vector f into a full-connection layer, designing a sigmoid layer, and outputting to obtain the depression probability value of the user
wherein ,represents the output of the fully connected layer, W f and bf Representing weights and biases, and defining a cross-entropy loss function as:
wherein K represents the number of training sets.
6. The method for detecting depression of social network users based on fusion of tweet information and behavior features as claimed in claim 5, wherein the step 3 is implemented as follows:
and (3) after the training of the depression detection model is finished, inputting the test set into the depression detection model, filtering positive emotion text with a certain proportion p, forming the rest text of each user into the historical text of the user, extracting the behavior characteristic vector of the user according to the step 3, inputting the behavior characteristic vector and the historical text into the trained depression detection model, and outputting the probability value that a certain user suffers from depression.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310045687.XA CN115935075B (en) | 2023-01-30 | 2023-01-30 | Social network user depression detection method integrating text information and behavior characteristics |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310045687.XA CN115935075B (en) | 2023-01-30 | 2023-01-30 | Social network user depression detection method integrating text information and behavior characteristics |
Publications (2)
Publication Number | Publication Date |
---|---|
CN115935075A true CN115935075A (en) | 2023-04-07 |
CN115935075B CN115935075B (en) | 2023-08-18 |
Family
ID=86654491
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310045687.XA Active CN115935075B (en) | 2023-01-30 | 2023-01-30 | Social network user depression detection method integrating text information and behavior characteristics |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115935075B (en) |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110807320A (en) * | 2019-11-11 | 2020-02-18 | 北京工商大学 | Short text emotion analysis method based on CNN bidirectional GRU attention mechanism |
CN112417098A (en) * | 2020-11-20 | 2021-02-26 | 南京邮电大学 | Short text emotion classification method based on CNN-BiMGU model |
WO2021133157A1 (en) * | 2019-12-23 | 2021-07-01 | Mimos Berhad | System and method for automatically detecting depression symptoms of a social media user |
CN113220825A (en) * | 2021-03-23 | 2021-08-06 | 上海交通大学 | Modeling method and system of topic emotion tendency prediction model for personal tweet |
CN113435192A (en) * | 2021-06-15 | 2021-09-24 | 王丽亚 | Chinese text emotion analysis method based on changing neural network channel cardinality |
CN114628008A (en) * | 2022-03-22 | 2022-06-14 | 广东工业大学 | Social user depression tendency detection method based on heterogeneous graph attention network |
CN115080725A (en) * | 2022-06-02 | 2022-09-20 | 四川大学 | Multi-element time series feature extraction method for depression symptoms of social network users |
CN115392260A (en) * | 2022-10-31 | 2022-11-25 | 暨南大学 | Social media tweet emotion analysis method facing specific target |
-
2023
- 2023-01-30 CN CN202310045687.XA patent/CN115935075B/en active Active
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110807320A (en) * | 2019-11-11 | 2020-02-18 | 北京工商大学 | Short text emotion analysis method based on CNN bidirectional GRU attention mechanism |
WO2021133157A1 (en) * | 2019-12-23 | 2021-07-01 | Mimos Berhad | System and method for automatically detecting depression symptoms of a social media user |
CN112417098A (en) * | 2020-11-20 | 2021-02-26 | 南京邮电大学 | Short text emotion classification method based on CNN-BiMGU model |
CN113220825A (en) * | 2021-03-23 | 2021-08-06 | 上海交通大学 | Modeling method and system of topic emotion tendency prediction model for personal tweet |
CN113435192A (en) * | 2021-06-15 | 2021-09-24 | 王丽亚 | Chinese text emotion analysis method based on changing neural network channel cardinality |
CN114628008A (en) * | 2022-03-22 | 2022-06-14 | 广东工业大学 | Social user depression tendency detection method based on heterogeneous graph attention network |
CN115080725A (en) * | 2022-06-02 | 2022-09-20 | 四川大学 | Multi-element time series feature extraction method for depression symptoms of social network users |
CN115392260A (en) * | 2022-10-31 | 2022-11-25 | 暨南大学 | Social media tweet emotion analysis method facing specific target |
Non-Patent Citations (2)
Title |
---|
方振宇;: "基于抑郁词典的社交网络心理障碍检测方法", 电脑知识与技术, no. 07, pages 44 - 47 * |
易顺明;易昊;周国栋;: "采用情感特征向量的Twitter情感分类方法研究", 小型微型计算机系统, no. 11, pages 2454 - 2458 * |
Also Published As
Publication number | Publication date |
---|---|
CN115935075B (en) | 2023-08-18 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Almars | Attention-Based Bi-LSTM Model for Arabic Depression Classification. | |
Kabir et al. | DEPTWEET: A typology for social media texts to detect depression severities | |
Mao et al. | Prediction of depression severity based on the prosodic and semantic features with bidirectional LSTM and time distributed CNN | |
CN112256866A (en) | Text fine-grained emotion analysis method based on deep learning | |
Asghar et al. | Detection and classification of psychopathic personality trait from social media text using deep learning model | |
Figuerêdo et al. | Early depression detection in social media based on deep learning and underlying emotions | |
Kawintiranon et al. | PoliBERTweet: a pre-trained language model for analyzing political content on twitter | |
Kholodna et al. | A Machine Learning Model for Automatic Emotion Detection from Speech. | |
Tseng et al. | Approaching Human Performance in Behavior Estimation in Couples Therapy Using Deep Sentence Embeddings. | |
CN116245110A (en) | Multi-dimensional information fusion user standing detection method based on graph attention network | |
Cao et al. | Category-aware chronic stress detection on microblogs | |
CN114628008A (en) | Social user depression tendency detection method based on heterogeneous graph attention network | |
CN117393163A (en) | Social network user depression detection method and system based on multi-mode information fusion | |
CN115935075B (en) | Social network user depression detection method integrating text information and behavior characteristics | |
Marerngsit et al. | A two-stage text-to-emotion depressive disorder screening assistance based on contents from online community | |
Tadisetty et al. | Anonymous prediction of mental illness in social media | |
Wu et al. | Development of Internet suicide message identification and the Monitoring-Tracking-Rescuing model in Taiwan | |
Bell et al. | Detecting diabetes risk from social media activity | |
Kanaan et al. | Detecting mental disorders through social media content | |
Uddin | Depression Detection in Text Using Long Short-Term Memory-Based Neural Structured Learning | |
Cao et al. | News detection for recurrent neural network approach | |
Sudha et al. | Depression detection using machine learning | |
Agbesi et al. | Multichannel 2D-CNN Attention-Based BiLSTM Method for Low-Resource Ewe Sentiment Analysis | |
Mehta et al. | Sentiment Analysis on Covid-19 Using Deep Learning | |
CN113535948B (en) | LSTM-Attention text classification method introducing essential point information |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
TR01 | Transfer of patent right | ||
TR01 | Transfer of patent right |
Effective date of registration: 20240206 Address after: Room 801, 85 Kefeng Road, Huangpu District, Guangzhou City, Guangdong Province Patentee after: Guangzhou Dayu Chuangfu Technology Co.,Ltd. Country or region after: China Address before: Hangzhou City, Zhejiang province 310036 Xiasha Higher Education Park forest Street No. 16 Patentee before: HANGZHOU NORMAL UNIVERSITY QIANJIANG College Country or region before: China |