CN109977231A - A kind of depressive emotion analysis method based on emotion decay factor - Google Patents
A kind of depressive emotion analysis method based on emotion decay factor Download PDFInfo
- Publication number
- CN109977231A CN109977231A CN201910285499.8A CN201910285499A CN109977231A CN 109977231 A CN109977231 A CN 109977231A CN 201910285499 A CN201910285499 A CN 201910285499A CN 109977231 A CN109977231 A CN 109977231A
- Authority
- CN
- China
- Prior art keywords
- emotion
- depression
- microblogging
- depressive
- text
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/237—Lexical tools
- G06F40/242—Dictionaries
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H50/00—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
- G16H50/30—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for calculating health indices; for individual health risk assessment
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- General Physics & Mathematics (AREA)
- Medical Informatics (AREA)
- General Engineering & Computer Science (AREA)
- Public Health (AREA)
- Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Artificial Intelligence (AREA)
- Databases & Information Systems (AREA)
- Primary Health Care (AREA)
- Epidemiology (AREA)
- Pathology (AREA)
- Data Mining & Analysis (AREA)
- Biomedical Technology (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a kind of depressive emotion analysis methods based on emotion decay factor, method includes the following steps: specific crowd microblogging text collection and mark;Microblogging Text Pretreatment;Emotion classifiers design;Depression Index learning model is established, and characterizes individual Degree of Depression with Depression Index;The fluctuation situation of individual depressive emotion is portrayed in the introducing of emotion decay factor.Depressive emotion analysis method the present invention is based on emotion decay factor is to construct the depressed mood dictionary of profession according to existing sentiment dictionary and cyberspeak feature, more accurately identifies depressed mood Feature Words, improves recognition accuracy;The present invention proposes the mathematical model of introducing emotion decay factor also to calculate depressive emotion value, portrays the change procedure of individual depressive emotion truly and effectively, has positive meaning for the prevention and treatment of depression;The present invention is accurately identified and has been analyzed to the depressive emotion of individual.
Description
Technical field
The invention belongs to text emotion analysis and field of social network, relate to computer and psychological field, specifically relate to
And a kind of depressive emotion analysis method based on emotion decay factor.
Background technique
Depression is the fourth-largest disease in the world, it is contemplated that will become second largest to the year two thousand twenty.But medical treatment of the China to depression
Prevention and treatment is also in the low situation of discrimination, hospital more than prefecture-level city to its discrimination less than 20%, only less than 10% trouble
Person receives relevant drug therapy.Meanwhile the morbidity (and suicide) of depression started to occur low age (university, or even
Students in middle and primary schools group) change trend.In conclusion urgently paying attention to the science popularization of depression, prevention, treatment work, depression prevention and treatment
It has been put into national mental health emphasis.
It was found that depression is the highest disease of homicide rate, the homicide rate of patients with depression is 20 times higher than general population.Suppression
Strongly fragrant disease has become in global disease and causes seriously bear first important diseases to the mankind, the pain caused by patient and its family members
Hardship, caused by society loss be other diseases it is incomparable.Since suicide is to develop to certain serious journey in disease
Just occur when spending, the felo-de-se of some unknown causes may suffer from serious depression before death, not sent out in time only
It is existing, so disease and early treatment are found early, it is extremely important to the patient of depression.
With mild depression very close to depressive emotion be that a kind of unhealthy emotion and people often meet in daily life
A kind of mood arrived.Current social is increasingly competitive, and almost everyone is in overload operation, it is easy to generate different degrees of
Depressive emotion, this is a kind of very common emotion ingredient.When people encounter stress, life setback, painful circumstances, life always
It dies of illness, when natural and man-made calamities, can naturally generate depressive emotion.If depressive emotion cannot be released effectively gentle
Solution saves bit by bit be easy to cause mild depression for a long time, and then is transformed into severe depression, therefore finds depressive emotion ten in time
Divide important.
Someone to patients with depression track 10 years the study found that there is 75%~80% patient's multiple relapse, therefore it is depressed
Disease patient needs to carry out prophylactic treatment.In treating depression method, a method is self-service psychological training, carries out psychology
Training, first has to perceive undesirable mood and negativity idea, i.e. depressive emotion.
In the social networks highly developed epoch, popularity rate of the network social intercourses tool such as microblogging in college student reaches
90% or more, a large amount of students express the viewpoint, opinion, emotion etc. of oneself by social platforms such as microbloggings, these platforms are also to be permitted
There is the individual progress emotion of depressive emotion to give vent to more and provides place.
Consider that factors above, the present invention provide a kind of depressive emotion analysis method based on emotion decay factor.By right
Microblogging text carries out mining analysis, can timely and effectively find and analyze the depressive emotion of individual, the prevention for depression
There is positive meaning with treatment
Summary of the invention
The purpose of the present invention is to provide a kind of depressive emotion analysis methods based on emotion decay factor, according to existing
Sentiment dictionary and cyberspeak feature construct the depressed mood dictionary of profession, more accurately identify depressed mood Feature Words, improve
Recognition accuracy;Meanwhile depressed mood classifier is also constructed based on support vector machines, depression is carried out to every microblogging text of individual
Emotional semantic classification after obtaining classification results, measures individual Degree of Depression by Depression Index, finally considers individual depressive emotion fluctuation
Continuity problem.
In order to achieve the above object, the invention is realized by the following technical scheme:
A kind of depressive emotion analysis method based on emotion decay factor comprising the steps of:
S1, mobilize crowd to fill in depression self-rating scale online, obtain the Depression Scale score and microblogging pet name of individual, and
Its microblogging text is acquired, and content of microblog is labeled by expert system;
S2, microblogging Text Pretreatment comprising text participle, remove stop words and depressed mood dictionary creation;
S3, by feature selecting, characteristic weighing, according to the result structure of the result of the feature selecting and the characteristic weighing
Term vector space is built, structuring text classifier classifies microblogging text to be sorted, obtains the depression of every microblogging text
Affective state;
S4, calculate Depression Index, to the correlation between the Depression Index and the Depression Scale score of the individual into
The detection of row Pearson came, and the relationship between Degree of Depression and the Depression Index is established according to testing result, with Depression Index table
Levy individual Degree of Depression;
S5, introducing emotion decay factor obtain the corresponding depressive emotion value of every microblogging, and judge individual Condition of depression, carve
Draw the fluctuation situation of individual depressive emotion.
Preferably, the depressed mood dictionary creation further includes steps of step S231, acquisition depressed mood net
Network vogue word;Step S232, common depressed mood symbol is extracted from microblogging;Step S233, on the basis of existing sentiment dictionary,
Collected cyberspeak and depressed mood symbol are joined, depressed mood dictionary is constructed, reduction is segmented in the text
The depressed word being split in the process.
Preferably, the feature selecting further includes steps of
Feature selecting, calculation formula are carried out to text using CHI method are as follows:
Wherein, t is characterized, ciFor classification, N is number of files, and A indicates that document belongs to the classification ciNot comprising feature t, B table
Show that the document is not belonging to classification ciComprising feature t, C indicates that the document belongs to the classification ciNot comprising the feature t, D table
Show that the document is not belonging to the classification ciNot comprising the feature t;
Finally, selecting the maximum value of feature t as overall situation CHI statistic, formula is as follows:
Preferably, the characteristic weighing further includes following steps:
Characteristic weighing, formula are carried out to text using word frequency-method of falling document frequency are as follows:
Wik=tfik·idfik,
Wherein, tfikIndicate Feature Words tiIn text dkThe number of middle appearance, idfikIndicate Feature Words tiAnti- document frequency
Rate, the anti-document frequency idfikFormula are as follows:DkIndicate all texts in text set
Number, dikIt indicates to include feature t in text setiTextual data.
Preferably, term vector space building further includes following steps: according to the result of the feature selecting with
The result of the characteristic weighing constructs term vector space, and indicates every microblogging using (L T:W) form, L indicate every it is micro-
Rich label, T indicate that characteristic item, W are characterized the weight of item.
It is preferably based in the depressive emotion analysis method of emotion decay factor, to Text Pretreatment and obtains institute's predicate
After vector space, structuring text classifier classifies microblogging text to be sorted, obtains the depressed feelings of every microblogging text
Sense state;Wherein, input of the term vector space as classifier, finally output obtains 0 or 1 label, obtains classification results
Middle depression microblogging item number.
Preferably, it is further included in the step S4:
The calculation formula of Depression Index are as follows:Wherein, NdIndicate depressed microblogging item in step S3 classification results
Number, NtIndicate total microblog number;
Relational expression between the Depression Index DI and Degree of Depression E (DI) are as follows:
When Depression Index is less than 0.1, illustrating the individual without depressive emotion, explanation individual on the contrary has depressive emotion,
And Depression Index is bigger, Degree of Depression is more serious.
Preferably, it is further included in the step S5:
The introducing of emotion decay factor, construction emotion decay formula, obtains the corresponding depressive emotion value of every microblogging, formula
For (t-1)+(- 1) f (t)=fne-λt, wherein time t is defined as the time interval of adjacent two microbloggings, then the value range of t
Are as follows: t=0,1,2 ..., n, and the original state f (t=0)=0 of any individual;F (t) is indicated corresponding to this microblogging of t moment
Depressive emotion value;F (t-1) indicates depressive emotion value corresponding to the microblogging text of last moment;λ is emotion decay factor, table
Show the rate of decay of emotion, it is assumed that depressed mood meets half-life period rule, takes λ=0.5, the value of n and adjacent two time point
Microblogging state is related.
Preferably, in the step S5, the method that calculates the value of n are as follows:
Wherein, c indicates microblogging state;When there are continuous two or more 0 states, if subsequent time tiState c
=1, then ti=1;When there are continuous two or more 1 states, if subsequent time tiState c=0, then tiValue not
1 is set, but then last moment is incremented by successively, both the above state replaces in change procedure, and the value of f (t-1) remains unchanged, still
For the depressive emotion value of last moment.
Preferably, in the step S5, following steps are further included in the individual Condition of depression of the judgement:
After the depressive emotion value for calculating every microblogging, then depressive emotion mean value is calculated, as follows:
Wherein, t=i expression is investigated since i-th microblogging, and f (t=i) indicates the depressive emotion value of i-th microblogging, Avg
Indicate the depressive emotion mean value from i-th microblogging to nth microblogging;
Based on the depressive emotion mean value, individual Condition of depression is judged:
If the depressive emotion mean value of individual [- 1.6,0.2) section when, then individual Condition of depression is that mood is normal;If
For depressive emotion mean value at [0.2,2] section, then individual Condition of depression is to have Depression trend.
Compared with prior art, the invention has the benefit that the present invention is based on the depressive emotions of emotion decay factor point
Analysis method is to construct the depressed mood dictionary of profession according to existing sentiment dictionary and cyberspeak feature, more accurately identify
Depressed mood Feature Words improve recognition accuracy;The present invention is based on support vector machines to construct depressed mood classifier, every to individual
Bar microblogging text carries out depressed mood classification, after obtaining classification results, measures individual Degree of Depression by Depression Index, finally examines
Consider the continuity problem of individual depressive emotion fluctuation;The present invention also proposes that the mathematical model for introducing emotion decay factor presses down to calculate
Strongly fragrant mood value, portrays the change procedure of individual depressive emotion truly and effectively, has actively for the prevention and treatment of depression
Meaning.
Detailed description of the invention
Fig. 1 is the overview flow chart of the depressive emotion analysis method of the invention based on emotion decay factor;
Fig. 2 is the relationship scatter plot of the Depression Scale score Score of Depression Index DI and individual of the invention;
Fig. 3 is the depressed mood trend graph of presently preferred embodiments of the present invention.
Specific embodiment
By reading detailed description of non-limiting embodiments made by-Fig. 3 referring to Fig.1, feature of the invention,
Objects and advantages will become more apparent upon.Referring to Fig. 1-Fig. 3 for showing the embodiment of the present invention, this hair hereafter will be described in greater detail
It is bright.However, the present invention can be realized by many different forms, and it should not be construed as the limit by the embodiment herein proposed
System.
As shown in Figure 1, the present invention provides a kind of depressive emotion analysis method based on emotion decay factor, this method packet
Containing following steps:
Step S1, specific crowd microblogging text collection and mark;
Step S2, microblogging Text Pretreatment;
Step S3, emotion classifiers design;
Step S4, Depression Index learning model is established, and characterizes individual Degree of Depression with Depression Index;
Step S5, the fluctuation situation of individual depressive emotion is portrayed in the introducing of emotion decay factor.
The step S1 is further included steps of
Step S11, specific crowd is started to fill in depression self-rating scale online, scale score and the microblogging for obtaining individual are close
Claim;
Step S12, the microblogging text of above-mentioned specific crowd is acquired, and content of microblog is labeled by expert system.
The step S2 is further included steps of
Step S21, text segments;
Step S22, stop words is removed;
Step S23, depressed mood dictionary creation;Citing: " worrying about imaginary troubles " be split in the step s 21 in order to " mediocre person
From disturb ", step S23 the word restore.
The step S23 is further included steps of
Step S231, depressed mood network vogue word is acquired;
Step S232, common depressed mood symbol is extracted from Sina weibo;
Step S233, on the basis of existing sentiment dictionary, by collected cyberspeak and depressed mood symbol be added into
It goes, constructs depressed mood dictionary, restore the depressed word being split in step S21.Wherein, the existing sentiment dictionary, which can be used, knows
Net dictionary or the Chinese emotion vocabulary ontology library of Dalian University of Science & Engineering etc., the present invention are without limitation.
The step S3 is further included steps of
Step S31, feature selecting, using card side's (CHI) statistic come the correlation between measures characteristic and classification, master
Thought is wanted to assume that feature t and classification ciBetween meet chi square distribution, CHI statistical value is bigger, related between feature and classification
Stronger, the calculation formula bigger to the contribution degree of classification of property are as follows:
Wherein, t is characterized, ciFor classification, N is number of files, and A indicates that document belongs to the classification ciNot comprising feature t, B table
Show that the document is not belonging to classification ciComprising feature t, C indicates that the document belongs to the classification ciNot comprising the feature t, D table
Show that the document is not belonging to the classification ciNot comprising the feature t.
Finally, selecting the maximum value of feature t as its overall situation CHI statistic, formula is as follows:
Citing: now with N microblogging, wherein having M item is about sport, if wanting to investigate a word " basketball " and classification
Correlation between " sport " now can be used there are four observed value:
1. including " basketball " and the microblog number for belonging to " sport " classification, it is named as A,
2. including " basketball " but the microblog number for being not belonging to " sport " classification, it is named as B,
3. not including " basketball " but belonging to the microblog number of " sport " classification, it is named as C,
4. both not including " basketball " or being not belonging to the microblog number of " sport " classification, it is named as D,
By formula, the chi-square value of word basketball Yu classification sport can be calculated, next can calculate other words such as " row
Ball ", " product ", the CHI of " bank " etc. and Sport Class finally sort according to size, select to need maximum several
Vocabulary is as characteristic item.
Step S32, characteristic weighing, such as feature is carried out to text using TF-IDF (word frequency-fall document frequency) method and is added
Power, wherein TF indicates word frequency, and IDF indicates document frequency, and feature weight is inversely proportional with word frequency, with the document comprising this feature
It is directly proportional, formula are as follows:
Wik=tfik·idfik,
Wherein, tfikIndicate Feature Words tiIn text dkThe number of middle appearance, idfikIndicate Feature Words tiAnti- document frequency
Rate, the anti-document frequency idfikFormula are as follows:DkIndicate all texts in text set
Number, dikIt indicates to include feature t in text setiTextual data.
Citing: assuming that there is 2 texts, all texts one share 5 different words:
Text 1: phone Huawei spends Huawei
Text 2: apple watch phone wrist-watch
Wherein, the TF of document 1 is respectively as follows: phone 1/4, Huawei 1/2, spends 1/4, apple 0, wrist-watch 0;This five word IDF points
It Wei not log (4/2.5+0.01), log (4/1.5+0.01), log (4/1.5+0.01), log (4/1.5+0.01), (4/1.5+
0.01);Therefore Feature Words phone is 1/4*log (4/2.5+0.01) in the TF-IDF of document 1, that is, feature weight value.
Step S33, empty according to the feature selecting result of step S31 and the characteristic weighing result of step S32 building term vector
Between, specifically: every microblogging is indicated using (LT:W) form, wherein L indicates that the label of every microblogging, T indicate characteristic item, W
It is characterized the weight of item.
Citing: " best of my love you, close eye, can forget for me, but the tears flowed down, do not deceive oneself but ", this
5 Feature Words, respectively " deceiving ", " love ", " tears ", " closing eye ", " flowing down " are obtained after word feature selecting.So the words
It can be expressed as " 1.0 28:0.452839:0.229549:0.3215862:0.58111832:0.54878 ", wherein 1.0 indicate
Label, 28 be the call number of Feature Words " deceiving ", and 0.4528 is the weight of Feature Words.
Step S34, to Text Pretreatment and after obtaining characteristic vector space, need structuring text classifier will be to be sorted micro-
Blog article is originally classified, and the depressed mood state of every microblogging text is obtained.Wherein, the term vector space obtained is as classifier
Input, finally output obtain label, i.e., whether depression.
Citing: it chooses a preferred embodiment and is obtained with SVM emotion classifiers to its pretreated microblogging text classification
Limited in view of length to the results are shown in Table 1, the affective state for only choosing preceding 15 content of microblog and classifier output is in
It is existing.
The content of microblog of 1 present pre-ferred embodiments of table and the output state of emotion classifiers
The step S4 further includes following steps:
Step S41, Depression Index, formula are calculated are as follows:
Wherein, NdIndicate depressed microblogging item number, N in step S3 classification resultstIndicate total microblog number.
Citing: if the microblogging total number of individual is 150, depressed microblogging item number is 15, then its Depression Index is exactly
0.1。
Step S42, it establishes depression according to relevance detection results using the correlation that Pearson detects DI and Score and refers to
Number learning model, specifically:
Step S421, Pearson came detection is carried out to the correlation between DI and Score: first using Score as dependent variable,
Depression Index draws scatter plot between the two, as shown in Figure 2 as independent variable.Wherein, DI indicates Depression Index, Score table
Show that the Depression Scale score of individual, i.e. individual Degree of Depression, Index value are 1 expression depressive emotion, indicate normal mood for 0,
Then Pearson came detection is carried out to the correlation between DI and Score, discovery DI and Score is significant on 0.01 horizontal (bilateral)
Correlation, and r=0.544 illustrate that the two has strong correlation.
Step S422: according to relevance detection results, the relational expression between Depression Index and Degree of Depression, formula are provided
Are as follows:
Step S43, characterize individual Degree of Depression with Depression Index: when Depression Index is less than 0.1, the individual is without depression
Mood, on the contrary have, and Depression Index is bigger, and Degree of Depression is more serious.For example, the Depression Index of individual is 0.1 or 0.05, then
His Depression trend belongs to normal (normal).
The step S5 further includes following steps:
Step S51, the introducing of emotion decay factor, construction emotion decay formula obtain the corresponding depressed feelings of every microblogging
Thread value, formula are as follows:
(t-1)+(- 1) f (t)=fne-λt,
Wherein, time t is defined as the time interval of adjacent two microbloggings, then the value range of t are as follows: t=0,1,2 ..., n,
And the original state f (t=0)=0 of any individual;F (t) indicates depressive emotion value corresponding to this microblogging of t moment;f(t-1)
Indicate depressive emotion value corresponding to the microblogging text of last moment;λ is emotion decay factor, indicates the rate of decay of emotion,
Assuming that depressed mood meets half-life period rule, λ=0.5 is taken, the value of n is related with the microblogging state at adjacent two time point.
Specifically, the value of n is calculated according to the following formula:
Wherein, c indicates microblogging state;When there are continuous two or more 0 states, if subsequent time tiState c
=1, then ti=1;On the contrary, when there are continuous two or more 1 states, if subsequent time tiState c=0, then tiValue
1 is not set, but then last moment is incremented by successively.Both the above state replaces in change procedure, and the value of f (t-1) is kept not
Become, is still the depressive emotion value of last moment.
Step S52, judge individual Condition of depression, specifically includes the following steps:
Step S521, the depressive emotion value of every microblogging is calculated.
Citing: according to step S3, the status switch for obtaining the emotion classifiers output of preferred embodiments of the present invention be 1,0,1,
0,0,1,1,0,0,0,1,0,0,1,0 }, apply above-mentioned formula, obtained affective state sequence be 0,1,0,1,0,0,1,1,0,
0,0,1,0,0,1,0 }, the corresponding depressive emotion value of the affective state sequence is as shown in table 2.
Table 2 is the corresponding depressive emotion value of depressed mood status switch of presently preferred embodiments of the present invention
Step S522, depressive emotion mean value, formula are calculated are as follows:
Wherein, t=i expression is investigated since i-th microblogging, and f (t=i) indicates the depressive emotion value of i-th microblogging, Avg
Indicate the depressive emotion mean value from i-th microblogging to nth microblogging.
Citing: the preferred embodiments of the present invention based on the step S521 calculate to obtain depressive emotion mean value from t=0 to t=15
For
Step S523, judge individual Condition of depression, relational expression are as follows:
If individual depressive emotion mean value [- 1.6,0.2) section when, then mood is normal;Depressive emotion mean value [0.2,
2] when section, then there is Depression trend.
Citing: according to the step S522 step it is found that calculate preferred embodiments depressive emotion mean value of the present invention is
0.7631, in section [0.2,2], therefore the example has Depression trend within this time.
Step S53, according to depressive emotion value, individual depressed mood trend graph in a period of time is drawn, individual depression is portrayed
The fluctuation situation of mood, as shown in figure 3, being the depressed mood trend graph of present pre-ferred embodiments section time.
Depressive emotion analysis method the present invention is based on emotion decay factor is used according to existing sentiment dictionary and network
Language feature constructs the depressed mood dictionary of profession, more accurately identifies depressed mood Feature Words, improves recognition accuracy;This hair
It is bright that depressed mood classifier is constructed based on support vector machines, depressed mood classification is carried out to every microblogging text of individual, is divided
After class result, individual Degree of Depression is measured by Depression Index, finally considers the continuity problem of individual depressive emotion fluctuation;This
Invention proposes the mathematical model of introducing emotion decay factor also to calculate depressive emotion value, portrays individual depressed feelings truly and effectively
The change procedure of thread has positive meaning for the prevention and treatment of depression.
It is discussed in detail although the contents of the present invention have passed through above preferred embodiment, but it should be appreciated that above-mentioned
Description is not considered as limitation of the present invention.After those skilled in the art have read above content, for of the invention
A variety of modifications and substitutions all will be apparent.Therefore, protection scope of the present invention should be limited to the appended claims.
Claims (10)
1. a kind of depressive emotion analysis method based on emotion decay factor, which is characterized in that comprise the steps of:
S1, mobilize crowd to fill in depression self-rating scale online, obtain the Depression Scale score and microblogging pet name of individual, and acquire
Its microblogging text, and content of microblog is labeled by expert system;
S2, microblogging Text Pretreatment comprising text participle, remove stop words and depressed mood dictionary creation;
S3, by feature selecting, characteristic weighing, word is constructed according to the result of the feature selecting and the result of the characteristic weighing
Vector space, structuring text classifier classify microblogging text to be sorted, obtain the depressed mood of every microblogging text
State;
S4, Depression Index is calculated, skin is carried out to the correlation between the Depression Index and the Depression Scale score of the individual
Er Xun detection, and the relationship between Degree of Depression and the Depression Index is established according to testing result, with Depression Index characterization
Body Degree of Depression;
S5, introducing emotion decay factor obtain the corresponding depressive emotion value of every microblogging, and judge individual Condition of depression, portray a
The fluctuation situation of body depressive emotion.
2. as described in claim 1 based on the depressive emotion analysis method of emotion decay factor, which is characterized in that the depression
Sentiment dictionary building further includes steps of
Step S231, depressed mood network vogue word is acquired;
Step S232, common depressed mood symbol is extracted from microblogging;
Step S233, on the basis of existing sentiment dictionary, collected cyberspeak and depressed mood symbol are joined, structure
Depressed mood dictionary is built, the depressed word being split during text participle is restored.
3. as described in claim 1 based on the depressive emotion analysis method of emotion decay factor, which is characterized in that the feature
Selection further includes steps of
Feature selecting, calculation formula are carried out to text using CHI method are as follows:
Wherein, t is characterized, ciFor classification, N is number of files, and A indicates that document belongs to the classification ciInstitute is indicated not comprising feature t, B
It states document and is not belonging to classification ciComprising feature t, C indicates that the document belongs to the classification ciInstitute is indicated not comprising the feature t, D
It states document and is not belonging to the classification ciNot comprising the feature t;
Finally, selecting the maximum value of feature t as overall situation CHI statistic, formula is as follows:
4. as claimed in claim 3 based on the depressive emotion analysis method of emotion decay factor, which is characterized in that the feature
Weighting further includes following steps:
Characteristic weighing, formula are carried out to text using word frequency-method of falling document frequency are as follows:
Wik=tfik·idfik,
Wherein, tfikIndicate Feature Words tiIn text dkThe number of middle appearance, idfikIndicate Feature Words tiAnti- document frequency, this is anti-
Document frequency idfikFormula are as follows:DkIndicate all textual datas in text set, dikTable
Show in text set comprising feature tiTextual data.
5. as claimed in claim 4 based on the depressive emotion analysis method of emotion decay factor, which is characterized in that
The term vector space building further includes following steps: according to the result of the feature selecting and the characteristic weighing
Result construct term vector space, and indicate that every microblogging, L indicate the label of every microblogging, T table using (L T:W) form
Show that characteristic item, W are characterized the weight of item.
6. the depressive emotion analysis method based on emotion decay factor as described in claim 1 or 3 or 4, which is characterized in that
To Text Pretreatment and after obtaining the term vector space, structuring text classifier is divided microblogging text to be sorted
Class obtains the depressed mood state of every microblogging text;Wherein, input of the term vector space as classifier, it is last defeated
0 or 1 label is obtained out, obtains depressed microblogging item number in classification results.
7. as claimed in claim 6 based on the depressive emotion analysis method of emotion decay factor, which is characterized in that
It is further included in the step S4:
The calculation formula of Depression Index are as follows:Wherein, NdIndicate depressed microblogging item number, N in step S3 classification resultstTable
Show total microblog number;
Relational expression between the Depression Index DI and Degree of Depression E (DI) are as follows:
When Depression Index is less than 0.1, illustrate the individual without depressive emotion, explanation individual on the contrary has depressive emotion, and presses down
Strongly fragrant index is bigger, and Degree of Depression is more serious.
8. as described in claim 1 based on the depressive emotion analysis method of emotion decay factor, which is characterized in that
It is further included in the step S5:
The introducing of emotion decay factor, construction emotion decay formula, obtains the corresponding depressive emotion value of every microblogging, formula f
(t) (t-1)+(- 1)=fne-λt, wherein time t is defined as the time interval of adjacent two microbloggings, then the value range of t are as follows: t
=0,1,2 ..., n, and the original state f (t=0)=0 of any individual;F (t) indicates depression corresponding to this microblogging of t moment
Mood value;F (t-1) indicates depressive emotion value corresponding to the microblogging text of last moment;λ is emotion decay factor, indicates feelings
The rate of decay of sense, it is assumed that depressed mood meets half-life period rule, takes λ=0.5, the microblogging of the value of n and adjacent two time point
State is related.
9. as claimed in claim 8 based on the depressive emotion analysis method of emotion decay factor, which is characterized in that
In the step S5, the method that calculates the value of n are as follows:
Wherein, c indicates microblogging state;When there are continuous two or more 0 states, if subsequent time tiState c=1, then
ti=1;When there are continuous two or more 1 states, if subsequent time tiState c=0, then tiValue do not set 1, and
Be then last moment it is incremented by successively, both the above state replace change procedure in, the value of f (t-1) remains unchanged, and is still upper one
The depressive emotion value at moment.
10. as described in claim 1 based on the depressive emotion analysis method of emotion decay factor, which is characterized in that
In the step S5, following steps are further included in the individual Condition of depression of the judgement:
After the depressive emotion value for calculating every microblogging, then depressive emotion mean value is calculated, as follows:
Wherein, t=i expression is investigated since i-th microblogging, and f (t=i) indicates the depressive emotion value of i-th microblogging, and Avg is indicated
From i-th microblogging to the depressive emotion mean value of nth microblogging;
Based on the depressive emotion mean value, individual Condition of depression is judged:
If the depressive emotion mean value of individual [- 1.6,0.2) section when, then individual Condition of depression is that mood is normal;If depressed
For mood mean value at [0.2,2] section, then individual Condition of depression is to have Depression trend.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910285499.8A CN109977231B (en) | 2019-04-10 | 2019-04-10 | Depressed mood analysis method based on emotional decay factor |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910285499.8A CN109977231B (en) | 2019-04-10 | 2019-04-10 | Depressed mood analysis method based on emotional decay factor |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109977231A true CN109977231A (en) | 2019-07-05 |
CN109977231B CN109977231B (en) | 2021-04-02 |
Family
ID=67083941
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910285499.8A Active CN109977231B (en) | 2019-04-10 | 2019-04-10 | Depressed mood analysis method based on emotional decay factor |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109977231B (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111312394A (en) * | 2020-01-15 | 2020-06-19 | 东北电力大学 | Psychological health condition evaluation system based on combined emotion and processing method thereof |
CN115495572A (en) * | 2022-08-01 | 2022-12-20 | 广州大学 | Auxiliary management method for depressed mood based on composite mood analysis |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20140095150A1 (en) * | 2012-10-03 | 2014-04-03 | Kanjoya, Inc. | Emotion identification system and method |
CN104794208A (en) * | 2015-04-24 | 2015-07-22 | 清华大学 | Sentiment classification method and system based on contextual information of microblog text |
CN106547875A (en) * | 2016-11-02 | 2017-03-29 | 哈尔滨工程大学 | A kind of online incident detection method of the microblogging based on sentiment analysis and label |
CN106708805A (en) * | 2016-12-30 | 2017-05-24 | 深圳天珑无线科技有限公司 | Text statistics-based psychoanalysis method and device |
CN107885849A (en) * | 2017-11-13 | 2018-04-06 | 成都蓝景信息技术有限公司 | A kind of moos index analysis system based on text classification |
CN108652648A (en) * | 2018-03-16 | 2018-10-16 | 合肥数翼信息科技有限公司 | Depression monitoring device for depression of old people |
CN109543110A (en) * | 2018-11-28 | 2019-03-29 | 南京航空航天大学 | A kind of microblog emotional analysis method and system |
-
2019
- 2019-04-10 CN CN201910285499.8A patent/CN109977231B/en active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20140095150A1 (en) * | 2012-10-03 | 2014-04-03 | Kanjoya, Inc. | Emotion identification system and method |
CN104794208A (en) * | 2015-04-24 | 2015-07-22 | 清华大学 | Sentiment classification method and system based on contextual information of microblog text |
CN106547875A (en) * | 2016-11-02 | 2017-03-29 | 哈尔滨工程大学 | A kind of online incident detection method of the microblogging based on sentiment analysis and label |
CN106708805A (en) * | 2016-12-30 | 2017-05-24 | 深圳天珑无线科技有限公司 | Text statistics-based psychoanalysis method and device |
CN107885849A (en) * | 2017-11-13 | 2018-04-06 | 成都蓝景信息技术有限公司 | A kind of moos index analysis system based on text classification |
CN108652648A (en) * | 2018-03-16 | 2018-10-16 | 合肥数翼信息科技有限公司 | Depression monitoring device for depression of old people |
CN109543110A (en) * | 2018-11-28 | 2019-03-29 | 南京航空航天大学 | A kind of microblog emotional analysis method and system |
Non-Patent Citations (2)
Title |
---|
施志伟: ""基于文本的抑郁情感倾向识别模型"", 《计算机系统应用》 * |
杨琳: ""基于社交网络的用户行为分析及预测"", 《中国优秀硕士学位论文全文数据库 信息科技辑》 * |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111312394A (en) * | 2020-01-15 | 2020-06-19 | 东北电力大学 | Psychological health condition evaluation system based on combined emotion and processing method thereof |
CN111312394B (en) * | 2020-01-15 | 2023-09-29 | 东北电力大学 | Psychological health assessment system based on combined emotion and processing method thereof |
CN115495572A (en) * | 2022-08-01 | 2022-12-20 | 广州大学 | Auxiliary management method for depressed mood based on composite mood analysis |
Also Published As
Publication number | Publication date |
---|---|
CN109977231B (en) | 2021-04-02 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Zhao et al. | Automatic detection of cyberbullying on social networks based on bullying features | |
Long et al. | Improving attention model based on cognition grounded data for sentiment analysis | |
Fodeh et al. | Using machine learning algorithms to detect suicide risk factors on twitter | |
Ptaszynski et al. | In the service of online order: Tackling cyber-bullying with machine learning and affect analysis | |
Bobichev et al. | Sentiment analysis in the Ukrainian and Russian news | |
Sciandra | COVID-19 outbreak through Tweeters’ words: Monitoring Italian social media communication about COVID-19 with text mining and word embeddings | |
Ogarkova et al. | Metaphorical and literal profiling in the study of emotions | |
CN110705247A (en) | Based on x2-C text similarity calculation method | |
Shahreen et al. | Suicidal trend analysis of twitter using machine learning and neural network | |
Appling et al. | Towards automated personality identification using speech acts | |
Inrak et al. | Applying latent semantic analysis to classify emotions in Thai text | |
Rabani et al. | Detecting suicidality on social media: Machine learning at rescue | |
CN112115712B (en) | Topic-based group emotion analysis method | |
CN109977231A (en) | A kind of depressive emotion analysis method based on emotion decay factor | |
Vayadande et al. | Classification of Depression on social media using Distant Supervision | |
Wu et al. | Maximum entropy-based sentiment analysis of online product reviews in Chinese | |
Nagaraj et al. | Classification of Tweets using natural language processing from Twitter API data | |
Riahi et al. | Implicit emotion detection from text with information fusion | |
Charalampakis et al. | Detecting irony on greek political tweets: A text mining approach | |
Gwad et al. | Twitter sentiment analysis classification in the Arabic language using long short-term memory neural networks | |
Ceyhan et al. | Health service quality measurement from patient reviews in Turkish by opinion mining | |
Gao et al. | Chinese micro-blog sentiment analysis based on semantic features and PAD model | |
Chaurasia et al. | Predicting mental health of scholars using contextual word embedding | |
Rabani et al. | A nove approach to predict the level of suicidal ideation on social networks Using machine and ensemble learning | |
Gupta et al. | Online document content and emoji-based classification understanding from normal to pandemic COVID-19 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |