CN104731773A - Text sentiment analysis method and text sentiment analysis system - Google Patents

Text sentiment analysis method and text sentiment analysis system Download PDF

Info

Publication number
CN104731773A
CN104731773A CN201510185153.2A CN201510185153A CN104731773A CN 104731773 A CN104731773 A CN 104731773A CN 201510185153 A CN201510185153 A CN 201510185153A CN 104731773 A CN104731773 A CN 104731773A
Authority
CN
China
Prior art keywords
text
entity
block
emotion
short text
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201510185153.2A
Other languages
Chinese (zh)
Inventor
张翔
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Qianhai Shenzhen Panoramic financial information Co., Ltd.
Original Assignee
SHENZHEN SECURITIES INFORMATION CO Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by SHENZHEN SECURITIES INFORMATION CO Ltd filed Critical SHENZHEN SECURITIES INFORMATION CO Ltd
Priority to CN201510185153.2A priority Critical patent/CN104731773A/en
Publication of CN104731773A publication Critical patent/CN104731773A/en
Pending legal-status Critical Current

Links

Landscapes

  • Machine Translation (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a text sentiment analysis method and a text sentiment analysis system. The text sentiment analysis method includes: segmenting a text according to punctuations to obtain at least one short text block; combining short text blocks with same attention entities to obtain long text blocks; subjecting the long text blocks to sentiment analysis to obtain sentiment values of the long text blocks; integrating the sentiment values of the long text blocks with same attention entities to obtain sentiment values of the attention entities. By the text sentiment analysis method and the text sentiment analysis system, sentiment judgment of multiple entity objects in the text can be realized, accuracy in sentiment analysis is improved, and high-precision automatic sentiment analysis is realized.

Description

Text emotion analytical approach and system
Technical field
The present invention relates to natural language processing field, particularly a kind of text emotion analytical approach and system.
Background technology
Sentiment analysis, also referred to as opining mining, viewpoint analysis, subjective and objective analysis etc., its objective is digging user is expressed from text viewpoint and feeling polarities.In recent years, the text with subjective tendency on network grows with each passing day, and these texts with emotion tendency have huge exploiting potentialities in news, ecommerce, government affairs etc.For the traditional forms of enterprises, by using sentiment analysis instrument can recognize the evaluation of user to oneself product rapidly, and place of pinpointing the problems; For financial industry, market can be understood rapidly to the view of some industry or enterprise and evaluation.In a word, sentiment analysis has very important using value in fields such as social public security, business intelligence, Social Public Feelings.
But, the sentiment analysis method of prior art can only make a Judgment by emotion to one section of article, when one section of article contains multiple entity object, prior art cannot make Judgment by emotion respectively to the multiple entity objects in this section of article, particularly when one section of article holds different Sentiment orientation for different entity objects, the accuracy of prior art sentiment analysis method is very low.
Summary of the invention
For the deficiency that the above-mentioned prior art of customer service exists, the object of the present invention is to provide a kind of text emotion analytical approach and system, Judgment by emotion can be made respectively to the multiple entity objects comprised in a text.
For reaching above-mentioned purpose, the invention provides a kind of text emotion analytical approach, the method comprises:
According to punctuation mark, described text is split, obtain at least one short text block;
Described short text block containing identical concern entity is merged, obtains long text block;
Sentiment analysis is carried out to described long text block, obtains the emotion score value of this long text block;
The emotion score value of the comprehensive described long text block containing same concerns entity, obtains the emotion score value of this concern entity.
The present invention also provides a kind of text emotion analytic system, and this system comprises text segmentation unit, text merge unit, sentiment analysis unit, COMPREHENSIVE CALCULATING unit, wherein:
Text segmentation unit, for splitting described text according to punctuation mark, to obtain at least one short text block;
Text merge unit, for merging the described short text block containing identical concern entity, to obtain long text block;
Sentiment analysis unit, for carrying out sentiment analysis to described long text block, to obtain the emotion score value of described long text block;
COMPREHENSIVE CALCULATING unit, for the emotion score value of the described long text block comprehensively containing same concerns entity, to obtain the emotion score value of this concern entity.
As can be seen from technique scheme, in embodiments of the present invention, by urtext being divided into multiple short text block, be long text block by short text merged block again, and respectively sentiment analysis is carried out to each long text block, emotion score value finally by the comprehensive long text block containing same concerns entity judges the emotion score value of this concern entity, thus Judgment by emotion can be made respectively to the multiple entities comprised in a text, solve prior art and can only make a Judgment by emotion to a text, and the problem of Judgment by emotion cannot be made respectively to entity multiple in text, achieve high-precision automatic sentiment analysis.
Accompanying drawing explanation
Fig. 1 is the method flow diagram of embodiment of the present invention text emotion analytical approach;
Fig. 2 is the text merge process flow diagram of embodiment of the present invention;
Fig. 3 is the text merge process flow diagram of another embodiment of the present invention;
Fig. 4 is the sentiment analysis process flow diagram of embodiment of the present invention;
Fig. 5 is the sentiment analysis process flow diagram of another embodiment of the present invention;
Fig. 6 is the system construction drawing of text emotion analytic system of the present invention.
Embodiment
For making the object, technical solutions and advantages of the present invention clearly, below in conjunction with accompanying drawing, the present invention is described in further detail.The content that those skilled in the art can be disclosed by instructions understands other advantage of the present invention and effect easily.The present invention is also implemented by other different specific embodiment or is applied, and the every details in this instructions also can based on different viewpoints and application, carries out various modification and change not deviating under spirit of the present invention.
Fig. 1 is the method flow diagram of embodiment of the present invention text emotion analytical approach.See Fig. 1, text emotion analytical approach provided by the invention, specifically can comprise the steps:
Step 101: split described text according to punctuation mark, obtains at least one short text block;
Step 103: the described short text block containing identical concern entity is merged, obtains long text block;
Step 105: carry out sentiment analysis to described long text block, obtains the emotion score value of described long text block;
Step 107: the emotion score value of the comprehensive described long text block containing same concerns entity, obtains the emotion score value of this concern entity.
In one embodiment, carry out merging to the short text block containing identical concern entity to comprise: when in two short text blocks that position is adjacent, at least one does not contain any concern entity, and, wherein the preceding short text in position block terminate with comma or in the posterior short text block in position without Chinese character time, by described two adjacent short text merged block.
In one embodiment, merging is carried out to the short text block containing identical concern entity and comprises: when adjacent two the short text blocks in position all containing and only containing a common concern entity time, by described two adjacent short text merged block; Or when non-conterminous two the short text blocks in position all containing and only contain a common concern entity, and short text block between described non-conterminous two short text blocks all containing any concern entity time, by described non-conterminous two short text blocks and between short text block all merge.
In one embodiment, the emotion score value of the comprehensive long text block containing same concerns entity comprises: the average emotion score value calculating all long text block containing same concerns entity, obtains the emotion score value of this concern entity.
In one embodiment, the emotion score value of the comprehensive long text block containing same concerns entity also comprises: the average string length calculating all long text block containing same concerns entity, obtains the importance degree score value of this concern entity.
Below in conjunction with instantiation, the algorithm of embodiment of the present invention text emotion analytical approach is illustrated in greater detail.
When carrying out sentiment analysis to text, one of them difficult point is exactly the Judgment by emotion problem to the multiple entity objects in text.Such as, suppose that urtext is:
" happy depending on bucket millet: happy view is won a lawsuit, millet box is judged to infringement, need bear bootlegging broadcasting joint liability.For this court verdict, millet science and technology public relations chief inspector Liu Fei represents, millet company expresses reservations.Happy view represents, since happy depending on by just like respect with protect the intellectual property.”
Obviously, above-mentioned text is evaluated as front to " happy depending on ", negative to being evaluated as of " millet ".But the text emotion analytical approach of prior art can only provide an emotion score value to above-mentioned text, and can not make Judgment by emotion respectively for " happy depending on " and " millet ", therefore the accuracy of sentiment analysis is very low.
In order to improve the accuracy of sentiment analysis, first the embodiment of the present invention is extracted the concern entity (as above in example " happy depending on " and " millet ") in text.Before entity is paid close attention in extraction, can first set up entity vocabulary, the title paying close attention to entity imported in this entity vocabulary, entity vocabulary is as shown in table 1.
Numbering Pay close attention to entity
0 Pleasure is looked
1 Millet
Table 1 entity vocabulary
Then, search the concern entity position in the text in entity vocabulary, extract this concern entity, and this concern entity is marked.Wherein, search concern entity position in the text to be searched by the programming of canonical formula.Such as, to the concern entity " pleasure is looked " in above-mentioned Text Feature Extraction entity vocabulary and " millet ", and as follows after mark is carried out to it:
" <oe> is happy to win a lawsuit depending on </oe> net depending on </oe> bucket <oe> millet </oe>:<oeGreatT. GreaT.GT pleasure; <oe> millet </oe> box is judged to infringement, need bear bootlegging broadcasting joint liability.For this court verdict, <oe> millet </oe> science and technology public relations chief inspector Liu Fei represents, <oe> millet </oe> company expresses reservations.The happy </oe> net of looking of <oe> represents, just like respect and will protect the intellectual property depending on </oe> since <oe> is happy.”
Then, be multiple short text blocks according to punctuation mark by text segmentation, preserve these short text blocks according to original text order.Such as, be following Block 1 to Block 9 totally 9 short text blocks to above-mentioned text by punctuation mark cutting, these short text blocks are preserved by its order in urtext.
Block 1:<oe> is happy looks </oe> bucket <oe> millet </oe>:
Block 2:<oe> is happy to win a lawsuit depending on </oe> net,
Block 3:<oe> millet </oe> box is judged to infringement,
Block 4: bootlegging broadcasting joint liability need be born.
Block 5: for this court verdict,
Block 6:<oe> millet </oe> science and technology public relations chief inspector Liu Fei represents,
Block 7:<oe> millet </oe> company expresses reservations.
The happy </oe> net of looking of Block8:<oe> represents,
Just like respect and will protect the intellectual property depending on </oe> since Block 9:<oe> is happy.
When according to Segmentation of Punctuation text, if do not selected punctuation mark and split text according to all punctuation marks, then can destroy the integrality of sentence unit in urtext, thus affect the accuracy and efficiency of sentiment analysis.Present inventor finds the punctuation mark had in text, such as: punctuation marks used to enclose the title " " ", " " ", bracket " [", "] ", braces " { ", " } " etc., content of text between it does not have Sentiment orientation usually, little to the effect of sentiment analysis.Present inventor after contrasting many times and testing, find select comma, ", fullstop ".", question mark "? ", exclamation mark "! ", pause mark ", ", quotation marks, round bracket, colon split text, the accuracy of its sentiment analysis and most effective.
In order to improve the efficiency of sentiment analysis further, require the sentence unit as far as possible retained in urtext, the embodiment of the present invention solves this problem by the merging of text block, and the long text block after merging reduces the sentence unit in urtext substantially.
Fig. 2 is the text merge process flow diagram of embodiment of the present invention.
See Fig. 2, the method comprises:
Step 201: by urtext order traversal short text block;
Step 203: if without paying close attention to entity in current short text block or next short text block, then enter step 205, otherwise, enter step 201;
Step 205: if current short text block terminates with comma, or without Chinese character in next short text block, then enter step 207, otherwise, enter step 201;
Step 207: by current short text block and next short text merged block.
That is, when in two short text blocks that position is adjacent, at least one does not contain any concern entity, further, wherein the preceding short text in position block terminate with comma or in the posterior short text block in position without Chinese character time, two the short text merged block adjacent by this.
Such as, by short text block Block 1 to the Block 9 in above-mentioned example by the compatible rule merging of step 201, following Block 1 to Block 7 totally 7 long text block are obtained:
Block 1:<oe> is happy looks </oe> bucket <oe> millet </oe>:
Block 2:<oe> is happy to win a lawsuit depending on </oe> net,
Block 3:<oe> millet </oe> box is judged to infringement, need bear bootlegging broadcasting joint liability.
Block 4: for this court verdict, <oe> millet </oe> science and technology public relations chief inspector Liu Fei represents,
Block 5:<oe> millet </oe> company expresses reservations.
The happy </oe> net of looking of Block 6:<oe> represents,
Just like respect and will protect the intellectual property depending on </oe> since Block 7:<oe> is happy.
In order to improve accuracy and the efficiency of sentiment analysis further, the embodiment of the present invention also proposed the method that another kind of text block merges, adjacent text block containing same concerns entity is merged, text block after merging contains unique concern entity, thus can carry out Judgment by emotion exactly to this concern entity.
Fig. 3 is the text merge process flow diagram of another embodiment of the present invention.
See Fig. 3, the method comprises:
Step 301: by urtext order traversal short text block;
Step 303: if current short text block contains and only containing paying close attention to entity X, then enters step 305; Otherwise, enter step 301;
Step 305: if next short text block contains and only containing paying close attention to entity X, then enters step 307; Otherwise, enter step 309;
Step 307: by current short text block and next short text merged block;
Step 309: if next short text block to the n-th short text block is all without any concern entity, and (n+1)th short text block contains and only containing paying close attention to entity X, then enters step 311; Otherwise, enter step 301;
Step 311: by current short text block to a (n+1)th short text merged block.
That is, when adjacent two the short text blocks in position all containing and only containing a common concern entity time, by described two adjacent short text merged block; Or when non-conterminous two the short text blocks in position all containing and only contain a common concern entity, and short text block between described non-conterminous two short text blocks all containing any concern entity time, by described non-conterminous two short text blocks and between short text block all merge.
It should be noted that, the present invention does not do concrete restriction to the order and executive mode that merge method shown in Fig. 2 and Fig. 3.Preferably, after short text block is merged according to method shown in Fig. 2, then merge according to method shown in Fig. 3 further.
Such as, to above-mentioned text block Block 1 to the Block 7 obtained after step 201 merges, then merge according to the method for step 301, obtain following Block 1 to Block 4 totally 4 long text block:
Block 1:<oe> is happy looks </oe> bucket <oe> millet </oe>:
Block 2:<oe> is happy to win a lawsuit depending on </oe> net,
Block 3:<oe> millet </oe> box is judged to infringement, need bear bootlegging broadcasting joint liability.For this court verdict, <oe> millet </oe> science and technology public relations chief inspector Liu Fei represents, <oe> millet </oe> company expresses reservations.
The happy </oe> net of looking of Block 4:<oe> represents, just like respect and will protect the intellectual property depending on </oe> since <oe> is happy.
As can be seen here, the content of text paying close attention to entity for each in urtext has been undertaken effectively extracting by the merging of text block by the present invention, find only to the text block that this concern entity is described, solve the problem can not carrying out sentiment analysis in prior art to concern entity multiple in text respectively, and by the merging of text block, short text merged block has been become long text block, effectively increase text size, also solve the problem can not carrying out sentiment analysis in prior art to short text (short text especially not containing emotion word).
After being long text block by short text merged block, the long text block after being combined carries out sentiment analysis one by one, obtains the emotion score value of the long text block after merging.For improving efficiency and the accuracy rate of sentiment analysis further, embodiment of the present invention does not carry out sentiment analysis to containing more than 2 the long text block paying close attention to entity, and the long text block only paying close attention to entity containing after being only combined carries out sentiment analysis.Then, the emotion score value of the comprehensive long text block containing same concerns entity, obtains the emotion score value of this concern entity.
Fig. 4 is the sentiment analysis process flow diagram of embodiment of the present invention.
See Fig. 4, the method comprises:
Step 401: sentiment analysis is carried out to the long text block only paying close attention to entity containing, obtains the emotion score value of this long text block;
Step 403: sum up on average the emotion score value of the long text block containing same concerns entity, obtains the emotion score value of this concern entity.
Such as, for long text block Block 1 to the Block 4 obtained after above-mentioned merging, wherein, Block1 contains two and pays close attention to entity (" pleasure is looked " and " millet "), therefore do not carry out sentiment analysis to it, Block2, Block 4 is containing unique concern entity " pleasure is looked ", and Block 3 is containing unique concern entity " millet ", respectively sentiment analysis is carried out to Block2 ~ 4, calculate its emotion score value.Result is as follows:
Block 2: pleasure is looked: Sentiment_value:0.0166; Sentiment_tag:non-neg;
Block3: millet: Sentiment_value:-0.4198; Sentiment_tag:neg;
Block4: pleasure is looked: Sentiment_value:0.0226; Sentiment_tag:non-neg;
Then, the emotion score value of the comprehensive text block containing same concerns entity, obtains the emotion score value of this concern entity.Concrete, the emotion score value of the text block containing some concern entities is summed up on average, obtains the emotion score value of this concern entity.Such as, in upper example, Block 2 and Block 4 is containing identical concern entity " happy look ", sums up on average the emotion score value of Block 2 and Block 4, and the emotion score value obtaining paying close attention to entity " find pleasure in depending on " is 0.0196; Use the same method and can obtain the emotion score value of other concern entity in text.As:
Millet: Sentiment_value:-0.4198; Sentiment_tag:neg;
Pleasure is looked: Sentiment_value:0.0196; Sentiment_tag:non-neg.
Can find out according to analysis result, urtext is evaluated as front to " happy depending on ", negative to being evaluated as of " millet ".
The present invention another embodiment further provides a kind of sentiment analysis method, while entity carries out sentiment analysis to concern, can also calculate and pay close attention to the importance degree of entity in urtext.
Judge the number of times that the method for entity importance degree occurs mainly through computational entity in prior art or judge whether entity appears in title to judge its importance degree.But these methods all have some limitations.First, these methods all need to spend extra computing time, and efficiency comparison is low; Secondly, their ranges of application are narrow and accuracy rate is low, cannot complete Significance Analysis accurately.Such as when identical number of times appears in entity A and entity B in a slice article, but entity A is described emphatically, and when entity B is only mentioned once, the number of times occurred according to entity in prior art judges that the method for its importance degree is obviously inaccurate; In addition, for a large amount of use the abbreviation of entity or designate, entity occurs but the situations such as non-text emphasis at title, prior art judges that the method for entity importance degree all can not judge exactly.
In order to solve in prior art the problem of the importance degree that can not judge each concern entity in urtext, present invention also offers a kind of sentiment analysis method, calculating while sentiment analysis is carried out to concern entity and paying close attention to the importance degree of entity in urtext.
Fig. 5 is the sentiment analysis process flow diagram of another embodiment of the present invention.Described sentiment analysis method is the improvement done based on Fig. 4 illustrated embodiment, does not repeat them here with Fig. 4 embodiment same section.
See Fig. 5, the method, on Fig. 4 embodiment basis, further comprises:
Step 501: string length calculating is carried out to the long text block only paying close attention to entity containing, obtains the string length of this long text block;
Step 503: sum up on average the string length of the long text block containing same concerns entity, obtains the importance degree score value of this concern entity.
Particularly, after being long text block by short text merged block, the long text block after being combined also walks one by one and calculates its string length, and then, the string length of the comprehensive long text block containing same concerns entity, obtains the importance degree score value of this concern entity.Generally, the text character string length paying close attention to entity containing certain is larger, then judge that this concern entity importance degree is in the text higher.
Similarly, for improving the efficiency and accuracy of paying close attention to the judgement of entity importance degree, embodiment of the present invention does not carry out character string calculating to containing more than 2 the long text block paying close attention to entity, and the long text block only paying close attention to entity containing after being only combined carries out string length calculating.
Such as, for the long text block Block 2 ~ 4 in above-mentioned Fig. 4 example, carry out Block2 ~ 4 respectively while sentiment analysis calculates its emotion score value, also carrying out Significance Analysis to it respectively and calculating its string length in the method according to step 401.Result is as follows:
Block 2: pleasure is looked: Sentiment_value:0.0166; Sentiment_tag:non-neg; Stringlength:7.0;
Block3: millet: Sentiment_value:-0.4198; Sentiment_tag:neg; String length:48.0;
Block4: pleasure is looked: Sentiment_value:0.0226; Sentiment_tag:non-neg; Stringlength:23.0;
Then, the emotion score value of the comprehensive text block containing same concerns entity and string length, obtain emotion score value and the importance degree of this concern entity.Concrete, the emotion score value of the text block containing some concern entities and string length are summed up on average respectively, obtains emotion score value and the importance degree of this concern entity.Such as, in upper example, Block 2 and Block 4 is containing identical concern entity " happy look ", sums up on average the emotion score value of Block 2 and Block 4, and the emotion score value obtaining paying close attention to entity " find pleasure in depending on " is 0.0196; Sum up on average the string length of Block 2 and Block 4, the importance degree obtaining paying close attention to entity " happy depending on " is 15.0.Emotion score value and the importance degree of other concern entity in text can be obtained to using the same method.As:
Millet: Sentiment_value:-0.4198; Sentiment_tag:neg; String length:24.0
Pleasure is looked: Sentiment_value:0.0196; Sentiment_tag:non-neg; String length:15.0.
Can find out according to analysis result, urtext is evaluated as front to " happy depending on ", negative to being evaluated as of " millet ".In addition, the importance degree of " happy depending on " is 15, and the importance degree of " millet " is 24, a little more than " happy look ".The present invention is carrying out while sentiment analysis calculates its emotion score value to text, also the importance degree paying close attention to entity is calculated, one time sentiment analysis can export emotion score value and importance degree score value two results simultaneously, saves a large amount of computing times, substantially increases efficiency and the accuracy of sentiment analysis.The present invention through practice, to paying close attention to accuracy rate that entity emotion analyzes up to more than 90%, to paying close attention to the accuracy rate of the importance analysis of entity in urtext up to more than 80%.
Based on above-mentioned labor, the embodiment of the present invention also proposed a kind of text emotion analytic system.
Fig. 6 is the system construction drawing of text emotion analytic system of the present invention.As shown in Figure 6, this system comprises text segmentation unit, text merge unit, sentiment analysis unit, COMPREHENSIVE CALCULATING unit, wherein:
Text segmentation unit, for splitting described text according to punctuation mark, to obtain at least one short text block;
Text merge unit, for merging the described short text block containing identical concern entity, to obtain long text block;
Sentiment analysis unit, for carrying out sentiment analysis to described long text block, to obtain the emotion score value of described long text block;
COMPREHENSIVE CALCULATING unit, for the emotion score value of the described long text block comprehensively containing same concerns entity, to obtain the emotion score value of this concern entity.
In one embodiment, text merge unit comprises the first merge cells, for working as in two adjacent short text blocks of position, at least one does not contain any concern entity, and, wherein the preceding short text in position block terminate with comma or in the posterior short text block in position without Chinese character time, by described two adjacent short text merged block.
In one embodiment, text merge unit also comprises the second merge cells, for when adjacent two the short text blocks in position all containing and only containing a common concern entity time, by described two adjacent short text merged block; Or when non-conterminous two the short text blocks in position all containing and only contain a common concern entity, and short text block between described non-conterminous two short text blocks all containing any concern entity time, by described non-conterminous two short text blocks and between short text block all merge.
In one embodiment, COMPREHENSIVE CALCULATING unit comprises affection computation unit, for calculating the average emotion score value of all long text block containing same concerns entity, obtains the emotion score value of this concern entity.
In one embodiment, COMPREHENSIVE CALCULATING unit comprises importance degree computing unit, for calculating the average string length of all long text block containing same concerns entity, obtains the importance degree score value of this concern entity.
In sum, in embodiments of the present invention, by the cutting of text block and folding, the content of text for a certain concern entity is carried out effectively extracting, find only to the text block that this concern entity is described, and carry out sentiment analysis one by one, thus emotion score value and the affective tag of each concern entity in text can be obtained.The invention solves prior art and can only provide a Judgment by emotion to a text, the problem of sentiment analysis cannot be carried out concern entity multiple in text respectively.
In addition, the present invention, by the comprehensive text block length containing a certain concern entity, analyzes the shared proportion in full of this concern entity to judge whether this concern entity is described emphatically by urtext, thus can judge to pay close attention to the importance of entity in urtext.
The above, be only preferred embodiment of the present invention, be not intended to limit protection scope of the present invention.Within the spirit and principles in the present invention all, any amendment done, equivalent replacement, improvement etc., all should be included within protection scope of the present invention.

Claims (10)

1. a text emotion analytical approach, is characterized in that, the method comprises:
According to punctuation mark, described text is split, obtain at least one short text block;
Described short text block containing identical concern entity is merged, obtains long text block;
Sentiment analysis is carried out to described long text block, obtains the emotion score value of described long text block;
The emotion score value of the comprehensive described long text block containing same concerns entity, obtains the emotion score value of this concern entity.
2. text emotion analytical approach according to claim 1, is characterized in that, describedly carries out merging to the described short text block containing identical concern entity and comprises:
When in two short text blocks that position is adjacent at least one not containing any concern entity, and, wherein the preceding short text in position block terminate with comma or in the posterior short text block in position without Chinese character time, by described two adjacent short text merged block.
3. text emotion analytical approach according to claim 1 and 2, is characterized in that, describedly carries out merging to the described short text block containing identical concern entity and comprises:
When adjacent two the short text blocks in position all containing and only containing a common concern entity time, by described two adjacent short text merged block; Or
When non-conterminous two the short text blocks in position all containing and only containing a common concern entity, and short text block between described non-conterminous two short text blocks all containing any concern entity time, by described non-conterminous two short text blocks and between short text block all merge.
4. text emotion analytical approach according to claim 1, is characterized in that, the emotion score value of the described comprehensive described long text block containing same concerns entity comprises:
Calculate the average emotion score value of all long text block containing same concerns entity, obtain the emotion score value of this concern entity.
5. text emotion analytical approach according to claim 1, is characterized in that, the emotion score value of the described comprehensive described long text block containing same concerns entity comprises:
Calculate the average string length of all long text block containing same concerns entity, obtain the importance degree score value of this concern entity.
6. a text emotion analytic system, is characterized in that, this system comprises text segmentation unit, text merge unit, sentiment analysis unit, COMPREHENSIVE CALCULATING unit, wherein:
Text segmentation unit, for splitting described text according to punctuation mark, to obtain at least one short text block;
Text merge unit, for merging the described short text block containing identical concern entity, to obtain long text block;
Sentiment analysis unit, for carrying out sentiment analysis to described long text block, to obtain the emotion score value of described long text block;
COMPREHENSIVE CALCULATING unit, for the emotion score value of the described long text block comprehensively containing same concerns entity, to obtain the emotion score value of this concern entity.
7. text emotion analytic system according to claim 6, is characterized in that, described text merge unit comprises:
First merge cells, for working as in two adjacent short text blocks of position, at least one does not contain any concern entity, further, wherein the preceding short text in position block terminate with comma or in the posterior short text block in position without Chinese character time, by described two adjacent short text merged block.
8. the text emotion analytic system according to claim 6 or 7, is characterized in that, described text merge unit comprises:
Second merge cells, for when adjacent two the short text blocks in position all containing and only containing a common concern entity time, by described two adjacent short text merged block; Or
When non-conterminous two the short text blocks in position all containing and only containing a common concern entity, and short text block between described non-conterminous two short text blocks all containing any concern entity time, by described non-conterminous two short text blocks and between short text block all merge.
9. text emotion analytic system according to claim 6, is characterized in that, described COMPREHENSIVE CALCULATING unit comprises:
Affection computation unit, for calculating the average emotion score value of all long text block containing same concerns entity, obtains the emotion score value of this concern entity.
10. text emotion analytic system according to claim 6, is characterized in that, described COMPREHENSIVE CALCULATING unit comprises:
Importance degree computing unit, for calculating the average string length of all long text block containing same concerns entity, obtains the importance degree score value of this concern entity.
CN201510185153.2A 2015-04-17 2015-04-17 Text sentiment analysis method and text sentiment analysis system Pending CN104731773A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510185153.2A CN104731773A (en) 2015-04-17 2015-04-17 Text sentiment analysis method and text sentiment analysis system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510185153.2A CN104731773A (en) 2015-04-17 2015-04-17 Text sentiment analysis method and text sentiment analysis system

Publications (1)

Publication Number Publication Date
CN104731773A true CN104731773A (en) 2015-06-24

Family

ID=53455671

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510185153.2A Pending CN104731773A (en) 2015-04-17 2015-04-17 Text sentiment analysis method and text sentiment analysis system

Country Status (1)

Country Link
CN (1) CN104731773A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106776538A (en) * 2016-11-23 2017-05-31 国网福建省电力有限公司 The information extracting method of enterprise's noncanonical format document
CN109543180A (en) * 2018-11-08 2019-03-29 中山大学 A kind of text emotion analysis method based on attention mechanism
CN110163688A (en) * 2019-05-30 2019-08-23 复旦大学 Commodity network public sentiment detection system
CN111527492A (en) * 2018-02-05 2020-08-11 国际商业机器公司 Superposition and entanglement of social emotion and natural language generated quanta
CN113343693A (en) * 2020-03-03 2021-09-03 阿里巴巴集团控股有限公司 Named entity identification method, device, equipment and machine readable medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110209043A1 (en) * 2010-02-21 2011-08-25 International Business Machines Corporation Method and apparatus for tagging a document
US20110295594A1 (en) * 2010-05-28 2011-12-01 International Business Machines Corporation System, method, and program for processing text using object coreference technology

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110209043A1 (en) * 2010-02-21 2011-08-25 International Business Machines Corporation Method and apparatus for tagging a document
US20110295594A1 (en) * 2010-05-28 2011-12-01 International Business Machines Corporation System, method, and program for processing text using object coreference technology

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106776538A (en) * 2016-11-23 2017-05-31 国网福建省电力有限公司 The information extracting method of enterprise's noncanonical format document
CN111527492A (en) * 2018-02-05 2020-08-11 国际商业机器公司 Superposition and entanglement of social emotion and natural language generated quanta
CN111527492B (en) * 2018-02-05 2024-03-01 国际商业机器公司 Superposition and entanglement of quanta generated by social emotion and natural language
CN109543180A (en) * 2018-11-08 2019-03-29 中山大学 A kind of text emotion analysis method based on attention mechanism
CN109543180B (en) * 2018-11-08 2020-12-04 中山大学 Text emotion analysis method based on attention mechanism
CN110163688A (en) * 2019-05-30 2019-08-23 复旦大学 Commodity network public sentiment detection system
CN113343693A (en) * 2020-03-03 2021-09-03 阿里巴巴集团控股有限公司 Named entity identification method, device, equipment and machine readable medium

Similar Documents

Publication Publication Date Title
CN106708966A (en) Similarity calculation-based junk comment detection method
AU2017243270B2 (en) Method and device for extracting core words from commodity short text
CN104731773A (en) Text sentiment analysis method and text sentiment analysis system
CN103544255B (en) Text semantic relativity based network public opinion information analysis method
CN103885937B (en) Method for judging repetition of enterprise Chinese names on basis of core word similarity
CN107807962B (en) A method of similarity mode being carried out to legal decision document using LDA topic model
CN110489560A (en) The little Wei enterprise portrait generation method and device of knowledge based graphical spectrum technology
CN109325019B (en) Data association relationship network construction method
CN106776897B (en) User portrait label determination method and device
CN111199474A (en) Risk prediction method and device based on network diagram data of two parties and electronic equipment
CN105045847B (en) A kind of method that Chinese institutional units title is extracted from text message
CN111222976A (en) Risk prediction method and device based on network diagram data of two parties and electronic equipment
CN103336766A (en) Short text garbage identification and modeling method and device
CN103389998A (en) Novel Internet commercial intelligence information semantic analysis technology based on cloud service
CN109583738A (en) A kind of device and method for bond risk control
CN105468649B (en) Method and device for judging matching of objects to be displayed
CN107368592B (en) Text feature model modeling method and device for network security report
CN109033132A (en) The method and device of text and the main body degree of correlation are calculated using knowledge mapping
CN103106211B (en) Emotion recognition method and emotion recognition device for customer consultation texts
CN106250365A (en) The extracting method of item property Feature Words in consumer reviews based on text analyzing
CN107818173B (en) Vector space model-based Chinese false comment filtering method
CN113535813A (en) Data mining method and device, electronic equipment and storage medium
Urata et al. Trade creation and diversion effects of regional trade agreements on commodity trade
CN109408643B (en) Fund similarity calculation method, system, computer equipment and storage medium
CN105573968A (en) Text indexing method based on rules

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C41 Transfer of patent application or patent right or utility model
TA01 Transfer of patent application right

Effective date of registration: 20160415

Address after: 518054 Guangdong city of Shenzhen province Qianhai cooperation zone before the Deep Bay Road No. 1 building 201 room A (in Shenzhen Qianhai City Secretary of Commerce Co., Ltd.)

Applicant after: Qianhai Shenzhen Panoramic financial information Co., Ltd.

Address before: 518028 Guangdong city of Shenzhen province Futian District Hongli West Industrial Zone on the step 203 building 606 room

Applicant before: Shenzhen Securities Information Co.,Ltd.

C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20150624