CN110569497A - Opinion vocabulary expansion system and opinion vocabulary expansion method - Google Patents

Opinion vocabulary expansion system and opinion vocabulary expansion method Download PDF

Info

Publication number
CN110569497A
CN110569497A CN201811341060.4A CN201811341060A CN110569497A CN 110569497 A CN110569497 A CN 110569497A CN 201811341060 A CN201811341060 A CN 201811341060A CN 110569497 A CN110569497 A CN 110569497A
Authority
CN
China
Prior art keywords
opinion
vocabularies
vocabulary
candidate
word
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201811341060.4A
Other languages
Chinese (zh)
Inventor
萧瑞祥
王雅诗
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tamkang University
Original Assignee
Tamkang University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tamkang University filed Critical Tamkang University
Publication of CN110569497A publication Critical patent/CN110569497A/en
Pending legal-status Critical Current

Links

Abstract

The invention discloses an opinion vocabulary expansion system and an opinion vocabulary expansion method, wherein the opinion vocabulary expansion method comprises the following steps: calculating a plurality of domain representative words representing a target domain from the plurality of words; extracting a plurality of candidate opinion vocabularies from the vocabularies according to a part-of-speech combination; dividing the candidate opinion vocabularies into a plurality of clusters according to the similarity of the candidate opinion vocabularies; and selecting a plurality of positive seed words and a plurality of negative seed words from the plurality of domain representative words, and calculating the emotional tendency of each candidate opinion word of each cluster according to the positive seed words and the negative seed words.

Description

Opinion vocabulary expansion system and opinion vocabulary expansion method
Technical Field
The invention relates to an opinion vocabulary expansion system, in particular to an opinion vocabulary expansion system based on part of speech combination. The invention also relates to an opinion vocabulary expansion method adopted by the opinion vocabulary expansion system.
Background
the amplification and establishment of the opinion vocabulary are the basis in opinion analysis, and the part of speech judgment of the opinion vocabulary is also an important ring in opinion analysis; generally, there are three general ways to expand and build opinion vocabularies: (1) manual mode: intercepting and establishing the required opinion vocabulary in a manual mode; (2) dictionary-based approach: the existing opinion vocabularies are amplified by matching the existing dictionary with synonymy and antisense vocabulary resources or any resources with vocabulary relations; (3) corpus-based approach: the rules of the part of speech, the context and the like of the opinion vocabulary to be captured are known through a statistical or observation method, and the required opinion vocabulary is found in the corpus through a rule making mode.
however, the manual method for expanding and building the opinion vocabulary is inefficient, and cannot effectively increase the coverage of the opinion vocabulary, and the dictionary-based method and the corpus-based method also have the problem of being unable to effectively increase the coverage of the opinion vocabulary.
the word part of speech determination of the opinion vocabulary is generally performed by the above three methods. However, the part-of-speech determination of the opinion vocabulary by a manual method can achieve higher accuracy, but is less efficient; the dictionary-based mode and the corpus-based mode have the problem of low precision.
Therefore, how to propose an opinion vocabulary analysis technique, which can effectively improve various limitations of the prior art, has become an unbearable problem.
Disclosure of Invention
In view of the above problems in the prior art, it is an object of the present invention to provide an opinion vocabulary expansion system and an opinion vocabulary expansion method, so as to solve various problems in the prior art.
According to one aspect of the present invention, an opinion vocabulary expansion system is provided, which comprises a target domain vocabulary calculation module, an opinion vocabulary extraction module, an opinion vocabulary similarity grouping module, and an opinion vocabulary emotional tendency analysis module. The target domain vocabulary calculation module can calculate a plurality of domain representative vocabularies which represent a target domain from a plurality of vocabularies. The opinion vocabulary extraction module can extract a plurality of candidate opinion vocabularies from the plurality of vocabularies according to a part-of-speech combination. The opinion vocabulary similarity grouping module can select a plurality of positive seed vocabularies and a plurality of negative seed vocabularies from the plurality of domain representative vocabularies, and can calculate the emotional tendency of each candidate opinion vocabulary of each cluster according to the positive seed vocabularies and the negative seed vocabularies.
According to another aspect of the present invention, a method for expanding opinion vocabulary is provided, which comprises the following steps: calculating a plurality of domain representative words representing a target domain from the plurality of words; extracting a plurality of candidate opinion vocabularies from the vocabularies according to a part-of-speech combination; dividing the candidate opinion vocabularies into a plurality of clusters according to the similarity of the candidate opinion vocabularies; and selecting a plurality of positive seed words and a plurality of negative seed words from the plurality of domain representative words, and calculating the emotional tendency of each candidate opinion word of each cluster according to the positive seed words and the negative seed words.
In view of the above, the opinion vocabulary expansion system and the opinion vocabulary expansion method according to the present invention may have one or more of the following advantages:
(1) In an embodiment of the invention, the opinion vocabulary expansion system can extract candidate opinion vocabularies by special part-of-speech combinations including idiom types and adjective types, so that the coverage rate of the opinion vocabularies can be greatly improved.
(2) in an embodiment of the invention, the opinion vocabulary expansion system can analyze the emotional tendency of the opinion vocabulary through a more effective emotional tendency analysis step, so that the part-of-speech judgment accuracy of the opinion vocabulary can be greatly improved.
(3) In an embodiment of the invention, the opinion vocabulary expansion system can adopt a specially designed mechanism to more rapidly perform the expansion and establishment of the opinion vocabulary and the part of speech judgment of the opinion vocabulary, thereby greatly improving the efficiency.
Drawings
FIG. 1 is a block diagram of an opinion vocabulary expansion system according to a first embodiment of the present invention.
Fig. 2 is a flowchart of a first embodiment of the present invention.
FIG. 3 is a block diagram of an opinion vocabulary expansion system according to a second embodiment of the present invention.
fig. 4 is a flowchart of a second embodiment of the present invention.
Description of reference numerals: 1-opinion vocabulary extension system; 11-a data preprocessing module; 12-a target domain vocabulary calculation module; 13-opinion vocabulary extraction module; 14-opinion vocabulary similarity clustering module; 15-invalid opinion vocabulary filtering module; 16-opinion vocabulary emotional tendency analysis module; d-a review database; S21-S25, S41-S46-step flow.
Detailed Description
Embodiments of the opinion vocabulary expansion system and opinion vocabulary expansion method according to the present invention will be described below with reference to the related drawings, in which components may be exaggerated or reduced in size or in scale for clarity and convenience in illustration. In the following description and/or claims, when an element is referred to as being "connected" or "coupled" to another element, it can be directly connected or coupled to the other element or intervening elements may be present; when an element is referred to as being "directly connected" or "directly coupled" to another element, there are no intervening elements present, and other words used to describe the relationship between the elements or layers should be interpreted in the same manner. For ease of understanding, the same components in the following embodiments are illustrated with the same reference numerals.
Please refer to fig. 1, which is a block diagram of an opinion vocabulary expansion system according to a first embodiment of the present invention. As shown in the figure, the opinion vocabulary expansion system 1 may include a data preprocessing module 11, a target domain vocabulary calculating module 12, an opinion vocabulary extracting module 13, an opinion vocabulary similarity grouping module 14, and an invalid opinion vocabulary filtering module 15.
The data preprocessing module 11 can obtain a plurality of product review articles from the review database D; the plurality of product review articles may be obtained by an automated web crawler. Then, the data preprocessing module 11 may perform word segmentation and part-of-speech tagging on the product review articles through a word segmenter to generate a plurality of words; in one embodiment, the word segmenter may be a word segmentation algorithm (e.g., Jieba).
The target domain vocabulary calculating module 12 can calculate a plurality of domain representative vocabularies representing a target domain from the vocabularies; in one embodiment, the target domain vocabulary calculation module 12 may calculate the domain representatives representing the target domain from the plurality of vocabularies by a word frequency-inverse document frequency (TF-IDF) algorithm.
The opinion vocabulary extraction module 13 can extract a plurality of candidate opinion vocabularies from the plurality of vocabularies according to the part-of-speech combination; in one embodiment, the part-of-speech combination may be generated according to the definition of a word-breaking algorithm (e.g., Jieba); for example, the part-of-speech combination may include an verb type, a verb type, an adverb plus verb type, and an adverb plus adverb type, and may further be derived into an idiom type and an adjective type.
The opinion vocabulary similarity clustering module 14 may classify the candidate opinion vocabularies into a plurality of clusters according to the similarities of the candidate opinion vocabularies; in one embodiment, the opinion vocabulary similarity clustering module 14 may employ a Single-Pass (Single-Pass) algorithm and a Levenshtein Distance (Levenshtein Distance) algorithm to calculate the similarity of the candidate opinion vocabularies and may divide the candidate opinion vocabularies into the clusters.
The invalid opinion vocabulary filtering module 15 may calculate various inter-Point Mutual Information (PMI) combinations of the candidate opinion vocabularies and the domain representative vocabularies, respectively, to filter out part of the invalid opinion vocabularies from the candidate opinion vocabularies.
As can be seen from the above, the opinion vocabulary expansion system 1 may extract candidate opinion vocabularies by special part-of-speech combinations including an verb type, a verb type, an adverb plus verb type, and an adverb plus adverb type, and the part-of-speech combinations may further derive idiom types and adjective types, thereby greatly increasing the coverage of the opinion vocabularies.
Of course, the above description is only an example, and the components of the opinion vocabulary expansion system 1 and the coordination relationship thereof may also vary according to the actual requirement, and the invention is not limited thereto.
Please refer to fig. 2, which is a flowchart illustrating a first embodiment of the present invention. As shown in the figure, the opinion vocabulary expansion method adopted by the opinion vocabulary expansion system 1 can comprise the following steps:
Step S21: the word segmentation and part-of-speech tagging are carried out on a plurality of product review articles through a word segmentation device to generate a plurality of words.
Step S22: a plurality of domain representative words representing a target domain are calculated from the plurality of words.
Step S23: extracting a plurality of candidate opinion vocabularies from the vocabularies according to a part-of-speech combination.
Step S24: and dividing the candidate opinion vocabularies into a plurality of clusters according to the similarity of the candidate opinion vocabularies.
Step S25: and respectively calculating the mutual information between various pairwise combinations of the candidate opinion vocabularies and the field representative vocabularies so as to filter out partial invalid opinion vocabularies by the candidate opinion vocabularies.
Please refer to fig. 3, which is a block diagram of an opinion vocabulary expansion system according to a second embodiment of the present invention, wherein the embodiment takes the food field and the makeup field as examples. As shown in the figure, the opinion vocabulary expansion system 1 may include a data preprocessing module 11, a target domain vocabulary calculating module 12, an opinion vocabulary extracting module 13, an opinion vocabulary similarity grouping module 14, an invalid opinion vocabulary filtering module 15 and an opinion vocabulary emotional tendency analyzing module 16.
The data preprocessing module 11 can obtain a plurality of food and makeup product review articles from the review database D; the multiple food and makeup product review articles can be obtained through an automatic web crawler, and invalid information in the articles can be filtered. Since word segmentation and part-of-speech tagging are required before the subsequent steps are performed, the data preprocessing module 11 can generate a plurality of words by segmenting and part-of-speech tagging the plurality of food and cosmetic product review articles through a word segmentation algorithm (e.g., Jieba), wherein the part-of-speech tagging is shown in table 1 below:
TABLE 1
The definitions of the symbols appearing in table 1 are derived from the part-of-speech table of Jieba, and should be well known to those skilled in the art, and therefore are not described herein in detail.
In order to determine the correlation between the vocabulary and the field of food and cosmetic, the vocabulary capable of representing food and cosmetic must be calculated. The target field vocabulary calculation module 12 can mark vocabularies with parts of speech as nouns in all food and makeup product comment articles after word segmentation as subsequent operation vocabularies so as to assist the subsequent steps in matching candidate opinion vocabularies; next, the target domain vocabulary calculation module 12 may use a word frequency-inverse document frequency (TF-IDF) algorithm to obtain the TF-IDF result of each vocabulary selected in the previous step in each food and cosmetic product review article, and record several previous representative vocabularies of each food and cosmetic product review article; then, the target domain vocabulary calculation module 12 may use the number of times that the representative vocabulary becomes the article representative word as a threshold, and determine the domain tendency of the representative vocabulary according to the probability that the vocabulary appears in the review articles in the food and makeup domains to find out a plurality of domain representative vocabularies in the food and makeup domains; in this embodiment, the stage excludes the vocabulary with the probability of becoming the representative vocabulary close to 50% in two fields, which means that the representative vocabulary has no representativeness of the food or cosmetic fields, and the output of this stage is shown in table 2:
Vocabulary and phrases A tendency in the field of food Tendency of beauty makeup Mainly representative of the field
Effect 0.20% 99.80% Beauty makeup
Service 99.96% 0.04% Food
Skin and skin 0.01% 99.99% Beauty makeup
dining with food 99.95% 0.05% Food
skin(s) 0.04% 99.96% beauty makeup
TABLE 2
the opinion vocabulary extraction module 13 can extract a plurality of candidate opinion vocabularies from the plurality of vocabularies according to the part-of-speech combination; in one embodiment, the part-of-speech combination may be generated according to the definition of a word-breaking algorithm (e.g., Jieba), as shown in Table 3:
TABLE 3
The definitions of the symbols appearing in table 3 are derived from the part-of-speech table of Jieba, and should be well known to those skilled in the art, and therefore are not described herein in detail.
The part-of-speech combination can be further extended into idiom types, as shown in Table 4:
TABLE 4
The part-of-speech combination may further be further extended by adjective type, as shown in Table 5:
Rules Examples of such applications are
N+A Texture/freshness
A+N skin tone/evenness
V+A Not enough/lasting
A+V easy/absorb
ADV+A Super smooth and tender
TABLE 5
The Opinion vocabulary extraction module 13 searches the related vocabulary rule related to the adjective before and after the adjective with the reference point, and the combination with the name word is restored to the remaining adjectives in the subsequent steps, and the other combinations keep the form of the meaning phrase (Opinion Phrases).
in tables 4 and 5, N represents a noun; i represents an exclamation word; ADV denotes adverb; u represents a help word; v represents a verb; a denotes an adjective.
The opinion vocabulary similarity clustering module 14 may calculate similarities of the candidate opinion vocabularies by using a Single-Pass (Single-Pass) algorithm and a Levenshtein Distance (Levenshtein Distance) algorithm, and may divide the candidate opinion vocabularies into the clusters, wherein the formula of the Levenshtein Distance (Levenshtein Distance) algorithm is as follows:
Levenstein distance 1-number of edits/Max (string 1 length, string 2 length) … … … … … … … (1)
The "number of editing times" of the numerator in formula (1) refers to the number of operations to edit the target-aligned phrase [ character string 1, character string 2] to be the same, wherein the operations covered by editing include: "character insertion, character deletion, and character replacement", and Max (length of character string 1, length of character string 2) of the denominator is the maximum value of the length of the character string in the matching phrase.
The Single-Pass algorithm may comprise the following steps: the method comprises the following steps: extracting a vocabulary from the vocabulary set, wherein the vocabulary becomes a first cluster under the condition of no clustering result, and the vocabulary also becomes a representative word of the first cluster; step two: taking out all the rest vocabularies, and performing character string similarity calculation (Levenshtein Distance) on the representative words of the existing clusters; step three: if the threshold value is reached, adding the grouping, and recalculating the common representative word with high frequency as the selection basis; step four: if the vocabulary calculated by the target can not be grouped, the vocabulary automatically establishes a cluster and takes the vocabulary as a representative word; step five: and repeating the second step to the fourth step until all the words are subjected to clustering operation. In the above manner, the opinion vocabulary similarity clustering module 14 may classify the candidate opinion vocabularies into the clusters.
The invalid opinion vocabulary filtering module 15 may calculate various inter-Point Mutual Information (PMI) combinations of the candidate opinion vocabularies and the domain representative vocabularies, respectively, to filter out part of the invalid opinion vocabularies from the candidate opinion vocabularies.
Finally, the opinion vocabulary emotional tendency analysis module 16 may select a plurality of positive seed vocabularies and a plurality of negative seed vocabularies from the plurality of domain representative vocabularies, and may calculate emotional tendency of each candidate opinion vocabulary of each cluster according to the plurality of positive seed vocabularies and the plurality of negative seed vocabularies through an emotional tendency point mutual information (SO-PMI) algorithm; the emotional tendency point mutual information (SO-PMI) algorithm adopted in this embodiment is shown in formula (2):
Wherein, SO-PMI (word) represents the calculation result of mutual information algorithm between emotional tendency points.
In this embodiment, the seed set is shown in table 6:
TABLE 6
Therefore, the opinion vocabulary expansion system 1 can extract candidate opinion vocabularies through special part-of-speech combinations, and the part-of-speech combinations can further extend idiom types and adjective types, so that the coverage rate of the opinion vocabularies can be greatly improved; in addition, the opinion vocabulary expansion system 1 can perform emotion tendency analysis of the opinion vocabulary through more effective emotion tendency analysis steps, so that the accuracy and efficiency of part of speech judgment of the opinion vocabulary can be greatly improved. Therefore, the opinion vocabulary expansion system 1 can effectively improve the deficiency of the prior art.
It is worth mentioning that the amplification and establishment of the opinion vocabulary are usually performed manually, in a dictionary-based manner or in a corpus-based manner at present; however, the manual method for expanding and building the opinion vocabulary is inefficient, and cannot effectively increase the coverage of the opinion vocabulary, and the dictionary-based method and the corpus-based method also have the problem of being unable to effectively increase the coverage of the opinion vocabulary. On the contrary, according to the embodiment of the invention, the opinion vocabulary expansion system can extract the candidate opinion vocabulary through the special part-of-speech combination including idiom type and adjective type, thereby greatly improving the coverage rate of the opinion vocabulary.
At present, the part of speech of the opinion vocabulary is generally judged manually, based on a dictionary or based on a corpus. However, the part-of-speech determination of the opinion vocabulary by a manual method can achieve higher accuracy, but is less efficient; the dictionary-based mode and the corpus-based mode have the problem of low precision. On the contrary, according to the embodiment of the invention, the opinion vocabulary expansion system can perform emotional tendency analysis of the opinion vocabularies through a more effective emotional tendency analysis step, so that the part-of-speech judgment accuracy of the opinion vocabularies can be greatly improved, and the opinion vocabulary expansion system can adopt a specially designed mechanism to more rapidly perform the amplification and establishment of the opinion vocabularies and the part-of-speech judgment of the opinion vocabularies, so that the efficiency can be greatly improved. From the above, the present invention is a patent element with advancement.
Please refer to fig. 4, which is a flowchart illustrating a second embodiment of the present invention. As shown in the figure, the opinion vocabulary expansion method adopted by the opinion vocabulary expansion system 1 can comprise the following steps:
step S41: the multiple product review articles are subjected to word segmentation and part-of-speech tagging through a word segmentation algorithm (such as Jieba) to generate multiple words.
Step S42: and calculating a plurality of domain representative words representing a target domain from the plurality of words by a word frequency-reverse file frequency algorithm.
Step S43: extracting a plurality of candidate opinion vocabularies from the vocabularies according to a part-of-speech combination.
Step S44: and dividing the candidate opinion vocabularies into a plurality of clusters according to the similarity of the candidate opinion vocabularies through a single clustering algorithm and a Levensstein distance algorithm.
Step S45: and respectively calculating the mutual information between various pairwise combinations of the candidate opinion vocabularies and the field representative vocabularies so as to filter out partial invalid opinion vocabularies by the candidate opinion vocabularies.
Step S46: selecting a plurality of positive seed words and a plurality of negative seed words from the plurality of field representative words, and calculating the emotional tendency of each candidate opinion word of each cluster according to the positive seed words and the negative seed words through an emotional tendency point mutual information algorithm.
In summary, according to the embodiment of the invention, the opinion vocabulary expansion system can extract the candidate opinion vocabularies by the special part-of-speech combination including idiom type and adjective type, so as to greatly improve the coverage of the opinion vocabularies.
In addition, according to the embodiment of the invention, the opinion vocabulary expansion system can carry out emotional tendency analysis on the opinion vocabularies through more effective emotional tendency analysis steps, so that the part-of-speech judgment accuracy of the opinion vocabularies can be greatly improved.
In addition, according to the embodiment of the invention, the opinion vocabulary expansion system can adopt a specially designed mechanism to more rapidly perform the augmentation and establishment of the opinion vocabulary and the part of speech judgment of the opinion vocabulary, so that the efficiency can be greatly improved.
The foregoing is by way of example only, and not limiting. Any other equivalent modifications or variations without departing from the spirit and scope of the present invention should be included in the protection scope of the present application.

Claims (20)

1. An opinion vocabulary expansion system, comprising:
The target field vocabulary calculation module is used for calculating a plurality of field representative vocabularies representing a target field from the vocabularies;
an opinion vocabulary extraction module, which extracts a plurality of candidate opinion vocabularies from the vocabularies according to a part-of-speech combination;
The opinion vocabulary similarity grouping module is used for grouping the candidate opinion vocabularies into a plurality of clusters according to the similarity of the candidate opinion vocabularies; and
And the opinion vocabulary emotional tendency analysis module selects a plurality of positive seed vocabularies and a plurality of negative seed vocabularies from the plurality of domain representative vocabularies and calculates the emotional tendency of each candidate opinion vocabulary of each cluster according to the positive seed vocabularies and the negative seed vocabularies.
2. the system of claim 1, further comprising a data preprocessing module for generating the words by segmenting and word tagging product review articles with a word segmenter.
3. the opinion vocabulary expansion system of claim 2 wherein the word segmenter is a word segmentation algorithm.
4. The system of claim 1, further comprising an invalid opinion vocabulary filtering module for respectively calculating mutual information between each point of the candidate opinion vocabularies and each of the domain representative vocabularies in pairwise combination to filter out part of the invalid opinion vocabularies from the candidate opinion vocabularies.
5. The opinion vocabulary expansion system of claim 1 wherein the part-of-speech combinations are generated according to a definition of a word-breaking algorithm.
6. the opinion vocabulary expansion system of claim 5 wherein the part-of-speech combinations include an verb type, a verb type, an ancillary plus verb type, and an ancillary plus ancillary verb type.
7. The system of claim 6, wherein the part of speech combination further comprises a word type and an adjective type.
8. The system of claim 1, wherein the target domain vocabulary calculation module calculates the domain representatives representing the target domain from the plurality of vocabularies by a word frequency-inverse file frequency algorithm.
9. The system of claim 1, wherein the opinion vocabulary similarity clustering module calculates the similarity of the candidate opinion vocabularies by using a one-time clustering algorithm and a Levensian distance algorithm, and divides the candidate opinion vocabularies into the clusters.
10. The system of claim 1, wherein the opinion vocabulary emotion tendency analysis module calculates emotion tendencies of the candidate opinion vocabularies of each cluster according to the positive seed vocabularies and the negative seed vocabularies through an emotion tendency point mutual information algorithm.
11. An opinion vocabulary expansion method is characterized by comprising the following steps:
calculating a plurality of domain representative words representing a target domain from the plurality of words;
extracting a plurality of candidate opinion vocabularies from the vocabularies according to a part-of-speech combination;
Dividing the candidate opinion vocabularies into a plurality of clusters according to the similarity of the candidate opinion vocabularies; and
Selecting a plurality of positive seed words and a plurality of negative seed words from the plurality of domain representative words, and calculating the emotional tendency of each candidate opinion word of each cluster according to the positive seed words and the negative seed words.
12. The method of claim 11, further comprising the steps of:
And performing word segmentation and part-of-speech tagging on a plurality of product review articles through a word segmentation device to generate a plurality of words.
13. The method of claim 12, wherein the word segmentation unit is a word segmentation algorithm.
14. The method of claim 11, further comprising the steps of:
And respectively calculating point-to-point mutual information of various pairwise combinations of the candidate opinion vocabularies and the field representative vocabularies so as to filter partial invalid opinion vocabularies from the candidate opinion vocabularies.
15. The method of claim 11, wherein the part-of-speech combination is generated according to a definition of a word-breaking algorithm.
16. The method of claim 15, wherein the part-of-speech combinations include an verb type, a verb type, an ancillary plus verb type, and an ancillary plus verb type.
17. The method of claim 16, wherein the part-of-speech combination further includes a type of a word and a type of an adjective.
18. The method of claim 11, wherein the plurality of domain representatives representing the target domain are calculated by a word frequency-inverse file frequency algorithm.
19. The method of claim 11, wherein the similarity of the candidate opinion vocabularies is calculated by a one-time clustering algorithm and a Levensian distance algorithm and divides the candidate opinion vocabularies into the clusters.
20. The method of claim 11, wherein the emotional tendency of each candidate opinion vocabulary in each cluster is calculated by an emotional tendency point mutual information algorithm.
CN201811341060.4A 2018-06-06 2018-11-12 Opinion vocabulary expansion system and opinion vocabulary expansion method Pending CN110569497A (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
TW107119472A TWI675304B (en) 2018-06-06 2018-06-06 Opinion dictionary expansion system and method tehreof
TW107119472 2018-06-06

Publications (1)

Publication Number Publication Date
CN110569497A true CN110569497A (en) 2019-12-13

Family

ID=68772434

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811341060.4A Pending CN110569497A (en) 2018-06-06 2018-11-12 Opinion vocabulary expansion system and opinion vocabulary expansion method

Country Status (2)

Country Link
CN (1) CN110569497A (en)
TW (1) TWI675304B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140081626A1 (en) * 2012-09-18 2014-03-20 Adobe Systems Incorporated Natural Language Vocabulary Generation and Usage
CN105117428A (en) * 2015-08-04 2015-12-02 电子科技大学 Web comment sentiment analysis method based on word alignment model
CN106610955A (en) * 2016-12-13 2017-05-03 成都数联铭品科技有限公司 Dictionary-based multi-dimensional emotion analysis method
CN106776551A (en) * 2016-12-06 2017-05-31 桂林电子科技大学 A kind of analysis method of english composition emotion viewpoint
CN107832297A (en) * 2017-11-09 2018-03-23 电子科技大学 A kind of field sentiment dictionary construction method of Feature Oriented word granularity

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8352405B2 (en) * 2011-04-21 2013-01-08 Palo Alto Research Center Incorporated Incorporating lexicon knowledge into SVM learning to improve sentiment classification
US8676730B2 (en) * 2011-07-11 2014-03-18 Accenture Global Services Limited Sentiment classifiers based on feature extraction

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140081626A1 (en) * 2012-09-18 2014-03-20 Adobe Systems Incorporated Natural Language Vocabulary Generation and Usage
CN105117428A (en) * 2015-08-04 2015-12-02 电子科技大学 Web comment sentiment analysis method based on word alignment model
CN106776551A (en) * 2016-12-06 2017-05-31 桂林电子科技大学 A kind of analysis method of english composition emotion viewpoint
CN106610955A (en) * 2016-12-13 2017-05-03 成都数联铭品科技有限公司 Dictionary-based multi-dimensional emotion analysis method
CN107832297A (en) * 2017-11-09 2018-03-23 电子科技大学 A kind of field sentiment dictionary construction method of Feature Oriented word granularity

Also Published As

Publication number Publication date
TWI675304B (en) 2019-10-21
TW202001619A (en) 2020-01-01

Similar Documents

Publication Publication Date Title
CN109146610B (en) Intelligent insurance recommendation method and device and intelligent insurance robot equipment
CN106844346B (en) Short text semantic similarity discrimination method and system based on deep learning model Word2Vec
CN109241538B (en) Chinese entity relation extraction method based on dependency of keywords and verbs
CN109190117B (en) Short text semantic similarity calculation method based on word vector
CN109299480B (en) Context-based term translation method and device
CN107463548B (en) Phrase mining method and device
US20120030157A1 (en) Training data generation apparatus, characteristic expression extraction system, training data generation method, and computer-readable storage medium
CN106095749A (en) A kind of text key word extracting method based on degree of depth study
CN113535974B (en) Diagnostic recommendation method and related device, electronic equipment and storage medium
CN106445921B (en) Utilize the Chinese text terminology extraction method of quadratic mutual information
EP3232336A1 (en) Method and device for recognizing stop word
Torres-Moreno Artex is another text summarizer
JP6558863B2 (en) Model creation device, estimation device, method, and program
CN106570120A (en) Process for realizing searching engine optimization through improved keyword optimization
CN113033183A (en) Network new word discovery method and system based on statistics and similarity
CN110929022A (en) Text abstract generation method and system
CN109255014A (en) The recognition methods of file keyword accuracy is promoted based on many algorithms
CN113239150A (en) Text matching method, system and equipment
CN110569497A (en) Opinion vocabulary expansion system and opinion vocabulary expansion method
Kim et al. Word2Vec based spelling correction method of Twitter message
CN110597982A (en) Short text topic clustering algorithm based on word co-occurrence network
CN106933797B (en) Target information generation method and device
CN112699831B (en) Video hotspot segment detection method and device based on barrage emotion and storage medium
JP2014149869A (en) Chinese character compound word division device
JP2015046183A (en) Dialogue device, method, and program

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20191213

WD01 Invention patent application deemed withdrawn after publication