WO2010132062A1 - Système et procédés d'analyse de sentiments - Google Patents

Système et procédés d'analyse de sentiments Download PDF

Info

Publication number
WO2010132062A1
WO2010132062A1 PCT/US2009/044197 US2009044197W WO2010132062A1 WO 2010132062 A1 WO2010132062 A1 WO 2010132062A1 US 2009044197 W US2009044197 W US 2009044197W WO 2010132062 A1 WO2010132062 A1 WO 2010132062A1
Authority
WO
WIPO (PCT)
Prior art keywords
sentences
comparative
entities
sentiment
entity
Prior art date
Application number
PCT/US2009/044197
Other languages
English (en)
Inventor
Liu Bing
Ding Xiaowen
Original Assignee
The Board Of Trustees Of The University Of Illinois
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by The Board Of Trustees Of The University Of Illinois filed Critical The Board Of Trustees Of The University Of Illinois
Priority to PCT/US2009/044197 priority Critical patent/WO2010132062A1/fr
Publication of WO2010132062A1 publication Critical patent/WO2010132062A1/fr

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis

Definitions

  • the present disclosure relates generally to data mining techniques, and more specifically to a system and methods for sentiment analysis.
  • FIG. 1 depicts an illustrative embodiment of a method for assigning entities
  • FIG. 2 depicts an illustrative embodiment of a method for identifying entities using a variety of seeds
  • FIG. 3 depicts an illustrative diagrammatic representation of a machine in the form of a computer system within which a set of instructions, when executed, may cause the machine to perform any one or more of the methodologies disclosed herein;
  • Table 1 depicts an illustrative embodiment of a plurality of data sets associated with two forums
  • Table 2 depicts an illustrative embodiment of experimental results for entity identification
  • Table 3 depicts an illustrative embodiment of experimental results for entity assignment
  • Table 4 depicts an illustrative embodiment of part-of-speech (POS) tags.
  • One embodiment of the present disclosure entails identifying a plurality of entities in opinionated text generated by a plurality of users, each user expressing one or more opinions about at least one of the plurality of entities, identifying a plurality of comparative sentences and a plurality of non-comparative sentences in the opinionated text, identifying inferior and superior entities from the plurality of entities according to a plurality of comparative opinions determined from the plurality of comparative sentences, determining a semantic orientation for each of the plurality of non-comparative sentences, and assigning at least a portion of the superior and inferior entities to one of the plurality of comparative sentences and the plurality of non-comparative sentences according to the determined semantic orientation of the plurality of non-comparative sentences, the plurality of comparative opinions, and sentiment consistency between consecutive sentences in the opinionated text.
  • An embodiment of the present disclosure entails a computer-readable storage medium having computer instructions to identify a plurality of entities in opinionated text, identify a plurality of comparative sentences and a plurality of non- comparative sentences in the opinionated text, identify inferior and superior entities from the plurality of entities according to a plurality of comparative opinions determined from the plurality of comparative sentences, and assigning at least a portion of the superior and inferior entities to one of the plurality of comparative sentences and the plurality of non-comparative sentences according to the plurality of comparative opinions, sentiment consistency between consecutive sentences in the opinionated text, and a semantic orientation of the plurality of non-comparative sentences.
  • An another embodiment of the present disclosure entails an evaluation system having a controller to identify a plurality of entities in opinionated text, identify inferior and superior entities from the plurality of entities according to a plurality of comparative opinions determined from a plurality of comparative sentences in the opinionated text, and assigning at least a portion of the superior and inferior entities to one of the plurality of comparative sentences and a plurality of non- comparative sentences of the opinionated text according to the plurality of comparative opinions, sentiment consistency between consecutive sentences in the opinionated text, and a semantic orientation of the plurality of non-comparative sentences.
  • a popular semantic level analysis is sentiment analysis or opinion mining, which tries to discover user opinions about products and services.
  • Such studies are mainly conducted in the context of product reviews [4, 6, 8, 9, 18, 19, 20, 22] due to the fact that reviews are focused on the entities being reviewed and contain little irrelevant information.
  • the first problem is similar to a named entity recognition (NER) problem.
  • NER named entity recognition
  • common NER methods do not work well because of the ungrammatical nature of the forum posts, over-capitalization and under capitalization. Overcapitalization means that the user may capitalize every word in the sentence, and under-capitalization means that the first letters of many entity names are not capitalized. These cause serious problems for existing entity recognition methods.
  • the second problem bears some resemblance to pronoun resolution [3, 24, 25] in natural language processing (NLP), which identifies what each pronoun in a sentence refers to. Pronoun resolution is still a major challenge in NLP.
  • Example 1 "(1) I bought Camera-A yesterday. (2) I took some pictures in the evening in my living room. (3) The images are very clear.
  • a simple approach to identifying the entities talked about in each sentence is the following: The algorithm sequentially processes each sentence. Whenever an entity name is encountered in a sentence, it is assumed that the sentence talks about that entity. It is also assumed that the subsequent sentences talk about that entity as well until a new entity name occurs. Then the new entity is the one talked about in its sentence. The subsequent sentences also talk about the new entity, and so on. This simple strategy works reasonably well in practice. However, it breaks down when a comparative sentence is encountered.
  • Example 2 "(1) I bought Camera-A yesterday. (2) I took a few pictures in the evening in my living room. (3) The images are very clear. (4) They are definitely better than those from my old Camera-B. (5) The pictures of that camera were blurring for night shots, but for day shots it was ok"
  • Example 2 is the same as example 1 except the last sentence. Obviously, sentence (5) of example 2 talks about Camera-B. The above algorithm does not work with example 2. Since the method does not rely on pronouns, it has two advantages.
  • sentence (5) in example 2 which expresses a negative sentiment in its first clause, should refer to the inferior product.
  • This phenomenon can be called sentiment consistency, which says that consecutive sentiment expressions should be consistent with each other. It would be ambiguous if this consistency were not observed in writing.
  • a sentiment analysis method can be adapted for direct opinions to solve aforementioned problem.
  • a direct opinion [17], what is meant is a sentence or a clause that directly expresses a positive or negative opinion on an entity or a feature of the entity, such as illustrated by sentence (5) of Example 1.
  • Direct opinions are in contrast to comparative opinions.
  • a comparative opinion does not directly express a positive or negative opinion on anything but expresses a preference of some entities.
  • sentence (4) expresses a comparative opinion, i.e., "Camera- A" is superior or preferred to "Camera-B" when comparing their images. It can be observed that sentence (4) alone does not say any camera is good or bad, but just states a comparison.
  • NER entity recognition
  • NER aims to identify entities such as persons, organizations and locations in natural language text.
  • references [6, 11] have studied the problem in the context of comparative sentences. The methods described in these references exploit specific structures of such sentences for extraction.
  • the present disclosure is more general and not focused on comparative sentences.
  • the present disclosure can also be different from the classic NER as there is only an interest in product type of entities.
  • Reference [2] provides good surveys of existing information extraction algorithms.
  • Conditional random fields (CRF) [16] have been shown to perform the best so far. It will be shown that the method described in the present disclosure outperforms CRF dramatically for our task.
  • sentiment classification which investigates ways to classify whole product reviews as positive, negative, or neutral [19, 22]. Sentiment classification is not applicable to sentences and clauses considered in the present disclosure. Sentence-level and clause sentiment classification has been studied in [e.g., 15, 21, 23].
  • the present disclosure relates to feature-based sentiment analysis or opinion mining [4, 9, 18, 20], which finds sentiments expressed on product features.
  • sentiments expressed on product features For example, in the above example, "photo quality” and “battery” are product features. The sentiment on "photo quality” is positive and the sentiment on "battery” is negative.
  • Sentiment words are words that express desired or undesired states. Positive words express desired states, e.g., "great” and "good”. Negative words express undesired states, e.g., "bad” and “poor”. Identifying sentiment words have been studied in [5, 9, 13, 14]. Several lists have been compiled.
  • Context dependent opinions are determined based on the pair.
  • the present disclosure does not use this context definition.
  • a specification language is disclosed to enable the user to add/delete complex sentiment indicators, which can be words, phrases or other language constructs without touching the underlying program.
  • the present disclosure shows that a sentiment analysis method for analyzing direct opinions can be adapted to analyzing comparative sentences to mine comparative opinions.
  • references [11, 12] propose a method to find comparative and superlative sentences.
  • the teachings in these references do not determine superior entities expressed in comparative sentences. They only extract some useful items from sentences. Such items alone are not sufficient in determining the superior entities.
  • Reference [1] proposes a method to extract items from superlative sentences. It does not study sentiments either.
  • reference [7] the authors tried to identify which entity has more of a certain property in a comparative sentence. Again, it is not concerned with the problem of identifying the superior entities.
  • Reference [8] studied the sentiment analysis of comparative sentences. However, it needs a large volume of external information, i.e., product reviews.
  • the basic information unit of forums, blogs and discussion boards consists of a start post and a list of follow-up posts or replies.
  • This basic information unit is often called a thread.
  • a thread t thus can be modeled as a sequence of posts, ⁇ p ⁇ , p 2 , ..., p n >- Pi is the start post.
  • Each post consists of a sequence of sentences, ⁇ su S 2 , ..., s m >.
  • An entity can be a person, a product, an organization, an event, etc.
  • Entity identification identify the set of entities E discussed in the posts of the threads.
  • Entity assignment determine the entities in E that each sentence S 1 of each post/?, in t (e T) talks about.
  • direct opinion A direct opinion is a positive or negative opinion on an entity or some feature of the entity without mentioning any other similar entities.
  • Camera-X is comparative sentence, which states that "Camera-Y” is superior or preferred to "Camera X" when comparing their "picture quality”.
  • the algorithm is thus iterative. Pattern mining is employed at each iteration to find more entities based on already found entities. The iterative process ends when no new entity names are found. Pruning methods are also proposed to remove those unlikely entities.
  • Step 1 - data preparation for sequential pattern mining This step perform two tasks, it first finds all sentences that contain any one of the seed entities, e ⁇ , ⁇ 2 , ..., e n in the dataset, and then generates a sequence for each occurrence of e t for pattern mining.
  • the present disclosure can use only the window of 5 words before each entity name and 5 words after each entity name.
  • Each word of a seed entity name is replaced with a generic (unique) name "ENTITYXYZ”. Utilizing this generic word can ensure that generic patterns about any entities are found.
  • each entity name can consist of more than one word.
  • the part-of-speech (POS) tag of each word can also be used.
  • each element of the sequence can be a pair, POS tag of the word and the word.
  • Example 3 The sentence that follows has POS tags attached.
  • n95 is a phone model (an entity).
  • the window is (n95 has been replaced with ENTITYXYZ): mad/JJ everyone/NN doesnt/NN have/VBP a/DT ENTITYXYZ /CD phone/NN fetish/NN ducky/JJ
  • the resulting sequence is:
  • Table 4 depicts POS Tags used above and throughout the rest of the disclosure.
  • Step 2 Sequential pattern mining: Given the set of sequences generated from step 1 , a sequential pattern mining algorithm is applied to generate sequential patterns. Sequential pattern mining is a popular data mining algorithm [17], which finds all patterns that appear frequently in the data. The frequency threshold is set by the user, which is called the minimum support. The present disclosure uses 0.01 as the minimum support. In the present disclosure each pattern contains ⁇ POStag, ENTITYXYZ ⁇ with a length greater than or equal to 2.
  • An example pattern is:
  • Step 3 Pattern matching to extract candidate entities: For each sentence in the test dataset, a system can match the generated patterns to extract a set of candidate entities. The patterns can be sorted based on their supports. In order not to generate too many spurious candidates, the matching process in a sentence terminates after five patterns have been matched.
  • Example 4 The following sentence is presented with POS tags attached:
  • The/DT misses/VBZ has/VBZ currently/RB got/VBN a/DT Nokia/NNP 7390/CD at/IN the/DT end/NN of/IN the/DT day ,/VBG all/DT she/PRP does/VBZ is/VBZ text/NN and/CC make/VB calls,/NN but/CC the/DT reception/NN is/VBZ serious,/VBG where/WRB my/PRP$ 6233/CD would/MD get/VB full/JJ bars/NNS hers/PRP would/MD only/RB get/VB I/CD or/CC 2./CD
  • the pattern, ⁇ DT ⁇ , ⁇ NNP, ENTITYXYZ ⁇ , ⁇ CD ⁇ > can match the sentence segment: a/DT Nokia/NNP 7390/CD to produce the candidate entity: "Nokia”.
  • the pattern, ⁇ DT ⁇ , ⁇ NNP ⁇ , ⁇ CD, ENTITYXYZ ⁇ , ⁇ IN ⁇ > can match the sentence segment: a/DT Nokia/NNP 7390/CD at/IN to produce the candidate entity: 7390.
  • Step 4 - Candidate pruning The above pattern matching method can extract many wrong entities.
  • a pruning method based on POS check is proposed by the present disclosure. It remedies some errors made by a POS tagger system. Since an entity is always associated with a POS tag in the present patterns, this method checks in the dataset to see whether the POS tag is the most frequent one for this candidate. If it is not, the candidate entity can be eliminated (a possible POS tagging error).
  • Example 5 Given the sentence:
  • Step 5 Finding additional entities using brand and model relation.
  • the second task in this step is to use the Brand to identify additional models.
  • a regular expression is used which assumes that a model name must have a digit.
  • Step 6 Finding more entities using syntactic patterns. Using some syntactic patterns can help finding competing entities (brands and models). The syntactic patterns exploit conjunctions and comparisons in sentences. [00063] In the present disclosure C denotes a discovered entity and CN as a competitor. The following eight patterns are used:
  • The/DT correct/JJ comparison/NN was/VBD made/VBN many/JJ times/NNS as/IN e398/CD vs./IN k700/CD ./.
  • Comparative sentences express similarity and differences of more than one entity. There can be three main types of comparatives:
  • Non-equal gradable "greater or less than” that expresses a total ordering of some entities with regard to some shared features or attributes. For example, the sentence, "Camera-X' s battery life is longer than that of Camera-T ⁇ orders Camera-X and Camera-Y based on their shared feature "battery life”.
  • Non-gradable Comparing two or more entities, but do not grade them.
  • the sentence, "Camera-X and Camera-Y have different shapes”, expresses a comparison of the shapes of the two cameras but does not grade them.
  • a superlative sentence expresses a relation of the type "greater or less than all others ' ", i.e., it ranks one entity over all other entities. For example, the sentence,
  • Camera-B and Camera-C Camera-A is the best.
  • FIG. 1 depicts an illustrative embodiment of an algorithm based on the above disclosure.
  • the flowchart of FIG. 1 follows the simple method provided above but with special handlings to comparative sentences as discussed above.
  • the input is a post, and the output is the entities discussed in each sentence.
  • the algorithm is simplified for presentation clarity.
  • the start post and quotes in replies are also considered as entities may be inherited from them.
  • Comparative sentences here also cover superlative sentences that contain more than one entity. For a superlative sentence with only a single entity, it is treated as a normal sentence.
  • the notations used in the algorithm are:
  • opinionQ It is the sentiment analysis function that analyzes a non- comparative sentence.
  • compOpinionQ It is the sentiment analysis function that finds superior and inferior entities from a comparative sentence.
  • SENTIMENT ANALYSIS It is the sentiment analysis function that finds superior and inferior entities from a comparative sentence.
  • Sentiment orientations of opinions can identify whether the opinions are positive, negative or neutral. Since the present disclosure is not concerned with entity features as in references [4, 9], entity features are not used in the analysis. In an application, entity features can be discovered in various ways if needed, e.g., the method in references [9, 20]. There are three main sentiment indicators, i.e., sentiment words and phrases, negations, and but-clauses. They are discussed below. [00090] Sentiment Indicators
  • Sentiment words and phrases In most cases, sentiments in sentences are expressed with sentiment (or opinion) words, e.g., "great” , "good”, “bad”, and “poor”. Although words that express sentiments are usually adjectives and adverbs, verbs and nouns can be used to express sentiments/opinions too. researchers have compiled sets of such words. Such lists are collectively called the sentiment lexicon. Apart from individual words, there are sentiment phrases and idioms, e.g., "cost someone an arm and a leg”. Furthermore, some phrases may involve sentiment words, but the whole phrases have no opinion. For example, the phrase "a good deal of" does not have an opinion although it has the positive sentiment word "great”.
  • Such phrases are called non-sentiment phrases involving sentiment words.
  • Negations Sentiment words and phrases form the basis of opinions in a sentence. Negations reverse their orientations. Apart from “not”, many other words and phrases can be used to express negations. Furthermore, “not” may not express negation in some cases, e.g., in “not only ... but also”. Such phrases are called non- negations involving negation words.
  • but means contrary. For example, the sentence, "The picture quality is great, but not the battery life” expresses a positive sentiment on "picture quality” but a negative sentiment on "battery life”. The following rule states the effect of "but”: The orientation before “but” is opposite to that after "but”. [00095] Apart from the word “but”, many other words and phrases behave similarly, e.g., “though” and “except that”. Similar to opinions and negations, not every "but” changes sentiment direction. For example, “but” in the pattern “not only ... but also” does not. Such phrases are called non-but phrases involving "but”. [00096] Specification for Sentiment Indicators
  • each indicator word is represented as a rule.
  • Each rule consists of two parts, an item on the right and an action on the left.
  • the ⁇ item> is either an individual word or a word attached with a type, which may be anyone of the part-of-speech (POS) tags.
  • POS part-of-speech
  • the specification can consist of a set of rules. Each rule has two parts, a phrase on the right and an action on the left. Each phrase can have a target word, indicated by [T], to which the action is applied.
  • the idea is that the left-hand- side of the rule is first matched in the sentence and then the action of the rule is applied to the target in the sentence.
  • indicator symbols Indicators ym
  • words Indicators ym
  • distances Indicators ym
  • indicator Sym These are indicator symbols, Po, Ne, Neu, Ng and But, from individual indicator words discussed above.
  • a "type" may also be attached, specifying the POS tag of the word.
  • Word It can be any word with an optional type.
  • Distance It indicates the number of words (or gap) that can appear between two non-distance items in the phrase. "-” means from “num” to "num” (num is an integer number).
  • Target It is the core item of the phrase, indicating which word the rule is applied to.
  • the action on the right states that the action symbol should be associated with the target.
  • the action symbol can be any of the outcomes or their negations, i.e.,
  • the ordering of rules can be significant. When the first rule for a target word is matched and applied, the rest will not be tried.
  • Step 1 Part-of-speech tagging: The tags are used for matching ⁇ type>'s in the rules.
  • Step 2 Applying indicator word rules: All sentiment words, negation words and but-like words in the sentence are identified in this step. After this step, one can obtain
  • the picture quality is not[Ng] good[Po], reaction is too slow[Neu], but[But] the battery life is long[Neu].
  • Step 3 - Applying phrase rules This step identifies all phrases in the sentence and performs the actions specified in the rules. After this step, the running example sentence becomes:
  • the picture quality is not[Ng] good[Po], reaction is too slow[NE], but[But] the battery life is long[Neu].
  • Step 4 - Handling negations A negation in a sentence reverses the orientation of an opinion. For neutral, it is turned to negative. After negation handling, the running example sentence becomes ("good" is now turned to negative from positive):
  • Step 5 Aggregating opinions: This step first finds but-symbols ("But” or "BUT"), which indicate sentiment changes. The sentiments on the two sides of a but- symbol are opposite to each other. For illustration purposes, only the sentiment in the first clause of the sentence is used.
  • Opinion aggregation All opinion indicators in the first clause of the sentence are aggregated to arrive at the final sentiment. The algorithm simply sums up all indicators [9]. A positive (or negative) indicator is assigned 1 (or -1). If the final sum is greater than 0, then the clause is positive. If the sum is less than 0, then the clause is negative and neutral otherwise. For our example, the sentiment of the first part (before "but") is positive.
  • Identifying superior and inferior entities as expressed in a comparative sentence is called comparative opinion mining.
  • the sentiment analysis method above can be adapted to find superior and inferior entities in comparative sentences. This is due to the following observation,
  • Positive and negative sentiment words have their corresponding comparative and superlative forms indicating superior and inferior states respectively.
  • the positive sentiment word, "good” has its comparative and superlative forms, “better” and “best”, which indicate superior (and inferior) entities.
  • comparatives and superlatives are special forms of adjectives and adverbs. In general, comparatives are formed by adding the suffix "- ⁇ ?r” and superlatives are formed by adding the suffix "-esf to the base (or original) adjectives and adverbs. Adjectives and adverbs with two syllables or more and not ending in y do not form comparatives or superlatives this way.
  • the heuristics rules used in the present disclosure are as follows (if a sentence matches anyone of the rules, it is considered a comparative or a superlative sentence): a) pronoun + compkey + prodname, b) prodname + compkey + pronoun, c) prodname + compkey + prodname d) pronoun + superkey e) prodname + superkey d) as + JJ + as (except "as long as” and "as far as") where compkey is a comparative keyword, prodname is a product name and superkey is a superlative keyword.
  • Identify superior entities As mentioned earlier, the above sentiment analysis method for mining direct opinions can be used to identify superior/preferred entities. Since a gradable comparative sentence typically has entities on the two sides of the comparative keyword, i.e., "Camera-X is better than Camera-Y ⁇ Based on sentiment analysis, if the sentence is positive, then the entities before the comparative keyword is superior and otherwise they are inferior (with the negation considered). Superlative sentences can be handled in a similar way. Note that equative and non- gradable comparisons do not express preferences. [000134] EMPERICAL EVALUATON [000135] This section evaluates the proposed techniques for the two tasks, entity identification and entity assignment. The disclosure below presents datasets and corresponding experimental results.
  • HowardForums is a message board dedicated to mobile phones while AVSforum is a message board dedicated to Home Theater and the products used. Data from AVSforum are discussions about Plasma and LCD TVs,
  • Table 1 shows the characteristics of the two data sets.
  • NET is a Named Entity Tagger, which can be used in the present case as product names are named entities.
  • the CRF system used in the description below is from
  • CRF is the data obtained from step 2 of our algorithm. Recall that the data from step 2 is automatically generated. The entities in those sentences are regarded as positive data and all the other words in the sentences are regarded as negative data.
  • the test data is the whole set for all the systems. Using the whole set as the test data is reasonable because present system does not use any manually labeled training data.
  • Table 3 gives the experimental results for entity assignment, which include the results of two baseline methods.
  • the disclosed method uses ED to denote the proposed technique. Below, the columns are explained one-by-one, and also discussion is provided on the results.
  • Two sets of experiments were conducted. The first set is denoted by “Next Sentences” in Table 3. "Next Sentences” means that only the comparative sentences and their subsequent sentences are considered. This set of experiments thus shows how effective the ED technique is in its intended task. The second set of experiments is denoted by "All Sentences", which considers all sentences. It shows how the ED method affects the overall implicit entity assignment task.
  • Column 1 Baseline 1 -next sentences: Baselinel works as follows: If a sentence does not mention any product name, one can simply take the last product of the previous sentence. Note that the product of the previous sentence can be inherited from its previous sentence and so on. The accuracy measure is used here because one can gauge how accurate the assignments of products to sentences are.
  • Column 2 baseline2-next sentences: In the Baseline2 method, if a sentence does not mention a product name, it simply takes the first product of the previous sentence. One can observe that Baseline2 is always more accurate than Baselinel because in most cases, the first product is the superior product in a comparative sentence and the next sentence also tends to talk about that product.
  • Column 3 (ED (k-com) - next sentences): It gives the result of each data set using the proposed ED method assuming that the comparative and superlative sentences are known, k-com denotes this assumption.
  • FIG. 3 depicts an exemplary diagrammatic representation of a machine in the form of a computer system 300 within which a set of instructions, when executed, may cause the machine to perform any one or more of the methodologies discussed above.
  • the machine operates as a standalone device.
  • the machine may be connected (e.g., using a network) to other machines.
  • the machine may operate in the capacity of a server or a client user machine in server-client user network environment, or as a peer machine in a peer-to-peer (or distributed) network environment.
  • the machine may comprise a server computer, a client user computer, a personal computer (PC), a tablet PC, a laptop computer, a desktop computer, a control system, a network router, switch or bridge, or any machine capable of executing a set of instructions (sequential or otherwise) that specify actions to be taken by that machine.
  • a device of the present disclosure includes broadly any electronic device that provides voice, video or data communication.
  • the term "machine” shall also be taken to include any collection of machines that individually or jointly execute a set (or multiple sets) of instructions to perform any one or more of the methodologies discussed herein.
  • the computer system 300 may include a processor 302 (e.g., a central processing unit (CPU), a graphics processing unit (GPU, or both), a main memory 304 and a static memory 306, which communicate with each other via a bus 308.
  • the computer system 300 may further include a video display unit 310 (e.g., a liquid crystal display (LCD), a flat panel, a solid state display, or a cathode ray tube (CRT)).
  • the computer system 300 may include an input device 312 (e.g., a keyboard), a cursor control device 314 (e.g., a mouse), a disk drive unit 316, a signal generation device 318 (e.g., a speaker or remote control) and a network interface device 320.
  • the disk drive unit 316 may include a machine-readable medium 322 on which is stored one or more sets of instructions (e.g., software 324) embodying any one or more of the methodologies or functions described herein, including those methods illustrated above.
  • the instructions 324 may also reside, completely or at least partially, within the main memory 304, the static memory 306, and/or within the processor 302 during execution thereof by the computer system 300.
  • the main memory 304 and the processor 302 also may constitute machine-readable media.
  • Dedicated hardware implementations including, but not limited to, application specific integrated circuits, programmable logic arrays and other hardware devices can likewise be constructed to implement the methods described herein.
  • Applications that may include the apparatus and systems of various embodiments broadly include a variety of electronic and computer systems. Some embodiments implement functions in two or more specific interconnected hardware modules or devices with related control and data signals communicated between and through the modules, or as portions of an application-specific integrated circuit. Thus, the example system is applicable to software, firmware, and hardware implementations.
  • the methods described herein are intended for operation as software programs running on a computer processor.
  • software implementations can include, but not limited to, distributed processing or component/object distributed processing, parallel processing, or virtual machine processing can also be constructed to implement the methods described herein.
  • the present disclosure contemplates a machine readable medium containing instructions 324, or that which receives and executes instructions 324 from a propagated signal so that a device connected to a network environment 326 can send or receive voice, video or data, and to communicate over the network 326 using the instructions 324.
  • the instructions 324 may further be transmitted or received over a network 326 via the network interface device 320.
  • machine-readable medium 322 is shown in an example embodiment to be a single medium, the term “machine-readable medium” should be taken to include a single medium or multiple media (e.g., a centralized or distributed database, and/or associated caches and servers) that store the one or more sets of instructions.
  • the term “machine-readable medium” shall also be taken to include any medium that is capable of storing, encoding or carrying a set of instructions for execution by the machine and that cause the machine to perform any one or more of the methodologies of the present disclosure.
  • machine-readable medium shall accordingly be taken to include, but not be limited to: solid-state memories such as a memory card or other package that houses one or more read-only (non- volatile) memories, random access memories, or other re- writable (volatile) memories; magneto-optical or optical medium such as a disk or tape; and/or a digital file attachment to e-mail or other self- contained information archive or set of archives is considered a distribution medium equivalent to a tangible storage medium. Accordingly, the disclosure is considered to include any one or more of a machine-readable medium or a distribution medium, as listed herein and including art-recognized equivalents and successor media, in which the software implementations herein are stored.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Machine Translation (AREA)

Abstract

Un système qui incorpore les enseignements de la présente invention peut comprendre, par exemple, un système d'évaluation comportant un contrôleur pour identifier une pluralité d'entités dans un texte d'opinions, identifier des entités inférieures et supérieures parmi la pluralité d'entités en fonction d'une pluralité d'opinions comparatives déterminées à partir d'une pluralité de phrases comparatives dans le texte d'opinions, et attribuer au moins une partie des entités supérieures et inférieures à l'une de la pluralité de phrases comparatives et d'une pluralité de phrases non comparatives du texte d'opinions conformément à la pluralité d'opinions comparatives, à une cohérence des sentiments entre les phrases consécutives dans le texte d'opinions, et à une orientation sémantique de la pluralité de phrases non comparatives. Des modes de réalisation supplémentaires sont présentés.
PCT/US2009/044197 2009-05-15 2009-05-15 Système et procédés d'analyse de sentiments WO2010132062A1 (fr)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/US2009/044197 WO2010132062A1 (fr) 2009-05-15 2009-05-15 Système et procédés d'analyse de sentiments

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/US2009/044197 WO2010132062A1 (fr) 2009-05-15 2009-05-15 Système et procédés d'analyse de sentiments

Publications (1)

Publication Number Publication Date
WO2010132062A1 true WO2010132062A1 (fr) 2010-11-18

Family

ID=40962445

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2009/044197 WO2010132062A1 (fr) 2009-05-15 2009-05-15 Système et procédés d'analyse de sentiments

Country Status (1)

Country Link
WO (1) WO2010132062A1 (fr)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2013168043A (ja) * 2012-02-16 2013-08-29 Nec Corp 不満抽出装置,不満抽出方法および不満抽出プログラム
WO2016066228A1 (fr) * 2014-10-31 2016-05-06 Longsand Limited Classification focalisée de sentiments
CN112199956A (zh) * 2020-11-02 2021-01-08 天津大学 一种基于深度表示学习的实体情感分析方法
CN112446202A (zh) * 2019-08-16 2021-03-05 阿里巴巴集团控股有限公司 文本的分析方法和装置
CN113298365A (zh) * 2021-05-12 2021-08-24 北京信息科技大学 一种基于lstm的文化附加值评估方法
CN117973946A (zh) * 2024-03-29 2024-05-03 云南与同加科技有限公司 一种面向教学的数据处理方法及系统

Non-Patent Citations (6)

* Cited by examiner, † Cited by third party
Title
BING LIU: "Sentiment Analysis and Subjectivity", 24 August 2009 (2009-08-24), XP002542660, Retrieved from the Internet <URL:http://www.cs.uic.edu/~liub/FBS/NLP-handbook-sentiment-analysis.pdf> [retrieved on 20090824] *
MURTHY GANAPATHIBHOTLA AND BING LIU: "Mining Opinions in Comparative Sentences"", PROCEEDINGS OF THE 22ND INTERNATIONAL CONFERENCE ON COMPUTATIONAL LINGUISTICS (COLING-2008), 18 August 2008 (2008-08-18) - 22 August 2008 (2008-08-22), Manchester, UK, pages 241 - 248, XP002542647 *
NITIN JINDAL AND BING LIU: "Mining Comparative Sentences and Relations", PROCEEDINGS OF 21ST NATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE (AAAI-2006), 16 July 2006 (2006-07-16) - 20 July 2006 (2006-07-20), Boston, Massachusetts, USA, pages 1331 - 1336, XP002542644 *
TETSUYA NASUKAW AND JEONGHEE YI: "Sentiment analysis: capturing favorability using natural language processing", PROCEEDINGS OF THE 2ND INTERNATIONAL CONFERENCE ON KNOWLEDGE CAPTURE, 23 October 2003 (2003-10-23) - 25 October 2003 (2003-10-25), Sanibel Island, FL, USA, pages 70 - 77, XP002542648 *
XIAOWEN DING, BING LIU AND LEI ZHANG: "Entity discovery and assignment for opinion mining applications", PROCEEDINGS OF THE 15TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 28 June 2009 (2009-06-28) - 1 July 2009 (2009-07-01), Paris, France, pages 1125 - 1133, XP002542645 *
XIAOWEN DING, BING LIU AND PHILIP S. YU: "A holistic lexicon-based approach to opinion mining", PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON WEB SEARCH AND WEB DATA MINING, 11 February 2008 (2008-02-11), pages 231 - 239, XP002542646 *

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2013168043A (ja) * 2012-02-16 2013-08-29 Nec Corp 不満抽出装置,不満抽出方法および不満抽出プログラム
WO2016066228A1 (fr) * 2014-10-31 2016-05-06 Longsand Limited Classification focalisée de sentiments
CN112446202A (zh) * 2019-08-16 2021-03-05 阿里巴巴集团控股有限公司 文本的分析方法和装置
CN112199956A (zh) * 2020-11-02 2021-01-08 天津大学 一种基于深度表示学习的实体情感分析方法
CN113298365A (zh) * 2021-05-12 2021-08-24 北京信息科技大学 一种基于lstm的文化附加值评估方法
CN113298365B (zh) * 2021-05-12 2023-12-01 北京信息科技大学 一种基于lstm的文化附加值评估方法
CN117973946A (zh) * 2024-03-29 2024-05-03 云南与同加科技有限公司 一种面向教学的数据处理方法及系统

Similar Documents

Publication Publication Date Title
Ding et al. Entity discovery and assignment for opinion mining applications
US11379512B2 (en) Sentiment-based classification of media content
US9948595B2 (en) Methods and apparatus for inserting content into conversations in on-line and digital environments
CN107256267B (zh) 查询方法和装置
US9626622B2 (en) Training a question/answer system using answer keys based on forum content
US9720904B2 (en) Generating training data for disambiguation
WO2018149115A1 (fr) Procédé et appareil de fourniture de resultats de recherche
US20130060769A1 (en) System and method for identifying social media interactions
Castellanos et al. LCI: a social channel analysis platform for live customer intelligence
US20150213361A1 (en) Predicting interesting things and concepts in content
Chen et al. Mining user requirements to facilitate mobile app quality upgrades with big data
KR102355212B1 (ko) 마이닝된 하이퍼링크 텍스트 스니펫을 통한 이미지 브라우징
CA2865186A1 (fr) Procede et systeme concernant l&#39;analyse de sentiment d&#39;un contenu electronique
US20160179966A1 (en) Method and system for generating augmented product specifications
US10740406B2 (en) Matching of an input document to documents in a document collection
US9811515B2 (en) Annotating posts in a forum thread with improved data
CN106663123B (zh) 以评论为中心的新闻阅读器
CN110245357B (zh) 主实体识别方法和装置
WO2010132062A1 (fr) Système et procédés d&#39;analyse de sentiments
Wong et al. An unsupervised method for joint information extraction and feature mining across different web sites
Nigam et al. Towards a robust metric of polarity
US20230112385A1 (en) Method of obtaining event information, electronic device, and storage medium
US20230090601A1 (en) System and method for polarity analysis
US9305103B2 (en) Method or system for semantic categorization
CN111368036B (zh) 用于搜索信息的方法和装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 09789688

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 09789688

Country of ref document: EP

Kind code of ref document: A1