WO2022221184A1 - Opinion summarization tool - Google Patents

Opinion summarization tool Download PDF

Info

Publication number
WO2022221184A1
WO2022221184A1 PCT/US2022/024244 US2022024244W WO2022221184A1 WO 2022221184 A1 WO2022221184 A1 WO 2022221184A1 US 2022024244 W US2022024244 W US 2022024244W WO 2022221184 A1 WO2022221184 A1 WO 2022221184A1
Authority
WO
WIPO (PCT)
Prior art keywords
neural network
generator
model
trained neural
opposing
Prior art date
Application number
PCT/US2022/024244
Other languages
French (fr)
Inventor
Christopher Malon
Original Assignee
Nec Laboratories America, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nec Laboratories America, Inc. filed Critical Nec Laboratories America, Inc.
Publication of WO2022221184A1 publication Critical patent/WO2022221184A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0282Rating or review of business operators or products
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/096Transfer learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • G06N3/0455Auto-encoder networks; Encoder-decoder networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent

Definitions

  • Bidirectional recurrent neural networks such as bidirectional LSTMs have been used to learn internal representations of wider sentential contexts.
  • Context2Vec is a neural network model that embeds entire sentential contexts and target words in the same low- dimensional space.
  • the model can learn a generic task-independent embedding function for variable-length sentential contexts around target words based on a continuous bag of words (CBOW) architecture.
  • CBOW continuous bag of words
  • a method for extracting and counting frequent opinions.
  • the method includes performing a frequency analysis on an inputted list of product reviews for a single item and an inputted corpus of reviews for a product category containing the single item to identify one or more frequent phrases.
  • the method further includes fine tuning a pretrained transformer model to produce a trained neural network claim generator model, T 1 , and generating a trained neural network opposing claim generator model based on the trained neural network claim generator model.
  • the method further includes generating a pair of opposing claims for each of the one or more frequent phrases related to the product review using the trained neural network claim generator model and the trained neural network opposing claim generator model, wherein a generated positive claim is entailed by the product reviews for the single item and a negative claim refutes the positive claim, and outputting a count of sentences entailing the positive claim and a count of sentences entailing the negative claim.
  • the system includes, one or more processors, computer memory, and a display screen in electronic communication with the computer memory and the one or more processors, wherein the computer memory includes a frequency analyzer configured to perform a frequency analysis on an inputted list of product reviews for a single item and an inputted corpus of reviews for a product category containing the single item to identify one or more frequent phrases, a trained neural network claim generator model, a trained neural network opposing claim generator model, wherein the trained neural network claim generator model and the trained neural network opposing claim generator model are configured to generate a pair of opposing claims for each of the one or more frequent phrases related to the product reviews, wherein a positive claim generated by the trained neural network claim generator is entailed by the product reviews for the single item and a negative claim generated by the trained neural network opposing claim generator refutes the positive claim, and an entailment module configured to output a count of sentences entailing the positive claim and a count of sentences entailing the negative claim.
  • a frequency analyzer configured to perform a frequency analysis on an input
  • non-transitory computer readable storage medium comprising a computer readable program for extracting and counting frequent opinions.
  • the computer readable program when executed on a computer causes the computer to perform the steps of: performing a frequency analysis on an inputted list of product reviews for a single item and an inputted corpus of reviews for a product category containing the single item to identify one or more frequent phrases; fine tuning a pretrained transformer model to produce a trained neural network claim generator model, T 1 ; generating a trained neural network opposing claim generator model based on the trained neural network claim generator model; generating a pair of opposing claims for each of the one or more frequent phrases related to the product review using the trained neural network claim generator model and the trained neural network opposing claim generator model, wherein a generated positive claim is entailed by the product reviews for the single item and a negative claim refutes the positive claim; and outputting a count of sentences entailing the positive claim and a count of sentences entailing the negative claim.
  • FIG. 1 is a block/flow diagram illustrating a high-level system/method for summarizing opinions from customer reviews, in accordance with an embodiment of the present invention
  • FIG. 2 is a block/flow diagram illustrating a system/method for applying a claim generator and opposing claim generator, in accordance with an embodiment of the present invention
  • FIG. 1 is a block/flow diagram illustrating a high-level system/method for summarizing opinions from customer reviews, in accordance with an embodiment of the present invention
  • FIG. 2 is a block/flow diagram illustrating a system/method for applying a claim generator and opposing claim generator, in accordance with an embodiment of the present invention
  • FIG. 1 is a block/flow diagram illustrating a system/method for applying a claim generator and opposing claim generator, in accordance with an embodiment of the present invention
  • FIG. 2 is a block/flow diagram illustrating a system/method for applying a claim generator and opposing claim generator, in accordance with an embodiment of the present invention
  • FIG. 3 is a flow diagram illustrating a system/method for training a claim generator, in accordance with an embodiment of the present invention.
  • FIG. 4 is a flow diagram illustrating a system/method for training an opposing claim generator, in accordance with an embodiment of the present invention.
  • FIG. 5 is a computer system for opinion summarization is illustrated, in accordance with an embodiment of the present invention; and
  • FIG. 6 is a block diagram of exemplary reviews and output, in accordance with an embodiment of the present invention. DETAILED DESCRIPTION OF PREFERRED EMBODIMENTS [0016]
  • systems and methods are provided for extracting and counting frequent opinions within a corpus of customer reviews.
  • Embodiments of the invention relate to summarizing and counting contrasting opinions about aspects of a product or service given by key words or phrases, within a set of customer reviews.
  • a list of frequent phrases and sentences containing those phrases is determined by comparing frequencies in a collection of reviews about a single product to overall frequencies of the phrases in the corpus. For each phrase, a first generation module fills in a template with a word to write a sentence using the phrase. A second generation module writes an opposing sentence using the phrase. Sentences in the review corpus that semantically imply the first generated sentence or opposing generated sentence can be counted and extracted using an entailment module. [0018] In various embodiments, statements are compared using an entailment module.
  • Entailment is the classification task of predicting the logical relation between a pair of sentences. Given a pair (p, h) of a premise statement, p, and a hypothesis, h, a distribution is computed over the probabilities that p entails h, that p refutes h, or neither.
  • the method utilizes a trained entailment module, M, where M(p; h; r) predicts the logarithm of the probability that premise, p, and hypothesis, h, are in relation, r.
  • a trained entailment module, M, and a trained language model, T 0 are used to train a claim generation module that generates fluent claims that are semantically implied by the review text. For each generated fluent claim, a semantically opposing claim is also generated to capture disagreements among the reviews. [0021] In various embodiments, by using a claim generation module, one can pinpoint opinions about key phrases beyond just “positive” and “negative” sentiment.
  • a pretrained transformer model for masked language modeling such as Bidirectional Encoder Representations from Transformers (BERT) or XLNet, can be fine-tuned to become the claim generation model, where the fine-tuned claim generation model predicts the sequence of words for the claim.
  • the vocabulary for such a model includes tokens for common words and pieces of words, and several special tokens used in training. Unannotated reviews can be used for training, without any ground truth summaries or ground truth sentiment data.
  • the pairs of opposing claims are output to the user, along with counts and/or quotes of the original review statements that support each of the claims.
  • an opinion summarization tool 100 can count opinions about a product and provide summaries of the opinions.
  • a list of reviews of a single product can be inputted to the frequency analyzer of the opinion summarization tool 100.
  • a corpus of reviews of a set of products in the same category as the single product can be inputted to the frequency analyzer of the opinion summarization tool 100.
  • the frequency analyzer can perform frequency analysis on both the list of reviews of the single product and the corpus of reviews of the set of products in the same category.
  • the frequency analysis can find frequent phrases for the single product, and identify key words and/or phrases, ki, used in the list of reviews of the single product.
  • the frequency analysis can also find the frequency of the phrases in the corpus. Given a set of reviews of a single product, a set of key words and/or phrases can be selected. The key words may be input directly by a user or determined, for example, by taking the phrases that occur in that product’s reviews more frequently than in the review corpus overall by some factor.
  • a claim generator may output a pair of opposing claims related to the single product, where a pair of opposing claims can be generated for each of the phrases that equal or exceed a threshold frequency.
  • the claim, g, and opposing claim g’ are the output of the claim generator module.
  • entailment can be applied to each claim of the opposing claims and the sentences of the reviews of the single product by an entailment module, M.
  • the method utilizes a trained entailment module, M, where M(p; h; r) predicts the logarithm of the probability that premise p and h are in relation r.
  • pairs of opposing claims and a count and/or quotation of the sentences in the reviews that supported either of the opposing claims can be outputted to the user.
  • the entailment module, M is then applied against every sentence of reviews of the product.
  • Pairs of opposing claims involving the frequent phrases and a count or quotation of the sentences that supported them can be outputted.
  • FIG. 2 is a block/flow diagram illustrating a system/method for applying a claim generator and opposing claim generator, in accordance with an embodiment of the present invention.
  • the claim generator and opposing claim generator 200 can be applied to output a pair of opposing statements about a key phrase.
  • a key word or phrase can be inputted to the claim generator.
  • the key word or phrase, ki can be selected by a user or determined by the frequency analysis, such as by taking a word or phrase that occurs more frequently in the product's reviews than in the corpus overall.
  • reviews of a product containing the key word or phrase can be inputted to the claim generator.
  • Sentences r (1) ; : : : ; r (n) containing the key word or phrase are extracted from the product’s reviews.
  • the part of speech or type of word or phrase that has been inputted by the user or selected based on frequency can be determined in order to determine which template to apply.
  • Existing systems, such as Spacy that can label words of sentences with their part of speech can be used.
  • zero or more summary templates, f can be defined in block 230 for use in block 240.
  • Each summary template, f quotes the key word or phrase, k i , or a word or phrase derived from ki, and has zero or one positions which are masked and are to be filled in with a word.
  • frequent phrases for the product are used to fill in the templates.
  • the templates are defined as follows. If ki is a noun or noun phrase, the template is “The k i is [MASK].” If k i is an adjective, the templates are “It is ki.” and “The [MASK] is ki.” [0042] Additional templates may be defined similarly. The templates may involve different words derived from k i , for instance by changing k i from plural to singular if ki is a noun, or by conjugating ki into past tense if ki is a verb.
  • an opposing template can be provided by inserting a negation word, such as “It is not ki” for “It is ki,” because the claim generator and opposing claim generator are not used for such templates.
  • a summary template, f is taken, with a slot for the key word or phrase, and zero or one masked tokens to be filled in.
  • a sequence of tokens, x can be constructed by concatenating the classification token, each of the sentences, r (1) ; : : ; r (n) , the separator token, the template output, f j , and another copy of the separator token.
  • the separator token is a special token that tells the model that different sentences have different purposes.
  • the separator token also keeps sentences from being concatenated or mixed up.
  • An input token is part of an input sequence.
  • the claim generator 200 can be trained 300 to complete a template with fluent, logically implied statements.
  • one masked word in the template can be predicted using a refined claim generator, T 1 , applied to the input reviews. In this sequence, suppose a mask token appears at position, m, within the template output, fj. Claim generator, T 1 , outputs a distribution over words at position m.
  • substitutions of the predicted words from the claim generator into the template can be reranked using entailment against the review sentences that include the key phrase, where entailment decides whether the completed template is logically implied by each of the review sentences.
  • the average log likelihood of a sentence implying the completed template, output by the entailment module, M, is used for reranking.
  • the predicted words from the opposing claim generator, T 2 can be reranked using entailment against the reviews, where entailment decides whether the completed template is logically refuted by the statement ranked highest in block 260.
  • FIG. 3 is a flow diagram illustrating a system/method for training a claim generator, in accordance with an embodiment of the present invention.
  • the claim generator model 200 can be trained 300 to complete a template with fluent, logically implied statements.
  • a trained entailment classification model, M can be inputted to the claim generator training method 350.
  • a pretrained masked language model, T 0 can be inputted to the claim generator training method 350.
  • a pretrained transformer model, T that has been pretrained for masked language modeling, such as BERT, can be fine-tuned.
  • the vocabulary for such a model includes tokens for common words and pieces of words, and several special tokens used in training, including a classification token and a separator token.
  • T 0 written for the pretrained model can be written before fine-tuning, and T 1 can be written for the fine-tuned model.
  • T 0 can be used for training, and T 1 can be generated from T 0 by backpropagation.
  • the claim generator training method can receive and implement the inputs.
  • a claim generator model T 1 is finetuned from the pretrained masked language model, T 0 .
  • the claim generator training method 350 can apply one or more templates to a key phrase, k j . Let f j be the output of a summary template on k j .
  • the claim generator training method 350 can construct an input sequence of words. Construct x (j) for blocks 250 and 370 by concatenating the classification token of the vocabulary of T, each of the sentences, r j (k) , the separator token of the vocabulary of T, the template output fj, and another copy of the separator token. One position ⁇ ⁇ is a masked token, coming from inside fj.
  • the separator token is a special token that tells the model that different sentences have different purposes. The separator token also keeps sentences from being concatenated or mixed up.
  • An input token is part of an input sequence.
  • the claim generator training method 350 can apply a claim generator model, T 1 , to the input sequence.
  • the claim generator model may be initialized using a copy of the parameters of the pretrained masked language model, T 0 , and then fine-tune its own parameters during the course of the training 300 described in this figure.
  • the claim generator model can apply a Gumbel softmax function to obtain one or more words for the masked position in the input sequence.
  • the Gumbel softmax estimator outputs a single choice. Let g j be the result of substituting w j into the masked position of the template output fj. [0066]
  • the Gumbel-Max offers an efficient way of sampling from the probability distribution over subwords (replacement values for the MASK) output by the model T 1 by adding a random variable to the log of the probabilities and taking the argmax.
  • the second step is to replace the argmax with a softmax to make this operation differentiable as well.
  • the softmax here has a temperature parameter ⁇ . Setting ⁇ to 0 makes the distribution identical to the categorical one and the samples are perfectly discrete. For small ⁇ , the gradients have high variance, which is an issue of stochastic neural networks.
  • an entailment loss can be computed for the claim generator model for the entailment of the completed template against the input sentences and a language modeling loss.
  • the training method 300 utilizes a trained entailment module, M, where M(p; h; r) predicts the logarithm of the probability that premise p and h are in relation r.
  • M(p; h; r) predicts the logarithm of the probability that premise p and h are in relation r.
  • This loss reflects the average log likelihood that each review statement supports the claim g j , according to entailment module, M, which is for entailment/implication.
  • a second loss reflects the word log likelihood estimates according to the language model T°:
  • the loss is calculated from the teacher model in a teacher- student arrangement. Low Log probabilities result in a high loss.
  • the loss(es), L, L 1 , L 2 can be used for backpropagation to refine the pretrained claim generator model. Training proceeds by backpropagation through M, G ⁇ , and T 1 , during which the parameters of entailment module, M, are held fixed. An annealing schedule may be used to lower the temperature t of the Gumbel softmax.
  • the trained claim generator model, T 1 can be provided to the user.
  • one or more summary templates can be defined in block 330 for use in block 360.
  • Each summary template quotes the key word or phrase, 1Q, or a word or phrase derived from 1Q, and has zero or one positions which are masked and are to be filled in with a word.
  • template,f can be: “The [KEYWORD] was [MASK] ,”
  • the refined claim generator model, T 1 might output: “The delivery was fast,” because it has been trained (in block 400) with a loss from the pretrained language model, T 0 , which finds that “fast” is a fluent completion of the template, and with a loss from the entailment module, M, which finds that this set of sentences entails “The delivery was fast.” [0084] The output would then be: “The delivery was fast.”
  • T 0 written for the pretrained model can be written before fine-tuning, and T 1 can be written for the fine-tuned model.
  • T 0 can be used for training, and T 1 can be generated from T 0 by backpropagation.
  • a second model T 2 can be fine-tuned from T 0 , as shown in Fig. 4.
  • the refined contrary claim generation model, T 2 might output: “The delivery is slow,” because “slow” is a fluent completion of the template according to language model T 0 , and the entailment module, M, would find that “The delivery is fast” refutes “The delivery is slow.”
  • FIG. 4 is a flow diagram illustrating a system/method for training an opposing claim generator, in accordance with an embodiment of the present invention.
  • the opposing claim generator 200 can be trained 500 to output a best ranked contrary statement from the top predictions from the entailment module. Backpropagation can be used for training the opposing claim generator.
  • a trained entailment classification model can be inputted to the opposing claim generator training method 550.
  • a trained masked language model, T 0 can be inputted to the opposing claim generator training method 550.
  • a set of templates and rules can be inputted to the opposing claim generator training method 550.
  • a data set of review sentences can be inputted to the opposing claim generator training method 550.
  • a dataset, 6, is provided in which the i th example consists of a keyword or phrase ki and a set of sentences ⁇ ⁇ ⁇ ; : : : ; each from a review of the same product and each containing ki.
  • the opposing claim generator 550 can receive the inputs.
  • the opposing claim generator can apply one or more templates to a key phrase selected by the user or by frequency analysis.
  • the claim generator T 1 is applied to the inputted to the template and the j th example of D to output a sentence gj.
  • the opposing claim generator can construct an input sequence by concatenating the classification token, the sentence g j , the separator token, the template fj, and a second copy of the separator token. Exactly one position ⁇ ′ ⁇ of y (j) is masked.
  • the opposing claim generator can apply an opposing claim generator model, T 2 , to the input sequence.
  • the opposing claim generator model may be initialized using a copy of the parameters of the trained masked language model T 0 and then fine-tune its own parameters during the course of the training described in this figure.
  • the opposing claim generator can apply a Gumbel softmax function to the obtain one or more words for the masked position.
  • a single word is obtained from the straight through Gumbel softmax estimator G ⁇ applied to T 2 at position m’j.
  • G ⁇ Gumbel softmax estimator
  • the refined contrary language model, T 2 might output: “The delivery is slow,” because “slow” is a fluent completion of the template according to language model T 0 , and the entailment module, M, would find that “The delivery is fast” refutes “The delivery is slow.”
  • FIG. 5 illustrates a computer system for opinion summarization, in accordance with an embodiment of the present invention.
  • the computer matching system for opinion summarization 700 can include one or more processors 710, which can be central processing units (CPUs), graphics processing units (GPUs), and combinations thereof, and a computer memory 720 in electronic communication with the one or more processors710, where the computer memory 720 can be random access memory (RAM), solid state drives (SSDs), hard disk drives (HDDs), optical disk drives (ODD), etc.
  • the memory 720 can be configured to store the opinion summarization tool 100, including a trained claim generator model 750, trained opposing claim generator model 760, trained entailment model 770, and review corpus 780.
  • the trained claim generator model 750 can be a neural network configured to generate claims utilizing one or more templates.
  • the opposing claim generator model 760 can be a neural network configured to generate opposing claims utilizing one or more templates.
  • the entailment model 770 can be configured to calculate entailment and entailment loss for each of the claims generated by the claim generator model 750 or opposing claims generated by the opposing claim generator model 760.
  • a display module can be configured to present an ordered list of the claims and opposing claims to a user as a summary of the reviews.
  • the memory 720 and one or more processors 710 can be in electronic communication with a display screen 730 over a system bus and I/O controllers, where the display screen 730 can present the ranked list of claims.
  • FIG. 6 is a block diagram of exemplary reviews and output, in accordance with an embodiment of the present invention.
  • a list of reviews of a single product 110 can be fed into the system/method.
  • the list of reviews of a single product 110 can include a number of different statements regarding the particular product (or service) provided by customers of the product (or service).
  • a corpus of reviews of similar products (or services) 120 can be fed into the system/method.
  • the list of reviews can include a number of different statements regarding products (or services) that are similar to the product of service being reviewed.
  • the trained neural network claim generator model 750 can be a neural network configured to generate claims utilizing one or more templates.
  • a claim generated by the claim generator model 750 that summarizes the input claims can be output 140.
  • An opposing claim generated by the opposing claim generator model can also be output to form a pair of opposing claims.
  • Embodiments described herein may be entirely hardware, entirely software or including both hardware and software elements.
  • the present invention is implemented in software, which includes but is not limited to firmware, resident software, microcode, etc.
  • Embodiments may include a computer program product accessible from a computer-usable or computer-readable medium providing program code for use by or in connection with a computer or any instruction execution system.
  • a computer- usable or computer readable medium may include any apparatus that stores, communicates, propagates, or transports the program for use by or in connection with the instruction execution system, apparatus, or device.
  • the medium can be magnetic, optical, electronic, electromagnetic, infrared, or semiconductor system (or apparatus or device) or a propagation medium.
  • the medium may include a computer-readable storage medium such as a semiconductor or solid state memory, magnetic tape, a removable computer diskette, a random access memory (RAM), a read-only memory (ROM), a rigid magnetic disk and an optical disk, etc.
  • a computer-readable storage medium such as a semiconductor or solid state memory, magnetic tape, a removable computer diskette, a random access memory (RAM), a read-only memory (ROM), a rigid magnetic disk and an optical disk, etc.
  • Each computer program may be tangibly stored in a machine-readable storage media or device (e.g., program memory or magnetic disk) readable by a general or special purpose programmable computer, for configuring and controlling operation of a computer when the storage media or device is read by the computer to perform the procedures described herein.
  • a data processing system suitable for storing and/or executing program code may include at least one processor coupled directly or indirectly to memory elements through a system bus.
  • the memory elements can include local memory employed during actual execution of the program code, bulk storage, and cache memories which provide temporary storage of at least some program code to reduce the number of times code is retrieved from bulk storage during execution.
  • I/O devices including but not limited to keyboards, displays, pointing devices, etc. may be coupled to the system either directly or through intervening I/O controllers.
  • Network adapters may also be coupled to the system to enable the data processing system to become coupled to other data processing systems or remote printers or storage devices through intervening private or public networks. Modems, cable modem and Ethernet cards are just a few of the currently available types of network adapters.
  • the term “hardware processor subsystem” or “hardware processor” can refer to a processor, memory, software or combinations thereof that cooperate to perform one or more specific tasks.
  • the hardware processor subsystem can include one or more data processing elements (e.g., logic circuits, processing circuits, instruction execution devices, etc.).
  • the one or more data processing elements can be included in a central processing unit, a graphics processing unit, and/or a separate processor- or computing element-based controller (e.g., logic gates, etc.).
  • the hardware processor subsystem can include one or more on-board memories (e.g., caches, dedicated memory arrays, read only memory, etc.).
  • the hardware processor subsystem can include one or more memories that can be on or off board or that can be dedicated for use by the hardware processor subsystem (e.g., ROM, RAM, basic input/output system (BIOS), etc.).
  • the hardware processor subsystem can include and execute one or more software elements.
  • the one or more software elements can include an operating system and/or one or more applications and/or specific code to achieve a specified result.
  • the hardware processor subsystem can include dedicated, specialized circuitry that performs one or more electronic processing functions to achieve a specified result. Such circuitry can include one or more application-specific integrated circuits (ASICs), field-programmable gate arrays (FPGAs), and/or programmable logic arrays (PLAs).
  • ASICs application-specific integrated circuits
  • FPGAs field-programmable gate arrays
  • PDAs programmable logic arrays
  • such phrasing is intended to encompass the selection of the first listed option (A) only, or the selection of the second listed option (B) only, or the selection of the third listed option (C) only, or the selection of the first and the second listed options (A and B) only, or the selection of the first and third listed options (A and C) only, or the selection of the second and third listed options (B and C) only, or the selection of all three options (A and B and C).
  • This may be extended for as many items listed.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Strategic Management (AREA)
  • Finance (AREA)
  • Development Economics (AREA)
  • Accounting & Taxation (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Game Theory and Decision Science (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • General Business, Economics & Management (AREA)
  • Machine Translation (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

Systems and methods for opinion summarization are provided for extracting and counting frequent opinions. The method includes performing a frequency analysis on an inputted list of product reviews for a single item and an inputted corpus of reviews for a product category containing the single item to identify one or more frequent phrases; fine tuning a pretrained transformer model to produce a trained neural network claim generator model, and generating a trained neural network opposing claim generator model based on the trained neural network claim generator model. The method further includes generating a pair of opposing claims for each of the one or more frequent phrases, wherein a generated positive claim is entailed by the product reviews for the single item and a negative claim refutes the positive claim, and outputting a count of sentences entailing the positive claim and a count of sentences entailing the negative claim

Description

OPINION SUMMARIZATION TOOL RELATED APPLICATION INFORMATION [0001] This application claims priority to U.S. Patent Application No. 17/716,347, filed on April 8, 2022, U.S. Provisional Patent Application No. 63/173,528, filed on April 12, 2021, both incorporated herein by reference in their entirety. BACKGROUND Technical Field [0002] The present invention relates to extracting and counting frequent opinions and more particularly for extracting and counting frequent opinions within a corpus of customer reviews. Description of the Related Art [0003] Capturing word embeddings from very large corpora increases the value of word embeddings to both unsupervised and semi-supervised NLP tasks. Bidirectional recurrent neural networks such as bidirectional LSTMs have been used to learn internal representations of wider sentential contexts. Context2Vec is a neural network model that embeds entire sentential contexts and target words in the same low- dimensional space. The model can learn a generic task-independent embedding function for variable-length sentential contexts around target words based on a continuous bag of words (CBOW) architecture. [0004] Objective functions define the objective of the optimization. An optimization problem can be stated as: min(Φ(U(x),x)), where Φ is the objective function that depends on the state variables, U, and the design variables, x. The target of an objective function can be minimized or maximized. SUMMARY [0005] According to an aspect of the present invention, a method is provided for extracting and counting frequent opinions. The method includes performing a frequency analysis on an inputted list of product reviews for a single item and an inputted corpus of reviews for a product category containing the single item to identify one or more frequent phrases. The method further includes fine tuning a pretrained transformer model to produce a trained neural network claim generator model, T1, and generating a trained neural network opposing claim generator model based on the trained neural network claim generator model. The method further includes generating a pair of opposing claims for each of the one or more frequent phrases related to the product review using the trained neural network claim generator model and the trained neural network opposing claim generator model, wherein a generated positive claim is entailed by the product reviews for the single item and a negative claim refutes the positive claim, and outputting a count of sentences entailing the positive claim and a count of sentences entailing the negative claim. [0006] According to another aspect of the present invention, a system is provided for opinion summarization. The system includes, one or more processors, computer memory, and a display screen in electronic communication with the computer memory and the one or more processors, wherein the computer memory includes a frequency analyzer configured to perform a frequency analysis on an inputted list of product reviews for a single item and an inputted corpus of reviews for a product category containing the single item to identify one or more frequent phrases, a trained neural network claim generator model, a trained neural network opposing claim generator model, wherein the trained neural network claim generator model and the trained neural network opposing claim generator model are configured to generate a pair of opposing claims for each of the one or more frequent phrases related to the product reviews, wherein a positive claim generated by the trained neural network claim generator is entailed by the product reviews for the single item and a negative claim generated by the trained neural network opposing claim generator refutes the positive claim, and an entailment module configured to output a count of sentences entailing the positive claim and a count of sentences entailing the negative claim. [0007] According to yet another aspect of the present invention, non-transitory computer readable storage medium comprising a computer readable program for extracting and counting frequent opinions is provided. The computer readable program when executed on a computer causes the computer to perform the steps of: performing a frequency analysis on an inputted list of product reviews for a single item and an inputted corpus of reviews for a product category containing the single item to identify one or more frequent phrases; fine tuning a pretrained transformer model to produce a trained neural network claim generator model, T1; generating a trained neural network opposing claim generator model based on the trained neural network claim generator model; generating a pair of opposing claims for each of the one or more frequent phrases related to the product review using the trained neural network claim generator model and the trained neural network opposing claim generator model, wherein a generated positive claim is entailed by the product reviews for the single item and a negative claim refutes the positive claim; and outputting a count of sentences entailing the positive claim and a count of sentences entailing the negative claim. [0008] These and other features and advantages will become apparent from the following detailed description of illustrative embodiments thereof, which is to be read in connection with the accompanying drawings. BRIEF DESCRIPTION OF DRAWINGS [0009] The disclosure will provide details in the following description of preferred embodiments with reference to the following figures wherein: [0010] FIG. 1 is a block/flow diagram illustrating a high-level system/method for summarizing opinions from customer reviews, in accordance with an embodiment of the present invention; [0011] FIG. 2 is a block/flow diagram illustrating a system/method for applying a claim generator and opposing claim generator, in accordance with an embodiment of the present invention; [0012] FIG. 3 is a flow diagram illustrating a system/method for training a claim generator, in accordance with an embodiment of the present invention; [0013] FIG. 4 is a flow diagram illustrating a system/method for training an opposing claim generator, in accordance with an embodiment of the present invention; [0014] FIG. 5 is a computer system for opinion summarization is illustrated, in accordance with an embodiment of the present invention; and [0015] FIG. 6 is a block diagram of exemplary reviews and output, in accordance with an embodiment of the present invention. DETAILED DESCRIPTION OF PREFERRED EMBODIMENTS [0016] In accordance with embodiments of the present invention, systems and methods are provided for extracting and counting frequent opinions within a corpus of customer reviews. Embodiments of the invention relate to summarizing and counting contrasting opinions about aspects of a product or service given by key words or phrases, within a set of customer reviews. [0017] In various embodiments, a list of frequent phrases and sentences containing those phrases is determined by comparing frequencies in a collection of reviews about a single product to overall frequencies of the phrases in the corpus. For each phrase, a first generation module fills in a template with a word to write a sentence using the phrase. A second generation module writes an opposing sentence using the phrase. Sentences in the review corpus that semantically imply the first generated sentence or opposing generated sentence can be counted and extracted using an entailment module. [0018] In various embodiments, statements are compared using an entailment module. This can allow diverse ways of describing an aspect of a product or service compared to keyword matching, and an opinion may be isolated with regard to an aspect, even if other opinions are contained in the same sentence. [0019] Entailment is the classification task of predicting the logical relation between a pair of sentences. Given a pair (p, h) of a premise statement, p, and a hypothesis, h, a distribution is computed over the probabilities that p entails h, that p refutes h, or neither. In various embodiments, the method utilizes a trained entailment module, M, where M(p; h; r) predicts the logarithm of the probability that premise, p, and hypothesis, h, are in relation, r. After training, the parameters of the trained entailment module, M, are frozen for the subsequent steps. [0020] In one or more embodiments, a trained entailment module, M, and a trained language model, T0, are used to train a claim generation module that generates fluent claims that are semantically implied by the review text. For each generated fluent claim, a semantically opposing claim is also generated to capture disagreements among the reviews. [0021] In various embodiments, by using a claim generation module, one can pinpoint opinions about key phrases beyond just “positive” and “negative” sentiment. A pretrained transformer model for masked language modeling, such as Bidirectional Encoder Representations from Transformers (BERT) or XLNet, can be fine-tuned to become the claim generation model, where the fine-tuned claim generation model predicts the sequence of words for the claim. The vocabulary for such a model includes tokens for common words and pieces of words, and several special tokens used in training. Unannotated reviews can be used for training, without any ground truth summaries or ground truth sentiment data. [0022] In various embodiments, the pairs of opposing claims are output to the user, along with counts and/or quotes of the original review statements that support each of the claims. [0023] Referring now in detail to the figures in which like numerals represent the same or similar elements and initially to FIG. 1, a high-level system/method for summarizing opinions from customer reviews is described, in accordance with an embodiment of the present invention. [0024] In one or more embodiments, an opinion summarization tool 100 can count opinions about a product and provide summaries of the opinions. [0025] At block 110, a list of reviews of a single product can be inputted to the frequency analyzer of the opinion summarization tool 100. [0026] At block 120, a corpus of reviews of a set of products in the same category as the single product can be inputted to the frequency analyzer of the opinion summarization tool 100. [0027] At block 130, the frequency analyzer can perform frequency analysis on both the list of reviews of the single product and the corpus of reviews of the set of products in the same category. The frequency analysis can find frequent phrases for the single product, and identify key words and/or phrases, ki, used in the list of reviews of the single product. The frequency analysis can also find the frequency of the phrases in the corpus. Given a set of reviews of a single product, a set of key words and/or phrases can be selected. The key words may be input directly by a user or determined, for example, by taking the phrases that occur in that product’s reviews more frequently than in the review corpus overall by some factor. [0028] At block 140, a claim generator may output a pair of opposing claims related to the single product, where a pair of opposing claims can be generated for each of the phrases that equal or exceed a threshold frequency. The claim, g, and opposing claim g’ are the output of the claim generator module. [0029] At block 150, entailment can be applied to each claim of the opposing claims and the sentences of the reviews of the single product by an entailment module, M. In various embodiments, the method utilizes a trained entailment module, M, where M(p; h; r) predicts the logarithm of the probability that premise p and h are in relation r. [0030] At block 160, pairs of opposing claims and a count and/or quotation of the sentences in the reviews that supported either of the opposing claims can be outputted to the user. To generate a summary for the user, the entailment module, M, is then applied against every sentence of reviews of the product. A count of sentences, x, such that: [0031] ^^^^^^^ ^^^, ^, ^^ = ^^^^^^^ ; [0032] and a count of sentences, x, such that [0033] ^^^^^^^ ^^^, ^′, ^^ = ^^^^^^^ ; [0034] is output to the user, along with the text of g and g’. This summarizes the contrasting opinions about key word or phrase, k. [0035] Pairs of opposing claims involving the frequent phrases and a count or quotation of the sentences that supported them can be outputted. [0036] FIG. 2 is a block/flow diagram illustrating a system/method for applying a claim generator and opposing claim generator, in accordance with an embodiment of the present invention. [0037] In one or more embodiments, the claim generator and opposing claim generator 200 can be applied to output a pair of opposing statements about a key phrase. [0038] At block 210, a key word or phrase can be inputted to the claim generator. The key word or phrase, ki, can be selected by a user or determined by the frequency analysis, such as by taking a word or phrase that occurs more frequently in the product's reviews than in the corpus overall. [0039] At block 220, reviews of a product containing the key word or phrase can be inputted to the claim generator. Sentences r(1); : : : ; r(n) containing the key word or phrase are extracted from the product’s reviews. [0040] At block 230, the part of speech or type of word or phrase that has been inputted by the user or selected based on frequency can be determined in order to determine which template to apply. Existing systems, such as Spacy, that can label words of sentences with their part of speech can be used. For each part of speech or phrase type, zero or more summary templates, f, can be defined in block 230 for use in block 240. Each summary template, f, quotes the key word or phrase, ki, or a word or phrase derived from ki, and has zero or one positions which are masked and are to be filled in with a word. In one or more embodiments, frequent phrases for the product are used to fill in the templates. [0041] In one or more embodiments, the templates are defined as follows. If ki is a noun or noun phrase, the template is “The ki is [MASK].” If ki is an adjective, the templates are “It is ki.” and “The [MASK] is ki.” [0042] Additional templates may be defined similarly. The templates may involve different words derived from ki, for instance by changing ki from plural to singular if ki is a noun, or by conjugating ki into past tense if ki is a verb. For templates with zero masked positions, an opposing template can be provided by inserting a negation word, such as “It is not ki” for “It is ki,” because the claim generator and opposing claim generator are not used for such templates. [0043] At block 240, for each key word or phrase, k, a summary template, f, is taken, with a slot for the key word or phrase, and zero or one masked tokens to be filled in. [0044] A sequence of tokens, x, can be constructed by concatenating the classification token, each of the sentences, r(1); : : : ; r(n), the separator token, the template output, fj, and another copy of the separator token. The separator token is a special token that tells the model that different sentences have different purposes. The separator token also keeps sentences from being concatenated or mixed up. An input token is part of an input sequence. In one or more embodiments, the claim generator 200 can be trained 300 to complete a template with fluent, logically implied statements. [0045] At block 250, one masked word in the template can be predicted using a refined claim generator, T1, applied to the input reviews. In this sequence, suppose a mask token appears at position, m, within the template output, fj. Claim generator, T1, outputs a distribution over words at position m. Let w1; : : : ; ws be the s most probable words, and let g1; : : : ; gs be the claims achieved by substituting into the masked position of template output, fj. [0046] At block 260, substitutions of the predicted words from the claim generator into the template can be reranked using entailment against the review sentences that include the key phrase, where entailment decides whether the completed template is logically implied by each of the review sentences. The average log likelihood of a sentence implying the completed template, output by the entailment module, M, is used for reranking. Then take the word w = wi and claim g = gi that maximizes:
Figure imgf000012_0001
[0048] Given claim g, let y be the sequence of tokens constructed by concatenating the classification token, the sentences, r(i), claim, g, the separator token, the template output f, and another copy of the separator token. [0049] At block 270, one masked word in the template can be predicted using an opposing claim generator, T2, applied to y. In this sequence, a mask token appears at position m’. Opposing claim generator, ^^ ^ ^ outputs a distribution over words at position m’. Let w’1 ; : : : ; w’s be the s most probable words, and let g’1 ; : : : ; g’s be the claims achieved by substituting into the masked position of f. [0050] At block 280, the predicted words from the opposing claim generator, T2, can be reranked using entailment against the reviews, where entailment decides whether the completed template is logically refuted by the statement ranked highest in block 260. The log likelihood of refutation is used for ranking. Take the word w’ = w’i and claim g’ = g’i that maximizes M(g; g’i; refutes). [0051] At block 290, the highest ranked statement, g, from block 260 and the highest ranked opposing statement, g’, from block 280 can be presented to the user. If the top entailment probability from block 260 is too low, the system/method can end without outputting statements. [0052] FIG. 3 is a flow diagram illustrating a system/method for training a claim generator, in accordance with an embodiment of the present invention; [0053] In one or more embodiments, the claim generator model 200 can be trained 300 to complete a template with fluent, logically implied statements. [0054] At block 310, a trained entailment classification model, M, can be inputted to the claim generator training method 350. [0055] At block 320, a pretrained masked language model, T0, can be inputted to the claim generator training method 350. In one or more embodiments, a pretrained transformer model, T, that has been pretrained for masked language modeling, such as BERT, can be fine-tuned. The vocabulary for such a model includes tokens for common words and pieces of words, and several special tokens used in training, including a classification token and a separator token. [0056] In various embodiments, T0, written for the pretrained model can be written before fine-tuning, and T1 can be written for the fine-tuned model. T0 can be used for training, and T1 can be generated from T0 by backpropagation. Given a sequence of tokens, x, at each position, i, T0 outputs a distribution Ti 0 over tokens, v, in a vocabulary, , in which Ti 00(v) predicts: [0057] log $^^^ = %|^'^^ ; [0058] where x−i denotes the sequence where the ith token is masked. [0059] At block 330, a set of templates and rules can be inputted to the claim generator training method 350. [0060] At block 340, a data set of review sentences, where given key phrases occur, can be inputted to the claim generator training method 350. A dataset, (, is provided in which the ith example consists of a keyword or phrase ki and a set of sentences
Figure imgf000014_0001
; : : : ; ^^^ ^ )^, each from a review of the same product and each containing ki. [0061] At block 350, the claim generator training method can receive and implement the inputs. A claim generator model T1 is finetuned from the pretrained masked language model, T0. [0062] At block 360, the claim generator training method 350 can apply one or more templates to a key phrase, kj. Let fj be the output of a summary template on kj. If there is no masked token in fj, this example is not used for training the claim generator model T1. [0063] At block 370, the claim generator training method 350 can construct an input sequence of words. Construct x(j) for blocks 250 and 370 by concatenating the classification token of the vocabulary of T, each of the sentences, rj (k), the separator token of the vocabulary of T, the template output fj, and another copy of the separator token. One position ^ ^
Figure imgf000014_0002
is a masked token, coming from inside fj. The separator token is a special token that tells the model that different sentences have different purposes. The separator token also keeps sentences from being concatenated or mixed up. An input token is part of an input sequence. [0064] At block 380, the claim generator training method 350 can apply a claim generator model, T1, to the input sequence. The claim generator model may be initialized using a copy of the parameters of the pretrained masked language model, T0, and then fine-tune its own parameters during the course of the training 300 described in this figure. [0065] At block 390, the claim generator model can apply a Gumbel softmax function to obtain one or more words for the masked position in the input sequence. In various embodiments, a straight through Gumbel softmax estimator, Gτ, with temperature τ is used to obtain a single word, +^ = , in such a way
Figure imgf000015_0001
that it may backpropagate through Gτ. The Gumbel softmax estimator outputs a single choice. Let gj be the result of substituting wj into the masked position of the template output fj. [0066] The Gumbel-Max offers an efficient way of sampling from the probability distribution over subwords (replacement values for the MASK) output by the model T1 by adding a random variable to the log of the probabilities and taking the argmax. The second step is to replace the argmax with a softmax to make this operation differentiable as well. The softmax here has a temperature parameter τ. Setting τ to 0 makes the distribution identical to the categorical one and the samples are perfectly discrete. For small τ, the gradients have high variance, which is an issue of stochastic neural networks. Therefore, there is a trade-off between variance and bias for Gumbel softmax. [0067] At block 400, an entailment loss can be computed for the claim generator model for the entailment of the completed template against the input sentences and a language modeling loss. In various embodiments, the training method 300 utilizes a trained entailment module, M, where M(p; h; r) predicts the logarithm of the probability that premise p and h are in relation r. [0068] The entailment loss is given by:
Figure imgf000016_0001
[0070] This refers to sentence number k of the reviews that mention keyword kj.
[0071] The entailment loss results from substituting Wj into the MASK location.
This loss reflects the average log likelihood that each review statement supports the claim gj, according to entailment module, M, which is for entailment/implication. A second loss reflects the word log likelihood estimates according to the language model T°:
Figure imgf000016_0002
[0073] The total loss is a linear combination L = λL1+L2 of these two losses, where l is a weight applied to L1. The loss is calculated from the teacher model in a teacher- student arrangement. Low Log probabilities result in a high loss.
[0074] In various embodiments, the loss(es), L, L1, L2 can be used for backpropagation to refine the pretrained claim generator model. Training proceeds by backpropagation through M, Gг , and T1, during which the parameters of entailment module, M, are held fixed. An annealing schedule may be used to lower the temperature t of the Gumbel softmax.
[0075] The trained claim generator model, T1, can be provided to the user.
[0076] In a non-limiting exemplary embodiment, for each part of speech or phrase type, one or more summary templates can be defined in block 330 for use in block 360. Each summary template quotes the key word or phrase, 1Q, or a word or phrase derived from 1Q, and has zero or one positions which are masked and are to be filled in with a word.
[0077] For example, template,f, can be: “The [KEYWORD] was [MASK] ,”
[0078] Given a KEYWORD of “delivery” and reviews of:
[0079] “The delivery was super fast and came in perfect shape.” [0080] “The delivery came on time.” [0081] “Really rapid delivery.” [0082] “Loved the speedy delivery.” [0083] In block 380, the refined claim generator model, T1, might output: “The delivery was fast,” because it has been trained (in block 400) with a loss from the pretrained language model, T0, which finds that “fast” is a fluent completion of the template, and with a loss from the entailment module, M, which finds that this set of sentences entails “The delivery was fast.” [0084] The output would then be: “The delivery was fast.” In various embodiments, T0, written for the pretrained model can be written before fine-tuning, and T1 can be written for the fine-tuned model. T0 can be used for training, and T1 can be generated from T0 by backpropagation. For an opposing claim generator, a second model T2 can be fine-tuned from T0, as shown in Fig. 4. [0085] The refined contrary claim generation model, T2, might output: “The delivery is slow,” because “slow” is a fluent completion of the template according to language model T0, and the entailment module, M, would find that “The delivery is fast” refutes “The delivery is slow.” [0086] FIG. 4 is a flow diagram illustrating a system/method for training an opposing claim generator, in accordance with an embodiment of the present invention; [0087] In one or more embodiments, the opposing claim generator 200 can be trained 500 to output a best ranked contrary statement from the top predictions from the entailment module. Backpropagation can be used for training the opposing claim generator. [0088] At block 510, a trained entailment classification model can be inputted to the opposing claim generator training method 550. [0089] At block 520, a trained masked language model, T0, can be inputted to the opposing claim generator training method 550. [0090] At block 530, a set of templates and rules can be inputted to the opposing claim generator training method 550. [0091] At block 540, a data set of review sentences can be inputted to the opposing claim generator training method 550. A dataset, 6, is provided in which the ith example consists of a keyword or phrase ki and a set of sentences ^^^^ ^ ; : : : ;
Figure imgf000018_0001
each from a review of the same product and each containing ki. [0092] At block 550, the opposing claim generator 550 can receive the inputs. [0093] At block 560, the opposing claim generator can apply one or more templates to a key phrase selected by the user or by frequency analysis. [0094] At block 565, the claim generator T1 is applied to the inputted to the template and the jth example of D to output a sentence gj. [0095] At block 570, the opposing claim generator can construct an input sequence by concatenating the classification token, the sentence gj, the separator token, the template fj, and a second copy of the separator token. Exactly one position ^′^ of y(j) is masked. [0096] At block 580, the opposing claim generator can apply an opposing claim generator model, T2, to the input sequence. The opposing claim generator model may be initialized using a copy of the parameters of the trained masked language model T0 and then fine-tune its own parameters during the course of the training described in this figure. [0097] At block 590, the opposing claim generator can apply a Gumbel softmax function to the obtain one or more words for the masked position. A single word is obtained from the straight through Gumbel softmax
Figure imgf000019_0001
estimator Gτ applied to T2 at position m’j. [0098] Let g’j be the result of substituting w’j into the masked position of the template output fj. [0099] The refined contrary language model, T2, might output: “The delivery is slow,” because “slow” is a fluent completion of the template according to language model T0, and the entailment module, M, would find that “The delivery is fast” refutes “The delivery is slow.” [0100] At block 600, the opposing claim generator can compute a loss for the refutation of the completed template by the output of the claim generator model, and a language modeling loss. Define the loss: [0101] ℒ8 = −^^^^ , ^′^^, ^^9:^^^^; [0102] In various embodiments, the loss(es) can be used for backpropagation to refine the opposing claim generator model.
Figure imgf000019_0002
[0105] reflecting the log likelihood that g’j contradicts gj. As before, define a second loss using the language model T0. [0106] ℒ< = −^^ 5 ^* ^ +′^^ . [0107] The total loss for training T2 is the linear combination L = λL3+L4 of these two losses. The loss(es) can be used for backpropagation through M, Gτ, and T2 with respect to y(j), during which the parameters of M are held fixed and the temperature τ may be annealed. [0108] FIG. 5 illustrates a computer system for opinion summarization, in accordance with an embodiment of the present invention. [0109] In one or more embodiments, the computer matching system for opinion summarization 700 can include one or more processors 710, which can be central processing units (CPUs), graphics processing units (GPUs), and combinations thereof, and a computer memory 720 in electronic communication with the one or more processors710, where the computer memory 720 can be random access memory (RAM), solid state drives (SSDs), hard disk drives (HDDs), optical disk drives (ODD), etc. The memory 720 can be configured to store the opinion summarization tool 100, including a trained claim generator model 750, trained opposing claim generator model 760, trained entailment model 770, and review corpus 780. The trained claim generator model 750 can be a neural network configured to generate claims utilizing one or more templates. The opposing claim generator model 760 can be a neural network configured to generate opposing claims utilizing one or more templates. The entailment model 770 can be configured to calculate entailment and entailment loss for each of the claims generated by the claim generator model 750 or opposing claims generated by the opposing claim generator model 760. A display module can be configured to present an ordered list of the claims and opposing claims to a user as a summary of the reviews. The memory 720 and one or more processors 710 can be in electronic communication with a display screen 730 over a system bus and I/O controllers, where the display screen 730 can present the ranked list of claims. [0110] FIG. 6 is a block diagram of exemplary reviews and output, in accordance with an embodiment of the present invention. [0111] In one or more embodiments, a list of reviews of a single product 110 can be fed into the system/method. The list of reviews of a single product 110 can include a number of different statements regarding the particular product (or service) provided by customers of the product (or service). A corpus of reviews of similar products (or services) 120 can be fed into the system/method. The list of reviews can include a number of different statements regarding products (or services) that are similar to the product of service being reviewed. The trained neural network claim generator model 750 can be a neural network configured to generate claims utilizing one or more templates. A claim generated by the claim generator model 750 that summarizes the input claims can be output 140. An opposing claim generated by the opposing claim generator model can also be output to form a pair of opposing claims. [0112] Embodiments described herein may be entirely hardware, entirely software or including both hardware and software elements. In a preferred embodiment, the present invention is implemented in software, which includes but is not limited to firmware, resident software, microcode, etc. [0113] Embodiments may include a computer program product accessible from a computer-usable or computer-readable medium providing program code for use by or in connection with a computer or any instruction execution system. A computer- usable or computer readable medium may include any apparatus that stores, communicates, propagates, or transports the program for use by or in connection with the instruction execution system, apparatus, or device. The medium can be magnetic, optical, electronic, electromagnetic, infrared, or semiconductor system (or apparatus or device) or a propagation medium. The medium may include a computer-readable storage medium such as a semiconductor or solid state memory, magnetic tape, a removable computer diskette, a random access memory (RAM), a read-only memory (ROM), a rigid magnetic disk and an optical disk, etc. [0114] Each computer program may be tangibly stored in a machine-readable storage media or device (e.g., program memory or magnetic disk) readable by a general or special purpose programmable computer, for configuring and controlling operation of a computer when the storage media or device is read by the computer to perform the procedures described herein. The inventive system may also be considered to be embodied in a computer-readable storage medium, configured with a computer program, where the storage medium so configured causes a computer to operate in a specific and predefined manner to perform the functions described herein. [0115] A data processing system suitable for storing and/or executing program code may include at least one processor coupled directly or indirectly to memory elements through a system bus. The memory elements can include local memory employed during actual execution of the program code, bulk storage, and cache memories which provide temporary storage of at least some program code to reduce the number of times code is retrieved from bulk storage during execution. Input/output or I/O devices (including but not limited to keyboards, displays, pointing devices, etc.) may be coupled to the system either directly or through intervening I/O controllers. [0116] Network adapters may also be coupled to the system to enable the data processing system to become coupled to other data processing systems or remote printers or storage devices through intervening private or public networks. Modems, cable modem and Ethernet cards are just a few of the currently available types of network adapters. [0117] As employed herein, the term “hardware processor subsystem” or “hardware processor” can refer to a processor, memory, software or combinations thereof that cooperate to perform one or more specific tasks. In useful embodiments, the hardware processor subsystem can include one or more data processing elements (e.g., logic circuits, processing circuits, instruction execution devices, etc.). The one or more data processing elements can be included in a central processing unit, a graphics processing unit, and/or a separate processor- or computing element-based controller (e.g., logic gates, etc.). The hardware processor subsystem can include one or more on-board memories (e.g., caches, dedicated memory arrays, read only memory, etc.). In some embodiments, the hardware processor subsystem can include one or more memories that can be on or off board or that can be dedicated for use by the hardware processor subsystem (e.g., ROM, RAM, basic input/output system (BIOS), etc.). [0118] In some embodiments, the hardware processor subsystem can include and execute one or more software elements. The one or more software elements can include an operating system and/or one or more applications and/or specific code to achieve a specified result. [0119] In other embodiments, the hardware processor subsystem can include dedicated, specialized circuitry that performs one or more electronic processing functions to achieve a specified result. Such circuitry can include one or more application-specific integrated circuits (ASICs), field-programmable gate arrays (FPGAs), and/or programmable logic arrays (PLAs). [0120] These and other variations of a hardware processor subsystem are also contemplated in accordance with embodiments of the present invention. [0121] Reference in the specification to “one embodiment” or “an embodiment” of the present invention, as well as other variations thereof, means that a particular feature, structure, characteristic, and so forth described in connection with the embodiment is included in at least one embodiment of the present invention. Thus, the appearances of the phrase “in one embodiment” or “in an embodiment”, as well any other variations, appearing in various places throughout the specification are not necessarily all referring to the same embodiment. However, it is to be appreciated that features of one or more embodiments can be combined given the teachings of the present invention provided herein. [0122] It is to be appreciated that the use of any of the following “/”, “and/or”, and “at least one of”, for example, in the cases of “A/B”, “A and/or B” and “at least one of A and B”, is intended to encompass the selection of the first listed option (A) only, or the selection of the second listed option (B) only, or the selection of both options (A and B). As a further example, in the cases of “A, B, and/or C” and “at least one of A, B, and C”, such phrasing is intended to encompass the selection of the first listed option (A) only, or the selection of the second listed option (B) only, or the selection of the third listed option (C) only, or the selection of the first and the second listed options (A and B) only, or the selection of the first and third listed options (A and C) only, or the selection of the second and third listed options (B and C) only, or the selection of all three options (A and B and C). This may be extended for as many items listed. [0123] The foregoing is to be understood as being in every respect illustrative and exemplary, but not restrictive, and the scope of the invention disclosed herein is not to be determined from the Detailed Description, but rather from the claims as interpreted according to the full breadth permitted by the patent laws. It is to be understood that the embodiments shown and described herein are only illustrative of the present invention and that those skilled in the art may implement various modifications without departing from the scope and spirit of the invention. Those skilled in the art could implement various other feature combinations without departing from the scope and spirit of the invention. Having thus described aspects of the invention, with the details and particularity required by the patent laws, what is claimed and desired protected by Letters Patent is set forth in the appended claims.

Claims

WHAT IS CLAIMED IS: 1. A method (100) for extracting and counting frequent opinions, comprising: performing a frequency analysis (130) on an inputted list of product reviews (110) for a single item and an inputted corpus of reviews (120) for a product category containing the single item to identify one or more frequent phrases; fine tuning (350) a pretrained transformer model to produce a trained neural network claim generator model, T1; generating (550) a trained neural network opposing claim generator model based on the trained neural network claim generator model; generating a pair of opposing claims (140) for each of the one or more frequent phrases related to the product review using the trained neural network claim generator model and the trained neural network opposing claim generator model, wherein a generated positive claim is entailed (150) by the product reviews for the single item and a negative claim refutes the positive claim; and outputting a count of sentences (160) entailing the positive claim and a count of sentences entailing the negative claim.
2. The method as recited in claim 1, wherein the trained neural network claim generator model, T1, is trained utilizing a loss measuring whether the generated text is entailed by the product reviews.
3. The method as recited in claim 2, wherein the pretrained transformer model is a Bidirectional Encoder Representations from Transformers (BERT).
4. The method as recited in claim 2, wherein the trained neural network claim generator model, T1, is trained utilizing a loss measuring whether the generated text is fluent according to the pre-trained language model.
5. The method as recited in claim 2, wherein the positive claim is generated by the substitution of words from the trained neural network claim generator model, T1, into a template.
6. The method as recited in claim 5, wherein the substitution words are predicted by a Gumbel softmax function, and the words are reranked using entailment against the review sentences.
7. A computer system (700) for opinion summarization, comprising: one or more processors (710) ; computer memory (720); and a display screen (730) in electronic communication with the computer memory (720) and the one or more processors (710); wherein the computer memory (720) includes a frequency analyzer (790) configured to perform a frequency analysis (130) on an inputted list of product reviews (110) for a single item and an inputted corpus (780) of reviews (120) for a product category containing the single item to identify one or more frequent phrases; a trained neural network claim generator model (750); a trained neural network opposing claim generator model (760), wherein the trained neural network claim generator model (750) and the trained neural network opposing claim generator model (760) are configured to generate a pair of opposing claims (140) for each of the one or more frequent phrases related to the product reviews, wherein a positive claim generated by the trained neural network claim generator is entailed (150) by the product reviews (120) for the single item and a negative claim generated by the trained neural network opposing claim generator (760) refutes the positive claim, and an entailment module (770) configured to output a count of sentences (160) entailing the positive claim and a count of sentences entailing the negative claim.
8. The computer system as recited in claim 7, wherein the positive claim is generated by the claim generator based on a first fine-tuned pretrained transformer model, T1.
9. The computer system as recited in claim 8, wherein the first pretrained transformer model is a Bidirectional Encoder Representations from Transformers (BERT).
10. The computer system as recited in claim 8, wherein the negative claim is generated by the opposing claim generator based on a second fine-tuned pretrained transformer model, T2.
11. The computer system as recited in claim 10, wherein the second pretrained transformer model is a BERT.
12. The computer system as recited in claim 8, wherein the claim generator model is configured to generate the positive claim by the substitution of words into a template.
13. The computer system as recited in claim 12, wherein the substitution words are predicted by a Gumbel softmax function, and the words are reranked using entailment against the review sentences.
14. A non-transitory computer readable storage medium comprising a computer readable program for extracting and counting frequent opinions, wherein the computer readable program when executed on a computer causes the computer to perform the steps of: performing a frequency analysis (130) on an inputted list of product reviews (110) for a single item and an inputted corpus of reviews (120) for a product category containing the single item to identify one or more frequent phrases; fine tuning (350) a pretrained transformer model to produce a trained neural network claim generator model, T1; generating (550) a trained neural network opposing claim generator model based on the trained neural network claim generator model; generating a pair of opposing claims (140) for each of the one or more frequent phrases related to the product review using the trained neural network claim generator model and the trained neural network opposing claim generator model, wherein a generated positive claim is entailed (150) by the product reviews for the single item and a negative claim refutes the positive claim; and outputting a count of sentences (160) entailing the positive claim and a count of sentences entailing the negative claim.
15. The non-transitory computer readable storage medium comprising a computer readable program, as recited in claim 14, wherein the trained neural network claim generator model, T1, is trained utilizing a loss measuring whether the generated text is entailed by the product reviews.
16. The non-transitory computer readable storage medium comprising a computer readable program, as recited in claim 15, wherein the pretrained transformer model is a Bidirectional Encoder Representations from Transformers (BERT).
17. The non-transitory computer readable storage medium comprising a computer readable program, as recited in claim 15, wherein the trained neural network claim generator model, T1, is trained utilizing a loss measuring whether the generated text is fluent according to the pre-trained language model.
18. The non-transitory computer readable storage medium comprising a computer readable program, as recited in claim 17, wherein the positive claim is generated by the substitution of words from the trained neural network claim generator model, T1, into a template.
19. The non-transitory computer readable storage medium comprising a computer readable program, as recited in claim 18, wherein the substitution words are predicted by a Gumbel softmax function, and the words are reranked using entailment against the review sentences.
PCT/US2022/024244 2021-04-12 2022-04-11 Opinion summarization tool WO2022221184A1 (en)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US202163173528P 2021-04-12 2021-04-12
US63/173,528 2021-04-12
US17/716,347 2022-04-08
US17/716,347 US20220327586A1 (en) 2021-04-12 2022-04-08 Opinion summarization tool

Publications (1)

Publication Number Publication Date
WO2022221184A1 true WO2022221184A1 (en) 2022-10-20

Family

ID=83509363

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2022/024244 WO2022221184A1 (en) 2021-04-12 2022-04-11 Opinion summarization tool

Country Status (2)

Country Link
US (1) US20220327586A1 (en)
WO (1) WO2022221184A1 (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190213498A1 (en) * 2014-04-02 2019-07-11 Brighterion, Inc. Artificial intelligence for context classifier
KR20190101156A (en) * 2018-02-22 2019-08-30 삼성전자주식회사 Electric apparatus and method for control thereof
US20200019611A1 (en) * 2018-07-12 2020-01-16 Samsung Electronics Co., Ltd. Topic models with sentiment priors based on distributed representations
US10789430B2 (en) * 2018-11-19 2020-09-29 Genesys Telecommunications Laboratories, Inc. Method and system for sentiment analysis
CN112395417A (en) * 2020-11-18 2021-02-23 长沙学院 Network public opinion evolution simulation method and system based on deep learning

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190213498A1 (en) * 2014-04-02 2019-07-11 Brighterion, Inc. Artificial intelligence for context classifier
KR20190101156A (en) * 2018-02-22 2019-08-30 삼성전자주식회사 Electric apparatus and method for control thereof
US20200019611A1 (en) * 2018-07-12 2020-01-16 Samsung Electronics Co., Ltd. Topic models with sentiment priors based on distributed representations
US10789430B2 (en) * 2018-11-19 2020-09-29 Genesys Telecommunications Laboratories, Inc. Method and system for sentiment analysis
CN112395417A (en) * 2020-11-18 2021-02-23 长沙学院 Network public opinion evolution simulation method and system based on deep learning

Also Published As

Publication number Publication date
US20220327586A1 (en) 2022-10-13

Similar Documents

Publication Publication Date Title
US11803751B2 (en) Training text summarization neural networks with an extracted segments prediction objective
Bojanowski et al. Alternative structures for character-level RNNs
US9659248B1 (en) Machine learning and training a computer-implemented neural network to retrieve semantically equivalent questions using hybrid in-memory representations
US9046932B2 (en) System and method for inputting text into electronic devices based on text and text category predictions
US9672476B1 (en) Contextual text adaptation
US11755909B2 (en) Method of and system for training machine learning algorithm to generate text summary
WO2020140073A1 (en) Neural architecture search through a graph search space
WO2018215404A1 (en) Feedforward generative neural networks
US11074412B1 (en) Machine learning classification system
US10915707B2 (en) Word replaceability through word vectors
Khalil et al. Niletmrg at semeval-2016 task 5: Deep convolutional neural networks for aspect category and sentiment extraction
WO2021195095A1 (en) Neural architecture search with weight sharing
US20220383119A1 (en) Granular neural network architecture search over low-level primitives
US20230205994A1 (en) Performing machine learning tasks using instruction-tuned neural networks
US10929453B2 (en) Verifying textual claims with a document corpus
US20220327586A1 (en) Opinion summarization tool
WO2023192674A1 (en) Attention neural networks with parallel attention and feed-forward layers
US20240013769A1 (en) Vocabulary selection for text processing tasks using power indices
US11687723B2 (en) Natural language processing with missing tokens in a corpus
Singh et al. Robustness tests of nlp machine learning models: Search and semantically replace
Bai et al. A public Chinese dataset for language model adaptation
US20240185065A1 (en) Training text summarization neural networks with an extracted segments prediction objective
US20240078379A1 (en) Attention neural networks with n-grammer layers
US11314725B2 (en) Integrated review and revision of digital content
US20240078431A1 (en) Prompt-based sequential learning

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22788714

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 22788714

Country of ref document: EP

Kind code of ref document: A1